Compute Query and Document Similarity using Explicit Semantic Analysis

Sangeetha, M.; Keerthika, P.; Devendran, K.; Sridhar, S.; Raagav, S. Shree; Vigneshwar, T.

doi:10.1109/iccmc53470.2022.9754087

2022 6th International Conference on Computing Methodologies and Communication (ICCMC) 2022

DOI: 10.1109/iccmc53470.2022.9754087

|View full text |Cite

Compute Query and Document Similarity using Explicit Semantic Analysis

M. Sangeetha¹,

P. Keerthika²,

K. Devendran³

et al.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

Publication Types

Select...

Article1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

References 14 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

Topical and Non-Topical Approaches to Measure Similarity between Arabic Questions

Daoud

2022

BDCC

View full text Add to dashboard Cite

Questions are crucial expressions in any language. Many Natural Language Processing (NLP) or Natural Language Understanding (NLU) applications, such as question-answering computer systems, automatic chatting apps (chatbots), digital virtual assistants, and opinion mining, can benefit from accurately identifying similar questions in an effective manner. We detail methods for identifying similarities between Arabic questions that have been posted online by Internet users and organizations. Our novel approach uses a non-topical rule-based methodology and topical information (textual similarity, lexical similarity, and semantic similarity) to determine if a pair of Arabic questions are similarly paraphrased. Our method counts the lexical and linguistic distances between each question. Additionally, it identifies questions in accordance with their format and scope using expert hypotheses (rules) that have been experimentally shown to be useful and practical. Even if there is a high degree of lexical similarity between a When question (Timex Factoid—inquiring about time) and a Who inquiry (Enamex Factoid—asking about a named entity), they will not be similar. In an experiment using 2200 question pairs, our method attained an accuracy of 0.85, which is remarkable given the simplicity of the solution and the fact that we did not employ any language models or word embedding. In order to cover common Arabic queries presented by Arabic Internet users, we gathered the questions from various online forums and resources. In this study, we describe a unique method for detecting question similarity that does not require intensive processing, a sizable linguistic corpus, or a costly semantic repository. Because there are not many rich Arabic textual resources, this is especially important for informal Arabic text processing on the Internet.

show abstract

Topical and Non-Topical Approaches to Measure Similarity between Arabic Questions

Daoud

2022

BDCC

View full text Add to dashboard Cite

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Compute Query and Document Similarity using Explicit Semantic Analysis

Cited by 1 publication

References 14 publications

Topical and Non-Topical Approaches to Measure Similarity between Arabic Questions

Topical and Non-Topical Approaches to Measure Similarity between Arabic Questions

Contact Info

Product

Resources

About