Search engines are an indispensable part of a web user's life. A vast majority of these web users experience difficulties caused by the keyword-based search engines such as inaccurate results for queries and irrelevant URLs even though the given keyword is present in them. Also, relevant URLs may be lost as they may have the synonym of the keyword and not the original one. This condition is known as the polysemy problem. To alleviate these problems, we propose an algorithm called automatic discovery and ranking of synonyms for search keywords in the web (ADRS). The proposed method generates a list of candidate synonyms for individual keywords by employing the relevance factor of the URLs associated with the synonyms. Then, ranking of these candidate synonyms is done using co-occurrence frequencies and various page count-based measures. One of the major advantages of our algorithm is that it is highly scalable which makes it applicable to online data on the dynamic, domain-independent and unstructured World Wide Web. The experimental results show that the best results are obtained using the proposed algorithm with WebJaccard.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.