The success of information retrieval style keyword search on the web leads to the emergence of XML based keyword search. The text database and XML database differences leads to three new challenges: 1) The users search intention is to be identified, i.e., the XML node types that user wants to search for and search via is identified.2) The similarities in tag name, tag value and the structure of tags are identified. 3) New scoring function is needed to estimate the output of the search results (XML document) relevance to the given query. However, these challenges cannot be addressed by the existing system, which results in low quality results in terms of query relevance. In this paper, an IR-style approach is proposed which basically utilizes the statistics of underlying XML data to address these challenges. First, specific guidelines that a search engine should meet in both search intention identification and relevance oriented ranking for search results is proposed. Then, based on these guidelines, a novel XML TF*IDF ranking strategy to rank the individual matches of all possible search intentions is proposed.
Personalized Web search is an effective means of providing precise results to different users when they submit the same query. As the amount of web information grows rapidly an efficient personalization approach that modifies the appearance of a website's content to satisfy a specific user's instructions or preferences is required. It is also essential to keep track of the change of interest of the user from time to time. An approach which involves a concept based user profiling strategy, along with the click-through data and keyword-based search, is developed. Concepts are split into content and location concepts and are maintained separately for monitoring the gradual transition in the interest of a user over the time. The user's interest is captured from the clickthrough information. Depending upon the links clicked and the concepts returned users' information access behavior is analyzed and re-ranking is performed to obtain the relevant results.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.