Joanne S. Claussen scite author profile

Joanne S. Claussen

3Publications

2Citation Statements Received

21Citation Statements Given

How they've been cited

How they cite others

Affiliations

Publications

Order By: Most citations

Effective collection metasearch in a hierarchical environment

Conrad

Yang

Claussen³

2002

View full text Add to dashboard Cite

We compare standard global IR searching with user-centric localized techniques to address the database selection problem. We conduct a series of experiments to compare the retrieval effectiveness of three separate search modes applied to a hierarchically structured data environment of textual database representations. The data environment is represented as a tree-like directory containing over 15,000 unique databases and over 100,000 total leaf nodes. Our search modes consist of varying degrees of browse and search, from a global search at the root node to a refined search at a subnode using dynamically-calculated inverse document frequencies (idf s) to score candidate databases for probable relevance. Our findings indicate that a browse and search approach that relies upon localized searching from sub-nodes is capable of producing the most effective results. INTRODUCTIONThe continued growth of online databases has made the work of finding the most relevant collections increasingly difficult. Until recently, the ability to execute a 'search' in a database directory as well as 'drill down' into its hierarchical structure have largely been regarded as separate activities. If either approach does not provide desired results, large numbers of users exit online systems with unmet information needs. Yahoo! and the Open Directory Project are exceptions that permit integrated browse and search. Research has begun to explore categorization and retrieval in such environments [4]. We hypothesized that if users could first browse to a potentially relevant sub-node in a large directory, results from a search in the sub-directory would be more precise than results from a search in the entire directory. To test the effectiveness of browse plus search functionality, we designed and conducted a series of experiments on three search modes. Using the same set of real user queries, these search modes included: (1) a global search of the directory from the root node, (2) a localized search of the relevant sub-directories using global idfs, and (3) a localized search of the relevant sub-directories using the appropriate dynamically-calculated local idfs.

show abstract

Effective collection metasearch in a hierarchical environment

Conrad¹,

Yang²,

Claussen³

2002

View full text Add to dashboard Cite

Global vs. localized search: A comparison of database selection methods in a hierarchical environment

Conrad

Claussen²,

Yang

2002

Proc of Assoc for Info

View full text Add to dashboard Cite

In this work, we compare standard global IR searching with more localized techniques to address the database selection problem. We conduct a series of experiments to compare the retrieval effectiveness of three separate search modes using a hierarchically structured data environment of textual database representations. The data environment is represented as a tree-like structure containing over 15,000 unique databases and approximately 100,000 total leaf nodes. The search modes consist of varying degrees of browse and search, from a global search at the root node to a refined search at a sub-node using dynamically-calculated inverse document frequencies (idfs) to score the candidate databases for probable relevance. Our findings indicate that a browse plus search approach that relies upon localized searching from sub-nodes in this environment produces the most effective results. IntroductionThe continued growth of online databases has made the work of finding the most relevant databases increasingly challenging. Until recently, the ability to search a metadata repository as well as 'drill down' into its hierarchical structure, e.g., as in a data directory, have largely remained separate activities. That is, browse and search tasks in the same repository have often been presented as mutually exclusive. As a result, large numbers of users exit online systems with unmet information needs when failing to find relevant sources of interest. This was the case with the Westlaw (Database) Directory. We hypothesized that if users could first browse to a potentially relevant subdirectory in the large directory, results from a search in the sub-directory would be more precise than results from a search on the entire directory. To test the effectiveness of browse plus search functionality, we designed and conducted a series of experiments on three search-modes, using the same set of real user queries. These search-modes include (1) a global search of the directory from the root node, (2) a localized search of the relevant sub-directories using global idfs,' and (3) a localized search of the relevant sub-directories using the appropriate local idfs. In the next section we review related work. Section 3 briefly describes our operational environment while section 4 discusses the underlying data. Section 5 describes the user queries harnessed for this investigation. Section 6 addresses the particular t r i d f scoring algorithm used. Our experiments are outlined in section 7 and our results are presented in section 8. In section 9 we draw our conclusions and in section 10 we mention future applications of this browse and search technology. Previous WorkAn appreciable body of work has focused on searching distributed databases of textual documents for relevant information in response to user queries (Gravano 1994; Callan, 1995;Yuwono, 1997;French 1999). Yet such fully automated retrieval and the corpus of related research which followed have been performed independent of additional user involvement. For this reason, the IR commu...

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.