Matan Mandelbrod scite author profile

Most of the work on XML query and search has stemmed from the publishing and database communities, mostly for the needs of business applications. Recently, the Information Retrieval community began investigating the XML search issue to answer information discovery needs. Following this trend, we present here an approach where information needs can be expressed in an approximate manner as pieces of XML documents or "XML fragments" of the same nature as the documents that are being searched. We present an extension of the vector space model for searching XML collections via XML fragments and ranking results by relevance. We describe how we have extended a fulltext search engine to comply with this model. The value of the proposed method is demonstrated by the relative high precision of our system, which was among the top performers in the recent INEX workshop. Our results indicate that certain queries are more appropriate than others for the extended vector space model. Specifically, queries with relatively specific contexts but vague information needs are best situated to reap the benefit of this model. Finally our results show that one method may not fit all types of queries and that it could be worthwhile to use different solutions for different applications.

show abstract

Component Ranking and Automatic Query Refinement for XML Retrieval

Mass

Mandelbrod

2005

View full text Add to dashboard Cite

Queries over XML documents challenge search engines to return the most relevant XML components that satisfy the query concepts. In a previous work[6] we described an algorithm to retrieve the most relevant XML components that performed relatively well in INEX'03. In this paper we show an improvement to that algorithm by introducing a document pivot that compensates for missing terms statistics in small components. Using this new algorithm we achieved improvements of 30%-50% in the Mean Average Precision over the previous algorithm. We then describe a general mechanism to apply existing Automatic Query Refinement (AQR) methods on top of our XML retrieval algorithm and demonstrate a particular such method that achieved top results in INEX'04.

show abstract

Searching XML documents via XML fragments

Carmel¹,

Maarek²,

Mandelbrod³

et al. 2003

View full text Add to dashboard Cite

Using the INEX Environment as a Test Bed for Various User Models for XML Retrieval

Mass¹,

Mandelbrod²

View full text Add to dashboard Cite

Relevance Feedback for XML Retrieval

Mass

Mandelbrod

2005

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.