Searching XML documents via XML fragments

Carmel, David; Maarek, Yoelle; Mandelbrod, Matan; Mass, Yosi; Soffer, Aya

doi:10.1145/860435.860464

Cited by 140 publications

(82 citation statements)

References 9 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The ancestor context similarity ancSim between two nodes (n i , n j ) is based on the resemblance measure between their paths (p i , p j ). This is done by calculating three scores established in [30]. These scores are combined and weighted by the linguistic similarity between (n i , n j ) to compute the ancestor context similarity:…”

Section: Structural Similarity Computationmentioning

confidence: 99%

Schema matching for integrating multimedia metadata

Amir¹,

Bilasco²,

Danışman³

et al. 2010

2010 International Conference on Machine and Web Intelligence

View full text Add to dashboard Cite

show abstract

Section: Structural Similarity Computationmentioning

confidence: 99%

Schema matching for integrating multimedia metadata

Amir¹,

Bilasco²,

Danışman³

et al. 2010

2010 International Conference on Machine and Web Intelligence

View full text Add to dashboard Cite

show abstract

“…Let us advert that in VSM model, two documents are presented in a space whose dimensions correspond each to a distinct indexing unit [2]. Indexing units are words that are in their root forms.…”

Section: Text Document Similaritymentioning

confidence: 99%

Towards structural Web Services matching based on Kernel methods

Nan

et al. 2007

Front. Comput. Sc. China

View full text Add to dashboard Cite

This paper describes a kernel methods based Web Services matching mechanism for Web Services discovery and integration. The matching mechanism tries to exploit the latent semantics by the structure of Web Services. In this paper, Web Services are schemed by WSDL (Web Services Description Language) as tree-structured XML documents, and their matching degree is calculated by our novel algorithm designed for loosely tree matching against the traditional methods. In order to achieve the task, we bring forward the concept of path subsequence to model WSDL documents in the vector space. Then, an advanced n-spectrum kernel function is defined, so that the similarity of two WSDL documents can be drawn by implementing the kernel function in the space. Using textual similarity and n-spectrum kernel values as features of low-level and mid-level, we build up a model to estimate the functional similarity between Web Services, whose parameters are learned by a ranking-SVM. Finally, a set of experiments were designed to verify the model, and the results showed that several metrics for the retrieval of Web Services have been improved by our approach.

show abstract

“…At INEX 2002), a broad spectrum of techniques was used to exploit non-content aspects of XML documents in addressing the XML element retrieval task. For instance, the JuruXML system by Mass et al (2003) and Carmel et al (2003) extends the traditional vector space model by allowing XML collections to be searched through so-called "XML fragments" which combine content and structure features. Similarly, Gövert et al (2003) exploit content and structure features to identify relevant elements and to redistribute relevancy from elements to their enclosing elements.…”

Section: Xml Retrievalmentioning

confidence: 99%

The Importance of Length Normalization for XML Retrieval

2005

View full text Add to dashboard Cite

XML retrieval is a departure from standard document retrieval in which each individual XML element, ranging from italicized words or phrases to full blown articles, is a retrievable unit. The distribution of XML element lengths is unlike what we usually observe in standard document collections, prompting us to revisit the issue of document length normalization. We perform a comparative analysis of arbitrary elements versus relevant elements, and show the importance of element length as a parameter for XML retrieval. Within the language modeling framework, we investigate a range of techniques that deal with length either directly or indirectly. We observe a length-bias introduced by the amount of smoothing, and show the importance of extreme length bias for XML retrieval. We also show that simply removing shorter elements from the index (by introducing a cut-off value) does not create an appropriate element length normalization. Even after restricting the minimal size of XML elements occurring in the index, the importance of an extreme explicit length bias remains.

show abstract

Searching XML documents via XML fragments

Cited by 140 publications

References 9 publications

Schema matching for integrating multimedia metadata

Schema matching for integrating multimedia metadata

Towards structural Web Services matching based on Kernel methods

The Importance of Length Normalization for XML Retrieval

Contact Info

Product

Resources

About