Mariano P. Consens scite author profile

The W3C XQuery language recommendation, based on a hierarchical and ordered document model, supports a wide variety of constructs and use cases. There is a diversity of approaches and strategies for evaluating XQuery expressions, in many cases only dealing with limited subsets of the language. In this paper we describe an implementation approach that handles XQuery with arbitrarily-nested FLWR expressions, element constructors and built-in functions (including structural comparisons). Our proposal maps an XQuery expression to a single equivalent SQL query using a novel dynamic interval encoding of a collection of XML documents as relations, augmented with information tied to the query evaluation environment. The dynamic interval technique enables (suitably enhanced) relational engines to produce predictably good query plans that do not restrict the use of sort-merge join query operators. The benefits are realized despite the challenges presented by intermediate results that create arbitrary documents and the need to preserve document order as prescribed by semantics of XQuery. Finally, our experimental results demonstrate that (native or relational) XML systems can benefit from the above technique to avoid a quadratic scale up penalty that effectively prevents the evaluation of nested FLWR expressions for large documents.

show abstract

ExpLOD: Summary-Based Exploration of Interlinking and RDF Usage in the Linked Open Data Cloud

Khatchadourian

Consens

2010

View full text Add to dashboard Cite

Abstract. Publishing interlinked RDF datasets as links between data items identified using dereferenceable URIs on the web brings forward a number of issues. A key challenge is to understand the data, the schema, and the interlinks that are actually used both within and across linked datasets. Understanding actual RDF usage is critical in the increasingly common situations where terms from different vocabularies are mixed. In this paper we describe a tool, ExpLOD, that supports exploring summaries of RDF usage and interlinking among datasets from the Linked Open Data cloud. ExpLOD's summaries are based on a novel mechanism that combines text labels and bisimulation contractions. The labels assigned to RDF graphs are hierarchical, enabling summarization at different granularities. The bisimulation contractions are applied to subgraphs defined via queries, providing for summarization of arbitrary large or small graph neighbourhoods. Also, ExpLOD can generate SPARQL queries from a summary. Experimental results, using several collections from the Linked Open Data cloud, compare the two summary creation approaches implemented by ExpLOD (graph-based vs. SPARQL-based).

show abstract

Optimizing queries on files

Consens

Milo

1994

View full text Add to dashboard Cite

We present a framework which allows the user to access and manipulate data uniformly, regardless of whether it resides in a database or in the file system (or in both). A key issue is the performance of the system. We show that text indexing, combined with newly developed optimization techniques, can be used to provide an efficient high level interface to information stored in files. Furthermore, using these techniques, some queries can be evaluated significantly faster than in standard database implementations. We also study the tradeoff between efficiency and the amount of indexing.

show abstract

Expressing structural hypertext queries in graphlog

Consens

Mendelzon

1989

103

View full text Add to dashboard Cite

show abstract

Text / relational database management systems: Harmonizing SQL and SGML

Blake

Consens

Kilpeläinen

et al. 1994

View full text Add to dashboard Cite

Algebras for querying text regions (extended abstract)

Consens

Milo

1995

View full text Add to dashboard Cite

Low complexity aggregation in graphlog and Datalog

Consens

Mendelzon

1990

View full text Add to dashboard Cite

Algebras for Querying Text Regions: Expressive Power and Optimization

Consens

Milo

1998

Journal of Computer and System Sciences

View full text Add to dashboard Cite

There is a significant amount of interest in combining and extending database and information retrieval technologies to manage textual data. The challenge is becoming more relevant due to increased availability of documents in digital form. Document data has a natural hierarchical structure, which may be made explicit due to the use of markup conventions (as with SGML). An important aspect of managing structured and semistructured textual data consists of supporting the efficient retrieval of text components based both on their content and on their structure.In this paper we study issues related to the expressive power and optimization of a class of algebras that support combining string (or pattern) searches with queries on the hierarchical structure of the text. The region algebra studied is a set-at-a-time algebra for manipulating text regions (substrings of the text) that supports finding out nesting and ordering properties of the text regions. This algebra is part of the language in use in commercial text retrieval systems and can form the basis for supporting SQL-like access to textual data.By presenting a close relationship between the region algebra and the monadic first order theory of finite binary trees, we show that queries in the algebra can be optimized, in the sense that equivalence to less expensive expressions can be tested. This optimization can be difficult (co-NP-hard in the general case), but there is an important class of queries that can be optimized in polynomial time. On the negative side, we show that the language is incapable of capturing some important properties of the text structure, related to the nesting and ordering of text regions. We conclude by suggesting possible extensions to increase the expressive power of the language and consider one such example. Academic Press

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.