Patterns and Types for Querying XML Documents

Castagna, Giuseppe

doi:10.1007/11601524_1

Cited by 7 publications

(4 citation statements)

References 40 publications

(33 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…In particular it seems worthwhile to consider more theoretically sound formalisms for tree queries such as, for instance, MSO formula or tree automata. The latter in particular would allow us to reuse our pruning algorithm for pattern-matching based languages (such as the CDuce language [1] and its pattern-based query language CQL [28,6,14]). It is also known that tree-automata (as well as MSO formula) have better closure properties than XPath expressions and support fine-grained set-theoretic operations (intersection, union, complement) that have been used with success to devise very precise type-systems for XML [22].…”

Section: Discussionmentioning

confidence: 99%

Optimizing XML querying using type-based document projection

Benzaken

Castagna

Colazzo

et al. 2013

ACM Trans. Database Syst.

Self Cite

View full text Add to dashboard Cite

V. Benzaken et al. we denote by f @i the unique subtree t of f such that t = s i or t = l i [ f ]. The set of identifiers of a forest f is then defined as Ids( f ) = {i | ∃ t. f @i = t}.Henceforth we will consider only well-formed forests and confound the notions of a node with that of the identifier of the node.Definition 2.4 (Root id). Let t be a tree. If t = s i or t = l i [ f ], we define RootId(t) = i. Types and ValidationIn this work, we present our approach for an abstract model of types, namely regular tree grammars. It is well known that regular tree grammars encompass most of the features of well established schema specifications such as DTDs, XMLSchemas, RelaxNG definitions, XDuce and CDuce's regular expression types. This is for instance documented in Murata et al. [2005], from where we borrow the definition of regular tree grammar.Definition 2.5 (Regular tree grammar). A regular tree grammar is a pair (S, E) where S is a set of distinguished names (actually, nonterminal metavariables) and E is a set of production rules of the form {X 1 → R 1 , . . . , X n → R n } such that:(1) each R i is either the terminal String, denoting string content, or the terminal Any, denoting any tree, or l[ r ] where l ranges over valid element names and r is a regular expression on the nonterminal symbols X 1 , . . . , X n , that is:(henceforth, we use r+ for r r * and r? for ε|r);(2) S ⊆ {X 1 , . . . , X n } is the set of start symbols;(3) for any two production rules with the same left-hand sideThe intuition is that a regular tree grammar describes (i.e., it "types") a set of trees of the data-model. Notice that the left-hand sides of the rules in E do not need to be pairwise distinct. Allowing two rules to have the same left-hand side allows us to freely take the union of two sets of rules and also simplifies some definitions. Furthermore, given a regular tree grammar, it is always possible to equivalently rewrite it so that condition 3 holds: if there are two rules X i → l[r] and X i → l[r ], then they can be merged into a single rule, X i → l[r|r ].Definition 2.6 (Names of a regular expression). Given a regular expression r we denote by Names(r) the set of nonterminals occurring in it, namely: Names(ε)= ∅ Names(r 1 r 2 ) = Names(r 1 ) ∪ Names(r 2 ) Names(r 1 | r 2 ) = Names(r 1 ) ∪ Names(r 2 ) Names(r * ) = Names(r) Names(X) = {X}By extension, given a set of rules E = {X 0 → R 0 , . . . , X n → R n }, we define Names(E) = i∈{0,...,n} Names(R i ).Definition 2.7 (Defined name). Given a rule X → R, we call X the defined name of the rule and we note Dn(X → R). By extension, given a set E = {X 0 → R 0 , . . . , X n → R n } we define Dn(E) = {X 0 , . . . , X n }. Note that in general, Names(E) ⊆ Dn(e).We also say that r is a regular expression over (S, E), if r is a regular expression over names in Dn(E). We will denote by L(r) the language recognized by the regular expression r. We will use W, X, Y, Z to range over names. We use Greek letters to range over sets of rules. As (S, E) represents a regular tree grammar we ...

show abstract

Section: Discussionmentioning

confidence: 99%

Optimizing XML querying using type-based document projection

Benzaken

Castagna

Colazzo

et al. 2013

ACM Trans. Database Syst.

Self Cite

View full text Add to dashboard Cite

show abstract

“…html X XAct: http://www.brics.dk/Xact/ Xduce: http://xduce.sourceforge.net XHaskell: http://taichi.ddns.comp.nus.edu.sg/ taichiwiki/XhaskellHomePage Xtatic: http://www.cis.upenn.edu/~bcpierce/ xtatic A gentle introduction to exact typechecking for both XML-to-XML transformations and XML Publishing can be found in [37]. A nontechnical presentation of Regular Expression Types and Patterns and their use in query languages can be found in the joint DPBL and XSym 2005 invited talk [4]. For a more complete presentation of Regular Expression Types and Patterns and the associated type-checking and subtyping algorithms we recommend the reader to refer to the seminal JFP article by Hosoya, Pierce, and Vouillon [22].…”

Section: Url To Codementioning

confidence: 99%

XML Typechecking

Benzaken¹,

Castagna²,

Hosoya³

et al. 2016

Encyclopedia of Database Systems

Self Cite

View full text Add to dashboard Cite

show abstract

“…Frisch et al, (2008) extended semantic subtyping to function types and propositional types, with typetest, resulting in the language CDuce (Benzaken et al, 2003). (An excellent overview of the use of semantic subtyping in the context of querying XML documents was given by Castagna, 2005. ) In the end, the XQuery working group resorted to a more conventional pure named type system (Siméon & Wadler, 2003) with a simpler notion of subtyping based on ordinary regular expression inclusion (as opposed to XDuce's use of tree regular expressions).…”

Section: Related Workmentioning

confidence: 99%

Semantic subtyping with an SMT solver

Bierman¹,

Gordon²,

Hriţcu³

et al. 2012

J. Funct. Prog.

View full text Add to dashboard Cite

We study a first-order functional language with the novel combination of the ideas of refinement type (the subset of a type to satisfy a Boolean expression) and type-test (a Boolean expression testing whether a value belongs to a type). Our core calculus can express a rich variety of typing idioms; for example, intersection, union, negation, singleton, nullable, variant, and algebraic types are all derivable. We formulate a semantics in which expressions denote terms, and types are interpreted as first-order logic formulas. Subtyping is defined as valid implication between the semantics of types. The formulas are interpreted in a specific model that we axiomatize using standard first-order theories. On this basis, we present a novel type-checking algorithm able to eliminate many dynamic tests and to detect many errors statically. The key idea is to rely on a Satisfiability Modulo Theories solver to compute subtyping efficiently. Moreover, using a satisfiability modulo theories solver allows us to show the uniqueness of normal forms for non-deterministic expressions, provide precise counterexamples when type-checking fails, detect empty types, and compute instances of types statically and at run-time.

show abstract

Patterns and Types for Querying XML Documents

Cited by 7 publications

References 40 publications

Optimizing XML querying using type-based document projection

Optimizing XML querying using type-based document projection

XML Typechecking

Semantic subtyping with an SMT solver

Contact Info

Product

Resources

About