Nested Sibling Tree Automata

Françoise Gire; Jean-Marc Talbot

RAIRO - Theoretical Informatics and Applications (2009)

  • Volume: 43, Issue: 2, page 379-402
  • ISSN: 0988-3754

Abstract

top
In the XML standard, data are represented as unranked labeled ordered trees. Regular unranked tree automata provide a useful formalism for the validation of schemas enforcing regular structural constraints on XML documents. However some concrete application contexts need the expression of more general constraints than the regular ones. In this paper we propose a new framework in which context-free style structural constraints can be expressed and validated. This framework is characterized by: (i) the introduction of a new notion of trees, the so-called typed unranked labeled trees (tulab trees for short) in which each node receives one of three possible types (up, down or fix), and (ii) the definition of a new notion of tree automata, the so-called nested sibling tulab tree automata, able to enforce context-free style structural constraints on tulab tree languages. During their structural control process, such automata are using visibly pushdown languages of words [R. Alur and P. Madhusudan, Visibly pushdown languages, 36th ACM symposium on Theory of Computing, Chicago, USA (2004) 202–211] on their alphabet of states. We show that the resulting class NSTL of tulab tree languages recognized by nested sibling tulab tree automata is robust, i.e. closed under Boolean operations and with decision procedures for the classical membership, emptiness and inclusion problems. We then give three characterizations of NSTL: a logical characterization by defining an adequate logic in which NSTL happens to coincide with the models of monadic second order sentences; the two other characterizations are using adequate encodings and map together languages of NSTL with some regular sets of 3-ary trees or with particular sets of binary trees.

How to cite

top

Gire, Françoise, and Talbot, Jean-Marc. "Nested Sibling Tree Automata." RAIRO - Theoretical Informatics and Applications 43.2 (2009): 379-402. <http://eudml.org/doc/250576>.

@article{Gire2009,
abstract = { In the XML standard, data are represented as unranked labeled ordered trees. Regular unranked tree automata provide a useful formalism for the validation of schemas enforcing regular structural constraints on XML documents. However some concrete application contexts need the expression of more general constraints than the regular ones. In this paper we propose a new framework in which context-free style structural constraints can be expressed and validated. This framework is characterized by: (i) the introduction of a new notion of trees, the so-called typed unranked labeled trees (tulab trees for short) in which each node receives one of three possible types (up, down or fix), and (ii) the definition of a new notion of tree automata, the so-called nested sibling tulab tree automata, able to enforce context-free style structural constraints on tulab tree languages. During their structural control process, such automata are using visibly pushdown languages of words [R. Alur and P. Madhusudan, Visibly pushdown languages, 36th ACM symposium on Theory of Computing, Chicago, USA (2004) 202–211] on their alphabet of states. We show that the resulting class NSTL of tulab tree languages recognized by nested sibling tulab tree automata is robust, i.e. closed under Boolean operations and with decision procedures for the classical membership, emptiness and inclusion problems. We then give three characterizations of NSTL: a logical characterization by defining an adequate logic in which NSTL happens to coincide with the models of monadic second order sentences; the two other characterizations are using adequate encodings and map together languages of NSTL with some regular sets of 3-ary trees or with particular sets of binary trees. },
author = {Gire, Françoise, Talbot, Jean-Marc},
journal = {RAIRO - Theoretical Informatics and Applications},
keywords = {Automata; logic; unranked trees; XML schemas. ; automata; XML schemas},
language = {eng},
month = {3},
number = {2},
pages = {379-402},
publisher = {EDP Sciences},
title = {Nested Sibling Tree Automata},
url = {http://eudml.org/doc/250576},
volume = {43},
year = {2009},
}

TY - JOUR
AU - Gire, Françoise
AU - Talbot, Jean-Marc
TI - Nested Sibling Tree Automata
JO - RAIRO - Theoretical Informatics and Applications
DA - 2009/3//
PB - EDP Sciences
VL - 43
IS - 2
SP - 379
EP - 402
AB - In the XML standard, data are represented as unranked labeled ordered trees. Regular unranked tree automata provide a useful formalism for the validation of schemas enforcing regular structural constraints on XML documents. However some concrete application contexts need the expression of more general constraints than the regular ones. In this paper we propose a new framework in which context-free style structural constraints can be expressed and validated. This framework is characterized by: (i) the introduction of a new notion of trees, the so-called typed unranked labeled trees (tulab trees for short) in which each node receives one of three possible types (up, down or fix), and (ii) the definition of a new notion of tree automata, the so-called nested sibling tulab tree automata, able to enforce context-free style structural constraints on tulab tree languages. During their structural control process, such automata are using visibly pushdown languages of words [R. Alur and P. Madhusudan, Visibly pushdown languages, 36th ACM symposium on Theory of Computing, Chicago, USA (2004) 202–211] on their alphabet of states. We show that the resulting class NSTL of tulab tree languages recognized by nested sibling tulab tree automata is robust, i.e. closed under Boolean operations and with decision procedures for the classical membership, emptiness and inclusion problems. We then give three characterizations of NSTL: a logical characterization by defining an adequate logic in which NSTL happens to coincide with the models of monadic second order sentences; the two other characterizations are using adequate encodings and map together languages of NSTL with some regular sets of 3-ary trees or with particular sets of binary trees.
LA - eng
KW - Automata; logic; unranked trees; XML schemas. ; automata; XML schemas
UR - http://eudml.org/doc/250576
ER -

References

top
  1. S. Abiteboul and P. Buneman, Data on the Web: From Relations to Semi-structured Data and XML (1999).  
  2. R. Alur, S. Chaudhuri and P. Madhusudan, A fixpoint calculus for local and global program flows, Symposium on Principles of Programming Languages, POPL 2006 (2006) 153–165.  
  3. R. Alur, S. Chaudhuri and P. Madhusudan, Languages of nested trees, Computer-Aided Verification, CAV 2006 (2006) 329–342.  
  4. Rajeev Alur and P. Madhusudan, Visibly pushdown languages, Chicago, USA. 36th ACM symposium on Theory of Computing (2004) 202–211.  
  5. M. Bojanczyck and T. Colcombet, Tree-walking automata cannot be determinized. ICALP (2004) 246–256.  
  6. T. Bray, J.P. Paoh and C.M. Sperberg-McQueen, Extensible markup language (xml) 1.0, http://www.w3org/TR/1998/REC-xml-19980210/ (1998).  
  7. A. Bruggemann, M. Murata and D. Wood, Regular tree and regular hedge languages over unranked alphabets, Technical Report 2001–05. HKUST TCS Center (1998).  
  8. H. Comon and V. Cortier, Tree automata with one memory set constraints and cryptographic protocols. Theoret. Comput. Sci.331 (2005) 143–214.  
  9. H. Comon, M. Dauchet, R. Gilleron, C. Löding, F. Jacquemard, D. Lugiez, S. Tison and M. Tommasi, Tree automata techniques and applications, available on: (2007), release October, 12th (2007).  URIhttp://www.grappa.univ-lille3.fr/tata
  10. H. Comon-Lundh, F. Jacquemard and N. Perrin, Tree automata with memory, visibility and structural constraints, In Foundations of Software Science and Computational Structures, 10th International Conference, FOSSACS 2007. Lect. Notes Comput. Sci.4423 (2007) 168–182.  
  11. D. Lugiez and S. DalZilio, Xml schema, tree logic and sheaves automata, Technical Report RR-4631, Inria (2002).  
  12. F. Neven, Automata, logic and xml. CSL (2002) 2–26.  
  13. C. Pitcher, Visibly pushdown expression effects for xml stream processing, Programming Languages Technologies for XML. PLAN-X (2005) 5–19.  
  14. T. Schwentick, Tree, automata and xml. Paris, PODS (2004).  
  15. L. Segoufin, Typing and quering xml documents: some complexity bounds. San Diego CA, PODS (2003) 167–178.  
  16. H. Seidl, T. Schwentick and A. Muscholl, Numerical document queries. San Diego CA, PODS (2003).  
  17. J.W. Thatcher and J.B. Wright, Generalized finite automata with an application to a decision problem of second-order logic. Mathematical System Theory2 57–82 (1968).  

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.