Tree compression pushdown automaton
Jan Janoušek; Bořivoj Melichar; Martin Poliak
Kybernetika (2012)
- Volume: 48, Issue: 3, page 429-452
- ISSN: 0023-5954
Access Full Article
topAbstract
topHow to cite
topJanoušek, Jan, Melichar, Bořivoj, and Poliak, Martin. "Tree compression pushdown automaton." Kybernetika 48.3 (2012): 429-452. <http://eudml.org/doc/247233>.
@article{Janoušek2012,
abstract = {A new kind of a deterministic pushdown automaton, called a Tree Compression Automaton, is presented. The tree compression automaton represents a complete compressed index of a set of trees for subtrees and accepts all subtrees of given trees. The algorithm for constructing our pushdown automaton is incremental. For a single tree with $n$ nodes, the automaton has at most $n+1$ states, its transition function cardinality is at most $4n$ and there are $2n+1$ pushdown store symbols. If hashing is used for storing automaton’s transitions, thus removing a factor of $log n$, the construction of the automaton takes linear time and space with respect to the length $n$ of the input tree(s). Our pushdown automaton construction can also be used for finding all subtree repeats without augmenting the overall complexity.},
author = {Janoušek, Jan, Melichar, Bořivoj, Poliak, Martin},
journal = {Kybernetika},
keywords = {trees; pushdown automata; compression; indexing trees; arbology; deterministic pushdown automata; Tree Compression Automaton; trees; indexing trees; arbology; compression},
language = {eng},
number = {3},
pages = {429-452},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Tree compression pushdown automaton},
url = {http://eudml.org/doc/247233},
volume = {48},
year = {2012},
}
TY - JOUR
AU - Janoušek, Jan
AU - Melichar, Bořivoj
AU - Poliak, Martin
TI - Tree compression pushdown automaton
JO - Kybernetika
PY - 2012
PB - Institute of Information Theory and Automation AS CR
VL - 48
IS - 3
SP - 429
EP - 452
AB - A new kind of a deterministic pushdown automaton, called a Tree Compression Automaton, is presented. The tree compression automaton represents a complete compressed index of a set of trees for subtrees and accepts all subtrees of given trees. The algorithm for constructing our pushdown automaton is incremental. For a single tree with $n$ nodes, the automaton has at most $n+1$ states, its transition function cardinality is at most $4n$ and there are $2n+1$ pushdown store symbols. If hashing is used for storing automaton’s transitions, thus removing a factor of $log n$, the construction of the automaton takes linear time and space with respect to the length $n$ of the input tree(s). Our pushdown automaton construction can also be used for finding all subtree repeats without augmenting the overall complexity.
LA - eng
KW - trees; pushdown automata; compression; indexing trees; arbology; deterministic pushdown automata; Tree Compression Automaton; trees; indexing trees; arbology; compression
UR - http://eudml.org/doc/247233
ER -
References
top- Arbology www pages:, Available on http://www.arbology.org/, March 2012.
- Aho, A. V., Ullman, J. D., The Theory of Parsing, Translation, and Compiling, Prentice-Hall Englewood Cliffs, N.J. 1972. MR0408321
- Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., Shindyalov, I. N., Bourne, P. E., 10.1093/nar/28.1.235, Nucleic Acids Res. 28 (2000), 235–242. DOI10.1093/nar/28.1.235
- Bille, P., Pattern Matching in Trees and Strings, Ph.D. Thesis, FIT University of Copenhagen 2008.
- Busatto, G., Lohrey, M., Maneth, S., Grammar-based Tree Compression, Technical Report 2004.
- Cleophas, L., Tree Algorithms. Two Taxonomies and a Toolkit, Ph.D. Thesis, Technische Universiteit Eindhoven 2008.
- Cole, R., Hariharan, R., Indyk, P., Tree pattern matching and subset matching in deterministic -time, SODA (1999), 245–254.
- Comon, H., Dauchet, M., Gilleron, R., Löding, C., Jacquemard, F., Lugiez, D., Tison, S., Tommasi, M., Tree automata techniques and applications, Available on: http://www.grappa.univ-lille3.fr/tata (2007).
- Gecseg, F., Steinby, M., Tree languages, In: Handbook of Formal Languages (G. Rozenberg and A. Salomaa, eds.), Vol. 3 Beyond Words. Springer-Verlag, Berlin 1997, pp. 1–68. MR1470018
- Flouri, T., Janoušek, J., Melichar, B., Subtree and tree pattern pushdown automata for trees in prefix notation (2009), Comput. Sci. Inform. Systems 7 (2010), 2, 331–357.
- Hoffmann, Ch., O'Donnell, M., 10.1145/322290.322295, J. Assoc. Comput. Mach. 29 (1982), 1, 68–95. Zbl0477.68067MR0662611DOI10.1145/322290.322295
- Hopcroft, J. E., Motwani, R., Ullman, J. D., Introduction to Automata Theory, languages, and Computation. Second edition, Addison-Wesley, Boston 2001. MR0645539
- Wu, C. H., Apweiler, R., Bairoch, A., Natale, D. A., Barker, W. C., Boeckmann, B., Ferro, S., Gasteiger, E., Huang, H., Lopez et al, R., The Universal Protein Resource (UniProt) – An Expanding Universe of Protein Information, Database issue, Nucleic Acids Research, Oxford University Press, D187–D191 2006.
- Schmidt, A. R., Waas, F., Kersten, M. L., Carey, M. J., Manolescu, I., Busse, R., XMark – A benchmark for XML data management, In: Proc. International Conference on Very Large Data Bases (VLDB), Hong Kong 2002, pp. 974–985.
- Van Der Beek, L., Bouma, G., Malouf, R., Van Noord, G., Rijksuniversiteit Groningen, The Alpino dependency treebank, In: Computational Linguistics in the Netherlands (CLIN) 2002, pp. 1686–1691.
- Welch, T., 10.1109/MC.1984.1659158, Computer 17 (1984), 6, 8–19. DOI10.1109/MC.1984.1659158
- Ziv, J., Lempel, A., 10.1109/TIT.1977.1055714, IEEE Trans. Inform. Theory 23 (1977), 3, 337–343. Zbl0379.94010MR0530215DOI10.1109/TIT.1977.1055714
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.