A perfect hashing incremental scheme for unranked trees using pseudo-minimal automata

Rafael C. Carrasco; Jan Daciuk

RAIRO - Theoretical Informatics and Applications (2009)

  • Volume: 43, Issue: 4, page 779-790
  • ISSN: 0988-3754

Abstract

top
We describe a technique that maps unranked trees to arbitrary hash codes using a bottom-up deterministic tree automaton (DTA). In contrast to other hashing techniques based on automata, our procedure builds a pseudo-minimal DTA for this purpose. A pseudo-minimal automaton may be larger than the minimal one accepting the same language but, in turn, it contains proper elements (states or transitions which are unique) for every input accepted by the automaton. Therefore, pseudo-minimal DTA are a suitable structure to implement stable hashing schemes, that is, schemes where the output for every key can be determined prior to the automaton construction. We provide incremental procedures to build the pseudo-minimal DTA and the mapping that associates an integer value to every transition that will be used to compute the hash codes. This incremental construction allows for the incorporation of new trees and their hash codes without the need to rebuild the whole DTA from scratch.

How to cite

top

Carrasco, Rafael C., and Daciuk, Jan. "A perfect hashing incremental scheme for unranked trees using pseudo-minimal automata." RAIRO - Theoretical Informatics and Applications 43.4 (2009): 779-790. <http://eudml.org/doc/250583>.

@article{Carrasco2009,
abstract = { We describe a technique that maps unranked trees to arbitrary hash codes using a bottom-up deterministic tree automaton (DTA). In contrast to other hashing techniques based on automata, our procedure builds a pseudo-minimal DTA for this purpose. A pseudo-minimal automaton may be larger than the minimal one accepting the same language but, in turn, it contains proper elements (states or transitions which are unique) for every input accepted by the automaton. Therefore, pseudo-minimal DTA are a suitable structure to implement stable hashing schemes, that is, schemes where the output for every key can be determined prior to the automaton construction. We provide incremental procedures to build the pseudo-minimal DTA and the mapping that associates an integer value to every transition that will be used to compute the hash codes. This incremental construction allows for the incorporation of new trees and their hash codes without the need to rebuild the whole DTA from scratch. },
author = {Carrasco, Rafael C., Daciuk, Jan},
journal = {RAIRO - Theoretical Informatics and Applications},
keywords = {Perfect hashing; deterministic tree automata; pseudo-minimal automata; incremental automata.; perfect hashing; pseudo-minimal automata; incremental automata},
language = {eng},
month = {9},
number = {4},
pages = {779-790},
publisher = {EDP Sciences},
title = {A perfect hashing incremental scheme for unranked trees using pseudo-minimal automata},
url = {http://eudml.org/doc/250583},
volume = {43},
year = {2009},
}

TY - JOUR
AU - Carrasco, Rafael C.
AU - Daciuk, Jan
TI - A perfect hashing incremental scheme for unranked trees using pseudo-minimal automata
JO - RAIRO - Theoretical Informatics and Applications
DA - 2009/9//
PB - EDP Sciences
VL - 43
IS - 4
SP - 779
EP - 790
AB - We describe a technique that maps unranked trees to arbitrary hash codes using a bottom-up deterministic tree automaton (DTA). In contrast to other hashing techniques based on automata, our procedure builds a pseudo-minimal DTA for this purpose. A pseudo-minimal automaton may be larger than the minimal one accepting the same language but, in turn, it contains proper elements (states or transitions which are unique) for every input accepted by the automaton. Therefore, pseudo-minimal DTA are a suitable structure to implement stable hashing schemes, that is, schemes where the output for every key can be determined prior to the automaton construction. We provide incremental procedures to build the pseudo-minimal DTA and the mapping that associates an integer value to every transition that will be used to compute the hash codes. This incremental construction allows for the incorporation of new trees and their hash codes without the need to rebuild the whole DTA from scratch.
LA - eng
KW - Perfect hashing; deterministic tree automata; pseudo-minimal automata; incremental automata.; perfect hashing; pseudo-minimal automata; incremental automata
UR - http://eudml.org/doc/250583
ER -

References

top
  1. J. Aoe, K. Morimoto and M. Hase, An algorithm for compressing common suffixes used in trie structures. Trans. IEICE, J75-D-II (1992) 770–779  
  2. J. Aoe, K. Morimoto and M. Hase, An algorithm for compressing common suffixes used in trie structures. Systems and Computers in Japan24 (1993) 31–42. Translated from Trans. IEICE, J75-D-II (1992) 770–779.  
  3. R. Bayer and E.M. McCreight. Organization and maintenance of large ordered indices. Acta Informatica1 (1972) 173–189.  
  4. W.S. Brainerd, The minimalization of tree automata. Information and Control13 (1968) 484–491.  
  5. R.C. Carrasco and M.L. Forcada, Incremental construction and maintenance of minimal finite-state automata. Computational Linguistics28 (2002) 207–216.  
  6. R.C. Carrasco, J. Daciuk and M.L. Forcada, An implementation of deterministic tree automata minimization, edited by J.Holub and J. Zdárek, CIAA2007, 12th International Conference on Implementation and Application of Automata Proceedings. Lect. Notes Comput. Sci.4783 (2007) 122–129.  
  7. R.C. Carrasco, J. Daciuk and M.L. Forcada, Incremental construction of minimal tree automata. Algorithmica55 (2009) 95–110.  
  8. M. Ciura and S. Deorowicz, How to squeeze a lexicon. Software – Practice and Experience31 (2001) 1077–1090.  
  9. H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison and M. Tommasi, (1997). Tree automata techniques and applications. Available on: . release October, 1st 2002.  URIhttp://www.grappa.univ-lille3.fr/tata
  10. Z.J. Czech, G. Havas and B.S. Majewski, Perfect hashing. Theoret. Comput. Sci.182 (1997) 1–143.  
  11. J. Daciuk, Comments on incremental construction and maintenance of minimal finite-state automata by C. Rafael Carrasco and L. Mikel Forcada. Computational Linguistics30 (2004) 227–235.  
  12. J. Daciuk, Perfect hashing tree automata, edited by T. Hanneforth and K.-M.Würzner, Proceedings of Finite-State Methods and Natural Language Processing: 6th International Workshop, FSMNLP 2007, 97–106, Potsdam, September 14–16 (2007).  
  13. J. Daciuk, S. Mihov, B.W. Watson and R.E. Watson, Incremental construction of minimal acyclic finite-state automata. Computational Linguistics26 (2000) 3–16.  
  14. J. Daciuk, D. Maurel and A. Savary, Dynamic perfect hashing with finite-state automata, edited by M.A. Kłopotek, S. Wierzchoń and K. Trojanowski, Intelligent Information Processing and Web Mining, Proceedings of the International IIS: IIPWM'05 Conference held in Gdańsk, Poland, June 13-16 (2005). Advances in Soft Computing31 (2005) 169–178.  
  15. J. Daciuk, D. Maurel and A. Savary, Incremental and semi-incremental construction of pseudo-minimal automata, edited by J. Farre, I. Litovsky and S. Schmitz, Implementation and Application of Automata: 10th International Conference, CIAA 2005. Lect. Notes Comput. Sci.3845 (2006) 341–342.  
  16. C. Doran, D. Egedi, B.A. Hockey, B. Srinivas and M. Zaidel, XTAG system – a wide coverage grammar for english. In Proceedings of the 15th International Conference on Computational Linguistics (COLING 94), Vol. II, Kyoto, Japan (1994) 922–928.  
  17. C. Lucchiesi and T. Kowaltowski, Applications of finite automata representing large vocabularies. Software – Practice and Experience23 (1993) 15–30.  
  18. M.P. Marcus, B. Santorini and M. Marcinkiewicz, Building a large annotated corpus of english: the Penn Treebank. Computational Linguistics19 (1993) 313–330.  
  19. D. Maurel, Pseudo-minimal transducer. Theoretical Computer Science231 (2000) 129–139.  
  20. M. Nivat and A. Podelski, Minimal ascending and descending tree automata. SIAM J. Comput.26 (1997) 39–58.  
  21. D. Revuz, Dictionnaires et lexiques: méthodes et algorithmes. Ph.D. thesis, Institut Blaise Pascal, Paris, France. LITP 91.44 (1991).  
  22. D. Revuz, Dynamic acyclic minimal automaton, edited by S. Yu and A. Paun, CIAA 2000, Fifth International Conference on Implementation and Application of Automata. Lect. Notes Comput. Sci.2088 (2000) 226–232.  
  23. G. Rozenberg and A. Salomaa, Handbook of Formal Languages. Springer-Verlag, New York, Inc., Secaucus, NJ, USA (1997).  
  24. A. Russell, Necessary and sufficient conditions for collision-free hashing, in CRYPTO '92: Proceedings of the 12th Annual International Cryptology Conference on Advances in Cryptology, London, UK. Springer-Verlag (1993) 433–441.  
  25. K. Sgarbas, N. Fakotakis and G. Kokkinakis, Two algorithms for incremental construction of directed acyclic word graphs. International Journal on Artificial Intelligence Tools4 (1995) 369–381.  

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.