On the Stack-Size of General Tries

Jérémie Bourdon; Markus Nebel; Brigitte Vallée

RAIRO - Theoretical Informatics and Applications (2010)

  • Volume: 35, Issue: 2, page 163-185
  • ISSN: 0988-3754

Abstract

top
Digital trees or tries are a general purpose flexible data structure that implements dictionaries built on words. The present paper is focussed on the average-case analysis of an important parameter of this tree-structure, i.e., the stack-size. The stack-size of a tree is the memory needed by a storage-optimal preorder traversal. The analysis is carried out under a general model in which words are produced by a source (in the information-theoretic sense) that emits symbols. Under some natural assumptions that encompass all commonly used data models (and more), we obtain a precise average-case and probabilistic analysis of stack-size. Furthermore, we study the dependency between the stack-size and the ordering on symbols in the alphabet: we establish that, when the source emits independent symbols, the optimal ordering arises when the most probable symbol is the last one in this order.

How to cite

top

Bourdon, Jérémie, Nebel, Markus, and Vallée, Brigitte. "On the Stack-Size of General Tries." RAIRO - Theoretical Informatics and Applications 35.2 (2010): 163-185. <http://eudml.org/doc/222100>.

@article{Bourdon2010,
abstract = { Digital trees or tries are a general purpose flexible data structure that implements dictionaries built on words. The present paper is focussed on the average-case analysis of an important parameter of this tree-structure, i.e., the stack-size. The stack-size of a tree is the memory needed by a storage-optimal preorder traversal. The analysis is carried out under a general model in which words are produced by a source (in the information-theoretic sense) that emits symbols. Under some natural assumptions that encompass all commonly used data models (and more), we obtain a precise average-case and probabilistic analysis of stack-size. Furthermore, we study the dependency between the stack-size and the ordering on symbols in the alphabet: we establish that, when the source emits independent symbols, the optimal ordering arises when the most probable symbol is the last one in this order. },
author = {Bourdon, Jérémie, Nebel, Markus, Vallée, Brigitte},
journal = {RAIRO - Theoretical Informatics and Applications},
keywords = {Average-case analysis of data-structures; information theory; trie; Mellin analysis.; digital trees; stack-size},
language = {eng},
month = {3},
number = {2},
pages = {163-185},
publisher = {EDP Sciences},
title = {On the Stack-Size of General Tries},
url = {http://eudml.org/doc/222100},
volume = {35},
year = {2010},
}

TY - JOUR
AU - Bourdon, Jérémie
AU - Nebel, Markus
AU - Vallée, Brigitte
TI - On the Stack-Size of General Tries
JO - RAIRO - Theoretical Informatics and Applications
DA - 2010/3//
PB - EDP Sciences
VL - 35
IS - 2
SP - 163
EP - 185
AB - Digital trees or tries are a general purpose flexible data structure that implements dictionaries built on words. The present paper is focussed on the average-case analysis of an important parameter of this tree-structure, i.e., the stack-size. The stack-size of a tree is the memory needed by a storage-optimal preorder traversal. The analysis is carried out under a general model in which words are produced by a source (in the information-theoretic sense) that emits symbols. Under some natural assumptions that encompass all commonly used data models (and more), we obtain a precise average-case and probabilistic analysis of stack-size. Furthermore, we study the dependency between the stack-size and the ordering on symbols in the alphabet: we establish that, when the source emits independent symbols, the optimal ordering arises when the most probable symbol is the last one in this order.
LA - eng
KW - Average-case analysis of data-structures; information theory; trie; Mellin analysis.; digital trees; stack-size
UR - http://eudml.org/doc/222100
ER -

References

top
  1. J. Clément, P. Flajolet and B. Vallée, Dynamical Sources in Information Theory: A General Analysis of Trie Structures. Algorithmica29 (2001) 307-369.  
  2. H. Daudé, P. Flajolet and B. Vallée, An average-case analysis of the Gaussian algorithm for lattice reduction. Combina. Probab. Comput.6 (1997) 397-433.  
  3. N.G. De Bruijn, D.E. Knuth and S.O. Rice, The average height of planted plane trees, Graph Theory and Computing. Academic Press (1972) 15-22.  
  4. L. Devroye and P. Kruszewski, On the Horton-Strahler number for Random Tries. RAIRO: Theoret. Informatics Appl.30 (1996) 443-456.  
  5. P. Flajolet, On the performance of evaluation of extendible hashing and trie searching. Acta Informatica20 (1983) 345-369.  
  6. P. Flajolet, X. Gourdon and P. Dumas, Mellin transforms and asymptotics: Harmonic sums. Theoret. Comput. Sci.144 (1995) 3-58.  
  7. P. Flajolet and C. Puech, Partial match retrieval of multidimensional data. J. ACM33 (1986) 371-407.  
  8. E.H. Fredkin, Trie Memory. Comm. ACM3 (1990) 490-500.  
  9. G.H. Gonnet and R. Baeza-Yates, Handbook of Algorithms and Data Structures: in Pascal and C. Addison-Wesley (1991).  
  10. A. Grothendieck, Produit tensoriels topologiques et espaces nucléaires. Mem. Amer. Math. Soc. 16 (1955).  
  11. A. Grothendieck, La Théorie de Fredholm. Bull. Soc. Math. France 84, 319-384.  
  12. P. Jacquet and W. Szpankowski, Analytical Depoissonization and its Applications. Theoret. Comput. Sci. 201 in ``Fundamental Study'' (1998) 1-62.  
  13. P. Kirschenhofer and H. Prodinger, On the Recursion Depth of Special Tree Traversal Algorithms. Inform. and Comput.74 (1987) 15-32.  
  14. R. Kemp, The average height of  
  15. D.E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley (1973).  
  16. M; Krasnoselskii, Positive solutions of operator equations. P. Noordhoff, Groningen (1964).  
  17. H.M. Mahmoud, Evolution of Random Search Trees. Wiley-Interscience Series (1992).  
  18. M.E. Nebel, The Stack-Size of Tries, a Combinatorial Study. Theoret. Comput. Sci. (to appear).  
  19. M.E. Nebel, The Stack-Size of Uniform Random Tries Revisited (submitted).  
  20. M.E. Nebel, On the Horton-Strahler Number for Combinatorial Tries. RAIRO: Theoret. Informatics Appl.34 (2000) 279-296.  
  21. M. Régnier, Trie hashing analysis, in Proc. 4th Int.Conf. Data Eng.. Los Angeles, CA (1988) 377-387.  
  22. M. Régnier, On the average height of trees in in digital search and dynamic hashing. Inform. Process. Lett.13 (1982) 64-66.  
  23. R.L. Rivest, Partial match retrieval algorithms. SIAM J. Comput.5 (1976) 19-50.  
  24. R. Sedgewick, Algorithms. Addison-Wesley (1988).  
  25. W. Szpankowski, On the height of digital trees and related problem. Algorithmica6 (1991) 256-277.  
  26. W. Szpankowski, Some results on  
  27. L. Trabb Pardo, Set representation and set intersection, Technical Report. Stanford University (1998).  
  28. B. Vallée, Dynamical Sources in Information Theory: Fundamental Intervals and Word Prefixes. Algorithmica29 (2001) 162-306.  
  29. X.G. Viennot, Trees Everywhere, in Proc. CAAP'90. Springer, Lecture Notes in Comput. Sci. 431 (1990) 18-41.  
  30. A. Yao, A note on the analysis of extendible hashing. Inform. Process. Lett.11 (1980) 84-86.  

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.