On the Stack-Size of General Tries

Jérémie Bourdon; Markus Nebel; Brigitte Vallée

RAIRO - Theoretical Informatics and Applications (2010)

  • Volume: 35, Issue: 2, page 163-185
  • ISSN: 0988-3754

Abstract

top
Digital trees or tries are a general purpose flexible data structure that implements dictionaries built on words. The present paper is focussed on the average-case analysis of an important parameter of this tree-structure, i.e., the stack-size. The stack-size of a tree is the memory needed by a storage-optimal preorder traversal. The analysis is carried out under a general model in which words are produced by a source (in the information-theoretic sense) that emits symbols. Under some natural assumptions that encompass all commonly used data models (and more), we obtain a precise average-case and probabilistic analysis of stack-size. Furthermore, we study the dependency between the stack-size and the ordering on symbols in the alphabet: we establish that, when the source emits independent symbols, the optimal ordering arises when the most probable symbol is the last one in this order.

How to cite

top

Bourdon, Jérémie, Nebel, Markus, and Vallée, Brigitte. "On the Stack-Size of General Tries." RAIRO - Theoretical Informatics and Applications 35.2 (2010): 163-185. <http://eudml.org/doc/222100>.

@article{Bourdon2010,
abstract = { Digital trees or tries are a general purpose flexible data structure that implements dictionaries built on words. The present paper is focussed on the average-case analysis of an important parameter of this tree-structure, i.e., the stack-size. The stack-size of a tree is the memory needed by a storage-optimal preorder traversal. The analysis is carried out under a general model in which words are produced by a source (in the information-theoretic sense) that emits symbols. Under some natural assumptions that encompass all commonly used data models (and more), we obtain a precise average-case and probabilistic analysis of stack-size. Furthermore, we study the dependency between the stack-size and the ordering on symbols in the alphabet: we establish that, when the source emits independent symbols, the optimal ordering arises when the most probable symbol is the last one in this order. },
author = {Bourdon, Jérémie, Nebel, Markus, Vallée, Brigitte},
journal = {RAIRO - Theoretical Informatics and Applications},
keywords = {Average-case analysis of data-structures; information theory; trie; Mellin analysis.; digital trees; stack-size},
language = {eng},
month = {3},
number = {2},
pages = {163-185},
publisher = {EDP Sciences},
title = {On the Stack-Size of General Tries},
url = {http://eudml.org/doc/222100},
volume = {35},
year = {2010},
}

TY - JOUR
AU - Bourdon, Jérémie
AU - Nebel, Markus
AU - Vallée, Brigitte
TI - On the Stack-Size of General Tries
JO - RAIRO - Theoretical Informatics and Applications
DA - 2010/3//
PB - EDP Sciences
VL - 35
IS - 2
SP - 163
EP - 185
AB - Digital trees or tries are a general purpose flexible data structure that implements dictionaries built on words. The present paper is focussed on the average-case analysis of an important parameter of this tree-structure, i.e., the stack-size. The stack-size of a tree is the memory needed by a storage-optimal preorder traversal. The analysis is carried out under a general model in which words are produced by a source (in the information-theoretic sense) that emits symbols. Under some natural assumptions that encompass all commonly used data models (and more), we obtain a precise average-case and probabilistic analysis of stack-size. Furthermore, we study the dependency between the stack-size and the ordering on symbols in the alphabet: we establish that, when the source emits independent symbols, the optimal ordering arises when the most probable symbol is the last one in this order.
LA - eng
KW - Average-case analysis of data-structures; information theory; trie; Mellin analysis.; digital trees; stack-size
UR - http://eudml.org/doc/222100
ER -

References

top
  1. J. Clément, P. Flajolet and B. Vallée, Dynamical Sources in Information Theory: A General Analysis of Trie Structures. Algorithmica29 (2001) 307-369.  Zbl1035.68039
  2. H. Daudé, P. Flajolet and B. Vallée, An average-case analysis of the Gaussian algorithm for lattice reduction. Combina. Probab. Comput.6 (1997) 397-433.  Zbl0921.11072
  3. N.G. De Bruijn, D.E. Knuth and S.O. Rice, The average height of planted plane trees, Graph Theory and Computing. Academic Press (1972) 15-22.  Zbl0247.05106
  4. L. Devroye and P. Kruszewski, On the Horton-Strahler number for Random Tries. RAIRO: Theoret. Informatics Appl.30 (1996) 443-456.  Zbl0867.68087
  5. P. Flajolet, On the performance of evaluation of extendible hashing and trie searching. Acta Informatica20 (1983) 345-369.  Zbl0515.68048
  6. P. Flajolet, X. Gourdon and P. Dumas, Mellin transforms and asymptotics: Harmonic sums. Theoret. Comput. Sci.144 (1995) 3-58.  Zbl0869.68057
  7. P. Flajolet and C. Puech, Partial match retrieval of multidimensional data. J. ACM33 (1986) 371-407.  
  8. E.H. Fredkin, Trie Memory. Comm. ACM3 (1990) 490-500.  
  9. G.H. Gonnet and R. Baeza-Yates, Handbook of Algorithms and Data Structures: in Pascal and C. Addison-Wesley (1991).  Zbl0719.68001
  10. A. Grothendieck, Produit tensoriels topologiques et espaces nucléaires. Mem. Amer. Math. Soc. 16 (1955).  
  11. A. Grothendieck, La Théorie de Fredholm. Bull. Soc. Math. France 84, 319-384.  
  12. P. Jacquet and W. Szpankowski, Analytical Depoissonization and its Applications. Theoret. Comput. Sci. 201 in ``Fundamental Study'' (1998) 1-62.  Zbl0902.68087
  13. P. Kirschenhofer and H. Prodinger, On the Recursion Depth of Special Tree Traversal Algorithms. Inform. and Comput.74 (1987) 15-32.  Zbl0625.68048
  14. R. Kemp, The average height of  Zbl06657087
  15. D.E. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching. Addison-Wesley (1973).  Zbl0302.68010
  16. M; Krasnoselskii, Positive solutions of operator equations. P. Noordhoff, Groningen (1964).  
  17. H.M. Mahmoud, Evolution of Random Search Trees. Wiley-Interscience Series (1992).  Zbl0762.68033
  18. M.E. Nebel, The Stack-Size of Tries, a Combinatorial Study. Theoret. Comput. Sci. (to appear).  Zbl0988.68137
  19. M.E. Nebel, The Stack-Size of Uniform Random Tries Revisited (submitted).  Zbl0995.68071
  20. M.E. Nebel, On the Horton-Strahler Number for Combinatorial Tries. RAIRO: Theoret. Informatics Appl.34 (2000) 279-296.  Zbl0966.05019
  21. M. Régnier, Trie hashing analysis, in Proc. 4th Int.Conf. Data Eng.. Los Angeles, CA (1988) 377-387.  
  22. M. Régnier, On the average height of trees in in digital search and dynamic hashing. Inform. Process. Lett.13 (1982) 64-66.  Zbl0472.68058
  23. R.L. Rivest, Partial match retrieval algorithms. SIAM J. Comput.5 (1976) 19-50.  Zbl0331.68064
  24. R. Sedgewick, Algorithms. Addison-Wesley (1988).  
  25. W. Szpankowski, On the height of digital trees and related problem. Algorithmica6 (1991) 256-277.  Zbl0711.68035
  26. W. Szpankowski, Some results on  Zbl0637.68072
  27. L. Trabb Pardo, Set representation and set intersection, Technical Report. Stanford University (1998).  
  28. B. Vallée, Dynamical Sources in Information Theory: Fundamental Intervals and Word Prefixes. Algorithmica29 (2001) 162-306.  Zbl1009.94003
  29. X.G. Viennot, Trees Everywhere, in Proc. CAAP'90. Springer, Lecture Notes in Comput. Sci. 431 (1990) 18-41.  
  30. A. Yao, A note on the analysis of extendible hashing. Inform. Process. Lett.11 (1980) 84-86.  Zbl0447.68077

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.