# Exponential inequalities for VLMC empirical trees

Antonio Galves; Véronique Maume-Deschamps; Bernard Schmitt

ESAIM: Probability and Statistics (2008)

- Volume: 12, page 219-229
- ISSN: 1292-8100

## Access Full Article

top## Abstract

top## How to cite

topGalves, Antonio, Maume-Deschamps, Véronique, and Schmitt, Bernard. "Exponential inequalities for VLMC empirical trees." ESAIM: Probability and Statistics 12 (2008): 219-229. <http://eudml.org/doc/250385>.

@article{Galves2008,

abstract = {
A seminal paper by Rissanen, published in 1983, introduced the class
of Variable Length Markov Chains and the algorithm Context which
estimates the probabilistic tree generating the chain. Even if the
subject was recently considered in several papers, the central
question of the rate of convergence of the algorithm remained
open. This is the question we address here. We provide an
exponential upper bound for the probability of incorrect estimation
of the probabilistic tree, as a function of the size of the
sample. As a consequence we prove the almost sure consistency of the
algorithm Context. We also derive exponential upper bounds for type
I errors and for the probability of underestimation of the context tree.
The constants appearing in the bounds are all
explicit and obtained in a constructive way.
},

author = {Galves, Antonio, Maume-Deschamps, Véronique, Schmitt, Bernard},

journal = {ESAIM: Probability and Statistics},

keywords = {Variable Length Markov Chain; context tree; algorithm context; weak dependance; variable length Markov chain},

language = {eng},

month = {1},

pages = {219-229},

publisher = {EDP Sciences},

title = {Exponential inequalities for VLMC empirical trees},

url = {http://eudml.org/doc/250385},

volume = {12},

year = {2008},

}

TY - JOUR

AU - Galves, Antonio

AU - Maume-Deschamps, Véronique

AU - Schmitt, Bernard

TI - Exponential inequalities for VLMC empirical trees

JO - ESAIM: Probability and Statistics

DA - 2008/1//

PB - EDP Sciences

VL - 12

SP - 219

EP - 229

AB -
A seminal paper by Rissanen, published in 1983, introduced the class
of Variable Length Markov Chains and the algorithm Context which
estimates the probabilistic tree generating the chain. Even if the
subject was recently considered in several papers, the central
question of the rate of convergence of the algorithm remained
open. This is the question we address here. We provide an
exponential upper bound for the probability of incorrect estimation
of the probabilistic tree, as a function of the size of the
sample. As a consequence we prove the almost sure consistency of the
algorithm Context. We also derive exponential upper bounds for type
I errors and for the probability of underestimation of the context tree.
The constants appearing in the bounds are all
explicit and obtained in a constructive way.

LA - eng

KW - Variable Length Markov Chain; context tree; algorithm context; weak dependance; variable length Markov chain

UR - http://eudml.org/doc/250385

ER -

## References

top- G. Bejerano and G. Yona, Variations on probabilistic suffix trees: statistical modeling and prediction of protein families. Bioinformatics17 (2001) 23–43.
- P. Bühlmann and A. Wyner, Variable length Markov chains. Ann. Statist.27 (1999) 480–513. Zbl0983.62048
- I. Csiszár, Large-scale typicality of Markov sample paths and consistency of MDL order estimators. Special issue on Shannon theory: perspective, trends, and applications. IEEE Trans. Inform. Theory48 (2002) 1616–1628. Zbl1060.62092
- I. Csiszár and Z. Talata, Context tree estimation for not necessarily finite memory processes via BIC and MDL, manuscript (2005). Zbl1284.94027
- J. Dedecker and P. Doukhan, A new covariance inequality and applications. Stochastic Process. Appl.106 (2003) 63–80. Zbl1075.60513
- J. Dedecker and C. Prieur, New dependence coefficients. Examples and applications to statistics. Prob. Theory Relat. Fields132 (2005) 203–236. Zbl1061.62058
- P. Ferrari and A. Galves, Coupling and regeneration for stochastic processes. Notes for a minicourse presented in XIII Escuela Venezolana de Matematicas. Can be downloaded from www.ime.usp.br/~pablo/book/abstract.html (2000).
- F. Ferrari and A. Wyner, Estimation of general stationary processes by variable length Markov chains. Scand. J. Statist.30 (2003) 459–480. Zbl1034.62080
- F. Leonardi and A. Galves, Sequence Motif identification and protein classification using probabilistic trees. Lect. Notes Comput. Sci.3594 (2005) 190–193.
- V. Maume-Deschamps, Exponential inequalities and estimation of conditional probabilities in Dependence in probability and statistics, Lect. Notes in Stat., Vol. 187, P. Bertail, P. Doukhan and P. Soulier Eds. Springer (2006). Zbl1177.60025
- J. Rissanen, A universal data compression system. IEEE Trans. Inform. Theory29 (1983) 656–664. Zbl0521.94010
- T.J. Tjalkens and F.M.J.F. Willems, Implementing the context-tree weighting method: arithmetic coding. Recent advances in interdisciplinary mathematics (Portland, ME, 1997). J. Combin. Inform. System Sci.25 (2000) 49-58. Zbl1219.94071
- F.M. Willems, Y.M. Shtarkov and T.J Tjalkens, The context-tree weighting method: basic properties. IEEE Trans. Inform. Theory41 (1995) 653–664. Zbl0837.94011

## NotesEmbed ?

topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.