On naive Bayes in speech recognition

László Tóth; András Kocsor; János Csirik

International Journal of Applied Mathematics and Computer Science (2005)

  • Volume: 15, Issue: 2, page 287-294
  • ISSN: 1641-876X

Abstract

top
The currently dominant speech recognition technology, hidden Mar-kov modeling, has long been criticized for its simplistic assumptions about speech, and especially for the naive Bayes combination rule inherent in it. Many sophisticated alternative models have been suggested over the last decade. These, however, have demonstrated only modest improvements and brought no paradigm shift in technology. The goal of this paper is to examine why HMM performs so well in spite of its incorrect bias due to the naive Bayes assumption. To do this we create an algorithmic framework that allows us to experiment with alternative combination schemes and helps us understand the factors that influence recognition performance. From the findings we argue that the bias peculiar to the naive Bayes rule is not really detrimental to phoneme classification performance. Furthermore, it ensures consistent behavior in outlier modeling, allowing efficient management of insertion and deletion errors.

How to cite

top

Tóth, László, Kocsor, András, and Csirik, János. "On naive Bayes in speech recognition." International Journal of Applied Mathematics and Computer Science 15.2 (2005): 287-294. <http://eudml.org/doc/207743>.

@article{Tóth2005,
abstract = {The currently dominant speech recognition technology, hidden Mar-kov modeling, has long been criticized for its simplistic assumptions about speech, and especially for the naive Bayes combination rule inherent in it. Many sophisticated alternative models have been suggested over the last decade. These, however, have demonstrated only modest improvements and brought no paradigm shift in technology. The goal of this paper is to examine why HMM performs so well in spite of its incorrect bias due to the naive Bayes assumption. To do this we create an algorithmic framework that allows us to experiment with alternative combination schemes and helps us understand the factors that influence recognition performance. From the findings we argue that the bias peculiar to the naive Bayes rule is not really detrimental to phoneme classification performance. Furthermore, it ensures consistent behavior in outlier modeling, allowing efficient management of insertion and deletion errors.},
author = {Tóth, László, Kocsor, András, Csirik, János},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {segment-based speech recognition; naive Bayes; hidden Markov model},
language = {eng},
number = {2},
pages = {287-294},
title = {On naive Bayes in speech recognition},
url = {http://eudml.org/doc/207743},
volume = {15},
year = {2005},
}

TY - JOUR
AU - Tóth, László
AU - Kocsor, András
AU - Csirik, János
TI - On naive Bayes in speech recognition
JO - International Journal of Applied Mathematics and Computer Science
PY - 2005
VL - 15
IS - 2
SP - 287
EP - 294
AB - The currently dominant speech recognition technology, hidden Mar-kov modeling, has long been criticized for its simplistic assumptions about speech, and especially for the naive Bayes combination rule inherent in it. Many sophisticated alternative models have been suggested over the last decade. These, however, have demonstrated only modest improvements and brought no paradigm shift in technology. The goal of this paper is to examine why HMM performs so well in spite of its incorrect bias due to the naive Bayes assumption. To do this we create an algorithmic framework that allows us to experiment with alternative combination schemes and helps us understand the factors that influence recognition performance. From the findings we argue that the bias peculiar to the naive Bayes rule is not really detrimental to phoneme classification performance. Furthermore, it ensures consistent behavior in outlier modeling, allowing efficient management of insertion and deletion errors.
LA - eng
KW - segment-based speech recognition; naive Bayes; hidden Markov model
UR - http://eudml.org/doc/207743
ER -

References

top
  1. Clarkson P. and Moreno P.J. (1999): On the use of support vector machines for phonetic classification. - Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing, Phoenix, AZ, pp. 585-588. 
  2. Domingos P. and Pazzani M. (1997): On the optimality of the simple Bayesian classifier under zero-oneloss. - Machine Learn., Vol. 29, No. 2-3, pp. 103-130. Zbl0892.68076
  3. Glass J.R. (1996): A probabilistic framework for feature-based speechrecognition. - Proc. 4-th Int. Conf. Spoken Language Processing,Philadelphia, PA, pp. 2277-2280. 
  4. Hand D.J. and Yu K. (2001): Idiot's Bayes-Not so stupid after all? - Int. Stat. Rev., Vol. 69, No. 3, pp. 385-398. Zbl1213.62010
  5. Holmes W.J. and Russel M.J. (1999): Probabilistic-trajectory Segmental HMMs, - Comput. Speech Lang., Vol. 13, No. 1, pp. 3-37. 
  6. Huang X.D., Acero A. and Hon H-W. (2001): Spoken Language Processing. - New York: Prentice Hall. 
  7. Lee K.-F. and Hon H.-W. (1989): Speaker-independent phone recognition using hidden Markov models. - IEEE Trans. Acoust. Speech Signal Process., Vol. 37, No. 11, pp. 1641-1648. 
  8. Ostendorf M., Digitalakis V. and Kimball O.A. (1996): From HMMs to segment models: A unified view of stochastic modeling for speech recognition. - IEEE Trans. Acoust. Speech Signal Process., Vol. 4, No. 5, pp. 360-378. 
  9. Tóth L., Kocsor A. and Kovacs K. (2000): A discriminative segmental speech model and its applicationto hungarian number recognition. - Proc. 3rd Workshop Text, Speech, Dialogue, Brno, Czech Republic,pp. 307-313. 
  10. Rish I., Hellerstein J. and Thathachar J. (2000): An analysis of data characteristics that affect naive Bayes performance. - IBM Technical Report RC1993. 
  11. Tax D.M.J., van Breukelen M., Duin R.P.W. and Kittler J. (2000): Combining multiple classifiers by averaging or by multiplying?. - Pattern Recogn., Vol. 33, No. 9, pp. 1475-1485. 
  12. Van Horn K.S. (2001): A maximum-entropy solution to the frame-dependency problem in speech recognition. - Tech. Rep., Dept. Computer Science, North Dakota State Univ. 
  13. Verhasselt J., Illina I., Martens J.-P., Gong Y., Haton J.-P.(1998): Assessing the importance of the segmentation probability in segment-based speechrecognition. - Speech Commun., Vol. 24, No. 1, pp. 51-72. 
  14. Woodland P.C. and Povey D. (2000): Large scale discriminative training for speech recognition. - Proc. ISCA ITRW ASR 2000, France: Paris, pp. 7-16. 
  15. Young S. et al. (2004): The HMM Toolkit (HTK) (software and manual). - Available at http://htk.eng.cam.ac.uk/ 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.