The use of information and information gain in the analysis of attribute dependencies
Krzysztof Moliński; Anita Dobek; Kamila Tomaszyk
Biometrical Letters (2012)
- Volume: 49, Issue: 2, page 149-158
- ISSN: 1896-3811
Access Full Article
topAbstract
topHow to cite
topKrzysztof Moliński, Anita Dobek, and Kamila Tomaszyk. "The use of information and information gain in the analysis of attribute dependencies." Biometrical Letters 49.2 (2012): 149-158. <http://eudml.org/doc/268905>.
@article{KrzysztofMoliński2012,
abstract = {This paper demonstrates the possible conclusions which can be drawn from an analysis of entropy and information. Because of its universality, entropy can be widely used in different subjects, especially in biomedicine. Based on simulated data the similarities and differences between the grouping of attributes and testing of their independencies are shown. It follows that a complete exploration of data sets requires both of these elements. A new concept introduced in this paper is that of normed information gain, allowing the use of any logarithm in the definition of entropy.},
author = {Krzysztof Moliński, Anita Dobek, Kamila Tomaszyk},
journal = {Biometrical Letters},
keywords = {dendrogram; entropy; information gain},
language = {eng},
number = {2},
pages = {149-158},
title = {The use of information and information gain in the analysis of attribute dependencies},
url = {http://eudml.org/doc/268905},
volume = {49},
year = {2012},
}
TY - JOUR
AU - Krzysztof Moliński
AU - Anita Dobek
AU - Kamila Tomaszyk
TI - The use of information and information gain in the analysis of attribute dependencies
JO - Biometrical Letters
PY - 2012
VL - 49
IS - 2
SP - 149
EP - 158
AB - This paper demonstrates the possible conclusions which can be drawn from an analysis of entropy and information. Because of its universality, entropy can be widely used in different subjects, especially in biomedicine. Based on simulated data the similarities and differences between the grouping of attributes and testing of their independencies are shown. It follows that a complete exploration of data sets requires both of these elements. A new concept introduced in this paper is that of normed information gain, allowing the use of any logarithm in the definition of entropy.
LA - eng
KW - dendrogram; entropy; information gain
UR - http://eudml.org/doc/268905
ER -
References
top- Bezzi M. (2007): Quantifying the information transmitted in a single stimulus. Biosystems, 89: 4-9.[Crossref][WoS]
- Brunsell N.A. (2010): A multiscale information theory approach to assess spatial- temporal variability of daily precipitation. Journal of Hydrology 385: 165-172.[WoS]
- Jakulin A. (2005). Machine learning based on attribute informations. PhD Dissertation. University of Ljubljana.
- Jakulin A., Bratko I., Smrke D., Demsar J., Zupan B. (2003): Attribute interactions in medical data analysis. In: 9th Conference on Artificial Intelligence in Medicine in Europe (AIME 2003), October 18-22, (2003), Protaras, Cyprus.
- Jakulin A., Bratko I. (2003): Analyzing attribute dependencies. In: 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2003), September 22-26, Cavtat, Croatia.
- Jakulin A., Bratko I. (2004a): Quantifying and visualizing attribute interactions: an approach based on entropy. http://arxiv.org/abs/cs.AI/0308002v3.
- Jakulin A., Bratko I. (2004b): Testing the significance of attribute interaction. Proc. 21st International Conference on Machine Learning. Banff, Canada.
- Kang G., Yue W., Zhang J., Cui Y., Zuo Y., Zhang D. (2008): An entropy-based approach for testing genetic epistasis underlying complex diseases. Journal of Theoretical Biology 250: 362-374.[WoS]
- Kullback S., Leibler R.A. (1951): On information and sufficiency. Annals of Mathematical Statistics 22(1): 79-86.[Crossref] Zbl0042.38403
- Matsuda H. (2000): Physical nature of higher-order mutual information. Intrinsic correlation and frustration. Physical Review E, 62: 3096-3102.
- McGill W.J. (1954): Multivariate information transmission. Psychometrika 19(2): 97-116.[Crossref] Zbl0058.35706
- Moniz L.J., Cooch E.G., Ellner S.P., Nichols J.D., Nichols J.M. (2007): Application of information theory methods to food web reconstruction. Ecological Modeling 208: 145-158.
- Moore J.H., Gilbert J.C., Tsai C.-T., Chiang F.-T., Holden T., Barney N., White B.C. (2006): A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology 241: 252-261.
- Rajski C. (1961): A metric space of discrete probability distributions. Information and Control 4: 373-377. Zbl0103.35805
- Shannon C. (1948): A mathematical theory of communication. Bell System Technical Journal 27: 379-423, 623-656. Zbl1154.94303
- Yan Z. Wang Z., Xie H. (2008): The application of mutual information-based feature selection and fuzzy LS-SVM-based classifier in motion classification. Computer Methods and Programs in Biomedicine 90: 275-284.[WoS]
NotesEmbed ?
topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.