# The use of information and information gain in the analysis of attribute dependencies

Krzysztof Moliński; Anita Dobek; Kamila Tomaszyk

Biometrical Letters (2012)

- Volume: 49, Issue: 2, page 149-158
- ISSN: 1896-3811

## Access Full Article

top## Abstract

top## How to cite

topKrzysztof Moliński, Anita Dobek, and Kamila Tomaszyk. "The use of information and information gain in the analysis of attribute dependencies." Biometrical Letters 49.2 (2012): 149-158. <http://eudml.org/doc/268905>.

@article{KrzysztofMoliński2012,

abstract = {This paper demonstrates the possible conclusions which can be drawn from an analysis of entropy and information. Because of its universality, entropy can be widely used in different subjects, especially in biomedicine. Based on simulated data the similarities and differences between the grouping of attributes and testing of their independencies are shown. It follows that a complete exploration of data sets requires both of these elements. A new concept introduced in this paper is that of normed information gain, allowing the use of any logarithm in the definition of entropy.},

author = {Krzysztof Moliński, Anita Dobek, Kamila Tomaszyk},

journal = {Biometrical Letters},

keywords = {dendrogram; entropy; information gain},

language = {eng},

number = {2},

pages = {149-158},

title = {The use of information and information gain in the analysis of attribute dependencies},

url = {http://eudml.org/doc/268905},

volume = {49},

year = {2012},

}

TY - JOUR

AU - Krzysztof Moliński

AU - Anita Dobek

AU - Kamila Tomaszyk

TI - The use of information and information gain in the analysis of attribute dependencies

JO - Biometrical Letters

PY - 2012

VL - 49

IS - 2

SP - 149

EP - 158

AB - This paper demonstrates the possible conclusions which can be drawn from an analysis of entropy and information. Because of its universality, entropy can be widely used in different subjects, especially in biomedicine. Based on simulated data the similarities and differences between the grouping of attributes and testing of their independencies are shown. It follows that a complete exploration of data sets requires both of these elements. A new concept introduced in this paper is that of normed information gain, allowing the use of any logarithm in the definition of entropy.

LA - eng

KW - dendrogram; entropy; information gain

UR - http://eudml.org/doc/268905

ER -

## References

top- Bezzi M. (2007): Quantifying the information transmitted in a single stimulus. Biosystems, 89: 4-9.[Crossref][WoS]
- Brunsell N.A. (2010): A multiscale information theory approach to assess spatial- temporal variability of daily precipitation. Journal of Hydrology 385: 165-172.[WoS]
- Jakulin A. (2005). Machine learning based on attribute informations. PhD Dissertation. University of Ljubljana.
- Jakulin A., Bratko I., Smrke D., Demsar J., Zupan B. (2003): Attribute interactions in medical data analysis. In: 9th Conference on Artificial Intelligence in Medicine in Europe (AIME 2003), October 18-22, (2003), Protaras, Cyprus.
- Jakulin A., Bratko I. (2003): Analyzing attribute dependencies. In: 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (PKDD 2003), September 22-26, Cavtat, Croatia.
- Jakulin A., Bratko I. (2004a): Quantifying and visualizing attribute interactions: an approach based on entropy. http://arxiv.org/abs/cs.AI/0308002v3.
- Jakulin A., Bratko I. (2004b): Testing the significance of attribute interaction. Proc. 21st International Conference on Machine Learning. Banff, Canada.
- Kang G., Yue W., Zhang J., Cui Y., Zuo Y., Zhang D. (2008): An entropy-based approach for testing genetic epistasis underlying complex diseases. Journal of Theoretical Biology 250: 362-374.[WoS]
- Kullback S., Leibler R.A. (1951): On information and sufficiency. Annals of Mathematical Statistics 22(1): 79-86.[Crossref] Zbl0042.38403
- Matsuda H. (2000): Physical nature of higher-order mutual information. Intrinsic correlation and frustration. Physical Review E, 62: 3096-3102.
- McGill W.J. (1954): Multivariate information transmission. Psychometrika 19(2): 97-116.[Crossref] Zbl0058.35706
- Moniz L.J., Cooch E.G., Ellner S.P., Nichols J.D., Nichols J.M. (2007): Application of information theory methods to food web reconstruction. Ecological Modeling 208: 145-158.
- Moore J.H., Gilbert J.C., Tsai C.-T., Chiang F.-T., Holden T., Barney N., White B.C. (2006): A flexible computational framework for detecting, characterizing, and interpreting statistical patterns of epistasis in genetic studies of human disease susceptibility. Journal of Theoretical Biology 241: 252-261.
- Rajski C. (1961): A metric space of discrete probability distributions. Information and Control 4: 373-377. Zbl0103.35805
- Shannon C. (1948): A mathematical theory of communication. Bell System Technical Journal 27: 379-423, 623-656. Zbl1154.94303
- Yan Z. Wang Z., Xie H. (2008): The application of mutual information-based feature selection and fuzzy LS-SVM-based classifier in motion classification. Computer Methods and Programs in Biomedicine 90: 275-284.[WoS]

## NotesEmbed ?

topTo embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.