Clustering of Symbolic Data based on Affinity Coefficient: Application to a Real Data Set

Áurea Sousa; Helena Bacelar-Nicolau; Fernando C. Nicolau; Osvaldo Silva

Biometrical Letters (2013)

  • Volume: 50, Issue: 1, page 27-38
  • ISSN: 1896-3811

Abstract

top
In this paper, we illustrate an application of Ascendant Hierarchical Cluster Analysis (AHCA) to complex data taken from the literature (interval data), based on the standardized weighted generalized affinity coefficient, by the method of Wald and Wolfowitz. The probabilistic aggregation criteria used belong to a parametric family of methods under the probabilistic approach of AHCA, named VL methodology. Finally, we compare the results achieved using our approach with those obtained by other authors.

How to cite

top

Áurea Sousa, et al. "Clustering of Symbolic Data based on Affinity Coefficient: Application to a Real Data Set." Biometrical Letters 50.1 (2013): 27-38. <http://eudml.org/doc/268875>.

@article{ÁureaSousa2013,
abstract = {In this paper, we illustrate an application of Ascendant Hierarchical Cluster Analysis (AHCA) to complex data taken from the literature (interval data), based on the standardized weighted generalized affinity coefficient, by the method of Wald and Wolfowitz. The probabilistic aggregation criteria used belong to a parametric family of methods under the probabilistic approach of AHCA, named VL methodology. Finally, we compare the results achieved using our approach with those obtained by other authors.},
author = {Áurea Sousa, Helena Bacelar-Nicolau, Fernando C. Nicolau, Osvaldo Silva},
journal = {Biometrical Letters},
keywords = {Ascendant Hierarchical Cluster Analysis; Symbolic Data; Interval Data; Affinity Coefficient; VL Methodology},
language = {eng},
number = {1},
pages = {27-38},
title = {Clustering of Symbolic Data based on Affinity Coefficient: Application to a Real Data Set},
url = {http://eudml.org/doc/268875},
volume = {50},
year = {2013},
}

TY - JOUR
AU - Áurea Sousa
AU - Helena Bacelar-Nicolau
AU - Fernando C. Nicolau
AU - Osvaldo Silva
TI - Clustering of Symbolic Data based on Affinity Coefficient: Application to a Real Data Set
JO - Biometrical Letters
PY - 2013
VL - 50
IS - 1
SP - 27
EP - 38
AB - In this paper, we illustrate an application of Ascendant Hierarchical Cluster Analysis (AHCA) to complex data taken from the literature (interval data), based on the standardized weighted generalized affinity coefficient, by the method of Wald and Wolfowitz. The probabilistic aggregation criteria used belong to a parametric family of methods under the probabilistic approach of AHCA, named VL methodology. Finally, we compare the results achieved using our approach with those obtained by other authors.
LA - eng
KW - Ascendant Hierarchical Cluster Analysis; Symbolic Data; Interval Data; Affinity Coefficient; VL Methodology
UR - http://eudml.org/doc/268875
ER -

References

top
  1. Bacelar-Nicolau H. (1980): Contributions to the Study of Comparison Coefficients in Cluster Analysis, PhD Th. (in Portuguese), Univ. Lisbon. 
  2. Bacelar-Nicolau H. (1987): On the Distribution Equivalence in Cluster Analysis, Proc. of the NATO ASI on Pattern Recognition Theory and Applications, Springer- Verlag, New York, 1987: 73-79. 
  3. Bacelar-Nicolau H. (1988): Two Probabilistic Models for Classification of Variables in Frequency Tables. In: Classification and Related Methods of Data Analysis, H.-H. Bock (ed.), North Holland: Elsevier Sciences Publishers B.V.: 181-186. Zbl0729.62546
  4. Bacelar-Nicolau H. (2000): The Affinity Coefficient. In: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, H.-H. Bock and E. Diday (Eds.), Berlin: Springer-Verlag: 160-165. Zbl0977.62066
  5. Bacelar-Nicolau H. (2002): On the Generalised Affinity Coefficient for Complex Data. Biocybernetics and Biomedical Engineering 22(1): 31-42. 
  6. Bacelar-Nicolau H., Nicolau F.C., Sousa A., Bacelar-Nicolau L. (2009): Measuring Similarity of Complex and Heterogeneous Data in Clustering of Large Data Sets, Biocybernetics and Biomedical Engineering 29(2): 9-18. Zbl1286.62060
  7. Bacelar-Nicolau H., Nicolau F.C., Sousa A., Bacelar-Nicolau L. (2010): Clustering Complex Heterogeneous Data Using a Probabilistic Approach. Proceedings of Stochastic Modeling Techniques and Data Analysis International Conference (SMTDA2010), Chania Crete Greece, 8-11 June 2010 - published on the CD Proceedings of SMTDA2010 (electronic publication). Zbl1286.62060
  8. Bock H.-H., Diday E. (2000): Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data. Series: Studies in Classification, Data Analysis, and Knowledge Organization, Berlin: Springer- Verlag. Zbl1039.62501
  9. Chavent M., Lechevallier Y. (2002): Dynamical Clustering Algorithm of Interval Data: Optimization of an Adequacy Criterion Based on Hausdorff Distance. In: Classification, Clustering, and Data Analysis, K. Jajuga, A. Sokolowski, H.-H. Bock (Eds.), Berlin: Springer-Verlag: 53-60. Zbl1032.62058
  10. Chavent M., De Carvalho F.A.T., Lechevallier Y., Verde R. (2003): Trois Nouvelles Méthodes de Classification Automatique de Données Symboliques de type intervalle, Revue de Statistique Appliquée, tome 51(4): 5-29. 
  11. De Carvalho F.A.T., Brito P., Bock H-H. (2006a): Dynamic Clustering for Interval Data Based on L2 Distance. Computational Statistics 21(2). Zbl1114.62070
  12. De Carvalho F.A.T., Souza R.M.C.R. de, Chavent M., Lechevallier Y. (2006b): Adaptive Hausorff Distances and Dynamic Clustering of Symbolic Interval Data. Pattern Recognition Letters 27(3).[Crossref] 
  13. Esposito F., Malerba D., Tamma V. (2000): Dissimilarity Measures for Symbolic Objects, In: Analysis of Symbolic Data: Exploratory Methods for Extracting Statistical Information from Complex Data, H.-H. Bock and E. Diday (Eds.), Berlin: Springer-Verlag: 165-185. Zbl0977.62005
  14. Fraser D.A.S. (1975): Non Parametric Methods in Statistics. Chapman and Hall. 
  15. Lerman I.C. (1972): Étude Distributionelle de Statistiques de Proximité entre Structures Algébriques Finies du Même Type: Apllication à la Classification Automatique. Cahiers du B.U.R.O., 19, Paris. 
  16. Lerman I.C. (1981): Classification et Analyse Ordinale des Données, Paris: Dunod. Zbl0485.62051
  17. Matusita K. (1951): On the theory of Statistical Decision Functions, Ann. Instit. Stat. Math. III: 1-30. Zbl0044.14901
  18. Nicolau F.C. (1983): Cluster Analysis and Distribution Function. Methods of Operations Research 45: 431-433. 
  19. Nicolau F.C.m, Bacelar-Nicolau H. (1998): Some Trends in the Classification of Variables. In: Data Science, Classification, and Related Methods, C. Hayashi, N. Ohsumi, K. Yajima, Y. Tanaka, H.-H. Bock, Y. Baba (Eds.), Springer-Verlag: 89-98. Zbl0894.62075
  20. Nicolau F.C. (1983): Cluster Analysis and Distribution Function. Methods of Operations Research 45: 431-433. 
  21. Souza R.M.C.R. de, De Carvalho F.A.T. (2004): Clustering of interval data Based on City-Block distances, Pattern Recognition Letters 25: 353-365.[Crossref] 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.