Seeking Relationships in Big Data: A Bayesian Perspective

Singpurwalla, Nozer

Serdica Journal of Computing (2014)

  • Volume: 8, Issue: 2, page 97-110
  • ISSN: 1312-6555

Abstract

top
The real purpose of collecting big data is to identify causality in the hope that this will facilitate credible predictivity . But the search for causality can trap one into infinite regress, and thus one takes refuge in seeking associations between variables in data sets. Regrettably, the mere knowledge of associations does not enable predictivity. Associations need to be embedded within the framework of probability calculus to make coherent predictions. This is so because associations are a feature of probability models, and hence they do not exist outside the framework of a model. Measures of association, like correlation, regression, and mutual information merely refute a preconceived model. Estimated measures of associations do not lead to a probability model; a model is the product of pure thought. This paper discusses these and other fundamentals that are germane to seeking associations in particular, and machine learning in general. ACM Computing Classification System (1998): H.1.2, H.2.4., G.3.

How to cite

top

Singpurwalla, Nozer. "Seeking Relationships in Big Data: A Bayesian Perspective." Serdica Journal of Computing 8.2 (2014): 97-110. <http://eudml.org/doc/269894>.

@article{Singpurwalla2014,
abstract = {The real purpose of collecting big data is to identify causality in the hope that this will facilitate credible predictivity . But the search for causality can trap one into infinite regress, and thus one takes refuge in seeking associations between variables in data sets. Regrettably, the mere knowledge of associations does not enable predictivity. Associations need to be embedded within the framework of probability calculus to make coherent predictions. This is so because associations are a feature of probability models, and hence they do not exist outside the framework of a model. Measures of association, like correlation, regression, and mutual information merely refute a preconceived model. Estimated measures of associations do not lead to a probability model; a model is the product of pure thought. This paper discusses these and other fundamentals that are germane to seeking associations in particular, and machine learning in general. ACM Computing Classification System (1998): H.1.2, H.2.4., G.3.},
author = {Singpurwalla, Nozer},
journal = {Serdica Journal of Computing},
keywords = {Association; Correlation; Dependence; Mutual Information; Prediction; Regression; Retrospective Data},
language = {eng},
number = {2},
pages = {97-110},
publisher = {Institute of Mathematics and Informatics Bulgarian Academy of Sciences},
title = {Seeking Relationships in Big Data: A Bayesian Perspective},
url = {http://eudml.org/doc/269894},
volume = {8},
year = {2014},
}

TY - JOUR
AU - Singpurwalla, Nozer
TI - Seeking Relationships in Big Data: A Bayesian Perspective
JO - Serdica Journal of Computing
PY - 2014
PB - Institute of Mathematics and Informatics Bulgarian Academy of Sciences
VL - 8
IS - 2
SP - 97
EP - 110
AB - The real purpose of collecting big data is to identify causality in the hope that this will facilitate credible predictivity . But the search for causality can trap one into infinite regress, and thus one takes refuge in seeking associations between variables in data sets. Regrettably, the mere knowledge of associations does not enable predictivity. Associations need to be embedded within the framework of probability calculus to make coherent predictions. This is so because associations are a feature of probability models, and hence they do not exist outside the framework of a model. Measures of association, like correlation, regression, and mutual information merely refute a preconceived model. Estimated measures of associations do not lead to a probability model; a model is the product of pure thought. This paper discusses these and other fundamentals that are germane to seeking associations in particular, and machine learning in general. ACM Computing Classification System (1998): H.1.2, H.2.4., G.3.
LA - eng
KW - Association; Correlation; Dependence; Mutual Information; Prediction; Regression; Retrospective Data
UR - http://eudml.org/doc/269894
ER -

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.