Detection of influential points by convex hull volume minimization

Petr Tichavský; Pavel Boček

Kybernetika (1998)

  • Volume: 34, Issue: 5, page [515]-534
  • ISSN: 0023-5954

Abstract

top
A method of geometrical characterization of multidimensional data sets, including construction of the convex hull of the data and calculation of the volume of the convex hull, is described. This technique, together with the concept of minimum convex hull volume, can be used for detection of influential points or outliers in multiple linear regression. An approximation to the true concept is achieved by ordering the data into a linear sequence such that the volume of the convex hull of the first n terms in the sequence grows as slowly as possible with n . The performance of the method is demonstrated on four well known data sets. The average computational complexity needed for the ordering is estimated by O ( N 2 + ( p - 1 ) / ( p + 1 ) ) for large N , where N is the number of observations and p is the data dimension, i. e. the number of predictors plus 1.

How to cite

top

Tichavský, Petr, and Boček, Pavel. "Detection of influential points by convex hull volume minimization." Kybernetika 34.5 (1998): [515]-534. <http://eudml.org/doc/33385>.

@article{Tichavský1998,
abstract = {A method of geometrical characterization of multidimensional data sets, including construction of the convex hull of the data and calculation of the volume of the convex hull, is described. This technique, together with the concept of minimum convex hull volume, can be used for detection of influential points or outliers in multiple linear regression. An approximation to the true concept is achieved by ordering the data into a linear sequence such that the volume of the convex hull of the first $n$ terms in the sequence grows as slowly as possible with $n$. The performance of the method is demonstrated on four well known data sets. The average computational complexity needed for the ordering is estimated by $O(N^\{2+(p-1)/(p+1)\})$ for large $N$, where $N$ is the number of observations and $p$ is the data dimension, i. e. the number of predictors plus 1.},
author = {Tichavský, Petr, Boček, Pavel},
journal = {Kybernetika},
keywords = {multiple linear regression; detection of influential points; multiple linear regression; detection of influential points},
language = {eng},
number = {5},
pages = {[515]-534},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Detection of influential points by convex hull volume minimization},
url = {http://eudml.org/doc/33385},
volume = {34},
year = {1998},
}

TY - JOUR
AU - Tichavský, Petr
AU - Boček, Pavel
TI - Detection of influential points by convex hull volume minimization
JO - Kybernetika
PY - 1998
PB - Institute of Information Theory and Automation AS CR
VL - 34
IS - 5
SP - [515]
EP - 534
AB - A method of geometrical characterization of multidimensional data sets, including construction of the convex hull of the data and calculation of the volume of the convex hull, is described. This technique, together with the concept of minimum convex hull volume, can be used for detection of influential points or outliers in multiple linear regression. An approximation to the true concept is achieved by ordering the data into a linear sequence such that the volume of the convex hull of the first $n$ terms in the sequence grows as slowly as possible with $n$. The performance of the method is demonstrated on four well known data sets. The average computational complexity needed for the ordering is estimated by $O(N^{2+(p-1)/(p+1)})$ for large $N$, where $N$ is the number of observations and $p$ is the data dimension, i. e. the number of predictors plus 1.
LA - eng
KW - multiple linear regression; detection of influential points; multiple linear regression; detection of influential points
UR - http://eudml.org/doc/33385
ER -

References

top
  1. Barnett V., The ordering of multivariate dat 
  2. Boček P., Lachout P., Linear programming approach to LMS–estimation, Comput Zbl0875.62292
  3. Buchta C., Müler J., Random polytopes in a bal 
  4. Chand D. R., Kapur S. S., An algorithm for convex polytope 
  5. Efron B., The convex hull of a random set of point 
  6. Hadi A. S., Identifying multiple outliers in multivariate dat 
  7. Rousseeuw P. J., Least median of squares regressio 
  8. Rousseeuw P. J., Zomeren B. C. van, Unmasking multivariate outliers and leverage points (with comments), J. Amer. Statist. Assoc 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.