Displaying similar documents to “A new method for identifying outlying subsets of data”

Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms

Bogdan Trawiński, Magdalena Smętek, Zbigniew Telec, Tadeusz Lasota (2012)

International Journal of Applied Mathematics and Computer Science

Similarity:

In the paper we present some guidelines for the application of nonparametric statistical tests and post-hoc procedures devised to perform multiple comparisons of machine learning algorithms. We emphasize that it is necessary to distinguish between pairwise and multiple comparison tests. We show that the pairwise Wilcoxon test, when employed to multiple comparisons, will lead to overoptimistic conclusions. We carry out intensive normality examination employing ten different tests showing...

Detecting atypical data in air pollution studies by using shorth intervals for regression

Cécile Durot, Karelle Thiébot (2005)

ESAIM: Probability and Statistics

Similarity:

To validate pollution data, subject-matter experts in Airpl (an organization that maintains a network of air pollution monitoring stations in western France) daily perform visual examinations of the data and check their consistency. In this paper, we describe these visual examinations and propose a formalization for this problem. The examinations consist in comparisons of so-called shorth intervals so we build a statistical test that compares such intervals in a nonparametric regression...

Linear discriminant analysis with a generalization of the Moore-Penrose pseudoinverse

Tomasz Górecki, Maciej Łuczak (2013)

International Journal of Applied Mathematics and Computer Science

Similarity:

The Linear Discriminant Analysis (LDA) technique is an important and well-developed area of classification, and to date many linear (and also nonlinear) discrimination methods have been put forward. A complication in applying LDA to real data occurs when the number of features exceeds that of observations. In this case, the covariance estimates do not have full rank, and thus cannot be inverted. There are a number of ways to deal with this problem. In this paper, we propose improving...

On equivalence and bioequivalence testing.

Jordi Ocaña, M. Pilar Sánchez O., Álex Sánchez, Josep Lluís Carrasco (2008)

SORT

Similarity:

Equivalence testing is the natural approach to many statistical problems. First, its main application, bioequivalence testing, is reviewed. The basic concepts of bioequivalence testing (2×2 crossover designs, TOST, interval inclusion principle, etc.) and its problems (TOST biased character, the carryover problem, etc.) are considered. Next, equivalence testing is discussed more generally. Some applications and methods are reviewed and the relation of equivalence testing and distance-based...

Detection of outlying observations using the Akaike information criterion

Andrzej Kornacki (2013)

Biometrical Letters

Similarity:

For the detection of outliers (observations which are seemingly different from the others) the method of testing hypotheses is most often used. This approach, however, depends on the level of significance adopted by the investigator. Moreover, it can lead to the undesirable effect of “masking” of the outliers. This paper presents an alternative method of outlier detection based on the Akaike information criterion. The theory presented is applied to analysis of the results of beet leaf...