A new method for identifying outlying subsets of data

Marta Zalewska; Antoni Grzanka; Wojciech Niemiro; Bolesław Samoliński

Displaying similar documents to “A new method for identifying outlying subsets of data”

Nonparametric statistical analysis for multiple comparison of machine learning regression algorithms

Bogdan Trawiński, Magdalena Smętek, Zbigniew Telec, Tadeusz Lasota (2012)

International Journal of Applied Mathematics and Computer Science

Similarity:

In the paper we present some guidelines for the application of nonparametric statistical tests and post-hoc procedures devised to perform multiple comparisons of machine learning algorithms. We emphasize that it is necessary to distinguish between pairwise and multiple comparison tests. We show that the pairwise Wilcoxon test, when employed to multiple comparisons, will lead to overoptimistic conclusions. We carry out intensive normality examination employing ten different tests showing...

Detecting atypical data in air pollution studies by using shorth intervals for regression

Cécile Durot, Karelle Thiébot (2005)

ESAIM: Probability and Statistics

Similarity:

To validate pollution data, subject-matter experts in Airpl (an organization that maintains a network of air pollution monitoring stations in western France) daily perform visual examinations of the data and check their consistency. In this paper, we describe these visual examinations and propose a formalization for this problem. The examinations consist in comparisons of so-called shorth intervals so we build a statistical test that compares such intervals in a nonparametric regression...

Linear discriminant analysis with a generalization of the Moore-Penrose pseudoinverse

Tomasz Górecki, Maciej Łuczak (2013)

International Journal of Applied Mathematics and Computer Science

Similarity:

The Linear Discriminant Analysis (LDA) technique is an important and well-developed area of classification, and to date many linear (and also nonlinear) discrimination methods have been put forward. A complication in applying LDA to real data occurs when the number of features exceeds that of observations. In this case, the covariance estimates do not have full rank, and thus cannot be inverted. There are a number of ways to deal with this problem. In this paper, we propose improving...

Outlier detection as a method for knowledge extraction from digital resources

Eugenia Stoimenova, Plamen Mateev, Milena Dobreva (2006)

Review of the National Center for Digitization

Similarity:

Data mining methods for gene selection on the basis of gene expression arrays

Michał Muszyński, Stanisław Osowski (2014)

International Journal of Applied Mathematics and Computer Science

Similarity:

On equivalence and bioequivalence testing.

Jordi Ocaña, M. Pilar Sánchez O., Álex Sánchez, Josep Lluís Carrasco (2008)

SORT

Similarity:

Equivalence testing is the natural approach to many statistical problems. First, its main application, bioequivalence testing, is reviewed. The basic concepts of bioequivalence testing (2×2 crossover designs, TOST, interval inclusion principle, etc.) and its problems (TOST biased character, the carryover problem, etc.) are considered. Next, equivalence testing is discussed more generally. Some applications and methods are reviewed and the relation of equivalence testing and distance-based...

Components of the Pearson-Fisher chi-squared statistic.

Rayner, G. D. (2002)

Journal of Applied Mathematics and Decision Sciences

Similarity:

Structural breaks in dependent, heteroscedastic, and extremal panel data

Matúš Maciak, Barbora Peštová, Michal Pešta (2018)

Kybernetika

Similarity:

New statistical procedures for a change in means problem within a very general panel data structure are proposed. Unlike classical inference tools used for the changepoint problem in the panel data framework, we allow for mutually dependent panels, unequal variances across the panels, and possibly an extremely short follow up period. Two competitive ratio type test statistics are introduced and their asymptotic properties are derived for a large number of available panels. The proposed...

Detection of outlying observations using the Akaike information criterion

Andrzej Kornacki (2013)

Biometrical Letters

Similarity:

For the detection of outliers (observations which are seemingly different from the others) the method of testing hypotheses is most often used. This approach, however, depends on the level of significance adopted by the investigator. Moreover, it can lead to the undesirable effect of “masking” of the outliers. This paper presents an alternative method of outlier detection based on the Akaike information criterion. The theory presented is applied to analysis of the results of beet leaf...

One sample tests for the location of modes of nonnormal data.

Carolan, Anthony M., Rayner, J.C.W. (2001)

Journal of Applied Mathematics and Decision Sciences

Similarity: