Displaying 401 – 420 of 1021

Showing per page

Graphical display in outlier diagnostics; adequacy and robustness.

Nethal K. Jajo (2005)

SORT

Outlier robust diagnostics (graphically) using Robustly Studentized Robust Residuals (RSRR) and Partial Robustly Studentized Robust Residuals (PRSRR) are established. One problem with some robust residual plots is that the residuals retain information from certain predicated values (Velilla, 1998). The RSRR and PRSRR techniques are unaffected by this complication and as a result they provide more interpretable results.

Hazard rate model and statistical analysis of a compound point process

Petr Volf (2005)

Kybernetika

A stochastic process cumulating random increments at random moments is studied. We model it as a two-dimensional random point process and study advantages of such an approach. First, a rather general model allowing for the dependence of both components mutually as well as on covariates is formulated, then the case where the increments depend on time is analyzed with the aid of the multiplicative hazard regression model. Special attention is devoted to the problem of prediction of process behaviour....

Heavy tailed durations of regional rainfall

Harry Pavlopoulos, Jan Picek, Jana Jurečková (2008)

Applications of Mathematics

Durations of rain events and drought events over a given region provide important information about the water resources of the region. Of particular interest is the shape of upper tails of the probability distributions of such durations. Recent research suggests that the underlying probability distributions of such durations have heavy tails of hyperbolic type, across a wide range of spatial scales from 2 km to 120 km. These findings are based on radar measurements of spatially averaged rain rate...

High-dimensional gaussian model selection on a gaussian design

Nicolas Verzelen (2010)

Annales de l'I.H.P. Probabilités et statistiques

We consider the problem of estimating the conditional mean of a real gaussian variable Y=∑i=1pθiXi+ɛ where the vector of the covariates (Xi)1≤i≤p follows a joint gaussian distribution. This issue often occurs when one aims at estimating the graph or the distribution of a gaussian graphical model. We introduce a general model selection procedure which is based on the minimization of a penalized least squares type criterion. It handles a variety of problems such as ordered and complete variable selection,...

Histogram selection in non Gaussian regression

Marie Sauvé (2009)

ESAIM: Probability and Statistics

We deal with the problem of choosing a piecewise constant estimator of a regression function s mapping 𝒳 into . We consider a non Gaussian regression framework with deterministic design points, and we adopt the non asymptotic approach of model selection via penalization developed by Birgé and Massart. Given a collection of partitions of 𝒳 , with possibly exponential complexity, and the corresponding collection of piecewise constant estimators, we propose a penalized least squares criterion which...

How many bins should be put in a regular histogram

Lucien Birgé, Yves Rozenholc (2006)

ESAIM: Probability and Statistics

Given an n-sample from some unknown density f on [0,1], it is easy to construct an histogram of the data based on some given partition of [0,1], but not so much is known about an optimal choice of the partition, especially when the data set is not large, even if one restricts to partitions into intervals of equal length. Existing methods are either rules of thumbs or based on asymptotic considerations and often involve some smoothness properties of f. Our purpose in this paper is to give an automatic,...

How powerful are data driven score tests for uniformity

Tadeusz Inglot, Alicja Janic (2009)

Applicationes Mathematicae

We construct a new class of data driven tests for uniformity, which have greater average power than existing ones for finite samples. Using a simulation study, we show that these tests as well as some "optimal maximum test" attain an average power close to the optimal Bayes test. Finally, we prove that, in the middle range of the power function, the loss in average power of the "optimal maximum test" with respect to the Neyman-Pearson tests, constructed separately for each alternative, in the Gaussian...

How the result of graph clustering methods depends on the construction of the graph

Markus Maier, Ulrike von Luxburg, Matthias Hein (2013)

ESAIM: Probability and Statistics

We study the scenario of graph-based clustering algorithms such as spectral clustering. Given a set of data points, one first has to construct a graph on the data points and then apply a graph clustering algorithm to find a suitable partition of the graph. Our main question is if and how the construction of the graph (choice of the graph, choice of parameters, choice of weights) influences the outcome of the final clustering result. To this end we study the convergence of cluster quality measures...

How to get Central Limit Theorems for global errors of estimates

Alain Berlinet (1999)

Applications of Mathematics

The asymptotic behavior of global errors of functional estimates plays a key role in hypothesis testing and confidence interval building. Whereas for pointwise errors asymptotic normality often easily follows from standard Central Limit Theorems, global errors asymptotics involve some additional techniques such as strong approximation, martingale theory and Poissonization. We review these techniques in the framework of density estimation from independent identically distributed random variables,...

Improving feature selection process resistance to failures caused by curse-of-dimensionality effects

Petr Somol, Jiří Grim, Jana Novovičová, Pavel Pudil (2011)

Kybernetika

The purpose of feature selection in machine learning is at least two-fold - saving measurement acquisition costs and reducing the negative effects of the curse of dimensionality with the aim to improve the accuracy of the models and the classification rate of classifiers with respect to previously unknown data. Yet it has been shown recently that the process of feature selection itself can be negatively affected by the very same curse of dimensionality - feature selection methods may easily over-fit...

Inferring the residual waiting time for binary stationary time series

Gusztáv Morvai, Benjamin Weiss (2014)

Kybernetika

For a binary stationary time series define σ n to be the number of consecutive ones up to the first zero encountered after time n , and consider the problem of estimating the conditional distribution and conditional expectation of σ n after one has observed the first n outputs. We present a sequence of stopping times and universal estimators for these quantities which are pointwise consistent for all ergodic binary stationary processes. In case the process is a renewal process with zero the renewal state...

Intermittent estimation for finite alphabet finitarily Markovian processes with exponential tails

Gusztáv Morvai, Benjamin Weiss (2021)

Kybernetika

We give some estimation schemes for the conditional distribution and conditional expectation of the the next output following the observation of the first n outputs of a stationary process where the random variables may take finitely many possible values. Our schemes are universal in the class of finitarily Markovian processes that have an exponential rate for the tail of the look back time distribution. In addition explicit rates are given. A necessary restriction is that the scheme proposes an...

Currently displaying 401 – 420 of 1021