Displaying similar documents to “Optimization of the maximum likelihood estimator for determining the intrinsic dimensionality of high-dimensional data”

Survival analysis on data streams: Analyzing temporal events in dynamically changing environments

Ammar Shaker, Eyke Hüllermeier (2014)

International Journal of Applied Mathematics and Computer Science

Similarity:

In this paper, we introduce a method for survival analysis on data streams. Survival analysis (also known as event history analysis) is an established statistical method for the study of temporal “events” or, more specifically, questions regarding the temporal distribution of the occurrence of events and their dependence on covariates of the data sources. To make this method applicable in the setting of data streams, we propose an adaptive variant of a model that is closely related to...

A Taxonomy of Big Data for Optimal Predictive Machine Learning and Data Mining

Fokoue, Ernest (2014)

Serdica Journal of Computing

Similarity:

Big data comes in various ways, types, shapes, forms and sizes. Indeed, almost all areas of science, technology, medicine, public health, economics, business, linguistics and social science are bombarded by ever increasing flows of data begging to be analyzed efficiently and effectively. In this paper, we propose a rough idea of a possible taxonomy of big data, along with some of the most commonly used tools for handling each particular category of bigness. The dimensionality p of...

A complete gradient clustering algorithm formed with kernel estimators

Piotr Kulczycki, Małgorzata Charytanowicz (2010)

International Journal of Applied Mathematics and Computer Science

Similarity:

The aim of this paper is to provide a gradient clustering algorithm in its complete form, suitable for direct use without requiring a deeper statistical knowledge. The values of all parameters are effectively calculated using optimizing procedures. Moreover, an illustrative analysis of the meaning of particular parameters is shown, followed by the effects resulting from possible modifications with respect to their primarily assigned optimal values. The proposed algorithm does not demand...

Survival analysis with coarsely observed covariates.

Soren Feodor Nielsen (2003)

SORT

Similarity:

In this paper we consider analysis of survival data with incomplete covariate information. We model the incomplete covariates as a random coarsening of the complete covariate, and an overview of the theory of coarsening at random is given. Various ways of estimating the parameters of the model for the survival data given the covariates are discussed and compared.

Protecting micro-data by micro-aggregation: the experience in Eurostat.

Daniel Defays (1997)

Qüestiió

Similarity:

A natural strategy to protect the confidentiality of individual data is to aggregate them at the lowest possible level. Some studies realised in Eurostat on this topic will be presented: properties of classifications in clusters of fixed sizes, micro-aggregation as a generic method to protect the confidentiality of individual data, application to the Community Innovation Survey. The work performed in Eurostat will be put in line with other projects conducted at European level on the...

Detecting a data set structure through the use of nonlinear projections search and optimization

Victor L. Brailovsky, Michael Har-Even (1998)

Kybernetika

Similarity:

Detecting a cluster structure is considered. This means solving either the problem of discovering a natural decomposition of data points into groups (clusters) or the problem of detecting clouds of data points of a specific form. In this paper both these problems are considered. To discover a cluster structure of a specific arrangement or a cloud of data of a specific form a class of nonlinear projections is introduced. Fitness functions that estimate to what extent a given subset of...

Ridge estimation of covariance matrix from data in two classes

Yi Zhou, Bin Zhang (2024)

Applications of Mathematics

Similarity:

This paper deals with the problem of estimating a covariance matrix from the data in two classes: (1) good data with the covariance matrix of interest and (2) contamination coming from a Gaussian distribution with a different covariance matrix. The ridge penalty is introduced to address the problem of high-dimensional challenges in estimating the covariance matrix from the two-class data model. A ridge estimator of the covariance matrix has a uniform expression and keeps positive-definite,...

On quantile optimization problem based on information from censored data

Petr Volf (2018)

Kybernetika

Similarity:

Stochastic optimization problem is, as a rule, formulated in terms of expected cost function. However, the criterion based on averaging does not take in account possible variability of involved random variables. That is why the criterion considered in the present contribution uses selected quantiles. Moreover, it is assumed that the stochastic characteristics of optimized system are estimated from the data, in a non-parametric setting, and that the data may be randomly right-censored....