Significance tests to identify regulated proteins based on a large number of small samples

Frank Klawonn

Significance tests to identify regulated proteins based on a large number of small samples

Frank Klawonn

Kybernetika (2012)

Volume: 48, Issue: 3, page 478-493
ISSN: 0023-5954

Access Full Article

top

Access to full text

Full (PDF)

Abstract

top

Modern biology is interested in better understanding mechanisms within cells. For this purpose, products of cells like metabolites, peptides, proteins or mRNA are measured and compared under different conditions, for instance healthy cells vs. infected cells. Such experiments usually yield regulation or expression values – the abundance or absence of a cell product in one condition compared to another one – for a large number of cell products, but with only a few replicates. In order to distinguish random fluctuations and noise from true regulations, suitable significance tests are needed. Here we propose a simple model which is based on the assumption that the regulation factors follow normal distributions with different expected values, but with the same standard deviation. Before suitable significance tests can be derived from this model, a reliable estimation for the standard deviation in the context of many small samples is needed. We therefore also include a discussion on the properties of the sample MAD (Median Absolute Deviation from the median) and the sample standard deviation for small samples sizes.

How to cite

top

MLA
BibTeX
RIS

Klawonn, Frank. "Significance tests to identify regulated proteins based on a large number of small samples." Kybernetika 48.3 (2012): 478-493. <http://eudml.org/doc/246400>.

@article{Klawonn2012,
abstract = {Modern biology is interested in better understanding mechanisms within cells. For this purpose, products of cells like metabolites, peptides, proteins or mRNA are measured and compared under different conditions, for instance healthy cells vs. infected cells. Such experiments usually yield regulation or expression values – the abundance or absence of a cell product in one condition compared to another one – for a large number of cell products, but with only a few replicates. In order to distinguish random fluctuations and noise from true regulations, suitable significance tests are needed. Here we propose a simple model which is based on the assumption that the regulation factors follow normal distributions with different expected values, but with the same standard deviation. Before suitable significance tests can be derived from this model, a reliable estimation for the standard deviation in the context of many small samples is needed. We therefore also include a discussion on the properties of the sample MAD (Median Absolute Deviation from the median) and the sample standard deviation for small samples sizes.},
author = {Klawonn, Frank},
journal = {Kybernetika},
keywords = {MAD; standard deviation; small samples; significance test; MAD; standard deviation; small samples; significance test},
language = {eng},
number = {3},
pages = {478-493},
publisher = {Institute of Information Theory and Automation AS CR},
title = {Significance tests to identify regulated proteins based on a large number of small samples},
url = {http://eudml.org/doc/246400},
volume = {48},
year = {2012},
}

TY - JOUR
AU - Klawonn, Frank
TI - Significance tests to identify regulated proteins based on a large number of small samples
JO - Kybernetika
PY - 2012
PB - Institute of Information Theory and Automation AS CR
VL - 48
IS - 3
SP - 478
EP - 493
AB - Modern biology is interested in better understanding mechanisms within cells. For this purpose, products of cells like metabolites, peptides, proteins or mRNA are measured and compared under different conditions, for instance healthy cells vs. infected cells. Such experiments usually yield regulation or expression values – the abundance or absence of a cell product in one condition compared to another one – for a large number of cell products, but with only a few replicates. In order to distinguish random fluctuations and noise from true regulations, suitable significance tests are needed. Here we propose a simple model which is based on the assumption that the regulation factors follow normal distributions with different expected values, but with the same standard deviation. Before suitable significance tests can be derived from this model, a reliable estimation for the standard deviation in the context of many small samples is needed. We therefore also include a discussion on the properties of the sample MAD (Median Absolute Deviation from the median) and the sample standard deviation for small samples sizes.
LA - eng
KW - MAD; standard deviation; small samples; significance test; MAD; standard deviation; small samples; significance test
UR - http://eudml.org/doc/246400
ER -

References

top

Anders, S., Huber, W., 10.1186/gb-2010-11-10-r106, Genome Biology 11 (2010), R106. DOI10.1186/gb-2010-11-10-r106
Benjamini, Y., Hochberg, Y., Controlling the false discovery rate: A practical and powerful approach to multiple testing, J. Roy. Statist. Soc. Ser. B (Methodological) 57 (1995), 289–300. Zbl0809.62014 MR1325392
Berrar, D. P., Dubitzky, M., Granzow, M., eds., A Practical Approach to Microarray Data Analysis, Springer, Dordecht 2009.
Breitwieser, F. P., Müller, A., Dayon, L., Köcher, T., Hainard, A., Pichler, P., Schmidt-Erfurth, U., Superti-Furga, G., Sanchez, J.-C., Mechtler, K., Bennett, K. L., Colinge, J., 10.1021/pr1012784, J. Proteome Res. 10 (2011), 2758–2766. DOI10.1021/pr1012784
Croux, C., Rousseuw, P. J., Alternatives to the median absolute deviation, In: Computational Statistics (Y. Dodge J. and Whittaker, eds.), Physica 1, Heidelberg 1992, pp. 411–428.
Gentleman, R., Carey, V., Huber, W., Irizarry, R., Dudoit, S., Bioinformatics and Computational Biology Solutions Using R and Bioconductor, Springer, New York 2005. Zbl1142.62100 MR2201836
Holm, S., A simple sequentially rejective multiple test procedure, Scand. J. Statist. 6 (1979), 65–70. Zbl0402.62058 MR0538597
Hundertmark, C., Fischer, R., Reinl, T., May, S., Klawonn, F., Jänsch, J., 10.1093/bioinformatics/btn551, Bioinformatics 25 (2009), 1004–1011. DOI10.1093/bioinformatics/btn551
Klawonn, F., Hundertmark, C., Jänsch, L., A maximum likelihood approach to noise estimation for intensity measurements in biology, In: Proc. Sixth IEEE International Conference on Data Mining: Workshops (S. Tsumoto, C. W. Clifton, N. Zhong, X. Wu, J. Liu, B. W. Wah, and Y.-M. Cheung, eds.), IEEE, Los Alamitos 2006, pp. 180–184.
Klawonn, F., Wüstefeld, T., Zender, L., Statistical modelling for data from experiments with short hairpin RNAs, In: Advances in Intelligent Data Analysis IX, Springer, Berlin 2010, pp. 79–90.
Development Core Team, R., R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing, Vienna 2009, http://www.R-project.org.
Robinson, M. D., Oshlack, A., 10.1186/gb-2010-11-3-r25, Genome Biology 11 (2010), R25. DOI10.1186/gb-2010-11-3-r25
Rousseuw, P. J., Croux, C., 10.1080/01621459.1993.10476408, J. Amer. Statist. Assoc. 88 (1993), 1273–1283. MR1245360 DOI10.1080/01621459.1993.10476408
Shaffer, J. P., 10.1146/annurev.ps.46.020195.003021, Ann. Rev. Psych. 46 (1995), 561–584. DOI10.1146/annurev.ps.46.020195.003021
Smyth, G. K., LIMMA: Linear models for microarray data, In: Bioinformatics and Computational Biology Solutions using R and Bioconductor (R. Gentleman, V. Carey, W. Huber, R. Irizarry, and S. Dudoit, eds.), Springer, New York 2005, pp. 397–420. MR2201836

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Language to use for this widget.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Number of notes per page

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.