Properties of a singular value decomposition based dynamical model of gene expression data

Krzysztof Simek

International Journal of Applied Mathematics and Computer Science (2003)

  • Volume: 13, Issue: 3, page 337-345
  • ISSN: 1641-876X

Abstract

top
Recently, data on multiple gene expression at sequential time points were analyzed using the Singular Value Decomposition (SVD) as a means to capture dominant trends, called characteristic modes, followed by the fitting of a linear discrete-time dynamical system in which the expression values at a given time point are linear combinations of the values at a previous time point. We attempt to address several aspects of the method. To obtain the model, we formulate a nonlinear optimization problem and present how to solve it numerically using the standard MATLAB procedures. We use freely available data to test the approach. We discuss the possible consequences of data regularization, called sometimes ``polishing'', on the outcome of the analysis, especially when the model is to be used for prediction purposes. Then, we investigate the sensitivity of the method to missing measurements and its abilities to reconstruct the missing data. Summarizing, we point out that approximation of multiple gene expression data preceded by SVD provides some insight into the dynamics, but may also lead to unexpected difficulties, like overfitting problems.

How to cite

top

Simek, Krzysztof. "Properties of a singular value decomposition based dynamical model of gene expression data." International Journal of Applied Mathematics and Computer Science 13.3 (2003): 337-345. <http://eudml.org/doc/207648>.

@article{Simek2003,
abstract = {Recently, data on multiple gene expression at sequential time points were analyzed using the Singular Value Decomposition (SVD) as a means to capture dominant trends, called characteristic modes, followed by the fitting of a linear discrete-time dynamical system in which the expression values at a given time point are linear combinations of the values at a previous time point. We attempt to address several aspects of the method. To obtain the model, we formulate a nonlinear optimization problem and present how to solve it numerically using the standard MATLAB procedures. We use freely available data to test the approach. We discuss the possible consequences of data regularization, called sometimes ``polishing'', on the outcome of the analysis, especially when the model is to be used for prediction purposes. Then, we investigate the sensitivity of the method to missing measurements and its abilities to reconstruct the missing data. Summarizing, we point out that approximation of multiple gene expression data preceded by SVD provides some insight into the dynamics, but may also lead to unexpected difficulties, like overfitting problems.},
author = {Simek, Krzysztof},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {multiple gene expression; dynamical model of gene expression data; singular value decomposition},
language = {eng},
number = {3},
pages = {337-345},
title = {Properties of a singular value decomposition based dynamical model of gene expression data},
url = {http://eudml.org/doc/207648},
volume = {13},
year = {2003},
}

TY - JOUR
AU - Simek, Krzysztof
TI - Properties of a singular value decomposition based dynamical model of gene expression data
JO - International Journal of Applied Mathematics and Computer Science
PY - 2003
VL - 13
IS - 3
SP - 337
EP - 345
AB - Recently, data on multiple gene expression at sequential time points were analyzed using the Singular Value Decomposition (SVD) as a means to capture dominant trends, called characteristic modes, followed by the fitting of a linear discrete-time dynamical system in which the expression values at a given time point are linear combinations of the values at a previous time point. We attempt to address several aspects of the method. To obtain the model, we formulate a nonlinear optimization problem and present how to solve it numerically using the standard MATLAB procedures. We use freely available data to test the approach. We discuss the possible consequences of data regularization, called sometimes ``polishing'', on the outcome of the analysis, especially when the model is to be used for prediction purposes. Then, we investigate the sensitivity of the method to missing measurements and its abilities to reconstruct the missing data. Summarizing, we point out that approximation of multiple gene expression data preceded by SVD provides some insight into the dynamics, but may also lead to unexpected difficulties, like overfitting problems.
LA - eng
KW - multiple gene expression; dynamical model of gene expression data; singular value decomposition
UR - http://eudml.org/doc/207648
ER -

References

top
  1. Alter O., Brown P.O., and Botstein D. (2000): Singular value decomposition for genome-wide expression data processing and modeling. - Proc. Natl. Acad. Sci., Vol. 97, No. 18, pp. 10101-10106. 
  2. Alter O., Brown P.O. and Botstein D. (2001): Processing and modelinggenome-wide expression data using singular value decomposition. - Proc. SPIE , Vol. 4266, No. 2, pp. 171-186. 
  3. Bellman R. (1960): Introduction to Matrix Analysis. - New York: McGraw-Hill. Zbl0124.01001
  4. Branch M.A. and Grace A. (1996): Matlab Optimization Toolbox. User's Guide. - Natick, MA: MathWorks. 
  5. Everitt B.S. and Dunn G. (2001): Applied Multivariate Data Analysis. - NewYork: Oxford University Press. Zbl1010.62040
  6. Golub G.H. and van Loan C.F. (1996): Matrix Computations. -Baltimore: Johns Hopkins University Press. 
  7. Holter N.S., Mitra M., Maritan A., Cieplak M., Banavar J.R. and Fedoroff N.V. (2000): Fundamental patterns underlying gene expression profiles: Simplicity from complexity. - Proc. Natl. Acad. Sci., Vol. 97, No. 15, pp. 8409-8414. 
  8. Holter N.S., Mitra M., Maritan A., Cieplak M., Fedoroff N.V. and Banavar J.R. (2001): Dynamic modeling of gene expression data. - Proc. Natl. Acad. Sci, Vol. 98, No. 4, pp. 1693-1698. 
  9. Jackson J.E. (1991): A User's Guide to Principal Components. - NewYork: Wiley. Zbl0743.62047
  10. Kim S., Dougherty E.R., Bittner M.L., Chen Y., Krishnamoorthy S., Meltzer P. and Trent J.M. (2001): General nonlinear framework for the analysis of gene interaction via multivariate expression arrays. - J. Biomed. Optics, Vol. 5, No. 4, pp. 411-424. 
  11. Radmacher M.D., Simon R., Desper R., Taetle R., Schaffer A.A. and Nelson M.A. (2001): Graph models of oncogenesis with an application to melanoma. - J. Theor. Biol., Vol. 212, No. 4, pp. 535-548. 
  12. Raychaudhuri S., Stuart J.M. and Altman R. (2000): Principal componentsanalysis to summarize microarray experiments: Application to sporulation timeseries. - Proc. Pac. Symp. Biocomput'2000, Singapore: World Scientific, pp. 455-466. 
  13. Spellman P.T., Sherlock G., Zhang M.Q., Iyer V.R., Anders K., Eisen M.B., Brown P.O., Botstein D. and Futcher B. (1998): Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae bymicroarray hybridization. - Mol. Biol. Cell, Vol. 9, No. 12, pp. 3273-3297. 
  14. Velculescu V.E., Zhang L., Vogelstein B. and Kinzler K.W. (1995): Serial analysis of gene expression. - Science, Vol. 270, No. 5235, pp. 484-487. 
  15. Vogelstein B., Fearon E.R., Hamilton S.R., Kern S.E., Preisinger A.C., Leppert M., Nakamura Y., White R., Smits A.M. and Bos J.L. (1988): Genetic alterations during colorectal-tumor development. - N. Engl. J. Med., Vol. 319, No. 9, pp. 525-532. 
  16. Wall M.E., Dyck P.A. and Brettin T.S. (2001): SVDMAN-singular valuede composition analysis of microarray data. - Bioinformatics, Vol. 17, No. 6, pp. 566-568. 
  17. Watkins D.S. (1991): Fundamentals of Matrix Computations. - New York: Wiley. Zbl0746.65022

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.