Random projections and hotelling's T² statistics for change detection in high-dimensional data streams

Ewa Skubalska-Rafajłowicz

International Journal of Applied Mathematics and Computer Science (2013)

  • Volume: 23, Issue: 2, page 447-461
  • ISSN: 1641-876X

Abstract

top
The method of change (or anomaly) detection in high-dimensional discrete-time processes using a multivariate Hotelling chart is presented. We use normal random projections as a method of dimensionality reduction. We indicate diagnostic properties of the Hotelling control chart applied to data projected onto a random subspace of Rn . We examine the random projection method using artificial noisy image sequences as examples.

How to cite

top

Ewa Skubalska-Rafajłowicz. "Random projections and hotelling's T² statistics for change detection in high-dimensional data streams." International Journal of Applied Mathematics and Computer Science 23.2 (2013): 447-461. <http://eudml.org/doc/257111>.

@article{EwaSkubalska2013,
abstract = {The method of change (or anomaly) detection in high-dimensional discrete-time processes using a multivariate Hotelling chart is presented. We use normal random projections as a method of dimensionality reduction. We indicate diagnostic properties of the Hotelling control chart applied to data projected onto a random subspace of Rn . We examine the random projection method using artificial noisy image sequences as examples.},
author = {Ewa Skubalska-Rafajłowicz},
journal = {International Journal of Applied Mathematics and Computer Science},
keywords = {change detection; multidimensional control charts; dimensionality reduction; random projections; process monitoring},
language = {eng},
number = {2},
pages = {447-461},
title = {Random projections and hotelling's T² statistics for change detection in high-dimensional data streams},
url = {http://eudml.org/doc/257111},
volume = {23},
year = {2013},
}

TY - JOUR
AU - Ewa Skubalska-Rafajłowicz
TI - Random projections and hotelling's T² statistics for change detection in high-dimensional data streams
JO - International Journal of Applied Mathematics and Computer Science
PY - 2013
VL - 23
IS - 2
SP - 447
EP - 461
AB - The method of change (or anomaly) detection in high-dimensional discrete-time processes using a multivariate Hotelling chart is presented. We use normal random projections as a method of dimensionality reduction. We indicate diagnostic properties of the Hotelling control chart applied to data projected onto a random subspace of Rn . We examine the random projection method using artificial noisy image sequences as examples.
LA - eng
KW - change detection; multidimensional control charts; dimensionality reduction; random projections; process monitoring
UR - http://eudml.org/doc/257111
ER -

References

top
  1. Achlioptas, D. (2001 ). Database friendly random projections, Proceedings of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Santa Barbara, CA, USA, pp. 274-281. 
  2. Ailon, N. and Chazelle, B. (2006). Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform, Proceedings of the 38th Annual ACM Symposium on Theory of Computing, Seattle, WA, USA, pp. 557-563. Zbl1301.68232
  3. Arriaga, R. and Vempala, S.(1999). An algorithmic theory of learning: Robust concepts and random projection, Proceedings of the 40th Annual IEEE Symposium on the Foundations of Computer Science, New York, NY, USA, pp. 616-623. Zbl1095.68092
  4. Biau, G. and Devroye, L. and Lugosi, G. (2008). On the performance of clustering in Hilbert spaces IEEE Transactions on Information Theory 54(2): 781-790. Zbl1304.62088
  5. Bodnar, O. and Schmid, W. (2005). Multivariate control charts based on a projection approach Allgemeines Statistisches Archiv 89(1): 75-93. Zbl05244505
  6. Chandola, V., Banerjee, A. and Kumar, V. (2009). Anomaly detection: A survey, ACM Computing Surveys 41(3): 15:1-15:58. 
  7. Cramer, H. and Wold, H.(1936). Some theorems on distribution functions, Journal of the London Mathematical Society 11(2): 290-295. Zbl62.0596.04
  8. Cuesta-Albertos, J.A., del Barrio, E., Fraiman, R. and Matran, C. (2007). The random projection method in goodness of fit for functional data, Computational Statistics and Data Analysis 51(10): 4814-4831. Zbl1162.62363
  9. Cuturi, M., Vert, J-P. and dAspremont, A. (2009). White functionals for anomaly detection in dynamical systems, in Y. Bengio, D. Schuurmans, J. Lafferty, C.K.I. Williams and A. Culotta (Eds.), Advances in Neural Information Processing Systems, Vol. 22, MIT Press, Vancouver, pp. 432-440. 
  10. Dasgupta, S. and Gupta, A. (2003). An elementary proof of a theorem of Johnson and Lindenstrauss, Random Structures and Algorithms 22(1): 60-65. Zbl1018.51010
  11. Donoho D.L. (2000 ). High-dimensional data analysis: The curses and blessings of dimensionality, Technical report, Department of Statistics, Stanford University, Stanford, CA. 
  12. Frankl, P. and Maehara, H. (1987). The Johnson-Lindenstrauss lemma and the sphericity of some graphs, Journal of Combinatorial Theory A 44(3): 355-362. Zbl0675.05049
  13. Forbes, C., Evans, M. and Hastings, N. and Peacock, B. (2011). Statistical Distributions, 4th Edn., John Wiley and Sons, Inc., Hoboken, NJ. Zbl1258.62012
  14. Hyvärinen, A., Karhunen, J. and Oja, E. (2001). Independent Component Analysis, Wiley, New York, NY. 
  15. Hotelling, H. (1931). The generalization of Student's ratio The Annals of Mathematical Statistics 2(3): 360-378. Zbl0004.26503
  16. Indyk, P. and Motwani, R. (1998). Approximate nearest neighbors: Towards removing the curse of dimensionality, Proceedings of the 30th Annual ACM Symposium on the Theory of Computing, Dallas, TX, USA, pp. 604-613. Zbl1029.68541
  17. Indyk, P. and Naor, A.(2007). Nearest neighbor preserving embeddings, ACM Transactions on Algorithms 3(3): 31:1-31:12. Zbl1192.68748
  18. Jolliffe, I.T. (1986). Principal Component Analysis, Springer-Verlag, New York, NY. Zbl1011.62064
  19. Johnson, W.B. and Lindenstrauss, J.(1984). Extensions of Contemporary Lipschitz mapping into Hilbert space, Mathematics 26: 189-206. Zbl0539.46017
  20. Korbicz, J., Kościelny, J.M., Kowalczuk, Z. and Cholewa, W. (Eds.) (2004). Fault Diagnosis. Models, Artificial Intelligence, Applications. Springer Verlag, Berlin/Heidelberg/New York, NY. Zbl1074.93004
  21. Lee, J.A. and Verleysen, M. (2007). Nonlinear Dimensionality Reduction, Springer, New York, NY. Zbl1128.68024
  22. Li, P., Hastie, T.J. and Church, K.W. (2006a). Nonlinear estimators and tail bounds for dimension reduction in L1 using Cauchy random projections, Technical report, Department of Statistics, Stanford University, Stanford, CA. Zbl1203.68160
  23. Li, P., Hastie, T.J. and Church, K.W. (2006b). Sub-Gaussian random projections, Technical report, Department of Statistics, Stanford University, Stanford, CA. 
  24. Mason, R.L., Tracy, N.D. and Young, J.C., (1992). Multivariate control charts for individual observations, Journal of Quality Technology 24(2): 88-95. 
  25. Mason, R.L. and Young, J.C. (2002). Multivariate Statistical Process Control with Industrial Application, SIAM, Philadelphia, PA. Zbl0989.62075
  26. Mathai, A.M. and Provost, S.B. (1992). Quadratic Forms in Random Variables: Theory and Applications, Marcel Dekker, New York, NY. Zbl0792.62045
  27. Matouŝek, J.(2008). On variants of the Johnson-Lindenstrauss lemma, Random Structures and Algorithms 33(2): 142-156. Zbl1154.51002
  28. Milman, V.(1971). A new proof of the theorem of A. Dvoretzky on sections of convex bodies, Functional Analysis and Its Applications 5(4): 28-37, (English translation). 
  29. Montgomery, D.C. (1996 ). Introduction to Statistical Quality Control, 3rd Edn., John Wiley and Sons, New York, NY. Zbl0997.62503
  30. Qin, S.J.(2003). Statistical process monitoring: Basics and beyond Journal of Chemometrics 17(8-9): 480-502. 
  31. Rao, C.R. (1973). Linear Statistical Inference and Its Applications, John Wiley and Sons, New York, NY/London/Sydney/Toronto. Zbl0256.62002
  32. Runger, G.C. (1996). Projections and the U-squared multivariate control chart, Journal of Quality Technology 28(3): 313-319. 
  33. Runger, G., Barton, R., Del Castillo, E. and Woodall, W.H. (2007). Optimal monitoring of multivariate data for fault patterns, Journal of Quality Technology 39(2): 159-172. 
  34. Skubalska-Rafajłowicz, E. (2006). RBF neural network for probability density function estimation and detecting changes in multivariate processes, in L. Rutkowski, R. Tadeusiewicz, L.A. Zadeh and J. Żurada (Eds.), Artificial Intelligence and Soft Computing, Lecture Notes in Computer Science, Vol. 4029, Springer-Verlag, Berlin/Heidelberg, pp. 133-141. 
  35. Skubalska-Rafajłowicz, E. (2008). Random projection RBF nets for multidimensional density estimation, International Journal of Applied Mathematics and Computer Science 18(4): 455-464, DOI: 10.2478/v10006-008-0040-9. Zbl1155.93428
  36. Skubalska-Rafajłowicz, E. (2009). Neural networks with sigmoidal activation functions dimension reduction using normal random projection, Nonlinear Analysis 71(12): e1255-e1263. 
  37. Skubalska-Rafajłowicz, E. (2011). Fast and efficient method of change detection in statistically monitored high-dimensional data streams, Proceedings of the 10th International Science and Technology Conference on Diagnostics of Processes and Systems, Zamość, Poland, pp. 256-260. 
  38. Srivastava, M.S. (2009). A review of multivariate theory for high dimensional data with fewer observations, in A. SenGupta (Ed.), Advances in Multivariate Statistical Methods, Vol. 9, World Scientific, Singapore, pp. 25-52. 
  39. Sulliva, J.H. and Woodall, W.H. (2000). Change-point detection of mean vector or covariance matrix shifts using multivariate individual observations, IIE Transactions 32(6): 537-549. 
  40. Tsung F. and Wang K. (2010). Adaptive charting techniques: Literature review and extensions, in H.-J. Lenz, P.-T. Wilrich and W. Schmid (Eds.), Frontiers in Statistical Quality Control, Vol. 9, Springer-Verlag, Berlin/Heidelberg, pp. 19-35. 
  41. Vempala, S. (2004). The Random Projection Method, American Mathematical Society, Providence, RI. Zbl1058.68063
  42. Wang, K. and Jiang, W. (2009). High-dimensional process monitoring and fault isolation via variable selection, Journal of Quality Technology 41(3): 247-258. 
  43. Wang, J. (2012). Geometric Structure of High-Dimensional Data and Dimensionality Reduction, Higher Education Press, Beijing/Springer-Verlag, Berlin/Heidelberg. Zbl1250.68010
  44. Wold, H. (1966). Estimation of principal components and related models by iterative least squares in P. Krishnaiaah (Ed.), Multivariate Analysis, Academic Press, New York, NY, pp. 391-420. 
  45. Zorriassatine, F., Tannock, J.D.T. and O‘Brien, C. (2003). Using novelty detection to identify abnormalities caused by mean shifts in bivariate processes, Computers and Industrial Engineering 44(3): 385-408. 

NotesEmbed ?

top

You must be logged in to post comments.

To embed these notes on your page include the following JavaScript code on your page where you want the notes to appear.

Only the controls for the widget will be shown in your chosen language. Notes will be shown in their authored language.

Tells the widget how many notes to show per page. You can cycle through additional notes using the next and previous controls.

    
                

Note: Best practice suggests putting the JavaScript code just before the closing </body> tag.