# Latent Semantic Indexing using eigenvalue analysis for efficient information retrieval

Cherukuri Kumar; Suripeddi Srinivas

International Journal of Applied Mathematics and Computer Science (2006)

- Volume: 16, Issue: 4, page 551-558
- ISSN: 1641-876X

## Access Full Article

top## Abstract

top## How to cite

topKumar, Cherukuri, and Srinivas, Suripeddi. "Latent Semantic Indexing using eigenvalue analysis for efficient information retrieval." International Journal of Applied Mathematics and Computer Science 16.4 (2006): 551-558. <http://eudml.org/doc/207813>.

@article{Kumar2006,

abstract = {Text retrieval using Latent Semantic Indexing (LSI) with truncated Singular Value Decomposition (SVD) has been intensively studied in recent years. However, the expensive complexity involved in computing truncated SVD constitutes a major drawback of the LSI method. In this paper, we demonstrate how matrix rank approximation can influence the effectiveness of information retrieval systems. Besides, we present an implementation of the LSI method based on an eigenvalue analysis for rank approximation without computing truncated SVD, along with its computational details. Significant improvements in computational time while maintaining retrieval accuracy are observed over the tested document collections.},

author = {Kumar, Cherukuri, Srinivas, Suripeddi},

journal = {International Journal of Applied Mathematics and Computer Science},

keywords = {eigenvalues; latent semantic indexing; vector space method; rank reduction; information retrieval; singular value decomposition},

language = {eng},

number = {4},

pages = {551-558},

title = {Latent Semantic Indexing using eigenvalue analysis for efficient information retrieval},

url = {http://eudml.org/doc/207813},

volume = {16},

year = {2006},

}

TY - JOUR

AU - Kumar, Cherukuri

AU - Srinivas, Suripeddi

TI - Latent Semantic Indexing using eigenvalue analysis for efficient information retrieval

JO - International Journal of Applied Mathematics and Computer Science

PY - 2006

VL - 16

IS - 4

SP - 551

EP - 558

AB - Text retrieval using Latent Semantic Indexing (LSI) with truncated Singular Value Decomposition (SVD) has been intensively studied in recent years. However, the expensive complexity involved in computing truncated SVD constitutes a major drawback of the LSI method. In this paper, we demonstrate how matrix rank approximation can influence the effectiveness of information retrieval systems. Besides, we present an implementation of the LSI method based on an eigenvalue analysis for rank approximation without computing truncated SVD, along with its computational details. Significant improvements in computational time while maintaining retrieval accuracy are observed over the tested document collections.

LA - eng

KW - eigenvalues; latent semantic indexing; vector space method; rank reduction; information retrieval; singular value decomposition

UR - http://eudml.org/doc/207813

ER -

## References

top- April K. and Pottenger W.M. (2006): A framework for understanding latent semantic indexing performance. - J. Inf. Process. Manag., Vol. 42, No. 1, pp. 56-73.
- Aswani Kumar Ch., Gupta A., Batool M. and Trehan S. (2005): An information retrieval model based on latent semantic indexing with intelligent preprocessing. - J. Inf. Knowl. Manag., Vol. 4, No. 4, pp. 1-7.
- Aswani Kumar Ch. and Srinivas S. (2006): On the effect of rank approximation on information retrieval. - Proc. Int. Conf. s Systemics Cybernetics and Informatics, Hyderabad, India, pp. 876-880.
- Balinski J. and Danilowicz C. (2005): Ranking method based on inter document distances. - Inf. Process. Manag., Vol. 41, No. 4,pp. 759-775. Zbl1080.68583
- Bass D. and Behrens C. (2003): Distributed LSI: Scalable concept based information retrieval with high semantic resolution. - Proc. 2003 Text Mining Workshop, San Francisco, CA, USA, pp. 72-82.
- Bast H. and Weber I. (2005): Insights from viewing ranked retrieval as rank aggregation. - Proc. Workshop s Challenges in Web Information Retrieval and Integration, WIRI05, Tokyo, Japan, pp. 232-239.
- Berry M.W. (1992): Large scale singular value computations. - Int. J. Super Comput. Appli., Vol. 6, No. 1, pp. 13-49.
- Berry M.W. and Dumasis S.T. (1995): Using linear algebra for intelligent information retrieval. - SIAM Rev., Vol. 37, No. 4, pp. 573-995.
- Berry M.W., Drmac Z. and Jessup E.R. (1999): Matrices, vector spaces, and information retrieval. - SIAM Rev., Vol. 41, No. 2, pp. 335-362. Zbl0924.68069
- Berry M.W. and Shakhina A.P. (2005): Computing sparse reduced-rank approximation to sparse matrices. - ACM Trans. Math. Software, 2005, Vol. 31, No. 2, pp. 252-269. Zbl1070.65539
- Cheng X.Z. and Lafferty J.(2006): A risk minimization framework for information retrieval. - Inf. Process. Manag., Vol. 42, No. 1, pp. 31-55. Zbl1101.68544
- Deerwester S. (1990): Indexing by latent semantic analysis. - J. Ameri. Soci. Inf. Sci., Vol. 41, No. 6, pp. 391-407.
- Fan J., Ravi K., Littman M.L. and Santosh V. (1999): Efficient singular value decomposition via document samplings. - Tech. Rep. CS-1999-5, Dept. Computer Science, Duke University, North Carolina.
- Gao J. and Zhang J. (2003): Sparsification strategies in latent semantic indexing. - Proc. 2003 Mining Workshop, San Francisco, CA, USA, pp. 93-103.
- Gao J. and Zhang J. (2005): Clustered SVD strategies in latent semantic indexing. - Inf. Process. Manag., Vol. 41, No. 5, pp. 1051-1063. Zbl1101.68538
- Golub G.H. and Van Loan C.F. (1996): Matrix Computations. -Baltimore: The John Hopkins University Press.
- Hua Y. (2000): Searching beyond SVD for rank reduction. - Proc. IEEE Workshop s Sensor Array and Multi Channel Signal Processing, Cambridge, MA, USA, pp. 395-397.
- Husbands P., Simon H. and Ding C. (2001): On the use of singular value decomposition for text retrieval. - SIAM Comput. Inf. Retrieval, pp. 145-156. Zbl0995.68045
- Husbands P. and Ding C. (2005): Term norm distribution and its effects on latent semantic indexing. - Inf. Process. Manag., Vol. 41, No. 4, pp. 777-787.
- Kontostathis A. and Pottenger W.M. (2002a): Detecting patterns in the latent semantic indexing term-term matrix. - Proc. Workshop s Foundations of Data Mining and Discovery, IEEE Int. Conf. Data Mining, (ICDM 02), Maeybaski, Japan.
- Kontostathis A. and Pottenger W.M. (2002b): A mathematical view of latent semantic indexing: Tracing term co-occurrences. - Tech. Report, LU-CSE-02-006, Lehigh University.
- Landauer T.K., Foltz P.W. and Laham D. (1998): Introduction to latent semantic analysis. - Discourse Processes, Vol. 25, pp. 259-284.
- Park H. and Elden L. (2003): Matrix rank reduction for data analysis and feature extraction. - Tech. Rep., Dept. Computer Science and Engineering, University of Minnesota.
- Praks P., Dvorsky J., Snasel V. and Cernohorsky J.D. (2003): On SVD free latent semantic indexing for image retrieval for application in a hard real time environment. - Proc. IEEE Int. Conf.Industrial Technology, Maribor, Slovenia, pp. 466-471.
- Schisler M.L. (2004): Latent semantic indexing: using eigenvalues and the singular value decomposition in lexicographical search techniques. - Tech. Rep., University of Missouri.
- Tang J.T. (2003): Application of principle direction divisive partitioning and SVD in information retrieval. - Masters Proj. Rep., Dept. Computer Science, University of Kentucky, Lexington, KY.
- Yates R.B. and Neto B.R. (1999): Modern Information Retrieval. -New Delhi: Pearson Education.
- Ye Y.Q. (2000): Comparing matrix methods in text-based information retrieval. - Tech. Rep., School of Mathematical Sciences, Peking University.
- Zang Z., Zha H. and Simon H. (2002): Low rank approximations with sparse factors: Basic algorithms and error analysis. - SIAM J. Matrix Anal. Appl., Vol. 23, No. 3, pp. 706-727. Zbl1003.65041