Large File Operations Support Using Order Preserving Perfect Hashing Functions
Since the huge database of patent documents is continuously increasing, the issue of classifying, updating and retrieving patent documents turned into an acute necessity. Therefore, we investigate the efficiency of applying Latent Semantic Indexing, an automatic indexing method of information retrieval, to some classes of patent documents from the United States Patent Classification System. We present some experiments that provide the optimal number of dimensions for the Latent Semantic Space and...
Text retrieval using Latent Semantic Indexing (LSI) with truncated Singular Value Decomposition (SVD) has been intensively studied in recent years. However, the expensive complexity involved in computing truncated SVD constitutes a major drawback of the LSI method. In this paper, we demonstrate how matrix rank approximation can influence the effectiveness of information retrieval systems. Besides, we present an implementation of the LSI method based on an eigenvalue analysis for rank approximation...
For a specific query merging the returned results from multiple search engines, in the form of a metasearch aggregation, can provide significant improvement in the quality of relevant documents. This paper suggests a minimax linear programming (LP) formulation for fusion of multiple search engines results. The paper proposes a weighting method to include the importance weights of the underlying search engines. This is a two-phase approach which in the first phase a new method for computing the importance...
For a specific query merging the returned results from multiple search engines, in the form of a metasearch aggregation, can provide significant improvement in the quality of relevant documents. This paper suggests a minimax linear programming (LP) formulation for fusion of multiple search engines results. The paper proposes a weighting method to include the importance weights of the underlying search engines. This is a two-phase approach which in...
Information retrieval in information systems (IS) with large amounts of data is not only a matter of an effective IS architecture and design and technical parameters of computer technology used for operation of the IS, but also of an easy and intuitive orientation in a number of offers and information provided by the IS. Such retrievals in IS are, however, frequently carried out with indeterminate information, which requires other models of orientation in the environment of the IS.
The area of Information Retrieval deals with problems of storage and retrieval within a huge collection of text documents. In IR models, the semantics of a document is usually characterized using a set of terms. A common need to various IR models is an efficient term retrieval provided via a term index. Existing approaches of term indexing, e. g. the inverted list, support efficiently only simple queries asking for a term occurrence. In practice, we would like to exploit some more sophisticated...
In the paper an interface is proposed that combines flexible (fuzzy) querying and data mining functionality. The point of departure is the fuzzy querying interface designed and implemented previously by the present authors. It makes it possible to formulate and execute, against a traditional (crisp) database, queries containing imprecisely specified conditions. Here we discuss possibilities to extend it with some data mining features. More specifically, linguistic summarization of data (databases),...
The paper presents a new technique for cognitive analysis and recognition of pathological wrist bone lesions. This method uses AI techniques and mathematical linguistics allowing us to automatically evaluate the structure of the said bones, based on palm radiological images. Possibilities of computer interpretation of selected images, based on the methodology of automatic medical image understanding, as introduced by the authors, were created owing to the introduction of an original relational description...