Low-rank Orthogonal Decompositions for Information Retrieval Applications

Berry, Michael W.; Fierro, Ricardo D.

doi:10.1002/(sici)1099-1506(199607/08)3:4<301::aid-nla84>3.0.co;2-s

Cited by 51 publications

(25 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The sparse SVD function SVDS is based on the Arnoldi methods described in [36]. Note that, for practical purposes, less expensive factorizations such as QR or ULV may suffice in place of the SVD [10].…”

Section: Sparsitymentioning

confidence: 99%

Matrices, Vector Spaces, and Information Retrieval

Berry¹,

Drmač²,

1999

Self Cite

View full text Add to dashboard Cite

Abstract. The evolution of digital libraries and the Internet has dramatically transformed the processing, storage, and retrieval of information. Efforts to digitize text, images, video, and audio now consume a substantial portion of both academic and industrial activity. Even when there is no shortage of textual materials on a particular topic, procedures for indexing or extracting the knowledge or conceptual information contained in them can be lacking. Recently developed information retrieval technologies are based on the concept of a vector space. Data are modeled as a matrix, and a user's query of the database is represented as a vector. Relevant documents in the database are then identified via simple vector operations. Orthogonal factorizations of the matrix provide mechanisms for handling uncertainty in the database itself. The purpose of this paper is to show how such fundamental mathematical concepts from linear algebra can be used to manage and index large text collections.

show abstract

Section: Sparsitymentioning

confidence: 99%

Matrices, Vector Spaces, and Information Retrieval

Berry¹,

Drmač²,

1999

Self Cite

View full text Add to dashboard Cite

show abstract

“…Compare the above expression with (3). Choosing the function φ(x) appropriately will allow us to interpretate this approach as a compromise between the vector space and the TSVD approaches.…”

Section: Latent Semantic Indexing By Polynomial Filteringmentioning

confidence: 99%

“…Most of the current implementations of LSI rely on matrix decompositions (see e.g., [3], [13]), with the truncated SVD (TSVD) being the most popular [1], [2]. In TSVD it is assumed that the smallest singular triplets are noisy and therefore only the largest singular triplets are used for the rank-k representation of the term-by-document matrix A.…”

Section: Introductionmentioning

confidence: 99%

Polynomial filtering in latent semantic indexing for information retrieval

Kokiopoulou

Saad

2004

Proceedings of the 27th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval

View full text Add to dashboard Cite

Latent Semantic Indexing (LSI) is a well established and effective framework for conceptual information retrieval. In traditional implementations of LSI the semantic structure of the collection is projected into the k-dimensional space derived from a rank-k approximation of the original term-by-document matrix. This paper discusses a new way to implement the LSI methodology, based on polynomial filtering. The new framework does not rely on any matrix decomposition and therefore its computational cost and storage requirements are low relative to traditional implementations of LSI. Additionally, it can be used as an effective information filtering technique when updating LSI models based on user feedback.

show abstract

“…These approximations identify hidden structures in word usage, thus enabling searches that go beyond simple keyword matching (see, for example [14]). Rank reduction techniques (SDD [40], SVD [11,12,47,10]) for type clustering are applicable here as they have been shown to be especially appropriate for latent semantic indexing in information retrieval. 1224 1288 1292 3 8 17 291 1386 1387 314 322 324 329 351 353 356 362 469 379 480 481 479 483 562 583 591 590 14 13 15 593 598 602 2437 605 2438 618 620 621 624 626 627 638 2486 652 688 691 719 722 729 734 2487 2488 742 2691 745 740 741 747 749 755 809 914 916 754 945 941 948 943 946 940 939 944 947 949 942 938 759 965 960 953 954 968 962 952 957 967 955 961 951 956 966 959 964 963 958 229 970 1008 1119 1127 1130 117 185 3797 3798 1157 1160 3799 1162 1167 1180 1182 1136 1011 1010 1017 1045 3903 1186 1201 1220 228 ...…”

Section: Clustering and Topical Compressionmentioning

confidence: 99%

PIPE: Web personalization by partial evaluation

Ramakrishnan

2000

IEEE Internet Comput.

View full text Add to dashboard Cite

The central contribution of this paper is to model personalization by the programmatic notion of partial evaluation. Partial evaluation is a technique used to automatically specialize programs, given incomplete information about their input. The methodology presented here models a collection of information resources as a program (which abstracts the underlying schema of organization and flow of information), partially evaluates the program with respect to user input, and recreates a personalized site from the specialized program. This enables a customizable methodology called PIPE that supports the automatic specialization of resources, without enumerating the interaction sequences beforehand. Issues relating to the scalability of PIPE, information integration, sessionizing scenarios, and case studies are presented.

show abstract

Low-rank Orthogonal Decompositions for Information Retrieval Applications

Cited by 51 publications

References 12 publications

Matrices, Vector Spaces, and Information Retrieval

Matrices, Vector Spaces, and Information Retrieval

Polynomial filtering in latent semantic indexing for information retrieval

PIPE: Web personalization by partial evaluation

Contact Info

Product

Resources

About