In information retrieval, it is common to model index terms and documents as vectore in a suitably defined vector space. The main di]ficulty with this approach is that the explicit repreeentation of term vectors is not known a priorL For th~ mason, the vector space model adopted by Salton for the SMART system treats the terms as a set of orthogonal vectom In such a model it is often necessary to adopt a separate, corrective procedure to take into account the correlations between terms. In this paper, we propose a systematic method (the generalized vector space model) to compute term correlations directly from automatic indexing scheme. We also demonstrate how such correlations can be included with minimal modification in the existing vector based information retrieval systems. The preliminary experimental . results obtained from the new model are very encouraging.
Color image quantization is a process of representing an image with a small number of well selected colors. In this article an algorithm for multidimensional data clustering (termed the variance‐based algorithm), based on the criterion of minimization of the sum‐of‐squared error, is applied to the problem of reducing the number of colors used to represent a given color image. The suitability of the sum‐of‐squared error criterion for measuring the similarity between the original and quantized images is examined using a digitized image and a computer‐generated image. The experimental results indicate that this error measure is basically consistent with the perceived quality of the quantized image. The performance of the variance‐based algorithm is compared with that of other algorithms for color image quantization in terms of quantization images generated using the colors selected by the variance‐based and the mediancut algorithms are also presented.
Notations and definitions necessary to identify the concepts and relationships that are important in modelling information retrieval objects and processes in the context of vector spaces are presented. Earlier work on the use of vector model is evaluated in terms of the concepts introduced and certain problems and inconsistencies are identified. More importantly, this investigation should lead to a clear understanding of the issues and problems in using the vector space model in information retrieval.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.