The first part of this paper reports a comparative study of the document classifications produced by the use of the single linkage, complete linkage, group average, and Ward clustering methods. Studies of cluster membership and of the effectiveness of cluster searches support previous findings that suggest that the single linkage classifications are rather different from those produced by the other three methods. These latter methods all produce large numbers of small clusters containing just pairs of documents. This finding motivates the work reported in the second part of the paper, which considers the use of clusters consisting of a document together with that document with which it is most similar. A comparison of the use of such clusters with conventional best match searches using seven document test collections suggests that the two types of search are of comparable effectiveness, but they retrieve noticeably different sets of relevant documents.
This paper considers the classifications produced by application of the single linkage, complete linkage, group average and Ward clustering methods to the Keen and Cranfield document test collections. Experiments were carried out to study the structure of the hierarchies produced by the different methods, the extent to which the methods distort the input similarity matrices during the generation of a classification, and the retrieval effectiveness obtainable in cluster based retrieval. The results would suggest that the single linkage method, which has been used extensively in previous work on document clustering, is not the most effective procedure of those tested, although it should be emphasized that the experiments have used only small document test collections.
The first part of this paper reports a comparative study of the document classifications produced by the use of the single linkage, complete linkage, group average, and Ward clustering methods. Studies of cluster membership and of the effectiveness of cluster searches support previous findings that suggest that the single linkage classifications are rather different from those produced by the other three methods. These latter methods all produce large numbers of small clusters containing just pairs of documents. This finding motivates the work reported in the second part of the paper, which considers the use of clusters consisting of a document together with that document with which it is most similar. A comparison of the use of such clusters with conventional best match searches using seven document test collections suggests that the two types of search are of comparable effectiveness, but they retrieve noticeably different sets of relevant documents.
This article examines a group of ten Indonesian inscriptions citing a range of gāthās, mantras and dhāraṇīs. The texts, contextualized and in some cases read and identified for the first time, underline the pan-Asian character of Buddhism and the integral place the Indonesian archipelago once held in the ancient Buddhist world. The identification of the sources of several of these texts in known Sanskrit scriptures raises the question whether some of these texts, none of which survives as such in the archipelago, were once transmitted there in manuscript form.
Au premier millénaire de notre ère, avant l'arrivée de l'ethnie birmane, le centre de la Birmanie abrita un important système urbain. Les chercheurs comme le grand public connaissent sa culture sous le nom « Pyu ». Les traces écrites des Pyus prennent la forme d'inscriptions sur pierre ou d'autres supports, rédigées en trois langues, chacune dotée de son propre type de graphie indienne. Le pyu, langue vernaculaire de la famille sinotibétaine, domine ; mais le sanskrit et le pali, langues cosmopolitaines, sont également représentées. Cette étude présente le contexte archéologique du corpus épigraphique ainsi que l'histoire des recherches antérieures sur la langue pyu ; elle établit la méthode et la notation dont les recherches à venir pourront se servir pour analyser et représenter les données épigraphiques en pyu ; et elle résume ce que nos recherches nous ont permis jusqu'ici de mieux comprendre en matière de graphie et de langue pyu. Les connaissances dans ce domaine sont enrichies par le biais d'une édition avec analyse linguistique de l'inscription bilingue sanskrit-pyu du tertre de Kan Wet Khaung. Enfin, l'inventaire des inscriptions relevant de la culture pyu fixe un identifiant stable pour chaque entrée, en lien avec les données pertinentes (lieux de conservation, documentation visuelle, références, etc.).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.