2010
DOI: 10.1016/j.jbi.2010.04.001
|View full text |Cite
|
Sign up to set email alerts
|

Reflective random indexing for semi-automatic indexing of the biomedical literature

Abstract: The rapid growth of biomedical literature is evident in the increasing size of the MEDLINE research database. Medical Subject Headings (MeSH), a controlled set of keywords, are used to index all the citations contained in the database to facilitate search and retrieval. This volume of citations calls for efficient tools to assist indexers at the US National Library of Medicine (NLM). Currently, the Medical Text Indexer (MTI) system provides assistance by recommending MeSH terms based on the title and abstract … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0
1

Year Published

2012
2012
2016
2016

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 22 publications
(24 citation statements)
references
References 9 publications
0
23
0
1
Order By: Relevance
“…However automatic systems also present problems because of the complexity of natural language processing (Sinkkilä et al, 2011). Consequently the semi-automatic indexing approach is a good solution, because in addition to obviating the problems of the automatic indexing system it facilitates the the task of indexers by providing suitable term suggestions (Vasuki and Cohen, 2010).…”
Section: Research Objectivesmentioning
confidence: 99%
“…However automatic systems also present problems because of the complexity of natural language processing (Sinkkilä et al, 2011). Consequently the semi-automatic indexing approach is a good solution, because in addition to obviating the problems of the automatic indexing system it facilitates the the task of indexers by providing suitable term suggestions (Vasuki and Cohen, 2010).…”
Section: Research Objectivesmentioning
confidence: 99%
“…Vasuki and Cohen [10] use an interesting approach that employs reflective random indexing to find the nearest neighbors in the training dataset and use the indexing based similarity scores to rank the terms from the neighboring citations. A recent effort by Jimeno-Yepes et al [11] uses a large dataset and uses meta-learning to train custom binary classifiers for each MeSH term and index the best performing model for each terml for usage on new testing citations; we request the reader to refer to their work for a recent review of machine learning approaches used for MeSH term assignment.…”
Section: Background and Related Workmentioning
confidence: 99%
“…We experiment with two public datasets used by Huang et al [1]. The NLM2007 dataset has 200 test citations and is used by other recent studies on this subject [10]. The L1000 dataset is curated by Huang et al by random selection for the purposes of their work to test their methods on a larger dataset that spanned a large number of years.…”
Section: Datasets and Evaluation Metricsmentioning
confidence: 99%
See 1 more Smart Citation
“…Les travaux sur le MeSH en anglais ont utilisé le modèle probabiliste [4] et les techniques d'apprentissage automatique telles que le réseau bayésien [5] et les k-plus proches voisins pour la classification des documents [2,[6][7][8]. Aronson et al [2] exploitent également l'outil MetaMap [9] et la méthode de tri-gram (cette méthode permet de déterminer la similarité entre deux phrases) pour l'extraction des concepts Unified Medical Language System (UMLS 1 ) qui sont ensuite restreints aux concepts MeSH.…”
Section: Introductionunclassified