Data preparation and interannotator agreement: BioCreAtIvE Task 1B

Colosimo, Marc E.; Morgan, Alexander A.; Yeh, Alexander; Colombe, Jeffrey B.; Hirschman, Lynette

doi:10.1186/1471-2105-6-s1-s12

Cited by 27 publications

(29 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…This meant that we had to edit the gene lists to make them correspond to genes mentioned in the abstract, rather than all the genes curated in the full text article. We developed a procedure to automatically remove genes not found in the abstract and were able to provide a large quantity of "noisy" training data for the three organisms, together with small collections of carefully corrected development and test data [11]. We estimated the quality of the noisy training data for the three organisms.…”

Section: Resultsmentioning

confidence: 99%

See 1 more Smart Citation

Overview of BioCreAtIvE: critical assessment of information extraction for biology

et al. 2005

Self Cite

View full text Add to dashboard Cite

Background: The goal of the first BioCreAtIvE challenge (Critical Assessment of Information Extraction in Biology) was to provide a set of common evaluation tasks to assess the state of the art for text mining applied to biological problems. The results were presented in a workshop held in Granada, Spain March 28-31, 2004. The articles collected in this BMC Bioinformatics supplement entitled "A critical assessment of text mining methods in molecular biology" describe the BioCreAtIvE tasks, systems, results and their independent evaluation.

show abstract

Section: Resultsmentioning

confidence: 99%

“…There are 6 papers for task 1b, including an overview [10], an article describing preparation of the test sets and inter-annotator agreement experiments [11], and four articles describing systems and results for task 1b [9,12-14]. …”

Section: Introductionmentioning

confidence: 99%

Overview of BioCreAtIvE: critical assessment of information extraction for biology

et al. 2005

Self Cite

View full text Add to dashboard Cite

show abstract

“…Whenever new annotators joined the project, they had to be trained using previously annotated examples and follow the guideline. Colosimo et al [5] and Tanabe et al [28] also conduct corpus annotation in the biology domain and conclude that clear annotation guidelines are important, and the annotations should be validated by proper interannotator-agreement experiments.…”

Section: How To Annotate Properly: What Have We Learnt?mentioning

confidence: 99%

A Methodology towards Effective and Efficient Manual Document Annotation: Addressing Annotator Discrepancy and Annotation Quality

Zhang

Chapman

Ciravegna

2010

Knowledge Engineering and Management by the Masses

View full text Add to dashboard Cite

Abstract. Manual document annotation is an essential technique for knowledge acquisition and capture. Creating high-quality annotations is a difficult task due to inter-annotator discrepancy, the problem that annotators can never agree completely on what and exactly how to annotate. To address this, traditional document annotation involves multiple domain experts working on the same annotation task in an iterative and collaborative manner to identify and resolve discrepancies progressively. However, such a detailed process is often ineffective despite taking significant time and effort; unfortunately, discrepancies remain high in many cases. This paper proposes an alternative approach to document annotation. The approach tackles the problem by firstly studying annotators' suitability based on the types of information to be annotated; then identifying and isolating the most inconsistent annotators who tend to cause the majority of discrepancies in a task; finally distributing annotation workload among the most suitable annotators. Tested in a named entity annotation task in the domain of archaeology, we show that compared to the traditional approach to document annotation, it produces larger amounts of better quality annotations that result in higher machine learning accuracy while requires significantly less time and effort.

show abstract

“…3.6. Ideally, each topic would be judged by at least three judges and use a consensus-driven process which would provide maximum consistency throughout the entire process of gold standard creation (Hripcsak and Wilcox 2002;Colosimo et al 2005). The time scale of TREC and the resources required for a consensus approach made this impractical.…”

Section: Factors Influencing Inter-annotator Agreementmentioning

confidence: 99%

“…For the 2002 KDD cup and year 1 of the Genomics track, assessment was performed by using existing curation in FlyBase and Entrez Gene, respectively (Hersh and Bhupatiraju 2003;Yeh et al 2003). Task 1a of the first BioCreAtIvE challenge took a multipronged approach by developing guidelines for their three assessors, performing three-way comparisons of replicated judgments, and using pooled results from participants to uncover potential false negatives and false positives (Colosimo et al 2005). For judging duties that went beyond relevance assessment, the TREC Genomics track employed guidelines, training sessions, scaled-down ''mini-topics'', and moderated assessment by an experienced former judge.…”

Section: Factors Influencing Inter-annotator Agreementmentioning

confidence: 99%

Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems

2008

View full text Add to dashboard Cite

With the help of a team of expert biologist judges, the TREC Genomics track has generated four large sets of ''gold standard'' test collections, comprised of over a hundred unique topics, two kinds of ad hoc retrieval tasks, and their corresponding relevance judgments. Over the years of the track, increasingly complex tasks necessitated the creation of judging tools and training guidelines to accommodate teams of part-time shortterm workers from a variety of specialized biological scientific backgrounds, and to address consistency and reproducibility of the assessment process. Important lessons were learned about factors that influenced the utility of the test collections including topic design, annotations provided by judges, methods used for identifying and training judges, and providing a central moderator ''meta-judge''.

show abstract

Data preparation and interannotator agreement: BioCreAtIvE Task 1B

Cited by 27 publications

References 7 publications

Overview of BioCreAtIvE: critical assessment of information extraction for biology

Overview of BioCreAtIvE: critical assessment of information extraction for biology

A Methodology towards Effective and Efficient Manual Document Annotation: Addressing Annotator Discrepancy and Annotation Quality

Tasks, topics and relevance judging for the TREC Genomics Track: five years of experience evaluating biomedical text information retrieval systems

Contact Info

Product

Resources

About