Proceedings of the Joint Conferences on New Methods in Language Processing and Computational Natural Language Learning - NeMLaP 1998
DOI: 10.3115/1603899.1603909
|View full text |Cite
|
Sign up to set email alerts
|

Automation of treebank annotation

Abstract: This paper describes applications of stochastic and symbolic NLP methods to treebank annotation. In paxticular we focus on (1) the automation of treebank annotation, (2) the comparison of conflicting annotations for the same sentence and (3) the automatic detection of inconsistencies. These techniques are currently employed for building a German treebank.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0
1

Year Published

2001
2001
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 27 publications
(23 citation statements)
references
References 8 publications
0
13
0
1
Order By: Relevance
“…There is a long history of scaling for language models, for both the model and dataset sizes. Brants et al (2007) showed the benefits of using language models trained on 2 trillion tokens, resulting in 300 billion n-grams, on the quality of machine translation. In the context of neural language models, Jozefowicz et al ( 2016) obtained state-of-the-art results on the Billion Word benchmark by scaling LSTMs to 1 billion parameters.…”
Section: Related Workmentioning
confidence: 99%
“…There is a long history of scaling for language models, for both the model and dataset sizes. Brants et al (2007) showed the benefits of using language models trained on 2 trillion tokens, resulting in 300 billion n-grams, on the quality of machine translation. In the context of neural language models, Jozefowicz et al ( 2016) obtained state-of-the-art results on the Billion Word benchmark by scaling LSTMs to 1 billion parameters.…”
Section: Related Workmentioning
confidence: 99%
“…Dickinson and Meurers [24] introduced an algorithm to detect POS tags errors in gold-standard annotations. They present three error detection methods, which are related to the common inter-annotator agreement evaluation strategy [25]. Thiele et al [26] applied a similar technique to detect POS errors and developed a graphical interface that enables users to find and evaluate annotation errors.…”
Section: Visualization and Computational Linguisticsmentioning
confidence: 99%
“…Since in ¾ All trees in this contribution follow the data format for trees defined by the NEGRA project of the Sonderforschungsbereich 378 at the University of the Saarland, Saarbrücken. They were printed by the NEGRA annotation tool [5]. ¿ Memory-based learning has recently been applied to a variety of NLP classification tasks, including part-of-speech tagging, noun phrase chunking, grapheme-phoneme conversion, word sense disambiguation, and pp attachment (see [9], [14], [15] for details).…”
Section: Similarity-based Tree Construc-tionmentioning
confidence: 99%
“…67,000 fully annotated sentences or sentence fragments. 5 The evaluation consisted of a ten-fold cross-validation test, where the training data provide an instance base of already seen cases for TüSBL's tree construction module.…”
Section: Quantitative Evaluationmentioning
confidence: 99%