Memory-based learning for article generation

Minnen, Guido; Bond, Francis; Copestake, Ann

doi:10.3115/1117601.1117611

Cited by 31 publications

(36 citation statements)

References 9 publications

(6 reference statements)

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Knight and Chander [25] took the first step to using a machine learning algorithm for article generation although their method only deals with a/the selection. Minnen et al [26] extend this work to three-way classification. However, their method also depends on information such as functional tags in Penn Treebank which may not be reliable in essay writing.…”

Section: Relation To Previous Workmentioning

confidence: 99%

A Method for Detecting Determiner Errors Designed for the Writing of Non-native Speakers of English

Nagata

Kawai

2012

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

SUMMARYThis paper proposes a method for detecting determiner errors, which are highly frequent in learner English. To augment conventional methods, the proposed method exploits a strong tendency displayed by learners in determiner usage, i.e., mistakenly omitting determiners most of the time. Its basic idea is simple and applicable to almost any conventional method. This paper also proposes combining the method with countability prediction, which results in further improvement. Experiments show that the proposed method achieves an F-measure of 0.684 and significantly outperforms conventional methods. key words: learners' tendency, determiner error, error detection, article, English writing, learner corpus IntroductionDeterminer usage is one of the major difficulties that nonnative speakers of English are faced with in English writing. This is especially true for those whose mother tongue does not have a determiner system similar to that of English (e.g., Chinese and Japanese). It can easily be observed, among other errors, in an essay written by a Japanese learner of English:I became univercity student, I get up early every morning. I go to the school when I listening to music in train. Stady is very different. Especiary I think that programing and math doesn't know.The underlines indicate the noun phrases (NPs) that have a determiner error. Because of the difficulty inherent in using determiners, errors in determiners, including article errors, are one of the most frequent grammatical error types in learnerDeterminer errors are so frequent that they become problematic in several circumstances. For example, teachers have to identify and correct determiner errors in writing classrooms, which is time-consuming and costly. Similarly, raters have to identify a great number of determiner errors to evaluate writing skills in grammar in writing tests.Given propose a maximum entropy (ME) classifier-based method for predicting correct articles; if the prediction disagrees with the one actually used, then it is detected as an error. The features are based on lexical and syntactic information around the article in question. They report that their method achieves a recall of 0.40 with a precision of 0.90. It should be noted that these article-error detection methods can naturally be extended to determinererror detection. One can build an n-way classifier where n corresponds to the number of target determiners. The classifier selects the correct one out of the target determiners in determiner-error detection.As an alternative approach, Nagata et al.[9], [10] propose using countability prediction. Countability is highly related to determiner usage [11], [12]. For example, noncount nouns do not take the indefinite article whereas singular count nouns do not appear without a determiner. Their method first predicts the countability of the head noun from its surrounding context and then applies some rules to the prediction to examine whether the determiner that modifies the head noun is correct or not.Although performance has imp...

show abstract

Section: Relation To Previous Workmentioning

confidence: 99%

A Method for Detecting Determiner Errors Designed for the Writing of Non-native Speakers of English

Nagata

Kawai

2012

IEICE Trans. Inf. & Syst.

View full text Add to dashboard Cite

show abstract

“…Our approach significantly improves upon the work of Minnen et al (2000). We also use additional automatically parsed data from the North American News Text Corpus (Graff, 1995), further improving our results.…”

Section: Introductionmentioning

confidence: 99%

“…As with (Minnen et al, 2000), we train the language model on the Penn Treebank (Marcus et al, 1993). As far as we know, language modeling always improves with additional training data, so we add data from the North American News Text Corpus (NANC) (Graff, 1995) automatically parsed with the Charniak parser to train our language model on up to 20 million additional words.…”

Section: Training the Modelmentioning

confidence: 99%

Automatic acquisition of grammatical types for nouns

Bel

Espeja

Marimon

2007

Human Language Technologies 2007: The Conference of the North American Chapter of the Association for Computational Linguistics

View full text Add to dashboard Cite

The work 1 we present here is concerned with the acquisition of deep grammatical information for nouns in Spanish. The aim is to build a learner that can handle noise, but, more interestingly, that is able to overcome the problem of sparse data, especially important in the case of nouns. We have based our work on two main points. Firstly, we have used distributional evidences as features. Secondly, we made the learner deal with all occurrences of a word as a single complex unit. The obtained results show that grammatical features of nouns is a level of generalization that can be successfully approached with a Decision Tree learner.

show abstract

“…At the syntactic sentence level TiMBL has been applied to part of speech tagging Zavrel and Daelemans, 1999;Van Halteren, Zavrel, and Daelemans, 2001); PPattachment (Zavrel, Daelemans, and Veenstra, 1997); subcategorization (Buchholz, 1998); phrase chunking (Veenstra, 1998;Tjong Kim Sang and Veenstra, 1999); shallow parsing Buchholz, Veenstra, and Daelemans, 1999;Yeh, 2000); clause identification (Orȃsan, 2000;Tjong Kim Sang, 2001); detecting the scope of negation markers (Morante, Liekens, and Daelemans, 2008); sentence-boundary detection (Stevenson and Gaizauskas, 2000); predicting the order of prenominal adjectives for generation (Malouf, 2000) and article generation (Minnen, Bond, and Copestake, 2000); and, beyond the sentence level, to anaphora resolution (Preiss, 2002;Mitkov, Evans, and Orasan, 2002;Hoste, 2005). More recently, memory-based learning has been integrated as a classifier engine in more complicated dependency parsing systems (Nivre, Hall, and Nilsson, 2004;Sagae and Lavie, 2005;, or dependency parsing in combination with semantic role labeling (Morante, Van Asch, and Van den Bosch, 2009).…”

Section: Nlp Applications Of Timblmentioning

confidence: 99%

Computational Approaches to Morphology

Keuleers

2018

Oxford Research Encyclopedia of Linguistics

View full text Add to dashboard Cite

Computational psycholinguistics has a long history of investigation and modeling of morphological phenomena. Several computational models have been developed to deal with the processing and production of morphologically complex forms and with the relation between linguistic morphology and psychological word representations. Historically, most of this work has focused on modeling the production of inflected word forms, leading to the development of models based on connectionist principles and other data-driven models such as Memory-Based Language Processing (MBLP), Analogical Modeling of Language (AM), and Minimal Generalization Learning (MGL). In the context of inflectional morphology, these computational approaches have played an important role in the debate between single and dual mechanism theories of cognition. Taking a different angle, computational models based on distributional semantics have been proposed to account for several phenomena in morphological processing and composition. Finally, although several computational models of reading have been developed in psycholinguistics, none of them have satisfactorily addressed the recognition and reading aloud of morphologically complex forms.

show abstract

Memory-based learning for article generation

Cited by 31 publications

References 9 publications

A Method for Detecting Determiner Errors Designed for the Writing of Non-native Speakers of English

A Method for Detecting Determiner Errors Designed for the Writing of Non-native Speakers of English

Automatic acquisition of grammatical types for nouns

Computational Approaches to Morphology

Contact Info

Product

Resources

About