Área foliar de mudas de urucum (&lt;i&gt;Bixa orellana&lt;/i&gt; L.) estimada por diferentes métodos: uma análise comparativa

Universal dependencies (UD) is a framework for morphosyntactic annotation of human language, which to date has been used to create treebanks for more than 100 languages. In this article, we outline the linguistic theory of the UD framework, which draws on a long tradition of typologically oriented grammatical theories. Grammatical relations between words are centrally used to explain how predicate–argument structures are encoded morphosyntactically in different languages while morphological features and part-of-speech classes give the properties of words. We argue that this theory is a good basis for cross-linguistically consistent annotation of typologically diverse languages in a way that supports computational natural language understanding as well as broader linguistic studies.

show abstract

Ethical Considerations in NLP Shared Tasks

Escartín¹,

Reijers²,

Lynn³

et al. 2017

View full text Add to dashboard Cite

Shared tasks are increasingly common in our field, and new challenges are suggested at almost every conference and workshop. However, as this has become an established way of pushing research forward, it is important to discuss how we researchers organise and participate in shared tasks, and make that information available to the community to allow further research improvements. In this paper, we present a number of ethical issues along with other areas of concern that are related to the competitive nature of shared tasks. As such issues could potentially impact on research ethics in the Natural Language Processing community, we also propose the development of a framework for the organisation of and participation in shared tasks that can help mitigate against these issues arising.

show abstract

Minority Language Twitter: Part-of-Speech Tagging and Analysis of Irish Tweets

Lynn¹,

Scannell²,

Maguire³

2015

View full text Add to dashboard Cite

Noisy user-generated text poses problems for natural language processing. In this paper, we show that this statement also holds true for the Irish language. Irish is regarded as a low-resourced language, with limited annotated corpora available to NLP researchers and linguists to fully analyse the linguistic patterns in language use in social media. We contribute to recent advances in this area of research by reporting on the development of part-ofspeech annotation scheme and annotated corpus for Irish language tweets. We also report on state-of-the-art tagging results of training and testing three existing POStaggers on our new dataset.

show abstract

Cross-lingual Transfer Parsing for Low-Resourced Languages: An Irish Case Study

Lynn¹,

Foster²,

Dras³

et al. 2014

View full text Add to dashboard Cite

We present a study of cross-lingual direct transfer parsing for the Irish language. Firstly we discuss mapping of the annotation scheme of the Irish Dependency Treebank to a universal dependency scheme. We explain our dependency label mapping choices and the structural changes required in the Irish Dependency Treebank. We then experiment with the universally annotated treebanks of ten languages from four language family groups to assess which languages are the most useful for cross-lingual parsing of Irish by using these treebanks to train delexicalised parsing models which are then applied to sentences from the Irish Dependency Treebank. The best results are achieved when using Indonesian, a language from the Austronesian language family.

show abstract

Foreebank: Syntactic Analysis of Customer Support Forums

Kaljahi¹,

Foster²,

Roturier³

et al. 2015

View full text Add to dashboard Cite

We present a new treebank of English and French technical forum content which has been annotated for grammatical errors and phrase structure. This double annotation allows us to empirically measure the effect of errors on parsing performance. While it is slightly easier to parse the corrected versions of the forum sentences, the errors are not the main factor in making this kind of text hard to parse.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Teresa Lynn

Universal Dependencies

Ethical Considerations in NLP Shared Tasks

Minority Language Twitter: Part-of-Speech Tagging and Analysis of Irish Tweets

Cross-lingual Transfer Parsing for Low-Resourced Languages: An Irish Case Study

Foreebank: Syntactic Analysis of Customer Support Forums

Contact Info

Product

Resources

About