2019
DOI: 10.1371/journal.pone.0221639
|View full text |Cite
|
Sign up to set email alerts
|

Towards a top-down approach for an automatic discourse analysis for Basque: Segmentation and Central Unit detection tool

Abstract: Lately, discourse structure has received considerable attention due to the benefits its application offers in several NLP tasks such as opinion mining, summarization, question answering, text simplification, among others. When automatically analyzing texts, discourse parsers typically perform two different tasks: i ) identification of basic discourse units (text segmentation) ii ) linking discourse units by means of discourse relations, building structures such as … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 43 publications
0
3
0
Order By: Relevance
“…the CEFR word list since the complex word identification is getting attention in the NLP community (Yimam et al, 2018). Moreover, we would like to keep on adding more features: more morphological features and discourse based features based on a multilingual discourse parser (Atutxa et al, 2019). Applying MultiAzterTest in other text classification tasks is also one of our future aims.…”
Section: Discussionmentioning
confidence: 99%
“…the CEFR word list since the complex word identification is getting attention in the NLP community (Yimam et al, 2018). Moreover, we would like to keep on adding more features: more morphological features and discourse based features based on a multilingual discourse parser (Atutxa et al, 2019). Applying MultiAzterTest in other text classification tasks is also one of our future aims.…”
Section: Discussionmentioning
confidence: 99%
“…RST parsers have been created for several languages, including English [29,32], Chinese [12], Basque [1], Spanish, Portuguese, German, and Dutch [3,21]. The architecture of most of the existing RST-style parsers contains two modules: for discourse segmentation and for DT construction.…”
Section: Related Workmentioning
confidence: 99%
“…Examples of segmentation mistakes are given in Table 3 in Appendix. The first issue is related to the fact that some EDUs in the corpus contain multiple sentences or multiple paragraphs (1). It often occurs in the "Blogs" subsection with image-replacement tags and image captions, as well as with lists manually annotated as a single EDU, because such text fragments are usually treated by the system as separate paragraphs.…”
Section: Qualitative Analysismentioning
confidence: 99%