2015
DOI: 10.1145/2641567
|View full text |Cite
|
Sign up to set email alerts
|

An EDU-Based Approach for Thai Multi-Document Summarization and Its Application

Abstract: Due to lack of a word/phrase/sentence boundary, summarization of Thai multiple documents has several challenges in unit segmentation, unit selection, duplication elimination, and evaluation dataset construction. In this article, we introduce Thai Elementary Discourse Units (TEDUs) and their derivatives, called Combined TEDUs (CTEDUs), and then present our three-stage method of Thai multi-document summarization, that is, unit segmentation, unit-graph formulation, and unit selection and summary generation. To ex… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 22 publications
0
6
0
Order By: Relevance
“…(1) Low Accuracy of Language Processing Tools. The aforementioned unique characteristics of Thai such as, Thai text does not contain any word/sentence boundary, makes it harder for Thai language processing tools such as TLTK 2 to yield a high accuracy [5,6,16,18]. Consequently, the stylometric features extraction process for Thai is noisier in comparison to English.…”
Section: Limitations Of Existing Studymentioning
confidence: 99%
See 2 more Smart Citations
“…(1) Low Accuracy of Language Processing Tools. The aforementioned unique characteristics of Thai such as, Thai text does not contain any word/sentence boundary, makes it harder for Thai language processing tools such as TLTK 2 to yield a high accuracy [5,6,16,18]. Consequently, the stylometric features extraction process for Thai is noisier in comparison to English.…”
Section: Limitations Of Existing Studymentioning
confidence: 99%
“…More than 90 million people speak Kra-dai languages and Thai is the most widely spoken Kra-dai language. However, most of the existing authorship identifcation solutions are designed for English [15,16,18]. These solutions are not directly applicable to Thai due to linguistic diferences between English and Thai.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus forming summary based on frequency of words related to topic has found suitable application in several area [18] [19] of text analysis. A typical approach to generate summary is to identify important or keywords, which is usually carried out by eliminating stop words such as is, are, a, the, then, etc.…”
Section: Text Summarization With Pronoun Frequencymentioning
confidence: 99%
“…The extractive summarization methods discussed as in [19]- [21] are intend to choose words, sentences and phrases from the given text to obtain the summary. Forming summary based on frequency of words related to the topic has found suitable application in several area [22], [23] of text analysis. It is observed that in a given document the words that are occurring more frequently indicates the subject on which the text is pivoted.…”
Section: Introductionmentioning
confidence: 99%