Automatic text summarization based on semantic analysis approach for documents in Indonesian language

Tardan, Pandu Prakoso; Erwin, Alva; Eng, Kho I; Muliady, Wahyu

doi:10.1109/iciteed.2013.6676209

Cited by 12 publications

(10 citation statements)

References 12 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It has been proven that preprocessing improves the performance of automatic text summarization systems [5,11,12]. Therefore, in this stage, tokenization was applied to texts so that texts are split into sentences and sentences into words (or terms).…”

Section: Preprocessingmentioning

confidence: 99%

“…On the other hand, in the extractive text summarization, the sentences chosen for the summary are not changed. Since the development of the system using this summarization method is less complex than abstractive summarization, many studies have preferred this approach rather than abstractive summarization [3][4][5][6][7].…”

Section: Introductionmentioning

confidence: 99%

“…In [4], which is the first study in the Turkish language, term frequencies, and the location of sentences play prominent roles for text summarization. In [5], the authors proposed a text summarization system in the Indonesian language designed with a semantic analysis approach aiming to determine the similarity between sentences by using vector values of each sentence with the title. Further, in [6], the scores of sentences were used to determine which sentences were in summary with the help of a neural network.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

A New Model on Automatic Text Summarization for Turkish

Bal

Günal

2021

Eskişehir Technical University Journal of Science and Technology a - Applied Sciences and Engineering

View full text Add to dashboard Cite

The amount of data available in the electronic environment is increasing day by day with the development of technology. It becomes challenging and time-consuming for the users to access the information they desire within this increasing amount of data. Automatic text summarization systems have been developed to reach the desired information within texts in a shorter time than manual text summarization. In this paper, a new extractive text summarization model is proposed. In the proposed model, the inclusion of sentences of a given text in the summary is decided based on a classification approach. Also, the effectiveness of widely used features for automatic text summarization in the Turkish language is evaluated using sequential feature selection methods. The evaluations were carried out specifically for Turkish texts in the categories of economy, art, and sports. The experimental work justified the proposed text summarization method's performance and revealed how effective the features are.

show abstract

Section: Preprocessingmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

A New Model on Automatic Text Summarization for Turkish

Bal

Günal

2021

Eskişehir Technical University Journal of Science and Technology a - Applied Sciences and Engineering

View full text Add to dashboard Cite

show abstract

“…Similarities between the title of news articles are being pre computed by using semantic analysis, cosine similarity to be exact. Few algorithms to compute sentence similarity had been researched by [23] and semantic analysis has been chosen in this paper because of its robust performance and fa st computation. The result of similarity between two article titles by using semantic analysis gives a higher rating compared to similarity using statistical approach.…”

Section: J) Recommended News Fr Om Association Rule Discoverymentioning

confidence: 99%

News recommendation in Indonesian language based on user click behavior

Desyaputri

Erwin

Galinium

et al. 2013

2013 International Conference on Information Technology and Electrical Engineering (ICITEE)

Self Cite

View full text Add to dashboard Cite

Recommendation system has been proposed for years as the solution of information era problem. This research strives to develop an intelligent recommendation system based on user click behavior on news websites. We extracted frequent item sets and association rules from the web server log of a news website, performed a pre-computation of similarity between news articles, and then proposed a three-level recommendation system : based on association rule discovery, news articles on the same category, and similarity between news articles. By combining collaborative filtering approach and content-based filtering, experiment results show that the technique produces reliable news recommendation.

show abstract

“…Abstractive summarization creates a brief useful summary by generating new sentences [4]. Some researchers have conducted automatic text summarization in Indonesian with several methods, including summarizing text using sentence scoring and decision trees [5], Text Summarization Based on Semantic Analysis [6], Sentence structurebased summarization [7], dan query-based summarization [8]. The study did not apply word order calculations in sentences.…”

Section: Introductionmentioning

confidence: 99%

Automatic Text Summarization Based on Semantic Networks and Corpus Statistics

Yulita

Priyanta²,

Sn³

2019

Indonesian J. Comput. Cybern. Syst.

View full text Add to dashboard Cite

AbstrakSalah satu metode peringkasan teks otomatis yang sederhana dan dapat meminimalkan redundansi pada ringkasan adalah metode Maximum Marginal Relevance (MMR). Metode MMR memiliki kelemahan yaitu terdapat bagian-bagian yang terpisah satu sama lain dalam hasil ringkasan yang secara semantic tidak terhubung. Oleh karena itu, penelitian ini bertujuan untuk membandingkan hasil ringkasan menggunakan metode MMR berbasis semantic dan MMR berbasis non-semantic. Metode MMR berbasis semantic memanfaatkan WordNet Bahasa dan corpus dalam pemrosesan ringkasan teks. Metode MMR berbasis non-semantic menggunakan metode TF-IDF. Penelitian ini juga melakukan pemampatan ringkasan sebesar 30%, 20% dan 10%. Data penelitian yang digunakan berupa 50 teks berita online. Pengujian hasil ringkasan teks dilakukan dengan menggunakan toolkit ROUGE.. Hasil penelitian menyatakan bahwa nilai rata-rata f-score terbaik pada metode MMR berbasis semantic adalah 0,561, sedangkan nilai f-score terbaik pada metode MMR berbasis non-semantic adalah 0,598. Nilai tersebut dihasilkan dengan menambahkan proses preprocessing berupa stemming dan pemampatan hasil ringkasan 30%. Perbedaan nilai yang diperoleh disebabkan oleh ketidaklengkapan WordNet Bahasa dan terdapat beberapa kata di dalam judul berita yang tidak sesuai dengan EYD (KBBI). Kata kunci-peringkasan teks otomatis, metode MMR, semantic, non-semantic AbstractOne simple automatic text summarization method that can minimize redundancy, in summary, is the Maximum Marginal Relevance (MMR) method. The MMR method has the disadvantage of having parts that are separated from each other in summary results that are not semantically connected. Therefore, this study aims to compare summary results using the MMR method based on semantic and non-semantic based MMR. Semantic-based MMR methods utilize WordNet Bahasa and corpus in processing text summaries. The MMR method is nonsemantic based on the TF-IDF method. This study also carried out summary compression of 30%, 20%, and 10%. The research data used is 50 online news texts. Testing of the summary text results is done using the ROUGE toolkit. The results of the study state that the best value of the f-score in the semantic-based MMR method is 0.561, while the best f-score in the nonsemantic MMR method is 0.598. This value is generated by adding a preprocessing process in the form of stemming and compression of a 30% summary result. The difference in value obtained is due to incomplete WordNet Bahasa and there are several words in the news title that are not in accordance with EYD (KBBI).

show abstract

Automatic text summarization based on semantic analysis approach for documents in Indonesian language

Cited by 12 publications

References 12 publications

A New Model on Automatic Text Summarization for Turkish

A New Model on Automatic Text Summarization for Turkish

News recommendation in Indonesian language based on user click behavior

Automatic Text Summarization Based on Semantic Networks and Corpus Statistics

Contact Info

Product

Resources

About