Text Summarization is an emerging field of research in Natural Language Processing (NLP). A bulk of the work is related to texts in English and other popular languages. This paper presents some of the early works attempted at performing single document extractive Automatic Text Summarization on Konkani language documents, which is an under-research language in the domain of Automatic Text Summarization (ATS). The input documents need to be cleaned of punctuation and then sentence scores are calculated for each sentence in the document. The scores for each sentence are computed using Term-Frequency/Inverse Document Frequency (TF-IDF) of constituent words and overlap with the title of the story and its positional value. K-means algorithm is applied to determine clusters of sentences for the formation of the final summary. The value of 'K' is determined using the Elbow method. The dataset employed was specially designed by the authors of the paper to perform the experiments. It consists of folk tales derived from books on Konkani literature. The performance assessment of the output summaries indicated that the summaries obtained by using three clusters were better than the ones obtained using two clusters. The proposed system exhibited promising outcome, considering, no language-dependent domain knowledge or any training corpora was utilized.
<span lang="EN-US">Automatic text summarization has gained immense popularity in research. Previously, several methods have been explored for obtaining effective text summarization outcomes. However, most of the work pertains to the most popular languages spoken in the world. Through this paper, we explore the area of extractive automatic text summarization using deep learning approach and apply it to Konkani language, which is a low-resource language as there are limited resources, such as data, tools, speakers and/or experts in Konkani. In the proposed technique, Facebook’s fastText <br /> pre-trained word embeddings are used to get a vector representation for sentences. Thereafter, deep multi-layer perceptron technique is employed, as a supervised binary classification task for auto-generating summaries using the feature vectors. Using pre-trained fastText word embeddings eliminated the requirement of a large training set and reduced training time. The system generated summaries were evaluated against the ‘gold-standard’ human generated summaries with recall-oriented understudy for gisting evaluation (ROUGE) toolkit. The results thus obtained showed that performance of the proposed system matched closely to the performance of the human annotators in generating summaries.</span>
For the protection of cultural heritage, modern techniques have been used alongside traditional methods in recent years. In addition, two modern measurement techniques (Unmanned aerial vehicle photogrammetry and terrestrial laser scanner), which have been the subject of many studies on cultural heritage documentation, the Wearable Mobile Laser Scanner (WMLS) three dimension (3D) data collection technique has started to be used. Especially in cultural heritage documentation, it is essential to obtain accurate and precise data as well as fast and high-quality data. This study includes the visual and statistical comparison of the WMLS measurement method, which enables fast data collection, working with the simultaneous localization and mapping (SLAM) algorithm, in terms of accuracy and precision. To assess the accuracy of the three measurement approaches, eighteen (18) checkpoints (ChP) considered absolute values were measured using total-station techniques. With these data, the root means square error (RMSE) of each point were determined according to all three measurement techniques, and the directional and statistical errors were calculated. As a result of this research, while the terrestrial laser scanner method with a RMSE of 0.8 cm provides the best value, the RMSE of 2.64 cm and 4.92 cm was calculated in Unmanned aerial vehicle photogrammetry and WMLS methods, respectively. At the end of the study, theories and limitations were taken into consideration for all three approaches. It was observed that the obtained accuracy of all three provide the measurement principles of cultural heritage and that a modern measurement tool such as WMLS was a significant innovation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.