Text Summarization is an emerging field of research in Natural Language Processing (NLP). A bulk of the work is related to texts in English and other popular languages. This paper presents some of the early works attempted at performing single document extractive Automatic Text Summarization on Konkani language documents, which is an under-research language in the domain of Automatic Text Summarization (ATS). The input documents need to be cleaned of punctuation and then sentence scores are calculated for each sentence in the document. The scores for each sentence are computed using Term-Frequency/Inverse Document Frequency (TF-IDF) of constituent words and overlap with the title of the story and its positional value. K-means algorithm is applied to determine clusters of sentences for the formation of the final summary. The value of 'K' is determined using the Elbow method. The dataset employed was specially designed by the authors of the paper to perform the experiments. It consists of folk tales derived from books on Konkani literature. The performance assessment of the output summaries indicated that the summaries obtained by using three clusters were better than the ones obtained using two clusters. The proposed system exhibited promising outcome, considering, no language-dependent domain knowledge or any training corpora was utilized.
The aim and objective of this research are to create a model to measure the hate speech and to measure the contents of hate speech. The descriptive analysis method of data science was used to describe and summarize raw data from a dataset. We used Twitter as the social networking Web site for this research to analyze and measure the hate speech and its classifications. A dataset from kaggle datasets was applied for this research. To produce statistical results, we used monkey learn machine learning libraries which are incorporated with Python program to design and develop a model to classify and measure hate speech and its types that could be trained and tested using sentiment analysis. Researchers have found that the majority of the tweets are based on racist and ethnicity, sex and religion-based hate speech are also widely available.
Assessing complexity can significantly contribute to the attainment of the various quality attributes associated with a system. The avoidable complexity can be identified and reduced on the basis of the assessment. It holds the key to success of the system being developed. Various evaluation methods exist which have specific objectives and basis and all contribute to enhance product quality. In this paper a Complexity Assessment approach based on Activity Diagrams (CAAD) is proposed to evaluate the process view of the architecture of a system. The proposed approach estimates the complexity of the system/class/function from the UML representation of the process view of the architecture in the form of activity diagrams. This complexity measure may be used to assess and estimate the time and effort required to develop the system. This approach can estimate the coding complexity in terms of size without actually developing the code for the system/class/function. The paper is on calculating a complexity factor C from the given activity diagram and further develop a relationship between C and LOC metrics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.