News categorization, which is a common application area of text classification, is the task of automatic annotation of news articles with predefined categories. In parallel with the rise of deep learning techniques in the field of machine learning, neural embedding models have been widely utilized to capture hidden relationships and similarities among textual representations of news articles. In this study, we approach the Turkish news categorization problem as an ad-hoc retrieval task and investigate the effectiveness of paragraph vector models to compute and utilize document-wise similarities of Turkish news articles. We propose an ensemble categorization approach that consists of three main stages, namely, document processing, paragraph vector learning, and document similarity estimation. Extensive experiments conducted on the TTC-3600 dataset reveal that the proposed system can reach up to 93.5% classification accuracy, which is a remarkable performance when compared to the baseline and state-of-the-art methods. Moreover, it is also shown that the Distributed Bag of Words version of Paragraph Vectors performs better than the Distributed Memory Model of Paragraph Vectors in terms of both accuracy and computational performance.
Social media, which has become an active communication tool in today’s education, constitutes a fast and easy alternative to share information by bringing students and educational institutions together. Although the interaction between these participants could provide implicit feedback on education services, there is only limited research on identifying the trending topics about open and distance education. This study examines Twitter content to reveal the primary topics of social media conversations related to open and distance education in Turkey. An experimental research is conducted on a collection of 20,010 unique tweets matching #aöf and #açıköğretim hashtags. The user tweets in this collection, which consist of hashtags and actual tweet texts, are analyzed by two statistical inference methods. While the most frequently preferred hashtags in the education domain are determined by co-occurrence modeling, Latent Dirichlet Allocation is employed to extract the core topics of actual tweet texts. The performed analyses reveal that social media interactions in open and distance education are gathered around semantic clusters such as exams, registration periods, course materials, and exam results. Consequently, social media can be utilized to understand students’ problems and demands better, and thus the quality of distance education services can be enhanced.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.