Selecting a major can be quite difficult for prospective college students. The choice may have an effect not only on their academic life, but also on their career path. Due to some restrictions as the impact of the COVID-19 pandemic, universities must find novel ways to reach prospective students and assist them in choosing their majors, one of which is a college major recommendation system. This system can assist prospective students in determining the most appropriate majors for them based on data from the current students. Unlike other existing systems that employ either a rule-based or fuzzy model, this study employs a machine learning approach using data from undergraduate students at Universitas Islam Indonesia. This paper aims to compare several clustering models (i.e., K-means, Agglomerative, Birch, and DBSCAN) for the purpose of categorizing current students, to which the results will be used for classification purposes using various approaches (i.e., single stage vs. multistage), algorithms (i.e., multinomial logistic regression, random forest, and support vector machine), and scenarios (i.e., with or without GPA-based label). Our findings indicate that the K-means model outperformed all other clustering models and that the single stage with random forest classification model performed the best across all scenarios.
Since the end of May 2020, there is a massive wave of protestant in the United States addressed to the government regarding the case of police violence towards black people. On May 25, 2020, George Floyd, a 46-years-old black American man was killed during an arrest for allegedly using a counterfeit bill in Minneapolis. This study analyzes the public’s reactions towards the Black Lives Matter campaign using a supervised machine learning-based approach. The proposed model uses logistic regression with word vectors as its feature. The model classifies the public’s reaction represented by tweets and crawls using Spark Streaming into sentiment class, i.e. positive, negative and neutral. In addition, named entity recognition analysis was also conducted in this study. The aim is to find who else besides George Floyd whose rights have been fought for by the public. SparkNLP is used to build the logistic regressions model, sentiment analysis and named entity recognition. This study finds that most of the public tweets had a negative tone addressed to the Floyd incident specifically and to the violence towards black people in general. Another finding is that the campaign not only fought for George Floyd, but also fought for the other victims like Rayshard Brooks, Dominique Fells and Eric Garner.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.