Today’s societies are connected to a level that has never been seen before. The COVID-19 pandemic has exposed the vulnerabilities of such an unprecedently connected world. As of 19 November 2020, over 56 million people have been infected with nearly 1.35 million deaths, and the numbers are growing. The state-of-the-art social media analytics for COVID-19-related studies to understand the various phenomena happening in our environment are limited and require many more studies. This paper proposes a software tool comprising a collection of unsupervised Latent Dirichlet Allocation (LDA) machine learning and other methods for the analysis of Twitter data in Arabic with the aim to detect government pandemic measures and public concerns during the COVID-19 pandemic. The tool is described in detail, including its architecture, five software components, and algorithms. Using the tool, we collect a dataset comprising 14 million tweets from the Kingdom of Saudi Arabia (KSA) for the period 1 February 2020 to 1 June 2020. We detect 15 government pandemic measures and public concerns and six macro-concerns (economic sustainability, social sustainability, etc.), and formulate their information-structural, temporal, and spatio-temporal relationships. For example, we are able to detect the timewise progression of events from the public discussions on COVID-19 cases in mid-March to the first curfew on 22 March, financial loan incentives on 22 March, the increased quarantine discussions during March–April, the discussions on the reduced mobility levels from 24 March onwards, the blood donation shortfall late March onwards, the government’s 9 billion SAR (Saudi Riyal) salary incentives on 3 April, lifting the ban on five daily prayers in mosques on 26 May, and finally the return to normal government measures on 29 May 2020. These findings show the effectiveness of the Twitter media in detecting important events, government measures, public concerns, and other information in both time and space with no earlier knowledge about them.
Digital societies could be characterized by their increasing desire to express themselves and interact with others. This is being realized through digital platforms such as social media that have increasingly become convenient and inexpensive sensors compared to physical sensors in many sectors of smart societies. One such major sector is road transportation, which is the backbone of modern economies and costs globally 1.25 million deaths and 50 million human injuries annually. The cutting-edge on big data-enabled social media analytics for transportation-related studies is limited. This paper brings a range of technologies together to detect road traffic-related events using big data and distributed machine learning. The most specific contribution of this research is an automatic labelling method for machine learning-based traffic-related event detection from Twitter data in the Arabic language. The proposed method has been implemented in a software tool called Iktishaf+ (an Arabic word meaning discovery) that is able to detect traffic events automatically from tweets in the Arabic language using distributed machine learning over Apache Spark. The tool is built using nine components and a range of technologies including Apache Spark, Parquet, and MongoDB. Iktishaf+ uses a light stemmer for the Arabic language developed by us. We also use in this work a location extractor developed by us that allows us to extract and visualize spatio-temporal information about the detected events. The specific data used in this work comprises 33.5 million tweets collected from Saudi Arabia using the Twitter API. Using support vector machines, naïve Bayes, and logistic regression-based classifiers, we are able to detect and validate several real events in Saudi Arabia without prior knowledge, including a fire in Jeddah, rains in Makkah, and an accident in Riyadh. The findings show the effectiveness of Twitter media in detecting important events with no prior knowledge about them.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.