The COVID-19 pandemic has induced many problems in various sectors of human life. After more than one year of the pandemic, many studies have been conducted to discover various technological innovations and applications to combat the virus that has claimed many lives. The use of Big Data technology to mitigate the threats of the pandemic has been accelerated. Therefore, this survey aims to explore Big Data technology research in fighting the pandemic. Furthermore, the relevance of Big Data technology was analyzed while technological contributions to five main areas were highlighted. These include healthcare, social life, government policy, business and management, and the environment. The analytical techniques of machine learning, deep learning, statistics, and mathematics were discussed to solve issues regarding the pandemic. The data sources used in previous studies were also presented and they consist of government officials, institutional service, IoT generated, online media, and open data. Therefore, this study presents the role of Big Data technologies in enhancing the research relative to COVID-19 and provides insights into the current state of knowledge within the domain and references for further development or starting new studies are provided.
The influence of social media in disseminating information, especially during the COVID-19 pandemic, can be observed with time interval, so that the probability of number of tweets discussed by netizens on social media can be observed. The nonhomogeneous Poisson process (NHPP) is a Poisson process with dependent on time parameters and the exponential distribution having unequal parameter values and, independently of each other. The probability of no accurence an event in the initial state is one and the probability of an event in initial state is zero. Using of non-homogeneous Poisson in this paper aims to predict and count the number of tweet posts with the keyword coronavirus, COVID-19 with set time intervals every day. Posting of tweets from one time each day to the next do not affect each other and the number of tweets is not the same. The dataset used in this study is crawling of COVID-19 tweets three times a day with duration of 20 minutes each crawled for 13 days or 39 time intervals. Result of this study obtained predictions and calculated for the probability of the number of tweets for the tendency of netizens to post on the situation of the COVID-19 pandemic.
Solar irradiance needs to estimate power consumptions for requiring of saving energy. The demand accomplished by providing facilities to predict. Time series data is a dataset that has complex problems. Multilayer perceptron (MLP) and autoregressive integrated moving average (ARIMA) with multivariate input were used to solve the problem for predicting solar irradiance. The dataset was collected from solar irradiance sensor by an online monitoring station with 10 minutes data interval for 18 months. Prediction experimented with t, t-2, and t-6 data inputs that represent t as the day to get the predictive model (t+1). In ARIMA model, optimization was obtained in the input parameter (t-6), and ARIMA(1,1,2) with minimum RMSE is 43.91 W/m2, whereas MLP model used a single layer, ten neurons and using relu activation function to predict with minimum RMSE is 8.68 W/m2 using (t) input parameter. The deep learning model is better than the statistical model in this experiment. RMSE, MSE, MAE, MAPE, and R 2 , are used as an evaluation for model performance.
Sentiment analysis of short texts is challenging because of its limited context of information. It becomes more challenging to be done on limited resource language like Bahasa Indonesia. However, with various deep learning techniques, it can give pretty good accuracy. This paper explores several deep learning methods, such as multilayer perceptron (MLP), convolutional neural network (CNN), long short-term memory (LSTM), and builds combinations of those three architectures. The combinations of those three architectures are intended to get the best of those architecture models. The MLP accommodates the use of the previous model to obtain classification output. The CNN layer extracts the word feature vector from text sequences. Subsequently, the LSTM repetitively selects or discards feature sequences based on their context. Those advantages are useful for different domain datasets. The experiments on sentiment analysis of short text in Bahasa Indonesia show that hybrid models can obtain better performance, and the same architecture can be directly used in another domain-specific dataset.
Weather prediction is usually performed for a reference in planning future activity. The prediction is performed by considering several parameters, such as temperature, air pressure, humidity, wind, rainfall, and others. In this study, the temperature, as one of weather parameters, is predicted by using time series from January 2015 to December 2017. The data was obtained from Lembaga Ilmu Pengetahuan Indonesia (LIPI) weather measurement station in Muaro Anai, Padang. The predictions were carried out by using Convolutional Neural Network (CNN), Multilayer Perceptron (MLP), and the hybrid of CNN-MLP methods. The parameters used in the CNN method, such as the number of filters and kernel size, and used in the MLP method, such as the number of hidden layers and number of neurons, were selected by performing the hyperparameter tuning procedure. After obtaining the best parameters for both methods, the performance of both methods was evaluated by calculating the value of Root Mean Square Error (RMSE) and R2. Based on the results, we found that the prediction by CNN is more accurate than other method. This is indicated by the highest value of R2 of the prediction obtained by CNN method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.