The SARS-CoV-2 virus which originated in Wuhan, China has since spread throughout the world and is affecting millions of people. When there is a novel virus outbreak, it is crucial to quickly determine if the epidemic is a result of the novel virus or a well-known virus. We propose a deep learning algorithm that uses a convolutional neural network (CNN) as well as a bi-directional long short-term memory (Bi-LSTM) neural network, for the classification of the severe acute respiratory syndrome coronavirus 2 (SARS CoV-2) amongst Coronaviruses. Besides, we classify whether a genome sequence contains candidate regulatory motifs or otherwise. Regulatory motifs bind to transcription factors. Transcription factors are responsible for the expression of genes. The experimental results show that at peak performance, the proposed convolutional neural network bi-directional long short-term memory (CNN-Bi-LSTM) model achieves a classification accuracy of 99.95%, area under curve receiver operating characteristic (AUC ROC) of 100.00%, a specificity of 99.97%, the sensitivity of 99.97%, Cohen's Kappa equal to 0.9978, Mathews Correlation Coefficient (MCC) equal to 0.9978 for the classification of SARS CoV-2 amongst Coronaviruses. Also, the CNN-Bi-LSTM correctly detects whether a sequence has candidate regulatory motifs or binding-sites with a classification accuracy of 99.76%, AUC ROC of 100.00%, a specificity of 99.76%, a sensitivity of 99.76%, MCC equal to 0.9980, and Cohen's Kappa of 0.9970 at peak performance. These results are encouraging enough to recognise deep learning algorithms as alternative avenues for detecting SARS CoV-2 as well as detecting regulatory motifs in the SARS CoV-2 genes.
Following the declaration by the World Health Organisation (WHO) on 11 March 2020, that the global COVID-19 outbreak had become a pandemic, South Africa implemented a full lockdown from 27 March 2020 for 21 days. The full lockdown was implemented after the publication of the National Disaster Regulations (NDR) gazette on 18 March 2020. The regulations included lockdowns, public health measures, movement restrictions, social distancing measures, and social and economic measures. We developed a hybrid model that consists of a long-short term memory auto-encoder (LSTMAE) and the kernel quantile estimator (KQE) algorithm to detect change-points. Thereafter, we utilised the Bayesian structural times series models (BSTSMs) to estimate the causal effect of the lockdown measures. The LSTMAE and KQE, successfully detected the changepoint that resulted from the full lockdown that was imposed on 27 March 2020. Additionally, we quantified the causal effect of the full lockdown measure on population mobility in residential places, workplaces, transit stations, parks, grocery and pharmacy, and retail and recreation. In relative terms, population mobility at grocery and pharmacy places decreased significantly by −17,137.04% (p-value = 0.001 < 0.05). In relative terms, population mobility at transit stations, retail and recreation, workplaces, parks, and residential places decreased significantly by −998.59% (p-value = 0.001 < 0.05), −1277.36% (p-value = 0.001 < 0.05), −2175.86% (p-value = 0.001 < 0.05), −370.00% (p-value = 0.001< 0.05), and −22.73% (p-value = 0.001 < 0.05), respectively. Therefore, the full lockdown Level 5 imposed on March 27, 2020 had a causal effect on population mobility in these categories of places.
Logistic regression is a popular method that is used for estimating causal effects in observational studies using propensity scores. We examine the use of deep learning models such as the deep neural network (DNN), PropensityNet (PN), convolutional neural network (CNN), and convolutional neural network-long short-term memory network (CNN-LSTM)) to estimate propensity scores and evaluate causal inference. Deep learning models, unlike logistic regression, do not depend on assumptions regarding (i) how variables are selected, (ii) specification of the correct functional form, (iii) statistical distributions of the variables, and (iv) interactions are specified. If these assumptions are not met when using logistic regression, one may obtain biased estimates of treatment effects due to not achieving covariate balance. We conducted studies using simulated data with different sample sizes (N = 500, N = 1000, N = 2000), 15 covariates, a continuous outcome and a binary exposure. These data were used in seven scenarios that were different in the degree of nonlinearity and non-additivity associations between the exposure and covariates. The estimation of propensity scores was considered as a classification task, and performance metrics that included the classification accuracy, the receiver operating characteristic curve area under the curve (AUCROC), the covariate balance, the standard error, the absolute bias, and the 95% confidence interval coverage were evaluated for each model. Overall, CNN and CNN-LSTM achieved good results for covariate balance, classification accuracy, AUCROC, and Cohen's Kappa. Logistic regression provided substantially better bias reduction, but it had subpar performance based on classification accuracy, AUCROC, Cohen's Kappa, and 95% confidence interval coverage. The results suggest that deep learning methods, especially CNN, may be useful for estimating propensity scores that are used to estimate causal effects.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.