The COVID-19 pandemic presented enormous data challenges in the United States. Policy makers, epidemiological modelers, and health researchers all require up-to-date data on the pandemic and relevant public behavior, ideally at fine spatial and temporal resolution. The COVIDcast API is our attempt to fill this need: Operational since April 2020, it provides open access to both traditional public health surveillance signals (cases, deaths, and hospitalizations) and many auxiliary indicators of COVID-19 activity, such as signals extracted from deidentified medical claims data, massive online surveys, cell phone mobility data, and internet search trends. These are available at a fine geographic resolution (mostly at the county level) and are updated daily. The COVIDcast API also tracks all revisions to historical data, allowing modelers to account for the frequent revisions and backfill that are common for many public health data sources. All of the data are available in a common format through the API and accompanying R and Python software packages. This paper describes the data sources and signals, and provides examples demonstrating that the auxiliary signals in the COVIDcast API present information relevant to tracking COVID activity, augmenting traditional public health reporting and empowering research and decision-making.
The COVID-19 pandemic presented enormous data challenges in the United States. Policy makers, epidemiological modelers, and health researchers all require up-to-date data on the pandemic and relevant public behavior, ideally at fine spatial and temporal resolution. The COVIDcast API is our attempt to fill this need: operational since April 2020, it provides open access to both traditional public health surveillance signals (cases, deaths, and hospitalizations) and many auxiliary indicators of COVID- 19 activity, such as signals extracted from de-identified medical claims data, massive online surveys, cell phone mobility data, and internet search trends. These are available at a fine geographic resolution (mostly at the county level) and are updated daily. The COVIDcast API also tracks all revisions to historical data, allowing modelers to account for the frequent revisions and backfill that are common for many public health data sources. All of the data is available in a common format through the API and accompanying R and Python software packages. This paper describes the data sources and signals, and provides examples demonstrating that the auxiliary signals in the COVIDcast API present information relevant to tracking COVID activity, augmenting traditional public health reporting and empowering research and decision-making.
Unfolding is an ill-posed inverse problem in particle
physics aiming to infer a true particle-level spectrum from smeared
detector-level data. For computational and practical reasons, these
spaces are typically discretized using histograms, and the smearing
is modeled through a response matrix corresponding to a discretized
smearing kernel of the particle detector. This response matrix
depends on the unknown shape of the true spectrum, leading to a
fundamental systematic uncertainty in the unfolding problem. To
handle the ill-posed nature of the problem, common approaches
regularize the problem either directly via methods such as Tikhonov
regularization, or implicitly by using wide-bins in the true space
that match the resolution of the detector. Unfortunately, both of
these methods lead to a non-trivial bias in the unfolded estimator,
thereby hampering frequentist coverage guarantees for confidence
intervals constructed from these methods. We propose two new
approaches to addressing the bias in the wide-bin setting through
methods called One-at-a-time Strict Bounds (OSB) and Prior-Optimized
(PO) intervals. The OSB intervals are a bin-wise modification of an
existing guaranteed-coverage procedure, while the PO intervals are
based on a decision-theoretic view of the problem. Importantly, both
approaches provide well-calibrated frequentist confidence intervals
even in constrained and rank-deficient settings. These methods are
built upon a more general answer to the wide-bin bias problem,
involving unfolding with fine bins first, followed by constructing
confidence intervals for linear functionals of the fine-bin
counts. We test and compare these methods to other available
methodologies in a wide-bin deconvolution example and a realistic
particle physics simulation of unfolding a steeply falling particle
spectrum.
This has been one long and arduous journey, but nevertheless a worthwhile life experience because of the many great Professors at SJSU and beloved friends. I am grateful and take this opportunity to thank my advisor Dr. Wu, who has been my constant support not only during the thesis but during my whole master's degree. It wouldn't have been possible without his trust and belief in me to do good research. I also want to thank Dr. Potika and Dr. Orang for consenting to be on my committee and giving their valuable inputs to my project, without which the project would not have been successful. I would also like to thank my parents, my sister Priya and my beloved friend Shweta for supporting and encouraging me throughout my graduation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.