Chromosome arm aneuploidies (CAAs) are pervasive in cancers. However, how they affect cancer development, prognosis and treatment remains largely unknown. Here, we analyse CAA profiles of 23,427 tumours, identifying aspects of tumour evolution including probable orders in which CAAs occur and CAAs predicting tissue-specific metastasis. Both haematological and solid cancers initially gain chromosome arms, while only solid cancers subsequently preferentially lose multiple arms. 72 CAAs and 88 synergistically co-occurring CAA pairs multivariately predict good or poor survival for 58% of 6977 patients, with negligible impact of whole-genome doubling. Additionally, machine learning identifies 31 CAAs that robustly alter response to 56 chemotherapeutic drugs across cell lines representing 17 cancer types. We also uncover 1024 potential synthetic lethal pharmacogenomic interactions. Notably, in predicting drug response, CAAs substantially outperform mutations and focal deletions/amplifications combined. Thus, CAAs predict cancer prognosis, shape tumour evolution, metastasis and drug response, and may advance precision oncology.
How do fine modifications to social distancing measures really affect COVID-19 spread? A major problem for health authorities is that we do not know. In an imaginary world, we might develop a harmless biological virus that spreads just like COVID-19, but is traceable via a cheap and reliable diagnosis. By introducing such an imaginary virus into the population and observing how it spreads, we would have a way of learning about COVID-19 because the benign virus would respond to population behaviour and social distancing measures in a similar manner. Such a benign biological virus does not exist. Instead, we propose a safe and privacy-preserving digital alternative. Our solution is to mimic the benign virus by passing virtual tokens between electronic devices when they move into close proximity. As Bluetooth transmission is the most likely method used for such inter-device communication, and as our suggested "virtual viruses" do not harm individuals' software or intrude on privacy, we call these Safe Blues. In contrast to many app-based methods that inform individuals or governments about actual COVID-19 patients or hazards, Safe Blues does not provide information about individuals' locations or contacts. Hence the privacy concerns associated with Safe Blues are much lower than other methods. However, from the point of view of data collection, Safe Blues has two major advantages: - Data about the spread of Safe Blues is uploaded to a central server in real time, which can give authorities a more up-to-date picture in comparison to actual COVID-19 data, which is only available retrospectively. - Sampling of Safe Blues data is not biased by being applied only to people who have shown symptoms or who have come into contact with known positive cases. These features mean that there would be real statistical value in introducing Safe Blues. In the medium term and end game of COVID-19, information from Safe Blues could aid health authorities to make informed decisions with respect to social distancing and other measures. In this paper we outline the general principles of Safe Blues and we illustrate how Safe Blues data together with neural networks may be used to infer characteristics of the progress of the COVID-19 pandemic in real time. Further information is on the Safe Blues website: https://safeblues.org/.
We consider the problem of best subset selection in linear regression, where the goal is to find for every model size k, that subset of k features that best fit the response. This is particularly challenging when the total available number of features is very large compared to the number of data samples. We propose COMBSS, a novel continuous optimization based method that identifies a solution path, a small set of models of varying size, that consists of candidates for the best subset in linear regression. COMBSS turns out to be very fast, making subset selection possible when the number of features is well in excess of thousands. Simulation results are presented to highlight the performance of COMBSS in comparison to existing popular methods such as Forward Stepwise, the Lasso and Mixed-Integer Optimization. Because of the outstanding overall performance, framing the best subset selection challenge as a continuous optimization problem opens new research directions for feature extraction for a large variety of regression models.
Viral spread is a complicated function of biological properties, the environment, preventative measures such as sanitation and masks, and the rate at which individuals come within physical proximity. It is these last two elements that governments can control through social-distancing directives. However, infection measurements are almost always delayed, making real-time estimation nearly impossible. Safe Blues is one way of addressing the problem caused by this time lag via online measurements combined with machine learning methods that exploit the relationship between counts of multiple forms of the Safe Blues strands and the progress of the actual epidemic. The Safe Blues protocols and techniques have been developed together with an experimental minimal viable product, presented as an app on Android devices with a server backend. Following initial exploration via simulation experiments, we are now preparing for a university-wide experiment of Safe Blues.
Multiclass open queueing networks find wide applications in communication, computer, and fabrication networks. Steady-state performance measures associated with these networks is often a topic of interset. Conceptually, under mild conditions, a sequence of regeneration times exists in multiclass networks, making them amenable to regenerative simulation for estimating steady-state performance measures. However, typically, identification of such a sequence in these networks is difficult. A well-known exception is when all interarrival times are exponentially distributed, where the instants corresponding to customer arrivals to an empty network constitute a sequence of regeneration times. In this article, we consider networks in which the interarrival times are generally distributed but have exponential or heavier tails. We show that these distributions can be decomposed into a mixture of sums of independent random variables such that at least one of the components is exponentially distributed. This allows an easily implementable embedded sequence of regeneration times in the underlying Markov process. We show that among all such interarrival time decompositions, the one with an exponential component that has the largest mean minimizes the asymptotic variance of the standard deviation estimator. We also show that under mild conditions on the network primitives, the regenerative mean and standard deviation estimators are consistent and satisfy a joint central limit theorem useful for constructing asymptotically valid confidence intervals. ACM Reference Format:Sarat Babu Moka and Sandeep Juneja. 2015. Regenerative simulation for queueing networks with exponential or heavier tail arrival distributions. ACM Trans. Model. Comput.
This work introduces and compares approaches for estimating rare-event probabilities related to the number of edges in the random geometric graph on a Poisson point process. In the one-dimensional setting, we derive closed-form expressions for a variety of conditional probabilities related to the number of edges in the random geometric graph and develop conditional Monte Carlo algorithms for estimating rare-event probabilities on this basis. We prove rigorously a reduction in variance when compared to the crude Monte Carlo estimators and illustrate the magnitude of the improvements in a simulation study. In higher dimensions, we use conditional Monte Carlo to remove the fluctuations in the estimator coming from the randomness in the Poisson number of nodes. Finally, building on conceptual insights from large-deviations theory, we illustrate that importance sampling using a Gibbsian point process can further substantially reduce the estimation variance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.