We analyze two communication-efficient algorithms for distributed optimization in statistical settings involving large-scale data sets. The first algorithm is a standard averaging method that distributes the N data samples evenly to m machines, performs separate minimization on each subset, and then averages the estimates. We provide a sharp analysis of this average mixture algorithm, showing that under a reasonable set of conditions, the combined parameter achieves mean-squared error (MSE) that decays as O(N −1 + (N/m) −2 ). Whenever m ≤ √ N , this guarantee matches the best possible rate achievable by a centralized algorithm having access to all N samples. The second algorithm is a novel method, based on an appropriate form of bootstrap subsampling. Requiring only a single round of communication, it has mean-squared error that decays as O(N −1 + (N/m) −3 ), and so is more robust to the amount of parallelization. In addition, we show that a stochastic gradient-based method attains mean-squared error decaying as O(N −1 + (N/m) −3/2 ), easing computation at the expense of a potentially slower MSE rate. We also provide an experimental evaluation of our methods, investigating their performance both on simulated data and on a large-scale regression problem from the internet search domain. In particular, we show that our methods can be used to efficiently solve an advertisement prediction problem from the Chinese SoSo Search Engine, which involves logistic regression with N ≈ 2.4 × 10 8 samples and d ≈ 740,000 covariates.
We consider distributed convex optimization problems originated from sample average approximation of stochastic optimization, or empirical risk minimization in machine learning. We assume that each machine in the distributed computing system has access to a local empirical loss function, constructed with i.i.d. data sampled from a common distribution. We propose a communication-efficient distributed algorithm to minimize the overall empirical loss, which is the average of the local empirical losses. The algorithm is based on an inexact damped Newton method, where the inexact Newton steps are computed by a distributed preconditioned conjugate gradient method. We analyze its iteration complexity and communication efficiency for minimizing self-concordant empirical loss functions, and discuss the results for distributed ridge regression, logistic regression and binary classification with a smoothed hinge loss. In a standard setting for supervised learning, the required number of communication rounds of the algorithm does not increase with the sample size, and only grows slowly with the number of machines.
Light field imaging has emerged as a technology allowing to capture richer visual information from our world. As opposed to traditional photography, which captures a 2D projection of the light in the scene integrating the angular domain, light fields collect radiance from rays in all directions, demultiplexing the angular information lost in conventional photography. On the one hand, this higher-dimensional representation of visual data offers powerful capabilities for scene understanding, and substantially improves the performance of traditional computer vision problems such as depth sensing, post-capture refocusing, segmentation, video stabilization, material classification, etc. On the other hand, the high-dimensionality of light fields also brings up new challenges in terms of data capture, data compression, content editing and display. Taking these two elements together, research in light field image processing has become increasingly popular in the computer vision, computer graphics and signal processing communities. In this article, we present a comprehensive overview and discussion of research in this field over the past 20 years. We focus on all aspects of light field image processing, including basic light field representation and theory, acquisition, super-resolution, depth estimation, compression, editing, processing algorithms for light field display, and computer vision applications of light field data. Index Terms-Light field imaging, light field processing.
Current treatments for chronic diabetic wounds remain unsatisfactory due to the lack of ideal wound dressings that can integrate matching mechanical strength, fast self‐healability, facile dressing change, and multiple therapeutic effects into one system. In this work, benefiting from the catechol groups and therapeutic effect of epigallocatechin‐3‐gallate (EGCG, green tea derivative), a smart hydrogel dressing can be conveniently obtained through copolymerization of the complex formed by EGCG and 3‐acrylamido phenylboronic acid (APBA) (the formation of boronate ester bond) and acrylamide. The resulting hydrogel features adequate mechanical properties, self‐healing capability, and tissue adhesiveness. Otherwise, the substantial release of EGCG can not only realize anti‐oxidation, antibacterial, anti‐inflammatory and proangiogenic effect, and modulation of macrophage polarization to accelerate wound healing, but also facilitate easy dressing change. This advanced hydrogel provides a facile and effective way for diabetic chronic wound management and may be extended for the therapy of other complicated wound healings.
Molecular subtyping of cancer is a critical step towards more individualized therapy and provides important biological insights into cancer heterogeneity. Although gene expression signature-based classification has been widely demonstrated to be an effective approach in the last decade, the widespread implementation has long been limited by platform differences, batch effects, and the difficulty to classify individual patient samples. Here, we describe a novel supervised cancer classification framework, deep cancer subtype classification (DeepCC), based on deep learning of functional spectra quantifying activities of biological pathways. In two case studies about colorectal and breast cancer classification, DeepCC classifiers and DeepCC single sample predictors both achieved overall higher sensitivity, specificity, and accuracy compared with other widely used classification methods such as random forests (RF), support vector machine (SVM), gradient boosting machine (GBM), and multinomial logistic regression algorithms. Simulation analysis based on random subsampling of genes demonstrated the robustness of DeepCC to missing data. Moreover, deep features learned by DeepCC captured biological characteristics associated with distinct molecular subtypes, enabling more compact within-subtype distribution and between-subtype separation of patient samples, and therefore greatly reduce the number of unclassifiable samples previously. In summary, DeepCC provides a novel cancer classification framework that is platform independent, robust to missing data, and can be used for single sample prediction facilitating clinical implementation of cancer molecular subtyping.
The recently identified Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is the cause of the COVID-19 pandemic. How this novel beta-coronavirus virus, and coronaviruses more generally, alter cellular metabolism to support massive production of ~30 kB viral genomes and subgenomic viral RNAs remains largely unknown. To gain insights, transcriptional and metabolomic analyses are performed 8 hours after SARS-CoV-2 infection, an early timepoint where the viral lifecycle is completed but prior to overt effects on host cell growth or survival. Here, we show that SARS-CoV-2 remodels host folate and one-carbon metabolism at the post-transcriptional level to support de novo purine synthesis, bypassing viral shutoff of host translation. Intracellular glucose and folate are depleted in SARS-CoV-2-infected cells, and viral replication is exquisitely sensitive to inhibitors of folate and one-carbon metabolism, notably methotrexate. Host metabolism targeted therapy could add to the armamentarium against future coronavirus outbreaks.
The application of RNA interference (RNAi) for inflammatory bowel disease (IBD) therapy has been limited by the lack of non-cytotoxic, efficient and targetable small interfering RNA (siRNA) carriers. TNF-α is the major pro-inflammatory cytokine mainly secreted by macrophages during IBD. Here, a mannosylated bioreducible cationic polymer (PPM) was synthesized and further spontaneously assembled nanoparticles (NPs) assisted by sodium triphosphate (TPP). The TPP-PPM/siRNA NPs exhibited high uniformity (polydispersity index = 0.004), a small particle size (211–275 nm), excellent bioreducibility, and enhanced cellular uptake. Additionally, the generated NPs had negative cytotoxicity compared to control NPs fabricated by branched polyethylenimine (bPEI, 25 kDa) or Oligofectamine (OF) and siRNA. In vitro gene silencing experiments revealed that TPP-PPM/TNF-α siRNA NPs with a weight ratio of 40:1 showed the most efficient inhibition of the expression and secretion of TNF-α (approximately 69.9%, which was comparable to the 71.4% obtained using OF/siRNA NPs), and its RNAi efficiency was highly inhibited in the presence of mannose (20 mM). Finally, TPP-PPM/siRNA NPs showed potential therapeutic effects on colitis tissues, remarkably reducing TNF-α level. Collectively, these results suggest that non-toxic TPP-PPM/siRNA NPs can be exploited as efficient, macrophage-targeted carriers for IBD therapy.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
334 Leonard St
Brooklyn, NY 11211
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.