Daniel Golovin scite author profile

Submodularity1 is a property of set functions with deep theoretical consequences and farreaching applications. At first glance it appears very similar to concavity, in other ways it resembles convexity. It appears in a wide variety of applications: in Computer Science it has recently been identified and utilized in domains such as viral marketing (Kempe et al., 2003), information gathering , image segmentation (Boykov and Jolly, 2001;Kohli et al., 2009;Jegelka and Bilmes, 2011a), document summarization (Lin and Bilmes, 2011), and speeding up satisfiability solvers . In this survey we will introduce submodularity and some of its generalizations, illustrate how it arises in various applications, and discuss algorithms for optimizing submodular functions. Our emphasis here is on maximization; there are many important results and applications related to minimizing submodular functions that we do not cover 2 .As a concrete running example, we will consider the problem of deploying sensors in a drinking water distribution network (see Figure 1) in order to detect contamination. In this domain, we may have a model of how contaminants, accidentally or maliciously introduced into the network, spread over time. Such a model then allows to quantify the benefit f (A) of deploying sensors at a particular set A of locations (junctions or pipes in the network) in terms of the detection performance (such as average time to detection). Based on this notion of utility, we then wish to find an optimal subset A ⊆ V of locations maximizing the utility, max A f (A), subject to some constraints (such as bounded cost). This application requires solving a difficult real-world optimization problem, that can be handled with the techniques discussed in this chapter (Krause et al. 2008b show in detail how submodular optimization can be applied in this domain.) We will also discuss more complex settings, for example how one can incorporate complex constraints on the feasible sets A, robustly optimize against adversarially chosen objective functions f , or adaptively select sensors based on previous observations. Several algorithms for submodular optimization described in this survey are implemented in an open source Matlab toolbox 3 (Krause, 2010).

show abstract

An Online Algorithm for Maximizing Submodular Functions

Streeter

Golovin

2007

143

254

View full text Add to dashboard Cite

Public reporting burden for this collection of information is estimated to average 1 hour per response, including the time for reviewing instructions, searching existing data sources, gathering and maintaining the data needed, and completing and reviewing this collection of information. Send comments regarding this burden estimate or any other aspect of this collection of information, including suggestions for reducing this burden to Department of Defense, Washington Headquarters Services, Directorate for Information PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION REPORT NUMBERUniversity of California, Irvine Irvine, CA 92697-1875 LMO4 is highly expressed in breast epithelial cells and is related to cell proliferation and/or invasion in vivo. Because these cellular features are associated with breast carcinogenesis and since LMO4 is overexpressed in more than 50% of breast cancer cases, we hypothesize that LMO4 may play roles in oncogenesis of breast epithelial cells by regulating proliferation, invasion and/or other cellular features. Using LMO4 over-expression or shRNA expression system in vitro, I found that LMO4 play crucial roles in the regulation of cell proliferation and apoptosis of normal mammary gland epithelial cells or breast cancer cells. Furthermore, I have also observed that deletion of LMO4 impaired the function and development of mammary gland in LMO4 conditional knockout mice, indicating that LMO4 protein is necessary for maintaining the normal development of mice mammary gland. In addition, I demonstrated that the LMO4 can modulate TGFβ signaling and regulated the proliferative response of epithelial cells to TGFβ signaling, and thereby linked LMO4 to a conserved signaling pathway that plays important roles in epithelial homeostasis. Under the support of grant, I received excellent training in bioinformatics. By combining previously described functional methods with bioinformatics approaches, we used DNA microarrays to discover LMO4-responsive genes, and identified BMP7 as a key down-stream gene of LMO4. In addition, we also found a significant correlation between LMO4 and BMP7 transcript levels in a large dataset of human breast cancers, providing additional support that BMP7 is a bona fide target gene of LMO4. Finally, we demonstrated that LMO4 binds to HDAC2 and that they are recruited together to the BMP7 promoter. We also suggested a novel mechanism for LMOs; LMO4, Clim2 and HDAC2 are part of a transcriptional complex, and alterations in LMO4 levels can disrupt the complex, leading to decreased HDAC2 recruitment and increased promoter activity. These results strengthen the hypothesis that LMO4 may contribute to the oncogenesis of breast tissue, indicating that our work will play a role in solving the breast cancer problem with the support of the Army Breast Cancer Research Program. SPONSORING / MONITORING AGENCY NAME(S) AND ADDRESS(ES) 10. SPONSOR/MONITOR'S ACRONYM(S)U

show abstract

Google Vizier

et al. 2017

View full text Add to dashboard Cite

Any sufficiently complex system acts as a black box when it becomes easier to experiment with than to understand. Hence, black-box optimization has become increasingly important as systems have become more complex. In this paper we describe Google Vizier, a Google-internal service for performing black-box optimization that has become the de facto parameter tuning engine at Google. Google Vizier is used to optimize many of our machine learning models and other systems, and also provides core capabilities to Google's Cloud Machine Learning HyperTune subsystem. We discuss our requirements, infrastructure design, underlying algorithms, and advanced features such as transfer learning and automated early stopping that the service provides.

show abstract

Ad click prediction

et al. 2013

View full text Add to dashboard Cite

Predicting ad click-through rates (CTR) is a massive-scale learning problem that is central to the multi-billion dollar online advertising industry. We present a selection of case studies and topics drawn from recent experiments in the setting of a deployed CTR prediction system. These include improvements in the context of traditional supervised learning based on an FTRL-Proximal online learning algorithm (which has excellent sparsity and convergence properties) and the use of per-coordinate learning rates.We also explore some of the challenges that arise in a real-world system that may appear at first to be outside the domain of traditional machine learning research. These include useful tricks for memory savings, methods for assessing and visualizing performance, practical methods for providing confidence estimates for predicted probabilities, calibration methods, and methods for automated management of features. Finally, we also detail several directions that did not turn out to be beneficial for us, despite promising results elsewhere in the literature. The goal of this paper is to highlight the close relationship between theoretical advances and practical engineering in this industrial setting, and to show the depth of challenges that appear when applying traditional machine learning methods in a complex dynamic system.

show abstract

Online distributed sensor selection

Golovin

Faulkner

Krause

2010

View full text Add to dashboard Cite

A key problem in sensor networks is to decide which sensors to query when, in order to obtain the most useful information (e.g., for performing accurate prediction), subject to constraints (e.g., on power and bandwidth). In many applications the utility function is not known a priori, must be learned from data, and can even change over time. Furthermore for large sensor networks solving a centralized optimization problem to select sensors is not feasible, and thus we seek a fully distributed solution. In this paper, we present Distributed Online Greedy (DOG), an efficient, distributed algorithm for repeatedly selecting sensors online, only receiving feedback about the utility of the selected sensors. We prove very strong theoretical no-regret guarantees that apply whenever the (unknown) utility function satisfies a natural diminishing returns property called submodularity. Our algorithm has extremely low communication requirements, and scales well to large sensor deployments. We extend DOG to allow observationdependent sensor selection. We empirically demonstrate the effectiveness of our algorithm on several real-world sensing tasks. ABSTRACTA key problem in sensor networks is to decide which sensors to query when, in order to obtain the most useful information (e.g., for performing accurate prediction), subject to constraints (e.g., on power and bandwidth). In many applications the utility function is not known a priori, must be learned from data, and can even change over time. Furthermore for large sensor networks solving a centralized optimization problem to select sensors is not feasible, and thus we seek a fully distributed solution. In this paper, we present Distributed Online Greedy (DOG), an efficient, distributed algorithm for repeatedly selecting sensors online, only receiving feedback about the utility of the selected sensors. We prove very strong theoretical no-regret guarantees that apply whenever the (unknown) utility function satisfies a natural diminishing returns property called submodularity. Our algorithm has extremely low communication requirements, and scales well to large sensor deployments. We extend DOG to allow observationdependent sensor selection. We empirically demonstrate the effectiveness of our algorithm on several real-world sensing tasks.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Daniel Golovin

Submodular Function Maximization

An Online Algorithm for Maximizing Submodular Functions

Google Vizier

Ad click prediction

Online distributed sensor selection

Contact Info

Product

Resources

About