Algorithms for subset selection in linear regression

Das, Abhimanyu; Kempe, David

doi:10.1145/1374376.1374384

Cited by 162 publications

(173 citation statements)

References 36 publications

Supporting

Mentioning

158

Contrasting

Unclassified

Order By: Relevance

“…defined above is submodular (Das and Kempe, 2008;Krause et al, 2008a), we found empirically that in our case ρ(·) is not quite submodular (see section 4.4 for details). Nevertheless, the greedy algorithm (and the lazy implementation) proved to be quite effective in practice, as we will see below.…”

Section: Selecting Observation Locations Via Submodular Optimization mentioning

confidence: 45%

“…Selecting the optimal set of observations O is NP-hard in general (Das and Kempe, 2008;Krause et al, 2008a), and therefore we must rely on approximate methods to design an optimal sampling scheme. There is a significant body of work on maximizing objective functions that are submodular (see, e.g., (Krause et al, 2008b) and references therein).…”

Section: Selecting Observation Locations Via Submodular Optimization mentioning

confidence: 99%

“…Using these computed covariance matrices we can easily calculate the expected mean-squared error, the mutual information, and other design metrics. To efficiently search the space of possible designs, we utilized "lazy greedy" methods from the literature on submodular optimization (Nemhauser et al, 1978;Krause et al, 2007Krause et al, , 2008bDas and Kempe, 2008). These lazy evaluation methods proved critical to making the optimization tractable.…”

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Optimal experimental design for sampling voltage on dendritic trees in the low-SNR regime

Huggins

Paninski

2011

J Comput Neurosci

View full text Add to dashboard Cite

Due to the limitations of current voltage sensing techniques, optimal filtering of noisy, undersampled voltage signals on dendritic trees is a key problem in computational cellular neuroscience. These limitations lead to voltage data that is incomplete (in the sense of only capturing a small portion of the full spatiotemporal signal) and often highly noisy. In this paper we use a Kalman filtering framework to develop optimal experimental design methods for voltage sampling. Our approach is to use a simple greedy algorithm with lazy evaluation to minimize the expected square error of the estimated spatiotemporal voltage signal. We take advantage of some particular features of the dendritic filtering problem to efficiently calculate the Kalman estimator's covariance. We test our framework with simulations of real dendritic branching structures and compare the quality of both time-invariant and time-varying sampling schemes. While the benefit of using the experimental design methods was modest in the time-invariant case, improvements of 25-100% over more naïve methods were found when the observation locations were allowed to change with time. We also present a heuristic approximation to the greedy algorithm that is an order of magnitude faster while still providing comparable results.

show abstract

Section: Selecting Observation Locations Via Submodular Optimization mentioning

confidence: 45%

Section: Selecting Observation Locations Via Submodular Optimization mentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Optimal experimental design for sampling voltage on dendritic trees in the low-SNR regime

Huggins

Paninski

2011

J Comput Neurosci

View full text Add to dashboard Cite

show abstract

“…Over the past year, we have obtained the following key results (which are currently under submission [1] The quality of approximation is characterized precisely in [1], but omitted here due to space constraints. This result improves on the ones of [6,17], in that it analyzes a more commonly used algorithm, and obtains somewhat improved bounds.…”

Section: A Mathematical Formulation Of Sample Selectionmentioning

confidence: 99%

“…Hence, we also study the case of sensors embedded in a metric space, where the covariance between sensors' readings is a monotone decreasing function of their distance. The general version of this problem is the subject of ongoing work, but [1] contains a promising initial finding:…”

Section: Theorem 2 If the Pairs Of Variables X I With High Covariancmentioning

confidence: 99%

AMBROSia: An Autonomous Model-Based Reactive Observing System

Caron

Das

Dhariwal

et al. 2007

Computational Science – ICCS 2007

Self Cite

View full text Add to dashboard Cite

Abstract. Observing systems facilitate scientific studies by instrumenting the real world and collecting corresponding measurements, with the aim of detecting and tracking phenomena of interest. Our AMBROSia project focuses on a class of observing systems which are embedded into the environment, consist of stationary and mobile sensors, and react to collected observations by reconfiguring the system and adapting which observations are collected next. In this paper, we report on recent research directions and corresponding results in the context of AMBROSia.

show abstract

Predictive elastic replication for multi‐tenant databases in the cloud

Sousa

Moreira

Filho

et al. 2018

Concurrency and Computation

View full text Add to dashboard Cite

Cloud computing is a trend of technology aimed at providing on-demand services with payment based on usage. In order to improve the use of resources, cloud providers adopt multi-tenant approaches, reducing the operation cost of services. Moreover, tenants have irregular workload patterns, which impacts in guarantees of quality of service, mainly due to interference among tenants. This paper proposes PredRep, a predictive approach to characterize the cloud database system workload and automatically provide or reduce resources by replication techniques. In order to evaluate PredRep, some experiments that measure the quality of service and elasticity are presented. Our experiment results confirm that PredRep reduces cost and SLA violations. INTRODUCTIONCloud computing is now a well-established paradigm for the use of computational resources, according to which hardware infrastructure, software, and platforms for the development of new applications are offered as remotely available services on a global scale. 1 Cloud users give up their own computing infrastructure to dispose it through services, based on the pay-as-you-go model, offered by third parties (cloud providers), delegating responsibilities and assuming costs proportionally to the amount of resources used. Cloud providers, in turn, must meet the expected infrastructure needs of their users.The cloud computing adoption makes it an attractive paradigm that is potentially able to meet the most strict quality-of-service (QoS) levels stipulated in a service-level agreement (SLA). However, it is still a challenging issue for providers to efficiently tackle both gradual load variations and load peaks from the workloads, which are highly dynamic and, as such, very unpredictable, 2 in order not to violate SLA requirements to maximize their revenues.Cloud providers adopt the sharing of their computing resources among users, thus optimizing resource usage, reducing costs, and maximizing profits. There are several levels of resource sharing that define the granule shared for the different users of the cloud. For this, there is the multi-tenancy, which is a technique used to consolidate multiple tenants in a computational resource. 3 In the cloud database context, a tenant can be a tuple, table, database, DBMS, OS, or VM. For each tenant type, there is a multi-tenant model associated. However, the shared DBMS multi-tenant model has been the most widely adopted in the database management platforms. 4 In this model, a DBMS manages multiple tenants where a tenant is a database.Dynamic provisioning techniques are designed to handle irregular workloads, so that SLA violations and their contractual penalties associated are avoided or limited, reducing costs from the cloud provider perspective. 4 These techniques usually take action based on workload observation and can be classified as either reactive or proactive. Proactive solutions apply sophisticated system models for prediction 2 and use forecast results to triggering allocations of expected need in advance. In contrast,...

show abstract

Algorithms for subset selection in linear regression

Cited by 162 publications

References 36 publications

Optimal experimental design for sampling voltage on dendritic trees in the low-SNR regime

Optimal experimental design for sampling voltage on dendritic trees in the low-SNR regime

AMBROSia: An Autonomous Model-Based Reactive Observing System

Predictive elastic replication for multi‐tenant databases in the cloud

Contact Info

Product

Resources

About