Max Simchowitz scite author profile

Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time. Conventional wisdom suggests that fairness criteria promote the long-term well-being of those groups they aim to protect.We study how static fairness criteria interact with temporal indicators of well-being, such as long-term improvement, stagnation, and decline in a variable of interest. We demonstrate that even in a one-step feedback model, common fairness criteria in general do not promote improvement over time, and may in fact cause harm in cases where an unconstrained objective would not. We completely characterize the delayed impact of three standard criteria, contrasting the regimes in which these exhibit qualitatively different behavior. In addition, we find that a natural form of measurement error broadens the regime in which fairness criteria perform favorably.Our results highlight the importance of measurement and temporal modeling in the evaluation of fairness criteria, suggesting a range of new challenges and trade-offs.

show abstract

Low-rank Solutions of Linear Matrix Equations via Procrustes Flow

Tu¹,

Boczar²,

Simchowitz³

et al. 2015

Preprint

190

View full text Add to dashboard Cite

In this paper we study the problem of recovering a low-rank matrix from linear measurements. Our algorithm, which we call Procrustes Flow, starts from an initial estimate obtained by a thresholding scheme followed by gradient descent on a non-convex objective. We show that as long as the measurements obey a standard restricted isometry property, our algorithm converges to the unknown matrix at a geometric rate. In the case of Gaussian measurements, such convergence occurs for a n1 × n2 matrix of rank r when the number of measurements exceeds a constant times (n1 + n2)r.

show abstract

First-order methods almost always avoid strict saddle points

Panageas²,

et al. 2019

View full text Add to dashboard Cite

We establish that first-order methods avoid saddle points for almost all initializations. Our results apply to a wide variety of first-order methods, including gradient descent, block coordinate descent, mirror descent and variants thereof. The connecting thread is that such algorithms can be studied from a dynamical systems perspective in which appropriate instantiations of the Stable Manifold Theorem allow for a global stability analysis. Thus, neither access to secondorder derivative information nor randomness beyond initialization is necessary to provably avoid saddle points. * This paper significantly extends upon the special case of gradient descent dynamics developed in the conference proceedings of the authors [24,33].

show abstract

Reward-Free Exploration for Reinforcement Learning

Jin¹,

Krishnamurthy²,

Simchowitz³

et al. 2020

Preprint

View full text Add to dashboard Cite

Exploration is widely regarded as one of the most challenging aspects of reinforcement learning (RL), with many naive approaches succumbing to exponential sample complexity. To isolate the challenges of exploration, we propose a new "reward-free RL" framework. In the exploration phase, the agent first collects trajectories from an MDP M without a pre-specified reward function. After exploration, it is tasked with computing near-optimal policies under for M for a collection of given reward functions. This framework is particularly suitable when there are many reward functions of interest, or when the reward function is shaped by an external agent to elicit desired behavior.We give an efficient algorithm that conducts Õ(S 2 Apoly(H)/ǫ 2 ) episodes of exploration and returns ǫ-suboptimal policies for an arbitrary number of reward functions. We achieve this by finding exploratory policies that visit each "significant" state with probability proportional to its maximum visitation probability under any possible policy. Moreover, our planning procedure can be instantiated by any black-box approximate planner, such as value iteration or natural policy gradient. We also give a nearly-matching Ω(S 2 AH 2 /ǫ 2 ) lower bound, demonstrating the near-optimality of our algorithm in this setting.

show abstract

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

Simchowitz¹,

Jamieson²

2019

Preprint

View full text Add to dashboard Cite

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs. In contrast to prior work, our bounds do not suffer a dependence on diameter-like quantities or ergodicity, and smoothly interpolate between the gap dependent logarithmicregret, and the O( √ HSAT )-minimax rate. The key technique in our analysis is a novel "clipped" regret decomposition which applies to a broad family of recent optimistic algorithms for episodic MDPs.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

hi@scite.ai

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Max Simchowitz

Delayed Impact of Fair Machine Learning

Low-rank Solutions of Linear Matrix Equations via Procrustes Flow

First-order methods almost always avoid strict saddle points

Reward-Free Exploration for Reinforcement Learning

Non-Asymptotic Gap-Dependent Regret Bounds for Tabular MDPs

Contact Info

Product

Resources

About