You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • ICML 2020 • Dylan Foster, Max Simchowitz

We consider the problem of online control in a known linear dynamical system subject to adversarial noise.

no code implementations • NeurIPS 2021 • Edgar Minasyan, Paula Gradu, Max Simchowitz, Elad Hazan

On the positive side, we give an efficient algorithm that attains a sublinear regret bound against the class of Disturbance Response policies up to the aforementioned system variability term.

no code implementations • NeurIPS 2021 • Juan C. Perdomo, Jack Umenberger, Max Simchowitz

Stabilizing an unknown control system is one of the most fundamental problems in control systems engineering.

no code implementations • 5 Aug 2021 • Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

We show that this is not possible -- there exists a fundamental tradeoff between achieving low regret and identifying an $\epsilon$-optimal policy at the instance-optimal rate.

no code implementations • NeurIPS 2021 • Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

no code implementations • 27 Mar 2021 • Tyler Westenbroek, Max Simchowitz, Michael I. Jordan, S. Shankar Sastry

The widespread adoption of nonlinear Receding Horizon Control (RHC) strategies by industry has led to more than 30 years of intense research efforts to provide stability guarantees for these methods.

no code implementations • 19 Mar 2021 • Juan C. Perdomo, Max Simchowitz, Alekh Agarwal, Peter Bartlett

We study the problem of adaptive control of the linear quadratic regulator for systems in very high, or even infinite dimension.

no code implementations • 28 Feb 2021 • Max Simchowitz, Aleksandrs Slivkins

However, the algorithm controls the flow of information, and can incentivize the agents to explore via information asymmetry.

no code implementations • 10 Feb 2021 • Andrew Wagenmaker, Max Simchowitz, Kevin Jamieson

Along the way, we establish that certainty equivalence decision making is instance- and task-optimal, and obtain the first algorithm for the linear quadratic regulator problem which is instance-optimal.

no code implementations • NeurIPS 2020 • Zakaria Mhammedi, Dylan J. Foster, Max Simchowitz, Dipendra Misra, Wen Sun, Akshay Krishnamurthy, Alexander Rakhlin, John Langford

We introduce a new algorithm, RichID, which learns a near-optimal policy for the RichLQR with sample complexity scaling only with the dimension of the latent state space and the capacity of the decoder function class.

no code implementations • NeurIPS 2020 • Max Simchowitz

Recent literature has made much progress in understanding \emph{online LQR}: a modern learning-theoretic take on the classical control problem in which a learner attempts to optimally control an unknown linear dynamical system with fully observed state, perturbed by i. i. d.

1 code implementation • NeurIPS 2020 • Kianté Brantley, Miroslav Dudik, Thodoris Lykouris, Sobhan Miryoosefi, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We propose an algorithm for tabular episodic reinforcement learning with constraints.

1 code implementation • ICML 2020 • Esther Rolf, Max Simchowitz, Sarah Dean, Lydia T. Liu, Daniel Björkegren, Moritz Hardt, Joshua Blumenstock

Our theoretical results characterize the optimal strategies in this class, bound the Pareto errors due to inaccuracies in the scores, and show an equivalence between optimal strategies and a rich class of fairness-constrained profit-maximizing policies.

no code implementations • 29 Feb 2020 • Dylan J. Foster, Max Simchowitz

We introduce a new algorithm for online linear-quadratic control in a known system subject to adversarial disturbances.

no code implementations • ICML 2020 • Chi Jin, Akshay Krishnamurthy, Max Simchowitz, Tiancheng Yu

We give an efficient algorithm that conducts $\tilde{\mathcal{O}}(S^2A\mathrm{poly}(H)/\epsilon^2)$ episodes of exploration and returns $\epsilon$-suboptimal policies for an arbitrary number of reward functions.

no code implementations • ICML 2020 • Max Simchowitz, Dylan J. Foster

Our upper bound is attained by a simple variant of $\textit{{certainty equivalent control}}$, where the learner selects control inputs according to the optimal controller for their estimate of the system while injecting exploratory random noise.

no code implementations • 25 Jan 2020 • Max Simchowitz, Karan Singh, Elad Hazan

We consider the problem of controlling a possibly unknown linear dynamical system with adversarial perturbations, adversarially chosen convex loss functions, and partially observed states, known as non-stochastic control.

no code implementations • 20 Nov 2019 • Thodoris Lykouris, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We initiate the study of multi-stage episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system extending recent results for the special case of stochastic bandits.

no code implementations • 6 Nov 2019 • Mark Braverman, Elad Hazan, Max Simchowitz, Blake Woodworth

We investigate the computational complexity of several basic linear algebra primitives, including largest eigenvector computation and linear regression, in the computational model that allows access to the data via a matrix-vector product oracle.

no code implementations • NeurIPS 2019 • Max Simchowitz, Kevin Jamieson

This paper establishes that optimistic algorithms attain gap-dependent and non-asymptotic logarithmic regret for episodic MDPs.

1 code implementation • 2 Feb 2019 • Max Simchowitz, Ross Boczar, Benjamin Recht

We analyze a simple prefiltered variation of the least squares estimator for the problem of estimation with biased, semi-parametric noise, an error model studied more broadly in causal statistics and active learning.

no code implementations • 27 Sep 2018 • Esther Rolf, David Fridovich-Keil, Max Simchowitz, Benjamin Recht, Claire Tomlin

We study an adaptive source seeking problem, in which a mobile robot must identify the strongest emitter(s) of a signal in an environment with background emissions.

no code implementations • 29 Aug 2018 • Lydia T. Liu, Max Simchowitz, Moritz Hardt

We show that under reasonable conditions, the deviation from satisfying group calibration is upper bounded by the excess risk of the learned score relative to the Bayes optimal score function.

no code implementations • 14 Aug 2018 • Max Simchowitz, Kevin Jamieson, Jordan W. Suchow, Thomas L. Griffiths

In this paper, we introduce the first principled adaptive-sampling procedure for learning a convex function in the $L_\infty$ norm, a problem that arises often in the behavioral and social sciences.

no code implementations • 24 Jul 2018 • Max Simchowitz

Minimizing a convex, quadratic objective of the form $f_{\mathbf{A},\mathbf{b}}(x) := \frac{1}{2}x^\top \mathbf{A} x - \langle \mathbf{b}, x \rangle$ for $\mathbf{A} \succ 0 $ is a fundamental problem in machine learning and optimization.

no code implementations • 4 Apr 2018 • Max Simchowitz, Ahmed El Alaoui, Benjamin Recht

We show that for every $\mathtt{gap} \in (0, 1/2]$, there exists a distribution over matrices $\mathbf{M}$ for which 1) $\mathrm{gap}_r(\mathbf{M}) = \Omega(\mathtt{gap})$ (where $\mathrm{gap}_r(\mathbf{M})$ is the normalized gap between the $r$ and $r+1$-st largest-magnitude eigenvector of $\mathbf{M}$), and 2) any algorithm $\mathsf{Alg}$ which takes fewer than $\mathrm{const} \times \frac{r \log d}{\sqrt{\mathtt{gap}}}$ queries fails (with overwhelming probability) to identity a matrix $\widehat{\mathsf{V}} \in \mathbb{R}^{d \times r}$ with orthonormal columns for which $\langle \widehat{\mathsf{V}}, \mathbf{M} \widehat{\mathsf{V}}\rangle \ge (1 - \mathrm{const} \times \mathtt{gap})\sum_{i=1}^r \lambda_i(\mathbf{M})$.

2 code implementations • ICML 2018 • Lydia T. Liu, Sarah Dean, Esther Rolf, Max Simchowitz, Moritz Hardt

Fairness in machine learning has predominantly been studied in static classification settings without concern for how decisions change the underlying population over time.

no code implementations • 22 Feb 2018 • Max Simchowitz, Horia Mania, Stephen Tu, Michael. I. Jordan, Benjamin Recht

We prove that the ordinary least-squares (OLS) estimator attains nearly minimax optimal performance for the identification of linear dynamical systems from a single observed trajectory.

no code implementations • 4 Jan 2018 • Reinhard Heckel, Max Simchowitz, Kannan Ramchandran, Martin J. Wainwright

Accordingly, we study the problem of finding approximate rankings from pairwise comparisons.

no code implementations • 20 Oct 2017 • Jason D. Lee, Ioannis Panageas, Georgios Piliouras, Max Simchowitz, Michael. I. Jordan, Benjamin Recht

We establish that first-order methods avoid saddle points for almost all initializations.

no code implementations • 14 Apr 2017 • Max Simchowitz, Ahmed El Alaoui, Benjamin Recht

We prove a \emph{query complexity} lower bound on rank-one principal component analysis (PCA).

no code implementations • 16 Feb 2017 • Max Simchowitz, Kevin Jamieson, Benjamin Recht

Moreover, our lower bounds zero-in on the number of times each \emph{individual} arm needs to be pulled, uncovering new phenomena which are drowned out in the aggregate sample complexity.

no code implementations • 9 Mar 2016 • Max Simchowitz, Kevin Jamieson, Benjamin Recht

This paper studies the Best-of-K Bandit game: At each time the player chooses a subset S among all N-choose-K possible options and observes reward max(X(i) : i in S) where X is a random vector drawn from a joint distribution.

no code implementations • 16 Feb 2016 • Jason D. Lee, Max Simchowitz, Michael. I. Jordan, Benjamin Recht

We show that gradient descent converges to a local minimizer, almost surely with random initialization.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.