Artikel in einem Konferenzbericht,

Regret Bounds for the Adaptive Control of Linear Quadratic Systems

, und .
COLT, Seite 1--26. (Juli 2011)

Zusammenfassung

We study the average cost Linear Quadratic (LQ) control problem with unknown model parameters, also known as the adaptive control problem in the control community. We design an algorithm and prove that apart from logarithmic factors its regret up to time T is O(T^1/2). Unlike previous approaches that use a forced-exploration scheme, we construct a high-probability confidence set around the model parameters and design an algorithm that plays optimistically with respect to this confidence set. The construction of the confidence set is based on the recent results from online least-squares estimation and leads to improved worst-case regret bound for the proposed algorithm. To the best of our knowledge this is the first time that a regret bound is derived for the LQ control problem.

Tags

Nutzer

  • @csaba

Kommentare und Rezensionen