Online Optimal Control with Affine Constraints





This paper considers online optimal control with linear constraints on the states and actions under a noisy linear dynamical system. The convex stage cost functions are adversarially changing and are unknown before selecting the stage actions. The dynamical system and the constraints are time-invariant and known beforehand. We propose an online control algorithm: Online Gradient Descent with Buffer Zone (OGD-BZ). OGD-BZ can guarantee the system to satisfy all the constraints despite the random process noises. We investigate the policy regret of OGD-BZ, which refers to the difference between OGD-BZ’s total cost and the total cost of an optimal linear policy in hindsight. We show that OGD-BZ achieves \tilde O(\sqrt T) regret, where \tilde O(\cdot) absorbs logarithmic terms of T.

This paper has been published at AAAI 2021