Improving motion primitive tracking using reinforcement learning

Presentation Abstract:

This work considers the problem of trajectory tracking for a UAV under model uncertainty. A standard approach to trajectory tracking is receding-horizon LQR. However, as the model becomes more inaccurate we expect performance to degrade. One alternative that addresses this is L1 adaptive control. In this work, we consider a different approach based on reinforcement learning. We begin by considering trajectories as finite sequences of motion primitives. One approach to tracking such trajectories is to compute a feedback policy for each trajectory, such as finite- or infinite-horizon LQR, and switch between these policies at the appropriate time. However, because of large model inaccuracies, tracking along a single primitive can exhibit poor performance and even instability. When tracking a sequence of primitives and switching between controllers, these problems can be amplified because of growing state error at each switching point. Recent work in apprenticeship learning demonstrates techniques that improve the tracking of particular trajectories. We use a similar approach to learn improved feedback policies for a finite set of primitives, and study the affect of this learning on switching stability.