Hey Everyone,
This post is more of a theoretical question (esp. working through this resource). As I’ve been learning more about reinforcement learning and MPC I’ve begun to wonder the real differences. From what I understand the advantage of MPC is that it is an online controller looking into the future for every real time step, and due to it adapting from real world info it can still function even if the model’s dynamics are fuzzy/inaccurate. Meanwhile RL is using a policy generated offline that is ideally generalized for the goal task (e.g. walking) by presenting the agent with random environments, system parameters, etc.
My question is do you think that one methodology will take over the other in terms of practicality and versatility in the future? Will we just have two different methods going forth that we implement based on whatever the task-space is? Or will robotics ideally come to some kind of fusion and what would a true fusion of control theory and RL even look like?
I was once thinking of RL maybe being the “link” between different controllers so you would use an RL policy to know when to and transition between two different locomotion controllers. But now it feels like some RL policies perform better than traditional controllers (e.g. Cassie running).
What do you guys think?