MPC vs RL or Future Combo Discussion

Roy · September 27, 2022, 10:58pm

Hey Everyone,

This post is more of a theoretical question (esp. working through this resource). As I’ve been learning more about reinforcement learning and MPC I’ve begun to wonder the real differences. From what I understand the advantage of MPC is that it is an online controller looking into the future for every real time step, and due to it adapting from real world info it can still function even if the model’s dynamics are fuzzy/inaccurate. Meanwhile RL is using a policy generated offline that is ideally generalized for the goal task (e.g. walking) by presenting the agent with random environments, system parameters, etc.

My question is do you think that one methodology will take over the other in terms of practicality and versatility in the future? Will we just have two different methods going forth that we implement based on whatever the task-space is? Or will robotics ideally come to some kind of fusion and what would a true fusion of control theory and RL even look like?

I was once thinking of RL maybe being the “link” between different controllers so you would use an RL policy to know when to and transition between two different locomotion controllers. But now it feels like some RL policies perform better than traditional controllers (e.g. Cassie running).

What do you guys think?

majid_khadiv · September 28, 2022, 1:02pm

Hi Roy,

This is a very general question and still open for research. Short answer is both frameworks try to find an optimal control policy in general. MPC uses a model to plan into the future and solves online an optimal control problem which is cast to a nonlinear program, while RL resorts to trial and error and function approximation to find optimal policy. based on my experience, RL can find more performant policies for single tasks (as we have seen many examples until now such as the latest running of Cassie), but it does not generalize and one needs to generate for every new task a new policy. On the other hand, MPC can potentially generate different motions without changing the structure, but the price to pay is that it’s computationally expensive and normally does not admit highly performant policies like RL due to the assumptions used in writing down the optimal control problem. This is the most general thing I could say, but there is much more detail in that …

Best,
Majid

yunifuchioka · October 12, 2022, 8:42pm

Very nice question and discussion going on here!

In addition to the MPC-RL workshop, another reference I recommend is this presentation from Marco Hutter’s group at a workshop in ICRA 2023: [09] M. Hutter, 6th Workshop on Legged Robots ICRA'22 - YouTube

Topic		Replies	Views
Variable Horizon MPC Controller for Bolt Software	2	582	December 10, 2021
What is good PD parameters for Bolt? Software	3	261	August 9, 2024
Motor controller design Electronics	1	463	November 23, 2020
Open Discussion: Solo12 integration with ROS/ROS2 General	4	1530	January 27, 2023
Functionality for the robot implemented in ROS? Software	5	1084	October 7, 2020

MPC vs RL or Future Combo Discussion

Related topics