OmniH2O (Omni Human2HumanOid) aims to provide a universal whole-body control interface for full-size humanoids with dexterous hands. By learning a robust whole-body loco-manipulation tracking policy and carefully designing the control interface (sparse kinematic poses), a single
Introduce OmniH2O, a learning-based system for whole-body humanoid teleop and autonomy:
🦾Robust loco-mani policy
🦸Universal teleop interface: VR, verbal, RGB
🧠Autonomy via
@chatgpt4o
or imitation
🔗Release the first whole-body humanoid dataset
I'll be joining
@CarnegieMellon
as an Assistant Professor in
@CMU_Robotics
and
@SCSatCMU
in Fall 2023. Deep thanks to my Ph.D. advisors
@yisongyue
and Soon-Jo Chung, collaborators, and many friends who have supported me all the time. Looking forward to a new journey at CMU!
Our Neural-Fly paper was published by
@SciRobotics
!
Neural-Fly enables rapid learning for agile flight in time-variant strong winds, with theoretical guarantees!
Paper:
@Caltech
news:
Explainer video:
Today, I received a tragic review with the lowest possible rating for my NeurIPS paper. I was all prepared to enter the ring of scholarly debate and start doubting my research taste, but then I found:
I wrote a blog post: "Neural-Control Family: What Deep Learning + Control Enables in the Real World." This post discusses some key principles of our work on learning robotic agility in safety-critical systems (e.g., Neural-Swarm below).
Officially Dr. Shi! Many thanks to the committee (Prof.
@yisongyue
@AdamWierman
Soon-Jo Chung and Joel Burdick)!
Before joining
@CMU_Robotics
as an Assistant Professor in the 2023 fall, I will spend one year at
@uwcse
as a postdoc with Byron Boots, focusing on 🤖️learning.
Excited to teach Intro to Robot Learning () at CMU this spring (start from the next week)!
Can't stop thinking about "how fast this field moves and how interdisciplinary it is" when preparing slides.
Diffusion models have shown strong capabilities in generating high-fidelity trajectories. However, standard diffusion processes cannot efficiently adapt to new scenarios beyond demonstrations (e.g., new robots with different dynamics).
MBD (Model-Based Diffusion) is a
🚀Introducing Model-Based Diffusion (MBD), a diffusion-based traj optimization method that directly computes the score function using model info. MBD doesn't require data, but can be integrated with data of diverse quality.
🌐
🧵1/6
My second blog post: "Learning-theoretic Perspectives on MPC via Competitive Control." This post discusses Model Predictive Control (MPC) from online learning's perspectives: why is MPC a competitive online learner, and what can we learn from it?
Such a cool idea! I was trying to use the Kolmogorov-Arnold representation theorem to prove the expressiveness of the heterogeneous deep sets structure in the Neural-Swarm2 paper (, figure below).
I think the Deep Sets paper () and
MLPs are so foundational, but are there alternatives? MLPs place activation functions on neurons, but can we instead place (learnable) activation functions on weights? Yes, we KAN! We propose Kolmogorov-Arnold Networks (KAN), which are more accurate and interpretable than MLPs.🧵
I wrote a tutorial on diffusion models for undergrad and grad students. I tried my best to give intuitive explanations for complicated equations.
Your feedback is much appreciated
Thanks to those who suggested various reading materials to me
Quadrupedal robots are mastering agile skills like running, standing, and parkouring, yet face the peril of damaging falls.
Introducing🛡️Guardians as You Fall (GYF), a safe falling and recovery framework that can actively tumble and recover to stable modes to minimize damage in
Grad school applicants (hope it's not too late): my group at
@CMU_Robotics
is hiring Ph.D. and master's students in Fall 2023!
If you are interested in learning & control & robotics, please consider applying to our graduate programs (, deadline is Dec 12).
How to start the data flywheel for human-like embodied intelligence? We think real-time teleoperating a humanoid🤖 in a whole-body manner will be a solution. The embodiment alignment allows for a seamless integration of human cognitive skills with versatile humanoid capabilities.
🤖 Introducing H2O (Human2HumanOid):
- 🧠 An RL-based human-to-humanoid real-time whole-body teleoperation framework
- 💃 Scalable retargeting and training using large human motion dataset
- 🎥 With just an RGB camera, everyone can teleoperate a full-sized humanoid to perform
For humanoid loco-manipulation tasks involving specific sequential contacts (e.g., the clapping & dancing video below), we found that the contact sequence itself is naturally an ideal representation for:
- Task decomposition to reduce the exploration burden;
- Simple &
🚨 Without Any Motion Priors, how to make humanoids do versatile parkour jumping🦘, clapping dance🤸, cliff traversal🧗, and box pick-and-move📦 with a unified RL framework?
Introduce WoCoCo:
🧗 Whole-body humanoid Control with sequential Contacts
🎯Unified designs for minimal
🚀Excited to introduce CoVO-MPC (Covariance-Optimal MPC)! Thanks to its flexibility and parallelizability, sampling-based MPC has been a practical approach in many domains, especially model-based RL.
However, there is no convergence analysis or principled way to tune
Two theory papers about online learning and control accepted by
#NeurIPS2020
!
In the first paper, we propose a new class of online optimization with memory and connect that with control; Second paper: we study how valuable predictions are in online control and analyze MPC.
We open-sourced everything for the Agile But Safe (ABS) project! Including hardware installation, system setup, simulation training, and real-world deployment.
👉
Code:
Led by
@TairanHe99
@ChongZitaZhang
Legged robots are mastering agile skills like parkouring, but how to unleash their agility in cluttered environments, where collision avoidance is a must?
We introduce ABS, a fully onboard and autonomous control framework that enables agile and collision-free locomotion for
Trajectory tracking is a standard control task, but how does a drone track arbitrary, potentially infeasible trajectories (e.g., the triangle or star shapes below) with large disturbances?
Introducing Deep Adaptive Trajectory Tracking (DATT,
@corl_conf
'23 oral), an RL and control
Unfortunately, I am not at
#ICRA2024
, but students from
@LeCARLab
will present several papers about learning and control in agile robotics. Check them out!
I am going to present "Unifying Semantic and Physical Intelligence for Generalist Humanoids" at
#CVPR2024
(The Computer Vision in the Wild Workshop). 11:30am on Jun 17 at Arch 3B. Will cover:
- Interface between semantic and physical intelligence: H2O, OmniH2O
- Learning
🛫 to ATL for
@corl_conf
! Present 4 papers about learning and control for various agile robots ✈️ 🐩 🚗:
1. Safe Deep Policy Adaptation (Deployable Workshop Mon 10:40) jointly tackles the problems of policy adaptation and safe RL under unseen disturbances.
I'll be presenting two works this week at
@NeurIPSConf
:
(Tue, poster session 1, E2) Analyzing MPC from learning-theoretic views in LTV systems - MPC is a competitive online learner
(Thu, session 7, E0) Meta-adaptive control - end-to-end guarantees for multi-task nonlinear control
How valuable are predictions in online control? How many predictions are needed to achieve performance with O(1) dynamic regret? How well does MPC perform?
We answer these in our new paper Joint with Chenkai Yu,
@yisongyue
, Soon-Jo Chung and Adam Wierman.
Two papers accepted by
#NeurIPS2023
!
@LeCARLab
1. (Spotlight) Which parameters in the dynamics model are most critical for MBRL, and how to quickly learn those parameters?
2. To learn a representation for multi-task robotics, which tasks are most
How to optimally explore for MBRL in unknown nonlinear systems?
We formally quantify which model parameters are most relevant to learning a good policy and provide a statistically optimal alg matching the lower bound!
w/ Andrew Wagenmaker and
@kgjamieson
I had the pleasure of visiting
@CMU_Robotics
over the past two days to give a VASC seminar talk and a guest lecture. Thanks
@GuanyaShi
for the amazing host! 🙌
The seminar talk was about our recent work on "Foundation Models for Robotic Manipulation": 🤖
Mind-blowing dexterous manipulation results, at 200Hz!
Cannot wait to see what happens when it is combined with some agile & robust lower-body policy.
We will make H2O () better to get closer to this goal : )
With OpenAI, Figure 01 can now have full conversations with people
-OpenAI models provide high-level visual and language intelligence
-Figure neural networks deliver fast, low-level, dexterous robot actions
Everything in this video is a neural network:
Thrilled to co-organize the Agile Robotics workshop. We believe robotic agility is far beyond a low-level control problem - it requires unification and reconciliation of reasoning, perception, planning, and control, especially in the era of large foundation models!
Consider
Traveling to Japan for ICRA? Consider showcasing your recent or upcoming work at the Agile Robotics workshop!
The deadline to contribute an extended abstract is 𝗠𝗮𝗿𝗰𝗵 𝟮𝟵.
Then, join on Monday, May 13 for an exciting slate of invited speakers:
Here are my thoughts on improving review quality for conferences like
@NeurIPSConf
: (1) Don't increase review numbers for each paper to 6. This only increases review burdens. (2) Reduce the review burden for junior reviewers. (3) Consider AC's evaluations/feedback for reviewers
Multi-task control is hard (e.g., a drone flying in different winds at Caltech CAST).
Excited to share OMAC, an online multi-task nonlinear control algorithm with non-asymptotic end-to-end guarantees! Key idea: integrate meta-learning with adaptive control
#ICRA2019
We will present Neural Lander today @ POD 22, 4-5pm
One of the first instances of provably robust deep learning based controllers! We show Lyapunov stability while using a deep NN as part of the controller design. Experiments on real robots too!
Our Neural-Fly paper was published by
@SciRobotics
!
Neural-Fly enables rapid learning for agile flight in time-variant strong winds, with theoretical guarantees!
Paper:
@Caltech
news:
Explainer video:
Finally we released the method behind this "magic" (the minimum distance between drones is only 24cm):
We introduce Heterogeneous Deep Sets to learn the complex interactions with permutation invariance, and use them to design stable controller and planner
This is not a choreographed "dance" - these robots are flying autonomously and not bumping into one another thanks to a system developed by Caltech engineers.
@antoine_leeman
I think it really depends on what "RL" means and what solvers we use for TO. If RL here refers to sim2real + PPO-style algorithms and TO uses first-order or second-order methods with simplified models, I cannot really imagine in which cases TO will dominate. Nevertheless, RL +
Incredible achievements! The best agile robotics paper I've read this year. It once again proves the power and necessity of (1) residual learning and real-to-sim for robot learning and (2) perception-control-in-a-loop.
I am very much looking forward to seeing champion-level
We are thrilled to share our groundbreaking paper published today in
@Nature
: "Champion-Level Drone Racing using Deep Reinforcement Learning." We introduce "Swift," the first autonomous vision-based drone that beat human world champions in several fair head-to-head races! PDF
Happy Holidays from our robots to you and your robots! With all the best wishes for a wonderful holiday season.
From the Caltech ARCL lab, edited by
@lupusorina
The key is that we meta-learn a DNN from 6 different wind conditions (12-min data in total), and use adaptive control to fine-tune it in real-time.
Neural-Fly significantly reduces the control error of the current SOTA and can generalize to unseen stronger winds and unseen drones
I will present these online learning & control papers at
#NeurIPS2020
in
(1) Poster Session 1
Mon, Dec 7 @ 9–11pm PST
GatherTown: RL & planning (Town C1-Spot A3)
(2) Session 7
Thu, Dec 10 @ 9–11pm PST
Optimization (A1-B0)
TBH, I doubted this interaction way but now I love it!
Two theory papers about online learning and control accepted by
#NeurIPS2020
!
In the first paper, we propose a new class of online optimization with memory and connect that with control; Second paper: we study how valuable predictions are in online control and analyze MPC.
In NOLA for
@NeurIPSConf
! Present two papers and one workshop paper. Excited to chat about RL, control, and robotics. DM if you want to meet up : )
1. Optimal Exploration for Model-based RL in Nonlinear Systems (). Thu 10:45-12:45pm
#1507
. TL;DR: Not all
#JetsonMeetUp18
So many DL applications in robotics! My favorite demo is
@Skydio
self-flying camera, where DL is used in objective tracking, depth estimation and even SLAM! I am interested in understanding high-dim dynamics with DL to help robotics control.
@NVIDIAEmbedded
The Carnegie Bosch Institute is inviting the first cohort of postdoc fellows (two-year, fully-funded) in the field of AI and cybersecurity.
Come work w/ me and others at
@CarnegieMellon
@SCSatCMU
. Support letters are needed, so reach out if interested!
Joint work with
@BenRiviere2
@yisongyue
Soon-Jo Chung and Wolfgang Hoenig. Paper links: . To achieve the cool results as shown in the video, the key is to use permutation invariant networks to learn complex interaction between drones
This is not a choreographed "dance" - these robots are flying autonomously and not bumping into one another thanks to a system developed by Caltech engineers.
Just arrived in Atlanta for
#ACC2022
. Looking forward to meeting friends and brainstorming new research ideas!
I will present our work on online learning and control with inaccurate predictions:
CAJun shows great potential of hierarchical RL and optimization-based control: RL for a versatile & adaptive centroid policy and control for robust & reactive tracking
Benefits: Efficient (20 mins, one GPU); Performance (40% wider jump than SOTA); Robust (critical for continuous)
We present CAJun, a hierarchical learning-control framework that achieves continuous, long-distance jumps (up to 70cm) on quadrupedal robots!
Paper:
Video:
Website:
Neural-Fly is the third child in the "Neural-Control" family (see my blog post ), a family of deep-learning-based nonlinear control methods. Unlike two earlier systems (Neural-Lander/Swarm), Neural-Fly learns and adapts in real-time!
In both papers, we use metrics (competitive ratio or dynamic regret) directly compared with the globally optimal policy.
Paper links: , . Will write a blog soon!
With
@LinYiheng
, Chenkai Yu,
@soonjochung
,
@yisongyue
, and
@AdamWierman
.
Let’s think about humanoid robots outside carrying the box. How about having the humanoid come out the door, interact with humans, and even dance?
Introducing Expressive Whole-Body Control for Humanoid Robots:
See how our robot performs rich, diverse,
OMAC is also compatible with deep representation learning (we provide demo PyTorch codes for an inverted pendulum task)!
Joint work with
@kazizzad
, Soon-Jo Chung, and
@yisongyue
.
Great work by
@zipengfu
and the team! As long as we can solve the whole-body control problem, the humanoid is such an exciting generalist physical intelligence platform, because of the human-to-humanoid embodiment alignment.
Introduce HumanPlus - Shadowing part
Humanoids are born for using human data. We build a real-time shadowing system using a single RGB camera and a whole-body policy for cloning human motion. Examples:
- boxing🥊
- playing the piano🎹/ping pong
- tossing
- typing
Open-sourced!
I am interested in:
1. Safe robotic control with learned agility (aerial robot, locomotion, ground vehicle, swarm)
2. Learning & control theory
3. Offline learning + online adaptation
4. Model-based + model-free
5. More safe and structured RL
And many other exciting topics!
I am very impressed by this example, which clearly shows that GPT-4 has common sense grounding.
But I am also confused by the definition of "zero-shot" in this report: given the unprecedented scale of data, how do we ensure these examples are not included in the training?
4. CAJun: Continuous Adaptive Jumping using a Learned Centroidal Controller (poster 12:00-12:45 on Wed). CAJun is a hierarchical learning and control framework that enables legged robots to jump continuously with adaptive distances.
Introducing 🤖🏃Humanoid Parkour Learning
Using vision and proprioception, our humanoid can jump over hurdles, and platforms, leap over gaps, walk up/down stairs, and much more.
🖥️Check our website at
📺Stay tuned for more videos.
If you have probabilistic ML models (e.g., GP) in the nonlinear dynamics with safety constraints, you will have to consider uncertainty propagation and safety violations (formulated as chance constraints). We propose a new framework for robust learning, exploration, and planning.
2. Deep Model Predictive Optimization (Deployable Workshop Mon 10:40). DMPO learns the inner-loop optimizer of sampling-based MPC directly via experience, outperforming MPC and end-to-end RL baselines.
3. DATT: Deep Adaptive Trajectory Tracking for Quadrotor Control (oral at 8:30pm on Wed, poster 12:00-12:45 on Wed). DATT can precisely track arbitrary, potentially infeasible trajectories in the presence of large disturbances.
A nice blog about optimality/robustness in control. Takeaways: (1) continuous and discrete systems are inherently different in terms of robustness; (2) In LQR, CE controller enjoys large margins but also has natural fragility; (3) (not in this blog)in LQG, CE is much more fragile
Agree! We can’t say it is “human-like” since they use a powerful sensor to estimate cube state. BTW, the key assumption in DR is the true dynamics lies in the parameter space in sim, which is hard to verify. We need to quantify uncertainties and robustness.
Since
@OpenAI
still has not changed misleading blog post about "solving the Rubik's cube", I attach detailed analysis, comparing what they say and imply with what they actually did. IMHO most would not be obvious to nonexperts.
Please zoom in to read & judge for yourself.
@beenwrekt
Hi Prof. Recht, thanks for this great blog! Really inspiring. In the last blog, you mentioned in LQR, discrete and continuous systems are inherently different (in terms of robustness). I am wondering if this difference holds or not in LQG. Discrete LQG is more fragile?
Our first talk:
09/30: Elad Hazan (Princeton)
@HazanPrinceton
"The Non-Stochastic Control Problem"
Please visit for more information (subscribe to the Google group to receive Zoom link and future announcements).
@beenwrekt
Hi Prof. Recht, thanks for the great blog! I feel our recent work () might be interesting to you. We explain why MPC is near optimal in LQR even with adversarial noise. MPC only needs O(logT) predictions to reach O(1) dynamic regret.