How can robot hands learn to do kitchen tasks from safe, continual real-world interaction?
In our new CoRL 2023 paper DEFT, we learn policy priors from human videos and use our soft robot hand to safely adapt the policy with real world practice.
1/
Want to get into dexterous manipulation but are frustrated with no good hardware options? You should try LEAP Hand, our dexterous, durable, $2k robot hand that can be assembled in 3 hours. We even open-sourced our sim2real code!
Visit us demoing at
#CoRL2023
Poster 5 and afternoon demos today!
@anag004
uses LEAP Hand for sim2real dexterous, functional grasping
@aditya__kannan
uses DASH Hand to learn from internet videos and real-world fine-tuning
Happy to share that our paper, Videodex, was featured on the front page of IJRR for April 2024.😀 I believe learning from human videos for robot hands will be very important especially in the era of foundation models that need a lot of data.
Great investigative work by Toru and co! Gloves are definitely a great way to collect robot hand teleop data. Our lab has also been also building a big setup which we will demo and open-source at RSS 2024 :)
A common question we get for HATO () is: can it be more dexterous?
Yes!
The first iteration of our system actually achieves this -- by capturing finger poses with mocap gloves and remapping them to robot hands.
[video taken in late 2023 (with
@yuzhang
)]
Autonomously opening ANY door in ANY environment is a very hard problem. We
1) Initialize our policy with VLM + priors and adapt continually to the real world using RL.
2) Develop robust, repeatable, low-cost hardware to unlock new robot performance
Introducing Open-World Mobile Manipulation 🦾🌍
– A full-stack approach for operating articulated objects in open-ended unstructured environments:
Unlocking doors with lever handles/ round knobs/ spring-loaded hinges 🔓🚪
Opening cabinets, drawers, and refrigerators 🗄️
👇
Robotic intelligence requires dexterous tool use, but generalizing across tools is hard.
Our CoRL23 paper combines semantics (affordances) with low-level control (sim2real) to show functional grasping that generalizes to hammers, drills and more!
1/n
Great project using Mocap gloves to collect accurate human hand data that can transfer to our LEAP Hand. The autonomous results are great! Best of all, everything is open source and very inexpensive to replicate. Congrats
@chenwang_j
and SAIL team!!!
Can we use wearable devices to collect robot data without actual robots?
Yes! With a pair of gloves🧤!
Introducing DexCap, a portable hand motion capture system that collects 3D data (point cloud + finger motion) for training robots with dexterous hands
Everything open-sourced
We introduce D-Cubed, a novel trajectory optimisation method using a latent diffusion model trained from a task-agnostic play dataset, including only representative hand motions, to solve dexterous deformable object manipulation tasks!
(1/N)
Dynamixels are truly great. It's almost similar to what Python is in software: a great abstraction layer on actuation. It allows us to focus on the robot design and learning itself.
Open-source robot arm for about $200. It uses five Dynamixel servo motors and weighs slightly over 100g (without the base). The design for a leader arm is included as well so that you can teleoperate it effectively. (Video is at 1x speed)
This has to be peak AI frenzy.
Cognition Labs potentially raising at a $2B valuation, just weeks after announcing their $21M raise at $350M.
Do they even have $1M in revenue?
This also applies to robot hardware open-sourcing. Creating LEAP Hand for a demo was one thing but creating an open-source guide and community was a whole new ballgame for me.
resonates with my own experience:
robot demo --> INSANE PAIN --> robot creating value in production
luckily I am hopeful we have by now paid most of our dues at Covariant :)
Open source, reproducible hardware should be the future of robotics. Especially as manufacturing techniques such as 3D printing technology and CNCing continues to improve.
The power of open source hardware!
I met
@chichengcc
and his new UMI grippers at Stanford last week. Turns out they've already been adopted on a humanoid by
@benjamin_bolte
and
@_mattfreed
🙌
Do you really need legs? We don't think so. As much we love anthropomorphic humanoids (our co-founder built one in 9th grade), we believe virtually all menial tasks can be done with two robot arms, mounted on wheels. In our view,
@1x_tech
's Eve robot is the optimal form factor
How do we practice safely in the environment?
We use DASH hand, our soft, dexterous end-effector. It has a small, human-like kinematic structure and is strong yet resilient to collisions.
This enables continual policy learning on 9 tasks over several hours of evaluation.
5/
How can we enable robots to perform diverse tasks? Designing rewards or demos for each task is not scalable.
We propose WHIRL which learns by watching a single human video followed by autonomous exploration *directly* in the real world (no simulation)!
I’m really excited to be starting a new adventure with multiple amazing friends & colleagues.
Our company is called Physical Intelligence (Pi or π, like the policy).
A short thread 🧵
So not to pick on agility here, because I love their robot and everything they've shown, but watch the feet in this video, between 1 and 2 seconds, to get an idea why getting sim-to-real for manipulation to work well is so hard
I had the pleasure of visiting
@CMU_Robotics
over the past two days to give a VASC seminar talk and a guest lecture. Thanks
@GuanyaShi
for the amazing host! 🙌
The seminar talk was about our recent work on "Foundation Models for Robotic Manipulation": 🤖
🥽 Want to use your new Apple Vision Pro to control your robot? Want to record how you navigate / manipulate the world to train a policy?
I developed an app for VisionOS that can stream your head / wrist / finger movements over WiFi, which you can subscribe on any machines using
Interesting (and sad) result here; I really had hoped more people would be able to just run with ROS2. But it seems like it's not quite there, if this is in any way worth doing for a small company/fast-moving startup that should be the target audience.
We improve beyond the prior with real world fine-tuning. Grasp parameters are sampled from a normal distribution initialized around the prior’s outputs.
These parameters are tested in the real world and assigned rewards. We iterate upon the distribution to maximize reward.
3/
Manufacturing update: taking delivery of a 5-axis CNC at the new NEO factory. The first of many machines to come. Prototype parts in hours instead of waiting weeks for a supplier 🚀
1X is a vertically integrated humanoid company. We make every part in-house, from winding our own
@sirwart
@Yannick46062890
One other thing, these Dynamixel U2D2s have 16ms of latency by default on most Linux OSes. On Mac you can't even get rid of it easily anymore. There's a troubleshooting on our leap hand API GitHub where if you change one driver setting it helps a lot with that latency at least.
One things for sure, Tesla has solved the deployment problem already......once the AI models work it can be adopted very fast. Many startups right now have demos but are far behind on the deployment curve.
The data inefficiency of our current ML/NN pipelines is a huge problem. Needs tons more data than what a human needs to learn from. If that can somehow magically improve it would have enormous impact.
No.
If it were the case, we would have AI systems that could teach themselves to drive a car in 20 hours of practice, like any 17 year-old.
But we still don't have fully autonomous, reliable self-driving, even though we (you) have millions of hours of *labeled* training data.
The shoulder singularity is one I specifically see a lot on 6 DOF arms and teleop which can be frustrating. Keeping this in mind and lifting or lowering the table can actually help prevent these two axes from reaching alignment pose in my experience.
🤖 What are 𝗥𝗢𝗕𝗢𝗧 𝗦𝗜𝗡𝗚𝗨𝗟𝗔𝗥𝗜𝗧𝗜𝗘𝗦?
A six-axis industrial robot arm has 3 types:
💡 Wrist singularities, elbow singularities, and shoulder singularities.
1️⃣ A wrist singularity occurs when the axes of joints 4 and 6 are coincident.
2️⃣ An elbow singularity occurs
Tesla maybe will become the biggest robot learning company in the world. Tesla cars with cameras and GPUs. And Optimus to bridge from navigation to manipulation. The car company thing was just to get started. 😅
@xiaolonw
100% agree. Otherwise hardware people do not know the correct thing to design/what learning people really need. And sometimes learning people don't know what hardware they want without trial and error.
To practice efficiently in the real world, we learn a grasp affordance prior from internet-scale human video datasets.
This model predicts where (wrist pose) and how (hand joint angles) to grasp objects, initializing our policy to perform reasonable, human-like behavior.
2/
A residual policy, conditioned on the image, is then learned to fit to the grasp and trajectory parameters of the top rollouts. At test time, we use the internet video prior and this learned residual policy together to rollout robot hand behavior.
4/
My first GPU was the GT 240 from back in 2009 for mostly CAD. I remember being so excited about it. Now we buy 8x A6000 ($30k+) and I don't even bat an eye. 😅. Crazy how much GPUs have changed.
@mark_riedl
"In particular, the word “foundation” specifies the role these models
play: a foundation model is itself incomplete but serves as the common basis from which many
task-specific models are built via adaptation."
@wang_jianren
LOL. Those stickers were funny. They stopped doing that around the 600 series when we started doing blower coolers and multi-fam designs. 😂
@sobieski902
Pretty much.
It has been staggeringly difficult to make generalized self-driving work, requiring all that you describe above and more.
The investment in training compute, gigantic data pipelines and vast video storage will be well over $10B cumulatively this year.
But that
@sirwart
Lighter is almost impossible without removing strength or switching away from electric motors. For reference, shadow hand is close to 5kg and that's the only way they can get the strength and dof.
Jim Cramer says the AI buzz is far from a bubble — the game has yet to start - CNBC
“In terms of innings, I don’t think the AI game has even started yet” - Jim Cramer
Did you know? The
@USNavy
determined a need for knowledge maintenance between sailor’s school learning & assignments so they enlisted computer scientists, educators & artists at ICT to collaborate w/ leading universities in order to create PAL3. Learn more
@chris_j_paxton
Yes.....these simplifications can require a lot of effort to fine-tune in sim. And then real world transfer from there can be even more difficult or impossible.
@richardkelley
There is nothing explicit in our door opening policy for transparent doors/walls, it is all either learned from priors or adapted to in the real-world using VLM rewards. The navigation to be clear is hard-coded and not part of our policy.
@bit_cosby
Very true. It was looking like it was gonna run away from me, truth be told couple dollars from my stop loss. Got a little lucky here but sometimes when it looks shitty you can turn it around.