🔊 Audio signals contain rich information about daily interactions. Can our robots learn from videos with sound?
Introducing ManiWAV, a robotic system that learns contact-rich manipulation skills from in-the-wild audio-visual data. See thread for more details (1/4) 👇
🤖 Can robots reason about their mistakes by reflecting on past experiences?
(1/n) We introduce REFLECT, a framework that leverages Large Language Models for robot failure explanation and correction, based on a summary of multi-sensory data. See below for details and links👇
Try out
#LumaDreamMachine
for robotics action generation, even though there are artifacts in the object generated, but I would say that the kinematics of the robot motion is pretty good. Can we use for robotics data?
Thanks
@_akhaliq
for covering our work! We find that LLM can reliably identify and explain robot failures given a textual summary of robot past experiences generated from raw sensory inputs.
More results on
Please stay tuned for code release!
REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction
paper page:
The ability to detect and analyze failed executions automatically is crucial for an explainable and robust robotic system. Recently, Large Language Models (LLMs)
Excited to see such a neat and ready-to-use data collection system is added to the robotics community! Looking forward to all the cool things our robots can learn😎
Can we collect robot data without any robots?
Introducing Universal Manipulation Interface (UMI)
An open-source $400 system from
@Stanford
designed to democratize robot data collection
0 teleop -> autonomously wash dishes (precise), toss (dynamic), and fold clothes (bimanual)
🫳The hand-held data collection device synchronously records images from a GoPro camera with fish-eye lens, and audio from a contact microphone embedded in the gripper finger.
🧠With the collected demonstrations, we train an end-to-end sensorimotor learning model. (2/4)
Cool work on generative gripper design! Impressive that all designs are generated with the same model by just taking as input a 2D/3D shape and a manipulation goal (e.g. shift up, rotate counterclockwise).
Can we automate task-specific mechanical design without task-specific training?
Introducing Dynamics-Guided Diffusion Model for Robot Manipulator Design, a data-driven framework for generating manipulator geometry designs for given manipulation tasks.
w. Huy Ha,
@SongShuran
🥯 By collecting in-the-wild demonstrations in diverse environments, our policy directly generalizes to unseen in-the-wild environments with several different test-time scenarios. (3/4)
(3/n) We systematically query LLM with a progressive failure explanation algorithm that is able to handle both execution-level and planning-level failures.
Conditioned on the explanation, LLM is able to generate a correction plan for the robot to complete the task.
@vrushankdes
Yes it’s mostly because of the motor vibration. I agree it could be a good idea to add some noise absorbing material in between the gripper and robot!
(2/n) By leveraging foundation models, REFLECT converts unstructured, multi-sensory robot data into a hierarchical textual summary of robot sensory inputs, key events, and subgoals. The summary facilitates quick failure localization and in-context explanation.