Introducing the RoboTurk Real Robot Dataset - one of the largest, richest, and most diverse robot manipulation datasets ever collected using human creativity and dexterity!
111 hours
54 non-expert demonstrators
2144 demonstrations
Download:
[1/2]
[1/2] Our lab has 3 papers accepted to NeurIPS 2019:
1. HYPE: Human Eye Perceptual Evaluation of Generative Models. Zhou and Gordon et al. (Oral)
2. SOCIAL-BIGAT: Multimodal Trajectory Forecasting using Bicycle-GAN and Graph Attention Networks. Kosaraju et al.
Stanford Vision and Learning Lab: Performing Research at the Forefront of Computer Vision, Machine Learning, and Robotics -
@drfeifei
@silviocinguetta
@jcniebles
We are hosting one of the 3 challenges of at CVPR20. Train your navigating agent in our simulator Gibson () and we will test it in the real world! The best solutions will showcase live during CVPR. More info:
Learning from hints (not demonstrations): A new paper on an important direction of RL for control where expert intuition can be used to guide learning without the need to provide optimal or even complete solutions.
Excited to be at
#ICRA2019
Best Paper Award talk
Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations for Contact-Rich Tasks
Paper:
Video:
We are happy to announce our ICCV19 Workshop on Visual Perception for Navigation in Human Environments: The JackRabbot Social Robotics Dataset and Benchmark. Submission deadline August 20. For more info, contact
@SHamidRezatofig
and Roberto Martin-Martin
Are you a passionate and experienced researcher in robotics with knowledge in computer vision? Do you want to build impactful robotic systems? Stanford Vision and Learning lab (SVL) is searching for a Postdoctoral Fellow with your skills.
Our focus on robot learning from single example of a task through a video has resulted in a line of work that combines symbolic systems with neural networks
We have just released our new work on 6D pose estimation from RGB-D data -- real-time inference with end-to-end deep models for real-world robot grasping and manipulation! Paper: Code: w/
@danfei_xu
@drfeifei
@silviocinguetta
A new work on structuring diverse semantics in 3D space that yielded the 3D Scene Graph! It’s showcased on the Gibson database by annotating the models with diverse semantics using a semi-automated method.
What space should diverse semantics be grounded & what should be the structure? 3D Scene Graph is a 4-layer structure for unified semantics, 3D space &camera. We demonstrate it on Gibson models with an automated labeling method. Data available to download!
This is continuing an important line of work in policy learning with large Datasets.
More importantly this is the question to answer if we need to create a analog of "Imagenet" for Robotics.
We need to both collect large Datasets and have Algorithms to leverage this data!
New work on exploring if a policy can be learned only from offline, off-policy dataset?
IRIS: Implicit Reinforcement without Interaction at Scale
Video:
Seattle Robotics
@NvidiaAI
@AjayMandlekar
@drfeifei
B. Boots F. Ramos D. Fox
Continued efforts in larger scale crowdsourcing robotics for dataset creation in setups where engineered solutions are hard, simulation is tricky and pure compute has low success. The diversity of human cognitive reasoning and dexterity provides so many ways do do the same task!
A thorough evaluation of possible action spaces and their efficiency with RL for manipulation tasks. The answer is best of both worlds - Op. Space control in end effector space with learnable gains fares better than end to end image-to-torque
We created 4D ConvNets for 3D video perception :D
Please checkout our paper: 4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks,
@cvpr2019
Be sure to check out
#TWiMLTalk
#123
! Joined by
@lynetcha
, we discuss her work on SEGCloud, an end-to-end framework that performs 3D point-level segmentation. Head over to to listen!
Very excited to share our new paper on Making Sense of Vision and Touch: Self-Supervised Learning of Multimodal Representations. Featuring RL on real-robots from scratch in a matter of hours without any simulation!
Video:
arXiv:
Happy day for students, and proud day for advisor (i.e. me) and families on
@Stanford
Computer Science Dept Graduation Ceremony day :) Last picture is for CS231n instructors/TAs.
@jcjohnss
@syeung10
@cs231n
Humans have the remarkable ability to make and use tools to help them solve tasks. We introduce a framework for robots also to jointly learn to design and use tools via reinforcement learning.
Poster: Thu 9th 2:45-3:30 pm
Website:
A robot may be unable to complete a task when limited by its morphology. Remarkably, people and some animals can get around this by not only using but also *designing* tools. We explore whether robots can also do this in our latest work!
🌐
🧵👇
Ranjay Krishna and Apoorva Doradula use conversations as a strategy for training AI systems. They call it engagement learning – an AI “learns what kinds of concepts people like to discuss and how to ask questions to get an informative response.”
NOIR is a brain-robot interface that enables humans to command robots to perform 20 challenging everyday activities using their brain signals, such as cooking, cleaning, & playing games.
Poster: Wed 8th 5:15pm
Website:
Excited about the lineup of speakers to explore causality in robotics.
RSS Workshop on Causal-Imitation. CfP is out and deadline for posters is Jun 3. Consider submitting!
@yukez
@ermonste
@jiajunwu_cs
and Michael Laskey
VoxPoser uses LLM+VLM to create 3D value maps in robot workspace, which guides motion planner to synthesize behaviors for everyday manipulation tasks w/o requiring robot data.
Oral: Wed 8th 11:50 am
Poster: Wed 8th 5:15-6:00 pm
How to harness foundation models for *generalization in the wild* in robot manipulation?
Introducing VoxPoser: use LLM+VLM to label affordances and constraints directly in 3D perceptual space for zero-shot robot manipulation in the real world!
🌐
🧵👇
MimicPlay is an imitation learning algorithm that uses cheap human play data to unlock real-time planning for long-horizon manipulation.
Oral: Thu 9th 8:30 am
Poster: Thu 9th 2:45-3:30 pm
Best paper/Best student paper finalist
Best system paper finalist
How to teach robots to perform long-horizon tasks efficiently and robustly🦾?
Introducing MimicPlay - an imitation learning algorithm that uses "cheap human play data". Our approach unlocks both real-time planning through raw perception and strong robustness to disturbances!🧵👇
Check out our first blog post about SAIL research, about how computer vision can be used to enable smart hospitals
By Albert Haque & Michelle Guo (
@mshlguo
), led by professors Terry Platchek (
@TerryPlatchek
), Arnold Milstein, & Fei-Fei Li (
@drfeifei
)
Sequential dexterity is a system that learns to chain multiple dexterous manipulation policies for tackling long-horizon manipulation tasks in both simulation and real-world.
Poster: Tue 7th 2:45-3:30 pm
How to chain multiple dexterous skills to tackle complex long-horizon manipulation tasks?
Imagine retrieving a LEGO block from a pile, rotating it in-hand, and inserting it at the desired location to build a structure.
Introducing our new work - Sequential Dexterity 🧵👇
It was an extra-ordinary visit to Stanford University. Had a wonderful exchange with its brilliant faculty on use of technology for human development and the application of AI and the challenge it poses. The faculty was deeply impressed with India's story of digital inclusion.
A new effort from SVL and the JackRabbot team! New dataset and benchmark for robot perception in human environments. The winners of the first challenge on pedestrian detection and tracking will be presented at our workshop at
#ICCV19
!
New study that analyzes different action spaces for RL in robot manipulation in the quest to find the best one. Guys what? The best method is a combination of operational space control (1986) with learned adaptive gain tuning. To appear at
#IROS2019
What action space should we use for contact-rich manipulation? We show that Variable Impedance Control in End-Effector Space outperforms most other choices.
Paper:
w\ R. Martin-Martin,
@michellearning
, R. Gardner,
@silviocinguetta
,
@leto__jean
@StanfordSVL
RL in multitask domains with shared underlying latent information can be made more effective through learning action space manifolds.
Check out this effort on Learning latent Action Spaces at
#ICRA2021
When you try to open a new door, do you try to yank it up? Likely, no.
Why should your robot continue to do so!
Check out our
#ICRA2021
paper on learning action spaces for efficient contact-rich manipulation.
paper:
Do you know how to make a dumpling🥟? Our robot🤖does! RoboCook is a robot system designed for long-horizon manipulation of elasto-plastic objects with a variety of tools.
Oral: Tue 7th 8:50 am
Poster: Tue 7th 2:45-3:30 pm
Best system paper finalist
Do you know how to make a dumpling🥟? Our robot🤖does!
Introducing RoboCook: Long-Horizon Elasto-Plastic Object Manipulation with Diverse Tools.
Project website:
Here we show how RoboCook makes a dumpling under external human perturbation. Thread🧵👇
A very timely workshop on causality and robotics. Consider participating!
Also see a recent piece on Judea Pearl's perspective on causality models in AI
Excited about the lineup of speakers to explore causality in robotics.
RSS Workshop on Causal-Imitation. CfP is out and deadline for posters is Jun 3. Consider submitting!
@yukez
@ermonste
@jiajunwu_cs
and Michael Laskey
Stanford SVL is a vibrant community of faculty, postdocs and students. Our alumni have landed in prestigious positions in academia and industry. We strive for an integrative and diverse group. We especially encourage applications from traditionally underrepresented groups in AI.