Sherry Yang Profile
Sherry Yang

@mengjiao_yang

3,118
Followers
361
Following
44
Media
160
Statuses

Research Scientist @GoogleDeepMind | PhD Student @UCBerkeley . Previously M.Eng. / B.S. @MIT .

Joined September 2015
Don't wanna be here? Send us removal request.
Pinned Tweet
@mengjiao_yang
Sherry Yang
11 months
Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:
32
245
1K
@mengjiao_yang
Sherry Yang
2 years
Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum
Tweet media one
5
101
573
@mengjiao_yang
Sherry Yang
1 year
Review paper on Foundation Models for Decision Making: Foundation models can characterize various components of decision making, such as states (S), behaviors (A), dynamics (T), task specifiers (R), through generative modeling or representation learning.
Tweet media one
4
110
442
@mengjiao_yang
Sherry Yang
9 months
Checkout UniMat -- a unified representation of materials that enables scaling of diffusion models to millions of stable crystal structures. Website: Paper:
6
49
232
@mengjiao_yang
Sherry Yang
2 months
Consider joining our team at Google DeepMind to work on foundation models for decision making, e.g., foundation model alignment, reasoning, planning, simulation, and optimization with foundation models.
@hanjundai
Hanjun Dai
2 months
Our team (w/Dale, @daibond_alpha , @mengjiao_yang + others) at Google DeepMind is looking to hire. If you are interested in foundation models+decision making, and making real-world impact through Gemini and cloud solutions, please consider applying through
1
27
134
0
18
235
@mengjiao_yang
Sherry Yang
6 months
Video generation will revolutionize decision making in the physical world like how language models have changed the digital world. Interested in the implications of video generation models like UniSim and Sora? Check out our position paper:
3
48
207
@mengjiao_yang
Sherry Yang
4 months
Happy to share that UniSim was selected for an Outstanding Paper Award at #ICLR2024 . Check out the oral presentation today at 10:30am Oral 1B and poster on Wed at 4:30-6:30pm #87 . Thanks to the award committee @eunsolc , @katjahofmann , @liu_mingyu ,
@iclr_conf
ICLR 2025
4 months
Announcing the #ICLR2024 Outstanding Paper Awards: Shoutout to the awards committee: @eunsolc , @katjahofmann , @liu_mingyu , @nanjiang_cs , @guennemann , @optiML , @tkipf , @CevherLIONS
3
53
303
19
12
162
@mengjiao_yang
Sherry Yang
2 years
What does "Learn principles, not formulas. Understand, do not memorize” mean for autonomous agents? Chain of Thought Imitation with Procedure Cloning! ArXiv Code Site w/ Dale @pabbeel @ofirnachum
2
31
131
@mengjiao_yang
Sherry Yang
2 years
Text-conditioned video generation can serve as universal policies (UniPi) and learn from sim, real, and web-scale videos. w/ @du_yilun , @hanjundai , @daibond_alpha , @ofirnachum , Josh, Dale, @pabbeel Paper: Web:
Tweet media one
Tweet media two
3
30
126
@mengjiao_yang
Sherry Yang
1 year
As video foundation models reach billions of parameters, how to adapt them to task-specific settings (e.g., animation, robotics) without access to the model weights becomes a pressing issue. We introduce Video Adapter:
2
21
102
@mengjiao_yang
Sherry Yang
8 months
See you all at the 2nd Foundation Models for Decision Making #NeurIPS2023 workshop tomorrow (Friday) at Hall E2 starting 8:15am. Don't miss out on an exciting line of speakers! See schedule and zoom link at .
Tweet media one
1
23
99
@mengjiao_yang
Sherry Yang
2 years
See you all at the 1st Foundation Models for Decision Making workshop @NeurIPSConf (Room 391) on Sat, Dec 3 2022. See schedule and zoom link at .
Tweet media one
@mengjiao_yang
Sherry Yang
2 years
Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum
Tweet media one
5
101
573
1
20
89
@mengjiao_yang
Sherry Yang
1 month
Looking forward to presenting the following papers @icmlconf : - Position paper on Video Generation for Decision Making (Tue 1:30 - 3pm #2613 ). - Code as Reward for real-world RL with VLMs (Thur 1:30-3pm #1115 ).
2
11
80
@mengjiao_yang
Sherry Yang
5 months
I talked about our position paper on "Video as the New Language for Real-World Decision Making" on the TWIML podcast @twimlai . Check out the conversation below:
@twimlai
The TWIML AI Podcast
5 months
Today we’re joined by @mengjiao_yang , a senior research scientist at Google DeepMind, to learn why video data offers a better foundation than natural language for teaching AI to understand the world. 🎧 / 🎥To listen to the audio version, visit .
2
10
50
2
13
80
@mengjiao_yang
Sherry Yang
3 years
Paper with @ofirnachum -- Representation Matters: Offline Pretraining for Sequential Decision Making -- is accepted to #ICML2021 ! Contrastive pretraining yields huge gains in low-data imitation learning, offline RL, and online RL.
Tweet media one
1
8
78
@mengjiao_yang
Sherry Yang
10 months
Both RT-Sketch () and RT-trajectory () can be viewed as a form of "chain-of-thought" of agents (). Learning mappings from high-dim images to low-dim controls is difficult, and intermediate info could help.
1
15
57
@mengjiao_yang
Sherry Yang
9 months
Heading to #NeurIPS2023 with an exciting agenda: [1/5] I will showcase UniSim at the Google DeepMind booth (Hall C, 315) on Mon Dec 11 12-2pm CT. Please stop by to interact with the real-world simulator.
@mengjiao_yang
Sherry Yang
11 months
Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:
32
245
1K
3
5
54
@mengjiao_yang
Sherry Yang
3 years
TRAIL's latent action pretraining provably accelerates downstream imitation learning even when the offline dataset is highly suboptimal (e.g., collected from a random policy). Paper: Code: Website:
@ofirnachum
Ofir Nachum
3 years
How can we leverage existing behavior datasets to learn an "easier" action space for control? Many have thought about this question (eg see my OPAL paper w/ A. Ajay), but existing work relies on dataset to already contain good and temporally-extended behavior. And so... TRAIL!
Tweet media one
Tweet media two
1
2
27
1
8
51
@mengjiao_yang
Sherry Yang
4 months
Checkout our ICML paper -- Code as Reward . VLMs can generate code to compute reward from images, enabling RL agents to learn in the real-world without hand-designed simulators. This compliments generative world models like UniSim .
@DavidAVenuto
David Venuto
4 months
We are excited to announce that our work "Code as Reward: Empowering Reinforcement Learning with VLMs" was accepted to ICML 2024. This work was done with Sami Nur Islam @MartinKlissarov Doina Precup @mengjiao_yang @ankit_s_anand .
Tweet media one
2
5
32
0
10
49
@mengjiao_yang
Sherry Yang
2 years
Dichotomy of Control accepted as "notable-top-5%" at #iclr2023 ! Only controlling what one can control seems to also be a useful research philosophy.
@ofirnachum
Ofir Nachum
2 years
Despite all the buzz about decision transformers, it's well-known they can be unboundedly suboptimal in stochastic envs. The issue is due to conditioning the policy on return, a highly stochastic quantity. We introduce "dichotomy of control" to solve this:
Tweet media one
5
38
200
0
4
47
@mengjiao_yang
Sherry Yang
2 years
CALM () and CHAI () are our first attempt at combining offline RL with language models to solve task-oriented dialogue. While dialogue is hard, our initial set of results are highly encouraging:
@svlevine
Sergey Levine
2 years
Offline RL is a natural fit for dialogue: RL with humans is hard, but data of humans talking to humans is plentiful. In two new papers, we explore offline RL and for end-to-end dialogue systems with Transformers! CALM: CHAI: 🧵->
1
42
244
2
7
46
@mengjiao_yang
Sherry Yang
3 months
I'll present Unified Materials (UniMat) today 10:45am - 12:45pm at poster #170 #ICLR2024 . Come and learn about the initial effort from @GoogleDeepMind on developing generative models for materials. Joint work with KwangHwan Cho, @amilmerchant , @pabbeel , Dale Schuurmans,
@mengjiao_yang
Sherry Yang
9 months
Checkout UniMat -- a unified representation of materials that enables scaling of diffusion models to millions of stable crystal structures. Website: Paper:
6
49
232
1
6
44
@mengjiao_yang
Sherry Yang
10 months
Check out Video Language Planning! VLP uses vision-language models as policy and reward, and UniSim as dynamics. Foundation agent models can now conduct search and planning in foundation world models.
@du_yilun
Yilun Du
10 months
Introducing Video Language Planning! By planning across the space of generated videos/language, we can synthesize long-horizon video plans and solve much longer horizon tasks than existing baseline (such as RT-2 and PALM-E). (1/5)
7
48
290
0
13
44
@mengjiao_yang
Sherry Yang
3 months
I'll present Video Adapter today (Friday) 10:45am - 12:45pm at poster #200 #ICLR2024 . Come and learn about how we can adapt large video foundation models like UniSim to domain-specific settings (e.g., animation, robotics) without requiring access to the foundation model
@mengjiao_yang
Sherry Yang
1 year
As video foundation models reach billions of parameters, how to adapt them to task-specific settings (e.g., animation, robotics) without access to the model weights becomes a pressing issue. We introduce Video Adapter:
2
21
102
1
4
42
@mengjiao_yang
Sherry Yang
4 years
Off-policy evaluation is not the full story - we introduce bayesian decision making to select policies based on any criteria: Offline Policy Selection under Uncertainty: code: with @daibond_alpha , @ofirnachum , @georgejtucker , Dale
Tweet media one
1
6
38
@mengjiao_yang
Sherry Yang
4 years
New paper: Energy-Based Processes for Exchangeable Data. We combine EBMs with stochastic processes to model set structures with arbitrary cardinality. Paper: Code: Joint work with @daibond_alpha , @hanjundai , and Dale Schuurmans.
Tweet media one
0
9
32
@mengjiao_yang
Sherry Yang
11 months
UniSim is trained on broad data rich in different axes (internet text-image, human activity videos, simulated executions, real robot videos, panorama scans, etc).
Tweet media one
2
4
31
@mengjiao_yang
Sherry Yang
2 years
In our recent work ALPT, I was surprised that inverse dynamics model pretrained on Freeway (Atari game only moving up and down) dramatically improves performance of Breakout (only moving left and right). One step closer towards cross-domain generalization:
@ofirnachum
Ofir Nachum
2 years
I am amazed at the generalization performance of inverse dynamics models for ALPT, even winning out in situations where source/target tasks have completely disjoint action spaces! Awesome work with @DavidAVenuto @mengjiao_yang @pabbeel @IMordatch , Precup.
Tweet media one
0
1
14
1
2
29
@mengjiao_yang
Sherry Yang
11 months
UniSim supports training RL agents purely in simulation that transfer to real robots in zero-shot.
1
2
27
@mengjiao_yang
Sherry Yang
8 months
FMDM workshop happening now in Hall E2 with @percyliang giving the first talk and a packed audience.
Tweet media one
@mengjiao_yang
Sherry Yang
8 months
See you all at the 2nd Foundation Models for Decision Making #NeurIPS2023 workshop tomorrow (Friday) at Hall E2 starting 8:15am. Don't miss out on an exciting line of speakers! See schedule and zoom link at .
Tweet media one
1
23
99
1
1
25
@mengjiao_yang
Sherry Yang
3 months
I will dive into more detail of how we have been leveraging UniSim to solve embodied decision making tasks at the Generative AI for Decision Making workshop today 4pm-4:30pm (Lehar 3).
@rl_agent
Lisa Lee
3 months
Generative AI for Decision Making workshop at #ICLR2024 is taking place on Sat, May 11 @ 8:30 - 17:00 (Lehar 3). We have an exciting lineup of Invited Speakers including: Katja Hofmann @katjahofmann @MSFTResearch Igor Mordatch @Imordatch @GoogleDeepMind Yuandong Tian @tydsh
1
5
39
0
1
25
@mengjiao_yang
Sherry Yang
2 years
Now with ICLR deadline behind us, please consider submit to the FMDM workshop @NeurIPSConf by Oct 3! We have an exciting line of speakers: Leslie Kaelbling, @DorsaSadigh , Dale Schuurmans, @pathak2206 , @machelreid , @Thom_Wolf , and @DrJimFan . There will be paper & travel awards.
@mengjiao_yang
Sherry Yang
2 years
Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum
Tweet media one
5
101
573
0
4
24
@mengjiao_yang
Sherry Yang
11 months
@du_yilun @coolboi95 @JonathanTompson @pabbeel Learning realistic world models is becoming a reality @ylecun
1
0
23
@mengjiao_yang
Sherry Yang
2 years
ChatGPT as a research advisor:
Tweet media one
2
2
24
@mengjiao_yang
Sherry Yang
11 months
UniSim supports both high-level language actions and low-level motor controls through conditional video generation.
Tweet media one
1
1
21
@mengjiao_yang
Sherry Yang
11 months
UniSim supports simulation of diverse manipulation actions such as various actions in cooking a meal.
2
2
19
@mengjiao_yang
Sherry Yang
11 months
UniSim supports diverse environment transitions such as different objects being uncovered.
1
2
18
@mengjiao_yang
Sherry Yang
11 months
UniSim also supports training embodied VLM planners purely in simulation and transfer to real robots in zero-shot.
1
1
16
@mengjiao_yang
Sherry Yang
2 years
We will be at J #107 on Thu 11am #NeurIPS2022 to chat about Multi-Game Decision Transformers!
Tweet media one
@IMordatch
Igor Mordatch
2 years
How can we effectively train generalist multi-environment agents? We trained a single Decision Transformer model to play many Atari games simultaneously and compared it to alternative approaches:
4
60
315
0
4
16
@mengjiao_yang
Sherry Yang
2 years
Curious about how to make OPE *practical* for selecting policies in practice? Checkout our #AISTATS2022 poster on Wed 3/30 at 8:30am PT (session 5). Offline Policy Selection under Uncertainty w/ @daibond_alpha , @ofirnachum , @georgejtucker , Dale
Tweet media one
1
5
16
@mengjiao_yang
Sherry Yang
11 months
A reminder to submit your work to the Foundation Models for Decision Making NeurIPS workshop. The deadline is Oct 1 AoE.
1
3
16
@mengjiao_yang
Sherry Yang
11 months
UniSim also supports diverse navigation actions.
1
1
15
@mengjiao_yang
Sherry Yang
1 year
Come and talk to us about ALPT (action limited pretraining) this Thursday at 1:30pm (Hall 1 #104 )!
Tweet media one
@mengjiao_yang
Sherry Yang
2 years
In our recent work ALPT, I was surprised that inverse dynamics model pretrained on Freeway (Atari game only moving up and down) dramatically improves performance of Breakout (only moving left and right). One step closer towards cross-domain generalization:
1
2
29
0
3
15
@mengjiao_yang
Sherry Yang
11 months
Work done with @du_yilun , @coolboi95 , @JonathanTompson , Dale Schuurmans, @pabbeel .
1
1
12
@mengjiao_yang
Sherry Yang
1 year
I'll present Dichotomy of Control today at 11am in the auditorium (Oral 3 Track 1) and poster session 11:30am - 1:30pm in MH1-2-3-4, #119 . Hope to see you there. #ICLR23 Recording:
@ofirnachum
Ofir Nachum
2 years
Despite all the buzz about decision transformers, it's well-known they can be unboundedly suboptimal in stochastic envs. The issue is due to conditioning the policy on return, a highly stochastic quantity. We introduce "dichotomy of control" to solve this:
Tweet media one
5
38
200
0
1
12
@mengjiao_yang
Sherry Yang
11 months
UniSim also supports long-horizon interactions without compromising simulation quality.
1
0
10
@mengjiao_yang
Sherry Yang
2 years
Foundation Models for Decision Making (FMDM) workshop @NeurIPSConf is open for submissions! Remember to submit your work (4-9 pages) before Sept 22 at .
@ofirnachum
Ofir Nachum
2 years
We are open for submissions! I know there are lots of people working on large models, pretraining, cross-domain/agent generalization for RL. Please submit your papers to the 1st FMDM workshop at NeurIPS 2022!
Tweet media one
1
20
126
0
2
12
@mengjiao_yang
Sherry Yang
1 year
For instance, skill discovery and Decision Transformer are examples of generative models of behavior, whereas model-based offline RL and Trajectory Transformer are examples of generative models of the world dynamics.
1
0
10
@mengjiao_yang
Sherry Yang
2 years
Come and learn about TRAIL for action representation learning at #ICLR2022 Thur 4/28 10:30 am PT: (with @svlevine and @ofirnachum ).
@mengjiao_yang
Sherry Yang
3 years
TRAIL's latent action pretraining provably accelerates downstream imitation learning even when the offline dataset is highly suboptimal (e.g., collected from a random policy). Paper: Code: Website:
1
8
51
0
3
10
@mengjiao_yang
Sherry Yang
1 year
Large language models can serve as agents or environments, supporting interactions with humans, tools, and the real world, enabling new learning environments such as the internet and human knowledge.
1
2
9
@mengjiao_yang
Sherry Yang
1 year
Work with @ofirnachum , @du_yilun , @_jasonwei , @pabbeel , Dale Schuurmans. Please feel free to drop us a note if your work is relevant and should be included in this review.
0
0
8
@mengjiao_yang
Sherry Yang
6 months
Aside from directly solving tasks, video generation is also a realistic simulator for complex games (), which can be combined with model-based planning, or be used to create new games ().
@_rockt
Tim Rocktäschel
6 months
I am really excited to reveal what @GoogleDeepMind 's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.
145
572
3K
1
2
9
@mengjiao_yang
Sherry Yang
1 year
Large pretrained vision / language foundation models can characterize various perceptual components of decision making agents such as image observations, language actions, and language goals (e.g., SayCan, PALM-E) through plug-and-play.
1
0
8
@mengjiao_yang
Sherry Yang
3 years
Super exciting work that connects contrastive pretraining to learning representations of transitions/rewards in a dynamical system, which probably improves downstream behavioral cloning tasks!
@ofirnachum
Ofir Nachum
3 years
For those who want a distraction from their submissions... checkout our new paper! w @mengjiao_yang about *provable* bounds of representation learning for downstream imitation (ie behavioral cloning, ie max-likelihood). Lots results & insights in this paper I'm excited about...1/
Tweet media one
1
7
80
0
0
8
@mengjiao_yang
Sherry Yang
2 years
UniPi came about as a part of our broader effort on *Foundation Models for Decision Making*, where vision-language architectures and pretrained models are applied to decision making. Check out other exciting work in this area:
@mengjiao_yang
Sherry Yang
2 years
Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum
Tweet media one
5
101
573
1
3
8
@mengjiao_yang
Sherry Yang
4 years
Our unified DICE framework reveals that unlike the Q-values, the dual estimates of policy value in terms of state-action distributions offer greater flexibility in regularization while being unbiased, and are robust to scaling and shifting of MDP rewards.
@ofirnachum
Ofir Nachum
4 years
Policy evaluation via duality/Lagrangian methods presents a lot of choices (how to setup the LPs, regularize them, etc). In we examine how these choices affect accuracy of final eval. Lots of insights in this paper, many of which I didn't expect....
Tweet media one
2
11
70
0
0
8
@mengjiao_yang
Sherry Yang
2 years
Evaluation on simulated robotic navigation, manipulation, and game environments show that procedure cloning exhibits significant generalization to unseen environment configurations such as maze layouts, positions of objects, transition stochasticity, and game difficulties.
Tweet media one
Tweet media two
Tweet media three
2
0
7
@mengjiao_yang
Sherry Yang
1 year
Video Adapter works by composing scores of a pretrained text-to-video model with scores of a domain-specific small model (with 1% parameters) during sampling time, achieving high-quality yet flexible video synthesis without requiring gradient updates on the pretrained model.
Tweet media one
1
1
7
@mengjiao_yang
Sherry Yang
9 months
One key to the success of foundation models is a unified representation. Language uses tokens. Vision uses pixels. What about materials? We develop a unified way to represent materials by storing the locations of atoms to their respective entries in the periodic table.
Tweet media one
1
0
6
@mengjiao_yang
Sherry Yang
2 years
Lastly, we note the connection between "chain of thought imitation" and "chain of thought prompting" by @_jasonwei et al. --- decomposing multi-step problems into intermediate steps and learning the intermediate steps using a sequence model is applicable to a variety of problems.
0
0
7
@mengjiao_yang
Sherry Yang
4 months
@du_yilun Poster session happening in 30 min at #87 with a DIY laptop stand!
Tweet media one
0
1
7
@mengjiao_yang
Sherry Yang
6 months
Visual and algorithmic reasoning and can also be casted as next frame / video generation tasks.
Tweet media one
Tweet media two
1
0
5
@mengjiao_yang
Sherry Yang
6 months
@billpeeb Would be cool to see reference to our "video generation as real-world simulator" work: . Ofc Sora is taking things to a whole new level.
@mengjiao_yang
Sherry Yang
11 months
Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:
32
245
1K
0
0
6
@mengjiao_yang
Sherry Yang
9 months
With this unified representation, we train a diffusion model to generate materials by moving atoms from random locations back to their original locations. Atoms that do not exist will be moved to a special null location. This allows generation of crystals with arbitrary elements.
Tweet media one
1
0
6
@mengjiao_yang
Sherry Yang
2 years
I found UniPi’s *video-as-policy* and *text-as-task* abstractions to be empowering. These abstractions unify environments with different state and action spaces and enable policy learning across broad datasets.
1
2
6
@mengjiao_yang
Sherry Yang
10 months
@xiao_ted An interesting next step would be to elicit these useful intermediate information in the agent automatically, so that humans don't have to draw goals / motion plans during inference, similar to how LLMs can automatically output intermediate reasoning steps.
1
1
4
@mengjiao_yang
Sherry Yang
6 months
For instance, classical computer vision tasks can be casted as a next-frame generation task ().
@YutongBAI1002
Yutong Bai
9 months
How far can we go with vision alone? Excited to reveal our Large Vision Model! Trained with 420B tokens, effective scalability, and enabling new avenues in vision tasks! (1/N) Kudos to @younggeng @Karttikeya_m @_amirbar , @YuilleAlan Trevor Darrell @JitendraMalikCV Alyosha Efros!
18
160
1K
1
0
5
@mengjiao_yang
Sherry Yang
6 months
Similar to text, video is a unified interface that can absorb internet knowledge and represent diverse tasks. This allows us to pour internet data into a single model that can solve many tasks through conditional generation.
Tweet media one
Tweet media two
1
0
6
@mengjiao_yang
Sherry Yang
10 months
@danijarh Thanks Danijar! Depending on the number of denoising steps, generating 16 frames could take anywhere between 1-20 seconds. We found for RL / planning on Language Table, having very few denoising steps (e.g., 8) is sufficient, which takes roughly a second.
0
0
5
@mengjiao_yang
Sherry Yang
6 months
Generative video simulator is also useful for optimizing control input in science and engineering domains where abundant video data can be collected but the underlying physical dynamics are hard to be explicitly expressed (e.g., cloud movement, interaction with soft objects).
0
0
4
@mengjiao_yang
Sherry Yang
2 years
We propose procedure cloning (an alternative to BC), which applies supervised sequence prediction to imitate the series of expert computations. Procedure cloning learns not only what to do (i.e., the output action), but how and why to do it (i.e., the procedure).
1
0
5
@mengjiao_yang
Sherry Yang
1 year
Glad to see our initial work, Attentive Contrastive Learning (BERT for RL) in 2021 , has been joined by many subsequent work in applying BERT-style autoencoding objectives to sequential decision making (e.g., MaskDP @fangchenliu_ and MTM @philippswu ).
@ofirnachum
Ofir Nachum
1 year
@aravindr93 Nice work. MTM sounds conceptually similar to what we called ACL in our paper a few years ago:
0
0
4
0
1
5
@mengjiao_yang
Sherry Yang
2 years
@HappyyPablo @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum Yes, the talks will be pre-recorded. The workshop will be in person but will also be live streamed.
0
0
5
@mengjiao_yang
Sherry Yang
9 months
We scale UniMat to train on >2M low-energy materials, and show that conditional generation (conditioned on atom type) can generalize to generating more difficult structures, outperforming random structure search (the current leading method) in discovering new stable materials.
Tweet media one
1
0
4
@mengjiao_yang
Sherry Yang
6 months
@_tim_brooks Would be cool to see reference to our "video generation as real-world simulator" work: . Ofc Sora is taking things to a whole new level.
@mengjiao_yang
Sherry Yang
11 months
Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:
32
245
1K
1
0
5
@mengjiao_yang
Sherry Yang
3 years
Before developing better dialogue agents, we need reliable and scalable evaluation of dialogue systems. ENIGMA evaluates dialogue agents by estimating how a human user would rate an agent via off-policy evaluation with DICE (distribution correction estimation).
@tourzhao
Tuo Zhao
3 years
An ideal environment for evaluating dialog agents, i.e., the Turing test, needs to involve human interaction, which is not affordable for large-scale experiments. Our EMNLP 2021 paper proposes a new framework – ENIGMA for automating the Turing test. (1/4)
2
7
14
0
0
5
@mengjiao_yang
Sherry Yang
6 months
Work with Jacob Walker, @jparkerholder , @du_yilun , Jake Bruce, Andre Barreto, @pabbeel , Dale Schuurmans.
0
0
5
@mengjiao_yang
Sherry Yang
9 months
Under DFT evaluations. UniMat generates materials with significantly lower decomposition energy and many more new stable materials (with respect to Materials Project 2021) compared to previous state-of-the-art generative models.
Tweet media one
Tweet media two
2
0
4
@mengjiao_yang
Sherry Yang
1 year
Video Adapter potential has broad applications such as anime production and domain randomization for bridging the sim-to-real gap in robotics. Great collaboration with @du_yilun , @daibond_alpha , Dale, Josh, and @pabbeel .
0
0
4
@mengjiao_yang
Sherry Yang
6 months
Video is also a unified observation space across different embodiments, so we can generate visual execution plans for different robots using a single video generation model:
1
0
4
@mengjiao_yang
Sherry Yang
10 months
@jeasinema Hi Xiaojian, thanks for the interest! For 1), we didn't train separate models for each domain; we used a single conditional video diffusion model for all the data mixtures (the mixture ratio is in Appendix B of the paper).
1
0
4
@mengjiao_yang
Sherry Yang
2 years
Extremely useful research advice on how to turn partial results into impactful research contributions!
@ofirnachum
Ofir Nachum
2 years
My second post is up now (as promised, updates are very infrequent). "Paper Writing: A View from the Trenches"
0
6
29
0
0
4
@mengjiao_yang
Sherry Yang
9 months
Work with KwangHwan Cho, @amilmerchant , @pabbeel , Dale Schuurmans, @IMordatch , @ekindogus .
1
0
4
@mengjiao_yang
Sherry Yang
1 year
Video Adapter can effectively adapt pretrained video model to egocentric videos and robotic data.
1
0
4
@mengjiao_yang
Sherry Yang
2 years
Happening now in room 291!
@shaneguML
Shane Gu
2 years
Check out Foundation Models for Decision Making workshop in Room 291-292! #NeurIPS2022
Tweet media one
Tweet media two
1
6
42
0
0
4
@mengjiao_yang
Sherry Yang
2 years
We formulate the “chain of thought” imitation learning problem, where an agent also has access to the intermediate computations that generated the expert state-action pairs, such as planning, search, or some other multi-step algorithm (e.g., BFS, MCTS).
1
0
4
@mengjiao_yang
Sherry Yang
6 months
A model can answer people’s questions by generating how-to videos (e.g., “how to make a sushi roll”), which may be more preferable than textual responses.
Tweet media one
1
0
3
@mengjiao_yang
Sherry Yang
2 years
Morning speakers: 10-10:30 Gato: A Generalist Agent @gbarthmaron 10:30-11 Open-Ended Embodied Agents with Internet-Scale Knowledge @DrJimFan 11-11:30 What does an intelligent robot need to know? (Leslie) 11:30-12 Learning and Leveraging Foundation Models in Robotics @DorsaSadigh
1
0
3
@mengjiao_yang
Sherry Yang
1 year
Video Adapter achieves much better FVD, FID, and Inception Scores than both the pretrained and the task-specific model, and outperforms finetuning under the same TPU hours.
Tweet media one
Tweet media two
1
0
3
@mengjiao_yang
Sherry Yang
2 years
Oral presentations: 13:30 - 13:45: In-context Reinforcement Learning with Algorithm Distillation 13:45 - 14:00: Large Language Models Are Human-Level Prompt Engineers
0
2
3
@mengjiao_yang
Sherry Yang
2 years
@svlevine @ofirnachum @kuanghueilee Also thanks to @IMordatch who contributed the majority of the initial MGDT infrastructure. Scalable RL is becoming reality.
0
0
3
@mengjiao_yang
Sherry Yang
2 years
I will be at J #929 on Thu 4pm to chat about chain-of-thought learning for embedded agents!
Tweet media one
@mengjiao_yang
Sherry Yang
2 years
What does "Learn principles, not formulas. Understand, do not memorize” mean for autonomous agents? Chain of Thought Imitation with Procedure Cloning! ArXiv Code Site w/ Dale @pabbeel @ofirnachum
2
31
131
0
1
3
@mengjiao_yang
Sherry Yang
3 years
Our breadth study shows that a class of contrastive self-prediction loss works particularly well while many other objectives perform poorly. Our depth study ablates over action/reward prediction/reconstruction, momentum network, tandem training, etc to illustrate their effects.
0
0
3
@mengjiao_yang
Sherry Yang
2 years
Code:
0
0
3
@mengjiao_yang
Sherry Yang
2 years
FMDM LOCATION CHANGE! Now in Room 291 - 292 on Saturday at 8:50 AM.
@mengjiao_yang
Sherry Yang
2 years
See you all at the 1st Foundation Models for Decision Making workshop @NeurIPSConf (Room 391) on Sat, Dec 3 2022. See schedule and zoom link at .
Tweet media one
1
20
89
0
0
3
@mengjiao_yang
Sherry Yang
9 months
@itif_workshop [5/5] We will be hosting the 2nd workshop on Foundation Models for Decision Making on Friday Dec 15 starting 8:15am CT in Hall E2. Don't miss out on this exciting workshop!
0
0
2
@mengjiao_yang
Sherry Yang
4 months
Unfortunately, I will have to miss the oral presentation. @du_yilun will talk about the work instead. But hope to see you at the poster session tomorrow! P.S. Lesson learned -- apply for Schengen visa early; appointments in certain regions fill up two months in advance!
1
0
2