Sherry Yang @mengjiao_yang Twitter profile

Pinned Tweet

Sherry Yang

@mengjiao_yang

11 months

Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:

32

245

1K

Last Seen Profiles

@_Rea20_

@baemeovv

@SukaIbuIbuTua2

@Defecation_Chin

@Loss31990686

@2MPV

@RohitJagasia1

@renka_rider

@zoesanndao

@ChuckBmx

@LosPlantaos

@16vk

@MeetWhale

@GokhanGuul

@MeetWhale

@lucard_chris

@bokeplokalmalam

@CrystalRoper4

@DavidHollandjr2

@47400

@sudo_dosu

@LoriMcNaughton

@No6

@galery_basah10

@ScrollaAfrica

@16GBapex

@Lookmhee1993

@MeetWhale

@Bewithpratima

@Worthbaseball

@bokeplokalmalam

@Falcons_Skip

@CemKoro

@hanbinarabb

@AriaArves

@MDJS_ESPORT

Sherry Yang

@mengjiao_yang

2 years

Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum

5

101

573

Sherry Yang

@mengjiao_yang

1 year

Review paper on Foundation Models for Decision Making: Foundation models can characterize various components of decision making, such as states (S), behaviors (A), dynamics (T), task specifiers (R), through generative modeling or representation learning.

4

110

442

Sherry Yang

@mengjiao_yang

9 months

Checkout UniMat -- a unified representation of materials that enables scaling of diffusion models to millions of stable crystal structures. Website: Paper:

6

49

232

Sherry Yang

@mengjiao_yang

2 months

Consider joining our team at Google DeepMind to work on foundation models for decision making, e.g., foundation model alignment, reasoning, planning, simulation, and optimization with foundation models.

Hanjun Dai

@hanjundai

2 months

Our team (w/Dale, @daibond_alpha , @mengjiao_yang + others) at Google DeepMind is looking to hire. If you are interested in foundation models+decision making, and making real-world impact through Gemini and cloud solutions, please consider applying through

1

27

134

0

18

235

Sherry Yang

@mengjiao_yang

6 months

Video generation will revolutionize decision making in the physical world like how language models have changed the digital world. Interested in the implications of video generation models like UniSim and Sora? Check out our position paper:

Video as the New Language for Real-World Decision Making

Both text and video data are abundant on the internet and support large-scale self-supervised learning through next token or frame prediction. However, they have not been equally leveraged:...

arxiv.org

3

48

207

Sherry Yang

@mengjiao_yang

4 months

Happy to share that UniSim was selected for an Outstanding Paper Award at #ICLR2024 . Check out the oral presentation today at 10:30am Oral 1B and poster on Wed at 4:30-6:30pm #87 . Thanks to the award committee @eunsolc , @katjahofmann , @liu_mingyu ,

ICLR 2025

@iclr_conf

4 months

Announcing the #ICLR2024 Outstanding Paper Awards: Shoutout to the awards committee: @eunsolc , @katjahofmann , @liu_mingyu , @nanjiang_cs , @guennemann , @optiML , @tkipf , @CevherLIONS

3

53

303

19

12

162

Sherry Yang

@mengjiao_yang

2 years

What does "Learn principles, not formulas. Understand, do not memorize” mean for autonomous agents? Chain of Thought Imitation with Procedure Cloning! ArXiv Code Site w/ Dale @pabbeel @ofirnachum

2

31

131

Sherry Yang

@mengjiao_yang

2 years

Text-conditioned video generation can serve as universal policies (UniPi) and learn from sim, real, and web-scale videos. w/ @du_yilun , @hanjundai , @daibond_alpha , @ofirnachum , Josh, Dale, @pabbeel Paper: Web:

3

30

126

Sherry Yang

@mengjiao_yang

1 year

As video foundation models reach billions of parameters, how to adapt them to task-specific settings (e.g., animation, robotics) without access to the model weights becomes a pressing issue. We introduce Video Adapter:

2

21

102

Sherry Yang

@mengjiao_yang

8 months

See you all at the 2nd Foundation Models for Decision Making #NeurIPS2023 workshop tomorrow (Friday) at Hall E2 starting 8:15am. Don't miss out on an exciting line of speakers! See schedule and zoom link at .

1

23

99

Sherry Yang

@mengjiao_yang

2 years

See you all at the 1st Foundation Models for Decision Making workshop @NeurIPSConf (Room 391) on Sat, Dec 3 2022. See schedule and zoom link at .

Sherry Yang

@mengjiao_yang

2 years

Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum

5

101

573

1

20

89

Sherry Yang

@mengjiao_yang

1 month

Looking forward to presenting the following papers @icmlconf : - Position paper on Video Generation for Decision Making (Tue 1:30 - 3pm #2613 ). - Code as Reward for real-world RL with VLMs (Thur 1:30-3pm #1115 ).

Video as the New Language for Real-World Decision Making

Both text and video data are abundant on the internet and support large-scale self-supervised learning through next token or frame prediction. However, they have not been equally leveraged:...

arxiv.org

2

11

80

Sherry Yang

@mengjiao_yang

5 months

I talked about our position paper on "Video as the New Language for Real-World Decision Making" on the TWIML podcast @twimlai . Check out the conversation below:

Video as the New Language for Real-World Decision Making

Both text and video data are abundant on the internet and support large-scale self-supervised learning through next token or frame prediction. However, they have not been equally leveraged:...

arxiv.org

The TWIML AI Podcast

@twimlai

5 months

Today we’re joined by @mengjiao_yang , a senior research scientist at Google DeepMind, to learn why video data offers a better foundation than natural language for teaching AI to understand the world. 🎧 / 🎥To listen to the audio version, visit .

2

10

50

2

13

80

Sherry Yang

@mengjiao_yang

3 years

Paper with @ofirnachum -- Representation Matters: Offline Pretraining for Sequential Decision Making -- is accepted to #ICML2021 ! Contrastive pretraining yields huge gains in low-data imitation learning, offline RL, and online RL.

1

8

78

Sherry Yang

@mengjiao_yang

10 months

Both RT-Sketch () and RT-trajectory () can be viewed as a form of "chain-of-thought" of agents (). Learning mappings from high-dim images to low-dim controls is difficult, and intermediate info could help.

Procedure Cloning

Paper: arxiv.org/abs/2205.10816 Code: github.com/google-research/google-research/tree/master/procedure_cloning

sites.google.com

1

15

57

Sherry Yang

@mengjiao_yang

9 months

Heading to #NeurIPS2023 with an exciting agenda: [1/5] I will showcase UniSim at the Google DeepMind booth (Hall C, 315) on Mon Dec 11 12-2pm CT. Please stop by to interact with the real-world simulator.

Sherry Yang

@mengjiao_yang

11 months

Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:

32

245

1K

3

5

54

Sherry Yang

@mengjiao_yang

3 years

TRAIL's latent action pretraining provably accelerates downstream imitation learning even when the offline dataset is highly suboptimal (e.g., collected from a random policy). Paper: Code: Website:

Ofir Nachum

@ofirnachum

3 years

How can we leverage existing behavior datasets to learn an "easier" action space for control? Many have thought about this question (eg see my OPAL paper w/ A. Ajay), but existing work relies on dataset to already contain good and temporally-extended behavior. And so... TRAIL!

1

2

27

1

8

51

Sherry Yang

@mengjiao_yang

4 months

Checkout our ICML paper -- Code as Reward . VLMs can generate code to compute reward from images, enabling RL agents to learn in the real-world without hand-designed simulators. This compliments generative world models like UniSim .

David Venuto

@DavidAVenuto

4 months

We are excited to announce that our work "Code as Reward: Empowering Reinforcement Learning with VLMs" was accepted to ICML 2024. This work was done with Sami Nur Islam @MartinKlissarov Doina Precup @mengjiao_yang @ankit_s_anand .

2

5

32

0

10

49

Sherry Yang

@mengjiao_yang

2 years

Dichotomy of Control accepted as "notable-top-5%" at #iclr2023 ! Only controlling what one can control seems to also be a useful research philosophy.

Ofir Nachum

@ofirnachum

2 years

Despite all the buzz about decision transformers, it's well-known they can be unboundedly suboptimal in stochastic envs. The issue is due to conditioning the policy on return, a highly stochastic quantity. We introduce "dichotomy of control" to solve this:

5

38

200

0

4

47

Sherry Yang

@mengjiao_yang

2 years

CALM () and CHAI () are our first attempt at combining offline RL with language models to solve task-oriented dialogue. While dialogue is hard, our initial set of results are highly encouraging:

CHAI: A CHatbot AI for Task-Oriented Dialogue with Offline...

Conventionally, generation of natural language for dialogue agents may be viewed as a statistical learning problem: determine the patterns in human-provided data and generate appropriate responses...

arxiv.org

Sergey Levine

@svlevine

2 years

Offline RL is a natural fit for dialogue: RL with humans is hard, but data of humans talking to humans is plentiful. In two new papers, we explore offline RL and for end-to-end dialogue systems with Transformers! CALM: CHAI: 🧵->

1

42

244

2

7

46

Sherry Yang

@mengjiao_yang

3 months

I'll present Unified Materials (UniMat) today 10:45am - 12:45pm at poster #170 #ICLR2024 . Come and learn about the initial effort from @GoogleDeepMind on developing generative models for materials. Joint work with KwangHwan Cho, @amilmerchant , @pabbeel , Dale Schuurmans,

Sherry Yang

@mengjiao_yang

9 months

Checkout UniMat -- a unified representation of materials that enables scaling of diffusion models to millions of stable crystal structures. Website: Paper:

6

49

232

1

6

44

Sherry Yang

@mengjiao_yang

10 months

Check out Video Language Planning! VLP uses vision-language models as policy and reward, and UniSim as dynamics. Foundation agent models can now conduct search and planning in foundation world models.

Yilun Du

@du_yilun

10 months

Introducing Video Language Planning! By planning across the space of generated videos/language, we can synthesize long-horizon video plans and solve much longer horizon tasks than existing baseline (such as RT-2 and PALM-E). (1/5)

7

48

290

0

13

44

Sherry Yang

@mengjiao_yang

3 months

I'll present Video Adapter today (Friday) 10:45am - 12:45pm at poster #200 #ICLR2024 . Come and learn about how we can adapt large video foundation models like UniSim to domain-specific settings (e.g., animation, robotics) without requiring access to the foundation model

Sherry Yang

@mengjiao_yang

1 year

As video foundation models reach billions of parameters, how to adapt them to task-specific settings (e.g., animation, robotics) without access to the model weights becomes a pressing issue. We introduce Video Adapter:

2

21

102

1

4

42

Sherry Yang

@mengjiao_yang

4 years

Off-policy evaluation is not the full story - we introduce bayesian decision making to select policies based on any criteria: Offline Policy Selection under Uncertainty: code: with @daibond_alpha , @ofirnachum , @georgejtucker , Dale

1

6

38

Sherry Yang

@mengjiao_yang

4 years

New paper: Energy-Based Processes for Exchangeable Data. We combine EBMs with stochastic processes to model set structures with arbitrary cardinality. Paper: Code: Joint work with @daibond_alpha , @hanjundai , and Dale Schuurmans.

0

9

32

Sherry Yang

@mengjiao_yang

11 months

UniSim is trained on broad data rich in different axes (internet text-image, human activity videos, simulated executions, real robot videos, panorama scans, etc).

2

4

31

Sherry Yang

@mengjiao_yang

2 years

In our recent work ALPT, I was surprised that inverse dynamics model pretrained on Freeway (Atari game only moving up and down) dramatically improves performance of Breakout (only moving left and right). One step closer towards cross-domain generalization:

Multi-Environment Pretraining Enables Transfer to Action Limited Datasets

Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications. In reinforcement learning, however, a...

arxiv.org

Ofir Nachum

@ofirnachum

2 years

I am amazed at the generalization performance of inverse dynamics models for ALPT, even winning out in situations where source/target tasks have completely disjoint action spaces! Awesome work with @DavidAVenuto @mengjiao_yang @pabbeel @IMordatch , Precup.

0

1

14

1

2

29

Sherry Yang

@mengjiao_yang

11 months

UniSim supports training RL agents purely in simulation that transfer to real robots in zero-shot.

1

2

27

Sherry Yang

@mengjiao_yang

8 months

FMDM workshop happening now in Hall E2 with @percyliang giving the first talk and a packed audience.

Sherry Yang

@mengjiao_yang

8 months

See you all at the 2nd Foundation Models for Decision Making #NeurIPS2023 workshop tomorrow (Friday) at Hall E2 starting 8:15am. Don't miss out on an exciting line of speakers! See schedule and zoom link at .

1

23

99

1

25

Sherry Yang

@mengjiao_yang

3 months

I will dive into more detail of how we have been leveraging UniSim to solve embodied decision making tasks at the Generative AI for Decision Making workshop today 4pm-4:30pm (Lehar 3).

Lisa Lee

@rl_agent

3 months

Generative AI for Decision Making workshop at #ICLR2024 is taking place on Sat, May 11 @ 8:30 - 17:00 (Lehar 3). We have an exciting lineup of Invited Speakers including: Katja Hofmann @katjahofmann @MSFTResearch Igor Mordatch @Imordatch @GoogleDeepMind Yuandong Tian @tydsh

1

5

39

0

1

25

Sherry Yang

@mengjiao_yang

2 years

Now with ICLR deadline behind us, please consider submit to the FMDM workshop @NeurIPSConf by Oct 3! We have an exciting line of speakers: Leslie Kaelbling, @DorsaSadigh , Dale Schuurmans, @pathak2206 , @machelreid , @Thom_Wolf , and @DrJimFan . There will be paper & travel awards.

Sherry Yang

@mengjiao_yang

2 years

Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum

5

101

573

0

4

24

Sherry Yang

@mengjiao_yang

11 months

@du_yilun @coolboi95 @JonathanTompson @pabbeel Learning realistic world models is becoming a reality @ylecun

1

0

23

Sherry Yang

@mengjiao_yang

2 years

ChatGPT as a research advisor:

2

24

Sherry Yang

@mengjiao_yang

11 months

UniSim supports both high-level language actions and low-level motor controls through conditional video generation.

1

21

Sherry Yang

@mengjiao_yang

11 months

UniSim supports simulation of diverse manipulation actions such as various actions in cooking a meal.

2

19

Sherry Yang

@mengjiao_yang

11 months

UniSim supports diverse environment transitions such as different objects being uncovered.

1

2

18

Sherry Yang

@mengjiao_yang

11 months

UniSim also supports training embodied VLM planners purely in simulation and transfer to real robots in zero-shot.

1

16

Sherry Yang

@mengjiao_yang

2 years

We will be at J #107 on Thu 11am #NeurIPS2022 to chat about Multi-Game Decision Transformers!

Igor Mordatch

@IMordatch

2 years

How can we effectively train generalist multi-environment agents? We trained a single Decision Transformer model to play many Atari games simultaneously and compared it to alternative approaches:

4

60

315

0

4

16

Sherry Yang

@mengjiao_yang

2 years

Curious about how to make OPE *practical* for selecting policies in practice? Checkout our #AISTATS2022 poster on Wed 3/30 at 8:30am PT (session 5). Offline Policy Selection under Uncertainty w/ @daibond_alpha , @ofirnachum , @georgejtucker , Dale

1

5

16

Sherry Yang

@mengjiao_yang

11 months

A reminder to submit your work to the Foundation Models for Decision Making NeurIPS workshop. The deadline is Oct 1 AoE.

1

3

16

Sherry Yang

@mengjiao_yang

11 months

UniSim also supports diverse navigation actions.

1

15

Sherry Yang

@mengjiao_yang

1 year

Come and talk to us about ALPT (action limited pretraining) this Thursday at 1:30pm (Hall 1 #104 )!

Sherry Yang

@mengjiao_yang

2 years

In our recent work ALPT, I was surprised that inverse dynamics model pretrained on Freeway (Atari game only moving up and down) dramatically improves performance of Breakout (only moving left and right). One step closer towards cross-domain generalization:

1

2

29

0

3

15

Sherry Yang

@mengjiao_yang

11 months

Work done with @du_yilun , @coolboi95 , @JonathanTompson , Dale Schuurmans, @pabbeel .

1

12

Sherry Yang

@mengjiao_yang

1 year

I'll present Dichotomy of Control today at 11am in the auditorium (Oral 3 Track 1) and poster session 11:30am - 1:30pm in MH1-2-3-4, #119 . Hope to see you there. #ICLR23 Recording:

Ofir Nachum

@ofirnachum

2 years

Despite all the buzz about decision transformers, it's well-known they can be unboundedly suboptimal in stochastic envs. The issue is due to conditioning the policy on return, a highly stochastic quantity. We introduce "dichotomy of control" to solve this:

5

38

200

0

1

12

Sherry Yang

@mengjiao_yang

11 months

UniSim also supports long-horizon interactions without compromising simulation quality.

1

0

10

Sherry Yang

@mengjiao_yang

2 years

Foundation Models for Decision Making (FMDM) workshop @NeurIPSConf is open for submissions! Remember to submit your work (4-9 pages) before Sept 22 at .

NeurIPS 2022 Workshop FMDM

Welcome to the OpenReview homepage for NeurIPS 2022 Workshop FMDM

openreview.net

Ofir Nachum

@ofirnachum

2 years

We are open for submissions! I know there are lots of people working on large models, pretraining, cross-domain/agent generalization for RL. Please submit your papers to the 1st FMDM workshop at NeurIPS 2022!

1

20

126

0

2

12

Sherry Yang

@mengjiao_yang

1 year

For instance, skill discovery and Decision Transformer are examples of generative models of behavior, whereas model-based offline RL and Trajectory Transformer are examples of generative models of the world dynamics.

1

0

10

Sherry Yang

@mengjiao_yang

2 years

Come and learn about TRAIL for action representation learning at #ICLR2022 Thur 4/28 10:30 am PT: (with @svlevine and @ofirnachum ).

Sherry Yang

@mengjiao_yang

3 years

TRAIL's latent action pretraining provably accelerates downstream imitation learning even when the offline dataset is highly suboptimal (e.g., collected from a random policy). Paper: Code: Website:

1

8

51

0

3

10

Sherry Yang

@mengjiao_yang

1 year

Large language models can serve as agents or environments, supporting interactions with humans, tools, and the real world, enabling new learning environments such as the internet and human knowledge.

1

2

9

Sherry Yang

@mengjiao_yang

1 year

Work with @ofirnachum , @du_yilun , @_jasonwei , @pabbeel , Dale Schuurmans. Please feel free to drop us a note if your work is relevant and should be included in this review.

0

8

Sherry Yang

@mengjiao_yang

6 months

Aside from directly solving tasks, video generation is also a realistic simulator for complex games (), which can be combined with model-based planning, or be used to create new games ().

Tim Rocktäschel

@_rockt

6 months

I am really excited to reveal what @GoogleDeepMind 's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.

145

572

3K

1

2

9

Sherry Yang

@mengjiao_yang

1 year

Large pretrained vision / language foundation models can characterize various perceptual components of decision making agents such as image observations, language actions, and language goals (e.g., SayCan, PALM-E) through plug-and-play.

1

0

8

Sherry Yang

@mengjiao_yang

3 years

Super exciting work that connects contrastive pretraining to learning representations of transitions/rewards in a dynamical system, which probably improves downstream behavioral cloning tasks!

Ofir Nachum

@ofirnachum

3 years

For those who want a distraction from their submissions... checkout our new paper! w @mengjiao_yang about *provable* bounds of representation learning for downstream imitation (ie behavioral cloning, ie max-likelihood). Lots results & insights in this paper I'm excited about...1/

1

7

80

0

8

Sherry Yang

@mengjiao_yang

2 years

UniPi came about as a part of our broader effort on *Foundation Models for Decision Making*, where vision-language architectures and pretrained models are applied to decision making. Check out other exciting work in this area:

Sherry Yang

@mengjiao_yang

2 years

Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum

5

101

573

1

3

8

Sherry Yang

@mengjiao_yang

4 years

Our unified DICE framework reveals that unlike the Q-values, the dual estimates of policy value in terms of state-action distributions offer greater flexibility in regularization while being unbiased, and are robust to scaling and shifting of MDP rewards.

Ofir Nachum

@ofirnachum

4 years

Policy evaluation via duality/Lagrangian methods presents a lot of choices (how to setup the LPs, regularize them, etc). In we examine how these choices affect accuracy of final eval. Lots of insights in this paper, many of which I didn't expect....

2

11

70

0

8

Sherry Yang

@mengjiao_yang

2 years

Evaluation on simulated robotic navigation, manipulation, and game environments show that procedure cloning exhibits significant generalization to unseen environment configurations such as maze layouts, positions of objects, transition stochasticity, and game difficulties.

2

0

7

Sherry Yang

@mengjiao_yang

1 year

Video Adapter works by composing scores of a pretrained text-to-video model with scores of a domain-specific small model (with 1% parameters) during sampling time, achieving high-quality yet flexible video synthesis without requiring gradient updates on the pretrained model.

1

7

Sherry Yang

@mengjiao_yang

9 months

One key to the success of foundation models is a unified representation. Language uses tokens. Vision uses pixels. What about materials? We develop a unified way to represent materials by storing the locations of atoms to their respective entries in the periodic table.

1

0

6

Sherry Yang

@mengjiao_yang

2 years

Lastly, we note the connection between "chain of thought imitation" and "chain of thought prompting" by @_jasonwei et al. --- decomposing multi-step problems into intermediate steps and learning the intermediate steps using a sequence model is applicable to a variety of problems.

0

7

Sherry Yang

@mengjiao_yang

4 months

@du_yilun Poster session happening in 30 min at #87 with a DIY laptop stand!

0

1

7

Sherry Yang

@mengjiao_yang

6 months

Visual and algorithmic reasoning and can also be casted as next frame / video generation tasks.

1

0

5

Sherry Yang

@mengjiao_yang

6 months

@billpeeb Would be cool to see reference to our "video generation as real-world simulator" work: . Ofc Sora is taking things to a whole new level.

Sherry Yang

@mengjiao_yang

11 months

Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:

32

245

1K

0

6

Sherry Yang

@mengjiao_yang

9 months

With this unified representation, we train a diffusion model to generate materials by moving atoms from random locations back to their original locations. Atoms that do not exist will be moved to a special null location. This allows generation of crystals with arbitrary elements.

1

0

6

Sherry Yang

@mengjiao_yang

2 years

I found UniPi’s *video-as-policy* and *text-as-task* abstractions to be empowering. These abstractions unify environments with different state and action spaces and enable policy learning across broad datasets.

1

2

6

Sherry Yang

@mengjiao_yang

10 months

@xiao_ted An interesting next step would be to elicit these useful intermediate information in the agent automatically, so that humans don't have to draw goals / motion plans during inference, similar to how LLMs can automatically output intermediate reasoning steps.

1

4

Sherry Yang

@mengjiao_yang

6 months

For instance, classical computer vision tasks can be casted as a next-frame generation task ().

Yutong Bai

@YutongBAI1002

9 months

How far can we go with vision alone? Excited to reveal our Large Vision Model! Trained with 420B tokens, effective scalability, and enabling new avenues in vision tasks! (1/N) Kudos to @younggeng @Karttikeya_m @_amirbar , @YuilleAlan Trevor Darrell @JitendraMalikCV Alyosha Efros!

18

160

1K

1

0

5

Sherry Yang

@mengjiao_yang

6 months

Similar to text, video is a unified interface that can absorb internet knowledge and represent diverse tasks. This allows us to pour internet data into a single model that can solve many tasks through conditional generation.

1

0

6

Sherry Yang

@mengjiao_yang

10 months

@danijarh Thanks Danijar! Depending on the number of denoising steps, generating 16 frames could take anywhere between 1-20 seconds. We found for RL / planning on Language Table, having very few denoising steps (e.g., 8) is sufficient, which takes roughly a second.

0

5

Sherry Yang

@mengjiao_yang

6 months

Generative video simulator is also useful for optimizing control input in science and engineering domains where abundant video data can be collected but the underlying physical dynamics are hard to be explicitly expressed (e.g., cloud movement, interaction with soft objects).

0

4

Sherry Yang

@mengjiao_yang

2 years

We propose procedure cloning (an alternative to BC), which applies supervised sequence prediction to imitate the series of expert computations. Procedure cloning learns not only what to do (i.e., the output action), but how and why to do it (i.e., the procedure).

1

0

5

Sherry Yang

@mengjiao_yang

1 year

Glad to see our initial work, Attentive Contrastive Learning (BERT for RL) in 2021 , has been joined by many subsequent work in applying BERT-style autoencoding objectives to sequential decision making (e.g., MaskDP @fangchenliu_ and MTM @philippswu ).

Ofir Nachum

@ofirnachum

1 year

@aravindr93 Nice work. MTM sounds conceptually similar to what we called ACL in our paper a few years ago:

0

4

0

1

5

Sherry Yang

@mengjiao_yang

2 years

@HappyyPablo @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum Yes, the talks will be pre-recorded. The workshop will be in person but will also be live streamed.

0

5

Sherry Yang

@mengjiao_yang

9 months

We scale UniMat to train on >2M low-energy materials, and show that conditional generation (conditioned on atom type) can generalize to generating more difficult structures, outperforming random structure search (the current leading method) in discovering new stable materials.

1

0

4

Sherry Yang

@mengjiao_yang

6 months

@_tim_brooks Would be cool to see reference to our "video generation as real-world simulator" work: . Ofc Sora is taking things to a whole new level.

Sherry Yang

@mengjiao_yang

11 months

Introducing Universal Simulator (UniSim), an interactive simulator of the real world. Interactive website: Paper:

32

245

1K

1

0

5

Sherry Yang

@mengjiao_yang

3 years

Before developing better dialogue agents, we need reliable and scalable evaluation of dialogue systems. ENIGMA evaluates dialogue agents by estimating how a human user would rate an agent via off-policy evaluation with DICE (distribution correction estimation).

Tuo Zhao

@tourzhao

3 years

An ideal environment for evaluating dialog agents, i.e., the Turing test, needs to involve human interaction, which is not affordable for large-scale experiments. Our EMNLP 2021 paper proposes a new framework – ENIGMA for automating the Turing test. (1/4)

2

7

14

0

5

Sherry Yang

@mengjiao_yang

6 months

Work with Jacob Walker, @jparkerholder , @du_yilun , Jake Bruce, Andre Barreto, @pabbeel , Dale Schuurmans.

0

5

Sherry Yang

@mengjiao_yang

9 months

Under DFT evaluations. UniMat generates materials with significantly lower decomposition energy and many more new stable materials (with respect to Materials Project 2021) compared to previous state-of-the-art generative models.

2

0

4

Sherry Yang

@mengjiao_yang

1 year

Video Adapter potential has broad applications such as anime production and domain randomization for bridging the sim-to-real gap in robotics. Great collaboration with @du_yilun , @daibond_alpha , Dale, Josh, and @pabbeel .

0

4

Sherry Yang

@mengjiao_yang

6 months

Video is also a unified observation space across different embodiments, so we can generate visual execution plans for different robots using a single video generation model:

1

0

4

Sherry Yang

@mengjiao_yang

10 months

@jeasinema Hi Xiaojian, thanks for the interest! For 1), we didn't train separate models for each domain; we used a single conditional video diffusion model for all the data mixtures (the mixture ratio is in Appendix B of the paper).

1

0

4

Sherry Yang

@mengjiao_yang

2 years

Extremely useful research advice on how to turn partial results into impactful research contributions!

Ofir Nachum

@ofirnachum

2 years

My second post is up now (as promised, updates are very infrequent). "Paper Writing: A View from the Trenches"

0

6

29

0

4

Sherry Yang

@mengjiao_yang

9 months

Work with KwangHwan Cho, @amilmerchant , @pabbeel , Dale Schuurmans, @IMordatch , @ekindogus .

1

0

4

Sherry Yang

@mengjiao_yang

1 year

Video Adapter can effectively adapt pretrained video model to egocentric videos and robotic data.

1

0

4

Sherry Yang

@mengjiao_yang

2 years

Happening now in room 291!

Shane Gu

@shaneguML

2 years

Check out Foundation Models for Decision Making workshop in Room 291-292! #NeurIPS2022

1

6

42

0

4

Sherry Yang

@mengjiao_yang

2 years

We formulate the “chain of thought” imitation learning problem, where an agent also has access to the intermediate computations that generated the expert state-action pairs, such as planning, search, or some other multi-step algorithm (e.g., BFS, MCTS).

1

0

4

Sherry Yang

@mengjiao_yang

6 months

A model can answer people’s questions by generating how-to videos (e.g., “how to make a sushi roll”), which may be more preferable than textual responses.

1

0

3

Sherry Yang

@mengjiao_yang

2 years

Morning speakers: 10-10:30 Gato: A Generalist Agent @gbarthmaron 10:30-11 Open-Ended Embodied Agents with Internet-Scale Knowledge @DrJimFan 11-11:30 What does an intelligent robot need to know? (Leslie) 11:30-12 Learning and Leveraging Foundation Models in Robotics @DorsaSadigh

1

0

3

Sherry Yang

@mengjiao_yang

1 year

Video Adapter achieves much better FVD, FID, and Inception Scores than both the pretrained and the task-specific model, and outperforms finetuning under the same TPU hours.

1

0

3

Sherry Yang

@mengjiao_yang

2 years

Oral presentations: 13:30 - 13:45: In-context Reinforcement Learning with Algorithm Distillation 13:45 - 14:00: Large Language Models Are Human-Level Prompt Engineers

0

2

3

Sherry Yang

@mengjiao_yang

2 years

@svlevine @ofirnachum @kuanghueilee Also thanks to @IMordatch who contributed the majority of the initial MGDT infrastructure. Scalable RL is becoming reality.

0

3

Sherry Yang

@mengjiao_yang

2 years

I will be at J #929 on Thu 4pm to chat about chain-of-thought learning for embedded agents!

Sherry Yang

@mengjiao_yang

2 years

What does "Learn principles, not formulas. Understand, do not memorize” mean for autonomous agents? Chain of Thought Imitation with Procedure Cloning! ArXiv Code Site w/ Dale @pabbeel @ofirnachum

2

31

131

0

1

3

Sherry Yang

@mengjiao_yang

3 years

Our breadth study shows that a class of contrastive self-prediction loss works particularly well while many other objectives perform poorly. Our depth study ablates over action/reward prediction/reconstruction, momentum network, tandem training, etc to illustrate their effects.

0

3

Sherry Yang

@mengjiao_yang

2 years

Code:

0

3

Sherry Yang

@mengjiao_yang

2 years

FMDM LOCATION CHANGE! Now in Room 291 - 292 on Saturday at 8:50 AM.

Sherry Yang

@mengjiao_yang

2 years

See you all at the 1st Foundation Models for Decision Making workshop @NeurIPSConf (Room 391) on Sat, Dec 3 2022. See schedule and zoom link at .

1

20

89

0

3

Sherry Yang

@mengjiao_yang

9 months

@itif_workshop [5/5] We will be hosting the 2nd workshop on Foundation Models for Decision Making on Friday Dec 15 starting 8:15am CT in Hall E2. Don't miss out on this exciting workshop!

0

2

Sherry Yang

@mengjiao_yang

4 months

Unfortunately, I will have to miss the oral presentation. @du_yilun will talk about the work instead. But hope to see you at the poster session tomorrow! P.S. Lesson learned -- apply for Schengen visa early; appointments in certain regions fill up two months in advance!

1

0

2