Jack Parker-Holder @jparkerholder Twitter profile | Pikagi

Pikagi

Jack Parker-Holder

@jparkerholder

2,560

Followers

697

Following

32

Media

775

Statuses

Research Scientist @GoogleDeepMind & Honorary Lecturer @UCL_DARK interested in generating worlds from internet data. Views are my own :)

London, England

https://t.co/lQPsq7NTFC

Joined October 2018

Don't wanna be here? Send us removal request.

Pinned Tweet

@jparkerholder

Jack Parker-Holder

5 months

When we started this project the idea of training world models *exclusively* from Internet videos seemed wild, but it turns out latent actions are the key and the bitter lesson holds. Now we have a viable path to generating the rich diversity of environments we need for AGI. 🚀

@_rockt

Tim Rocktäschel

5 months

I am really excited to reveal what @GoogleDeepMind 's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.

144

571

3K

8

23

166

Last Seen Profiles

@tombarys

@opormore

@MuseAlaCarte

@Alaki2023

@SMTripathi414

@njg28fs7rg

@ChikendiKeke

@MissSRDavies

@TheSpandex_

@JethroRostedt

@EliaNader337653

@stw_pdg

@abdlkydtha69920

@merelynisope

@networkx_team

@bi2986

@El_Tanito_

@AshcroftBen

@haino

@ScottishLabour

@peepauu_

@EugeneDVV

@EXPERIMENT000XD

@to9____

@GentlemansGuide

@PramborsJogja

@sakelover3

@Kaylebxxx

@UlsterGAA

@lokipalauri

@Utsbraz

@eilishtwy

@oktatabye23

@KOKOYO97543147

@clueyelsa

@soumya_pillai

@jparkerholder

Jack Parker-Holder

2 years

I'm super excited to be joining @DeepMind today as a Research Scientist, working with @_rockt ! Thank you to everyone who helped make this possible! Watch this space 🌱

36

7

447

@jparkerholder

Jack Parker-Holder

3 years

🤖 Introducing the first survey on AutoRL: methods for automatically discovering multiple components of the RL training pipeline, from tuning hyperparameters and architectures to learning algorithms or automatically designing environments. Link 👉 [1/4]

Tweet media one

2

113

323

@jparkerholder

Jack Parker-Holder

2 years

Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]

3

47

226

@jparkerholder

Jack Parker-Holder

3 years

I always love hearing from former ML PhD students about the days before tensorflow/pytorch... maybe in a few years we will tell current PhD students about the time before free MuJoCo 🙌

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

3 years

We’ve acquired the MuJoCo physics simulator () and are making it free for all, to support research everywhere. MuJoCo is a fast, powerful, easy-to-use, and soon to be open-source simulation tool, designed for robotics research:

85

2K

6K

4

11

136

@jparkerholder

Jack Parker-Holder

1 year

Interested in scaling open-ended learning systems? Check out the new RE posting in our team 🚀. Feel free to DM with any questions!

Tweet card media

boards.greenhouse.io

0

17

104

@jparkerholder

Jack Parker-Holder

1 year

Feel very fortunate to have contributed to this as my first project @DeepMind ! It is amazing to see what can be done when combining Transformer models with meta-RL and PLR in a vast, open-ended task space!

@FeryalMP

Feryal

1 year

I’m super excited to share our work on AdA: An Adaptive Agent capable of hypothesis-driven exploration which solves challenging unseen tasks with just a handful of experience, at a similar timescale to humans. See the thread for more details 👇 [1/N]

25

266

1K

3

9

100

@jparkerholder

Jack Parker-Holder

3 years

The case for offline RL is clear: we often have access to real world data in settings where it is expensive (and potentially even dangerous) to collect new experience. But what happens if this offline data doesn’t perfectly match the test environment? [1/8]

1

14

85

@jparkerholder

Jack Parker-Holder

3 years

First day @facebookai working with @_rockt and @egrefen .... should be a great summer! 😀

5

3

73

@jparkerholder

Jack Parker-Holder

3 years

Population Based Training (PBT) has been shown to be successful in a variety of RL settings, but often requires vast computational resources 💰. To address this, last year we introduced Population Based Bandits (PB2 ) [1/N]

Tweet card media

Provably Efficient Online Hyperparameter Optimization with...

Many of the recent triumphs in machine learning are dependent on well-tuned hyperparameters. This is particularly prominent in reinforcement learning (RL) where a small change in the configuration...

2

16

60

@jparkerholder

Jack Parker-Holder

2 years

Super exciting time to work on population-based methods! We already have fast data collection, now this paper shows vectorizing agent updates can lead to huge speedups (on a GPU): Looking forward to discussing with the authors ( @instadeepai ) at #ICML2022 😀

Tweet card media

Fast Population-Based Reinforcement Learning on a Single Machine

Training populations of agents has demonstrated great promise in Reinforcement Learning for stabilizing training, improving exploration and asymptotic performance, and generating a diverse set of...

3

7

60

@jparkerholder

Jack Parker-Holder

10 months

Working closely with many amazing members of @UCL_DARK (and @robertarail ) over the past few years has been a privilege and I am *also* super excited to make this official!! 😎🚀

@UCL_DARK

UCL DARK

10 months

We are super excited to announce that Dr Roberta Raileanu ( @robertarail ) and Dr Jack Parker-Holder ( @jparkerholder ) have joined @UCL_DARK as Honorary Lecturers! Both have done impressive work in Reinforcement Learning and Open-Endedness, and our lab is lucky to get their support.

Tweet media one

4

12

86

4

5

57

@jparkerholder

Jack Parker-Holder

2 years

Heading to Baltimore for #ICML2022 ✈️ Will be presenting ACCEL on Thursday and would love to chat about unsupervised environment design and open-endedness with many of you there! DM if you're around and want to catch up 😀

@jparkerholder

Jack Parker-Holder

2 years

Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]

3

47

226

1

3

53

@jparkerholder

Jack Parker-Holder

2 years

Heading to @NeurIPSConf tomorrow, would be great to chat about open-endedness, RL, world models or England’s chances at the word cup 😀 DMs open! #NeurIPS2022

4

4

51

@jparkerholder

Jack Parker-Holder

8 months

If you're thinking of applying for PhDs, interested in open-endedness/foundation models and don't mind rainy weather 🇬🇧, then consider applying to @UCL_DARK ! My DMs are open and I'll be in New Orleans for NeurIPS so please get in touch if this sounds like you! 😀

@UCL_DARK

UCL DARK

8 months

We ( @_rockt , @egrefen , @robertarail , and @jparkerholder ) are looking for PhD students to join us in Fall 2024. If you are interested in Open-Endedness, RL & Foundation Models, then apply here: and also write us at ucl-dark-admissions @googlegroups .com

3

20

65

3

8

43

@jparkerholder

Jack Parker-Holder

7 months

I’ll be ✈️ to #NeurIPS2023 on Monday and hoping to discuss: - open-endedness and why it matters for AGI #iykyk - world models - why it’s never been a better time to do a PhD in ML (especially @UCL_DARK 😉)! Find me at two posters + @aloeworkshop + hanging around the GDM booth 🤪

@_rockt

Tim Rocktäschel

7 months

Everyone from @GoogleDeepMind 's Open-Endedness Team and almost the entire @UCL_DARK Lab are going to be at @NeurIPSConf 2023 next week. You will find most of us at the @ALOEworkshop on Friday. Come and say hi!

0

6

59

0

3

41

@jparkerholder

Jack Parker-Holder

6 months

Not sure who needs to hear this, but, effectively filtering large and noisy datasets is a gift that keeps on giving!! 🎁 Often more impactful than fancy new model architectures 😅 We found this same thing in RL with autocurricula (e.g. PLR, ACCEL), and I'd bet it works elsewhere

@evgenia_rusak

Evgenia Rusak

6 months

In our new paper (oral , ICCV23), we develop a concept-specific pruning criterion (Density-Based-Pruning) which reduces the training cost by 72%. Joint work with @amrokamal1997 @kushal_tirumala @wielandbr @kamalikac @arimorcos (1/5)

Tweet media one

1

34

80

1

6

40

@jparkerholder

Jack Parker-Holder

5 months

Going for action-free training is a total game changer and it helps to do it with someone who has been thinking about this for years () who happens to also be one of the nicest people ever

Tweet card media

Imitating Latent Policies from Observation

In this paper, we describe a novel approach to imitation learning that infers latent policies directly from state observations. We introduce a method that characterizes the causal effects of...

@ashrewards

Ashley Edwards

5 months

This was such a fun and rewarding project to work on. Amazing job by the team! The most exciting thing for me is that we were able to achieve this without using a single doggone action label, which believe me, was not easy!

10

17

115

1

3

37

@jparkerholder

Jack Parker-Holder

2 years

Already the second day of the year and no huge breakthroughs in AI… what’s going on?

3

2

35

@jparkerholder

Jack Parker-Holder

11 months

Super cool work showing QD algorithms at scale 🚀 Congrats to the team!! May be of interest @CULLYAntoine @tehqin17 @jeffclune @MinqiJiang @_samvelyan

@TZahavy

Tom Zahavy

11 months

I'm super excited to share AlphaZeroᵈᵇ, a team of diverse #AlphaZero agents that collaborate to solve #Chess puzzles and demonstrate increased creativity. Check out our paper to learn more! A quick 🧵(1/n)

Tweet media one

5

71

335

1

6

35

@jparkerholder

Jack Parker-Holder

4 years

For anyone interested in finding diverse solutions for exploration or generalization, this is worth checking out! Was awesome to work on this project and I'm excited to see where the next ridges take us!! 🚀

@j_foerst

Jakob Foerster

4 years

The gradient is a locally greedy direction. Where do you get if you follow the eigenvectors of the Hessian instead? Our new paper, “Ridge Rider” (), explores how to do this and what happens in a variety of (toy) problems (if you dare to do so),.. Thread 1/N

Tweet media one

4

71

585

1

5

33

@jparkerholder

Jack Parker-Holder

2 years

👋 @MinqiJiang and I will be presenting ACCEL today @icmlconf , come by! Talk: Room 327 at 14:35 ET Poster: Hall E #919 Hopefully see you there 😀 #ICML2022

@jparkerholder

Jack Parker-Holder

2 years

Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]

3

47

226

0

9

32

@jparkerholder

Jack Parker-Holder

2 years

With Bayesian Generational PBT we can update *both* architectures and >10 hyperparameters on the fly in a single run 😮 even better it’s fast with parallel simulators ⚡️… great time to work in this area!!

@wanxingchen_

Xingchen Wan

2 years

(1/7) Population Based Training (PBT) has been shown to be highly effective for tuning hyperparameters (HPs) for deep RL. Now with the advent of massively parallel simulators, there has never been a better time to use these methods! However, PBT has a couple of key problems…

3

5

40

0

3

30

@jparkerholder

Jack Parker-Holder

1 year

The Open-Endedness team is growing 🌱 come and join us!! Exciting times 😀

@_rockt

Tim Rocktäschel

1 year

In addition to a Research Engineer, we are also looking for a Research Scientist 🧑‍🔬 to join @DeepMind 's Open-Endedness Team! If you are excited about the intersection of open-ended, self-improving, generalist AI and foundation models, please apply 👇

4

18

141

1

2

30

@jparkerholder

Jack Parker-Holder

3 months

Looking forward to discussing Genie tomorrow!! 🧞‍♀️🧞🧞‍♂️

@Saptarashmi

Saptarashmi Bandyopadhyay

3 months

📢Happy to share our UMD MARL talk @ Apr 16, 12:00 pm ET📢/--by @GoogleDeepMind Research Scientist @jparkerholder on "Generative Interactive Environments (GENIE)" in-person: IRB-5137 virtually: @johnpdickerson @umdcs @umiacs @ml_umd #RL #AI #ML

0

0

10

0

4

30

@jparkerholder

Jack Parker-Holder

4 months

This is also what we see with Genie, predicting the future is sufficient to learn parallax and consistent latent actions

@NandoDF

Nando de Freitas 🏳️‍🌈

4 months

Predicting the next word "only" is sufficient for language models to learn a large body of knowledge that enables then to code, answer questions, understand many topics, chat, and so on. This is clear to many researchers now, and there are nice tutorials on why this works by

11

125

649

1

4

28

@jparkerholder

Jack Parker-Holder

3 years

By curating *randomly generated* environments we can produce a curriculum that makes it possible for a student agent to transfer zero-shot to challenging human designed ones, including Formula One tracks 🏎️... maybe one day F1 teams will use PLR? 😀 come check it out @NeurIPSConf

@MinqiJiang

Minqi Jiang

3 years

🏎️ Replay-Guided Adversarial Environment Design Prioritized Level Replay (PLR) is secretly a form of unsupervised environment design. This leads to new theory improving PLR + impressive zero-shot transfer, like driving the Nürburgring Grand Prix. paper:

5

23

131

1

5

28

@jparkerholder

Jack Parker-Holder

2 years

We introduce ACCEL, a new algorithm that extends replay-based Unsupervised Environment Design (UED) (e.g. ) by including an *editor*. The editor makes small changes to previously useful levels, which compound over time to produce complex structures. [2/N]

@MinqiJiang

Minqi Jiang

3 years

🏎️ Replay-Guided Adversarial Environment Design Prioritized Level Replay (PLR) is secretly a form of unsupervised environment design. This leads to new theory improving PLR + impressive zero-shot transfer, like driving the Nürburgring Grand Prix. paper:

5

23

131

2

8

28

@jparkerholder

Jack Parker-Holder

10 months

I think the most exciting thing about the current research paradigm is a shift in focus from *solutions* -> *stepping stones*. Every time a new LLM or VLM comes out it immediately enables new capabilities in a variety of unexpected downstream areas. What a time to be alive 🌱

0

0

28

@jparkerholder

Jack Parker-Holder

2 years

Was super fun chatting with @kanjun and @joshalbrecht , hopefully I said something useful in there somewhere! Also interesting to see how much has changed since we spoke in August (both in the field and for @genintelligent 🚀) what a time to be an AI researcher!!😀

@kanjun

Kanjun 🐙🏡

2 years

Had a really fun convo with @jparkerholder about co-evolving RL agents & environments, alternatives & blockers to population-based training, and why we aren't thinking properly about data efficiency in RL. We also discussed how Jack managed so many papers during his PhD 💪!

0

7

32

0

5

28

@jparkerholder

Jack Parker-Holder

2 years

Join us!! 😀

@_rockt

Tim Rocktäschel

2 years

We are hiring for @DeepMind ’s Open-Endedness team. If you have expertise in topics such as RL, evolutionary computation, PCG, quality diversity, novelty search, generative modelling, world models, intrinsic motivation etc., then please consider applying!

12

62

272

1

3

27

@jparkerholder

Jack Parker-Holder

5 months

As we see with Genie - foundation world models trained from videos offer the potential for generating the environments we need for AGI 🎮. New paper by @mengjiao_yang laying out all the possibilities in the space, exciting times 🚀

@_akhaliq

AK

5 months

Video as the New Language for Real-World Decision Making Both text and video data are abundant on the internet and support large-scale self-supervised learning through next token or frame prediction. However, they have not been equally leveraged: language models have had

Tweet media one

4

56

271

1

2

27

@jparkerholder

Jack Parker-Holder

2 years

Check out our #NeurIPS2022 paper showing we can train more general world models by collecting data with a diverse population of agents! Great work by @YingchenX and team!! Come chat to us in New Orleans 😀

@YingchenX

Yingchen Xu

2 years

Interested in learning general world models at scale? 🌍 Check out our new #NeurIPS2022 paper to find out! Paper: Website: [1/N]

3

42

161

0

10

27

@jparkerholder

Jack Parker-Holder

3 years

PSA: you can use linear models in deep RL papers and still get accepted at #ICML2021 !! Congrats to @philipjohnball and @cong_ml ... now let’s try and beat ViT with ridge regression :)

@jparkerholder

Jack Parker-Holder

3 years

The case for offline RL is clear: we often have access to real world data in settings where it is expensive (and potentially even dangerous) to collect new experience. But what happens if this offline data doesn’t perfectly match the test environment? [1/8]

1

14

85

0

2

26

@jparkerholder

Jack Parker-Holder

1 year

We can now scale UED to competitive multi-agent RL!! This plot is my favorite, showing that the agent-level dependence clearly matters 🤹‍♂️ come check out the paper at #ICLR2023

@_samvelyan

Mikayel Samvelyan

1 year

A key insight for multi-agent settings is that, from the perspective of the teacher, maximising the student’s regret over co-players independently of the environment (and vice versa) doesn’t guarantee maximising regret in the joint space of co-players and environments.

Tweet media one

2

0

12

0

3

24

@jparkerholder

Jack Parker-Holder

2 years

Probably the shortest reviews I’ve ever seen for a top tier conference… maybe we can use them as a prompt for a language model to generate more thorough reviews?? 🤔 #ICML2022

0

1

24

@jparkerholder

Jack Parker-Holder

3 months

Genie + @UCL_DARK = 🫶🚀

@UCL_DARK

UCL DARK

3 months

We're excited to announce that the Genie Team from @GoogleDeepMind will be our next invited speakers! Title: Genie: Generative Interactive Environments Speakers: @ashrewards , @jparkerholder , @YugeTen Sign up: 📌 90 High Holborn 📅 Tue 30 Apr, 17:00

2

11

42

1

2

24

@jparkerholder

Jack Parker-Holder

5 months

Thank you @maxjaderberg !! XLand was super inspiring for us, it showed that our current RL algorithms are already capable of amazing things when given sufficiently rich and diverse environments. Can't wait to push this direction further with future versions of Genie 🚀🚀

@maxjaderberg

Max Jaderberg

5 months

Very cool to see the @GoogleDeepMind Genie results: learning an action-conditional generative model purely unsupervised from video data. This is close to my heart in getting towards truly open-ended environments to train truly general agents with RL 1/

2

18

153

0

0

23

@jparkerholder

Jack Parker-Holder

1 year

Great news!! ALOE is back and in person. If you’re heading to @NeurIPSConf and interested in open-endedness, adaptive curricula or self-driven learning systems then hopefully see you there 🕺

@aloeworkshop

ALOE Workshop

1 year

🌱 The 2nd Agent Learning in Open-Endedness Workshop will be held at NeurIPS 2023 (Dec 10–16) in magnificent New Orleans. ⚜️ If your research considers learning in open-ended settings, consider submitting your work (by 11:59 PM Sept. 29th, AoE).

2

14

52

0

1

23

@jparkerholder

Jack Parker-Holder

5 months

It turns out foundation world models are the stepping stone required for converting children's sketches into interactive experiences 🌱

@jeffclune

Jeff Clune

5 months

One amazing thing Genie enables: anyone, including children, can draw a world and then *step into it* and explore it!! How cool is that!?! We tried this with drawings my children made, to their delight. My child drew this, and now can fly the eagles around. Magic!🧞✨

Tweet media one

5

31

186

0

2

22

@jparkerholder

Jack Parker-Holder

7 months

Loved this part of the documentary and so glad it has become a meme... also totally true 😅

@PhD_Genie

PhD_Genie

8 months

The academic way...

Tweet media one

7

352

3K

0

1

21

@jparkerholder

Jack Parker-Holder

2 years

Thanks to a fantastic effort from @MinqiJiang all the code from our recent work on UED is now public!! Excited to see the new ideas that come from this! 🍿

@MinqiJiang

Minqi Jiang

2 years

We have open sourced our recent algorithms for Unsupervised Environment Design! These algorithms produce adaptive curricula that result in robust RL agents. This codebase includes our implementations of ACCEL, Robust PLR, and PAIRED.

2

42

217

0

1

21

@jparkerholder

Jack Parker-Holder

2 years

Super excited about this, more info to follow 😀. #NeurIPS2022

@mengjiao_yang

Sherry Yang

2 years

Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum

Tweet media one

5

101

573

1

2

21

@jparkerholder

Jack Parker-Holder

3 months

🧞‍♀️🫶

@_rockt

Tim Rocktäschel

3 months

How can we learn a foundational world model directly from Internet-scale videos without any action annotations? @YugeTen , @ashrewards and @jparkerholder from @GoogleDeepMind 's Open-Endedness Team are presenting "Genie: Generative Interactive Environments" at the @UCL_DARK Seminar

Tweet media one

2

17

117

1

0

20

@jparkerholder

Jack Parker-Holder

5 months

💯 and as many have pointed out, this is the worst video models are ever going to be. Super exciting to see the impact these models will have when used as world simulators with open-ended learning

@phillip_isola

Phillip Isola

5 months

So, rather than considering video models as a poor approximation to a real simulation engine, I think it's interesting to also consider them as something more: a new kind of world simulation that is in many ways far more complete than anything we have had before. 3/3

2

0

31

0

0

20

@jparkerholder

Jack Parker-Holder

4 years

@hardmaru Lol at everyone trying to explain monetary policy to a former rates trader

1

0

20

@jparkerholder

Jack Parker-Holder

3 months

Super exciting to see improved techniques for generating synthetic data for agents! Awesome work from @JacksonMattT and team, plenty more to be done in this space 🚀🚀🚀

@JacksonMattT

Matthew Jackson

3 months

🎮 Introducing the new and improved Policy-Guided Diffusion! Vastly more accurate trajectory generation than autoregressive models, with strong gains in offline RL performance! Plus a ton of new theory and results since our NeurIPS workshop paper... Check it out ⤵️

6

100

543

0

3

20

@jparkerholder

Jack Parker-Holder

2 years

Come to break the agents… stay to read about our new approach for unsupervised environment design 😀

@MinqiJiang

Minqi Jiang

2 years

🧬 For ACCEL, we made an interactive paper to accompany the typical PDF we all know and love. "Figure 1" is a demo that lets you challenge our agents by designing your own environments! Now you can also view agents from many training runs simultaneously.

2

52

245

0

4

20

@jparkerholder

Jack Parker-Holder

3 years

This was very much a collective effort from a great group of people! @RaghuSpaceRajan @XingyouSong @AndreBiedenkapp @yingjieMiao @The_Eimer @BaoheZhang1 @nguyentienvu @RCalandra @AleksandraFaust @FrankRHutter @LindauerMarius Look forward to seeing future progress here 📈! [4/4]

0

1

20

@jparkerholder

Jack Parker-Holder

2 years

Looking forward to seeing all the creative ideas submitted to this workshop! Submit by September 22nd 😀

@ofirnachum

Ofir Nachum

2 years

We are open for submissions! I know there are lots of people working on large models, pretraining, cross-domain/agent generalization for RL. Please submit your papers to the 1st FMDM workshop at NeurIPS 2022!

Tweet media one

1

20

126

0

1

20

@jparkerholder

Jack Parker-Holder

3 months

Super excited about this, we are only just beginning to see the potential for controllable video models!! #ICML2024

@cvgworkshop

Controllable Video Generation Workshop @ ICML2024

3 months

We are pleased to announce the first *controllable video generation* workshop at @icmlconf 2024! 📽️📽️📽️ We welcome submissions that explore video generation via different modes of control (e.g. text, pose, action). Deadline: 31st May AOE Website:

Tweet media one

1

9

47

0

2

19

@jparkerholder

Jack Parker-Holder

3 years

PSA: we are super excited to announce the workshop on Agent Learning in Open-Endedness (ALOE) at #ICLR2022 ! If you're interested in open-ended learning systems then check out the amazing speaker line-up and the CfP 😀

@aloeworkshop

ALOE Workshop

3 years

Announcing the first Agent Learning in Open-Endedness (ALOE) Workshop at #ICLR2022 ! We're calling for papers across many fields: If you work on open-ended learning, consider submitting. Paper deadline is February 25, 2022, AoE. .

Tweet media one

1

18

72

0

1

19

@jparkerholder

Jack Parker-Holder

1 year

🥚Eggsclusive🥚… introducing the first workshop on Environment Generation for Generalizable robots at #RSS2023 !! This workshop brings together many topics close to my heart: PCG, large offline datasets, generative modelling and much more! More info from @vbhatt_cs ⬇️⬇️⬇️

@vbhatt_cs

Varun Bhatt

1 year

We are excited to announce the first workshop on Environment Generation for Generalizable Robots (EGG) at #RSS2023 ()! Consider submitting if you are working in any area relevant to environment generation for robotics. Submissions due on May 17, 2023, AoE.

1

7

17

0

3

18

@jparkerholder

Jack Parker-Holder

6 months

This is really great work from @_samvelyan & @PaglieriDavide … and… it’s applied to football 🫶😀 bucket list item ✅

@_samvelyan

Mikayel Samvelyan

6 months

Uncovering vulnerabilities in multi-agent systems with the power of Open-Endedness! Introducing MADRID: Multi-Agent Diagnostics for Robustness via Illuminated Diversity ⚽️ Paper: Site: Code: 🔜 Here's what it's all about: 🧵👇

Tweet media one

1

41

150

0

4

18

@jparkerholder

Jack Parker-Holder

2 years

Hate to steal your thunder @pcastr …but I got 8!! I genuinely enjoy reviewing but this makes it impossible to do a good job @iclr_conf #ICLR2023

@pcastr

Pablo Samuel Castro

2 years

seven #ICLR2023 papers to review in 12 days (8 business days) is too much, imho...

8

0

61

2

1

18

@jparkerholder

Jack Parker-Holder

2 months

Our recent talk on Genie is now on YouTube 📽️ check it out!!

@UCL_DARK

UCL DARK

2 months

We were honored to have @ashrewards , @jparkerholder and @YugeTen from @GoogleDeepMind 's Open-Endedness Team presenting their foundation world model Genie at @ai_ucl Video available on our YouTube channel:

0

18

74

0

6

17

@jparkerholder

Jack Parker-Holder

2 years

This looks like a great tool for RL research!!

@_samvelyan

Mikayel Samvelyan

2 years

Thanks to @Bam4d , we now have a MiniHack Level Editor inside a browser which allows to easily design custom MiniHack environments using a convenient drag-and-drop functionality. Check it out at

Tweet media one

3

20

78

0

1

17

@jparkerholder

Jack Parker-Holder

1 year

Didn't make it to Hawaii, but, *just* made it into the fireside chat photo... I guess this is my 15 seconds of fame 😎

@sundarpichai

Sundar Pichai

1 year

Spent time with the Google DeepMind team in London this week, including the people working on our next generation models. Great to see the exciting progress and talk to @demishassabis and the teams about the future of AI.

Tweet media one

Tweet media two

Tweet media three

197

219

3K

0

0

16

@jparkerholder

Jack Parker-Holder

5 months

It has been a dream to work on Genie with such fantastic people, I’ve learned so much from all of them. We've also had a lot of fun, for example, using our model trained on platformers to convert random pictures of our pets into playable worlds 🤯🐶

0

2

16

@jparkerholder

Jack Parker-Holder

1 year

Exciting new approach for generating diverse co-players in cooperative games! This is a super hard problem and the solution required some flair 😀

@_andreilupu

Andrei Lupu

1 year

Access to diverse partners is crucial when training robust cooperators or evaluating ad-hoc coordination. In our top 25% #iclr2023 paper, we tackle the challenge of generating diverse cooperative policies and expose the issue of "sabotages" affecting simpler methods. A 🧵!

Tweet media one

2

20

70

0

0

16

@jparkerholder

Jack Parker-Holder

11 months

Learned adversaries are back 😎... after some amazing work from @ishitamed a variant of PAIRED can now match our previous sota UED algorithms (ACCEL and Robust PLR). This should unlock some exciting new research directions for autocurricula and environment generation 🚀

@ishitamed

Ishita Mediratta

11 months

📢 Exciting News! We're thrilled to announce our latest paper: "Stabilizing Unsupervised Environment Design with a Learned Adversary” 📚🤖 accepted at #CoLLAs 2023 @CoLLAs_Conf as an Oral presentation! 📄Paper: 💻Code: 1/🧵 👇

2

10

60

0

1

16

@jparkerholder

Jack Parker-Holder

2 years

Despite starting simple, levels in the replay buffer quickly become complex. Not only that, but ACCEL agents are capable of transfer to challenging human designed out-of-distribution environments, outperforming several strong baselines! [3/N]

Tweet media one

1

2

16

@jparkerholder

Jack Parker-Holder

3 years

It has been a pleasure to collaborate with @ucl_dark on so many exciting projects… come say hi at the conference!! #NeurIPS2021

@UCL_DARK

UCL DARK

3 years

We're excited to present @UCL_DARK 's work at #NeurIPS2021 and look forward to seeing you at the virtual conference! Check out all posters sessions and activities by our members below 👇

Tweet media one

Tweet media two

1

11

35

0

1

16

@jparkerholder

Jack Parker-Holder

3 years

Great summary of Ridge Rider!!

@RobertTLange

Robert Lange

3 years

📉 GD can be biased towards finding 'easy' solutions 🐈 By following the eigenvectors of the Hessian with negative eigenvalues, Ridge Rider explores a diverse set of solutions 🎨 #mlcollage [40] 📜: 💻: 🎬:

Tweet media one

1

32

148

0

0

16

@jparkerholder

Jack Parker-Holder

2 years

Welcome @aditimavalankar ! Exciting times for the Open-Endedness team 🙌

@aditimavalankar

Aditi Mavalankar

@aditimavalankar

2 years

Super stoked to be back at @DeepMind in London this time as a Research Scientist in the Open-Endedness team! I look forward to working with all my brilliant colleagues here!

16

5

493

1

0

16

@jparkerholder

Jack Parker-Holder

3 months

If only real footballers got up that quickly… 😅

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

3 months

Soccer players have to master a range of dynamic skills, from turning and kicking to chasing a ball. How could robots do the same? ⚽ We trained our AI agents to demonstrate a range of agile behaviors using reinforcement learning. Here’s how. 🧵

132

529

3K

1

1

16

@jparkerholder

Jack Parker-Holder

5 months

Why generate one adversarial prompt when you can instead generate them all…. And then train a drastically more robust model 🌈🌈🌈 Amazing work from @_samvelyan @_andreilupu @sharathraparthy and team!!

@_samvelyan

Mikayel Samvelyan

5 months

Introducing 🌈 Rainbow Teaming, a new method for generating diverse adversarial prompts for LLMs via LLMs It's a versatile tool 🛠️ for diagnosing model vulnerabilities across domains and creating data to enhance robustness & safety 🦺 Co-lead w/ @sharathraparthy & @_andreilupu

6

44

175

1

6

15

@jparkerholder

Jack Parker-Holder

2 years

Given the empirical gains, we wanted to see how far we could push the ACCEL agent. It turns out it gets over 50% success rate on mazes over an order of magnitude larger than the training curriculum! The next best baseline was PLR (25% success), while other methods failed. [4/N]

1

1

15

@jparkerholder

Jack Parker-Holder

2 years

Check out the updated AutoRL survey! 🦾🤖

@JAIR_Editor

J. AI Research-JAIR

2 years

New Article: "Automated Reinforcement Learning (AutoRL): A Survey and Open Problems" by Parker-Holder, Rajan, Song, Biedenkapp, Miao, Eimer, Zhang, Nguyen, Calandra, Faust, Hutter and Lindauer

0

5

16

0

2

15

@jparkerholder

Jack Parker-Holder

3 years

AutoRL faces significant challenges not seen in typical AutoML problems, leading to a distinct set of methods. In addition, the diversity of RL problems means methods span a wide range of communities. We provide a common taxonomy, discuss each area and pose open problems. [3/4]

Tweet media one

1

1

15

@jparkerholder

Jack Parker-Holder

1 year

Access to useful data is critical for training (and scaling) RL agents... and now we can cheaply generate it 😎! We have been discussing this type of thing for a while and diffusion seems to be the missing ingredient 🧑‍🍳 Amazing work as always by @cong_ml & @philipjohnball !!

@cong_ml

Cong Lu

1 year

RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]

5

37

184

0

4

14

@jparkerholder

Jack Parker-Holder

10 months

Couldn’t agree more with this! It’s also unfairly biased towards people who have more recently taken an algorithms class. I don’t see how there’s any useful signal in these interviews and it just adds loads of stress for candidates

@DrJimFan

Jim Fan

10 months

I rarely rant, but Leetcode is about the stupidest interview for AI scientist positions. It is totally out of distribution for daily tasks in AI, and doesn’t at all reflect research taste & skills. I honestly don’t think most tenured AI profs can solve hard Leetcode Qs without

84

81

1K

4

0

14

@jparkerholder

Jack Parker-Holder

4 years

The camera-ready version of this paper is now on arxiv: and here's the code for DvD-ES: .... time to start measuring diversity with determinants!!

GitHub - jparkerholder/DvD_ES: Code from the paper "Effective Diversity in Population Based...

Code from the paper "Effective Diversity in Population Based Reinforcement Learning", presented as a spotlight at NeurIPS 2020. This is the Evolution Strategies implementation, bu...

@hardmaru

hardmaru

4 years

Effective Diversity in Population-Based Reinforcement Learning Interesting work that looks at ways to increase diversity in behaviors found using population-based methods for RL. Comparisons made to existing Evolution Strategies and Novelty Search methods

3

27

98

0

6

14

@jparkerholder

Jack Parker-Holder

5 months

@ashrewards Lucky we have a project lead who has been working on latent actions since *way* before they were cool 😎

Tweet card media

Imitating Latent Policies from Observation

In this paper, we describe a novel approach to imitation learning that infers latent policies directly from state observations. We introduce a method that characterizes the causal effects of...

3

1

13

@jparkerholder

Jack Parker-Holder

2 years

Super excited about this!!

@mengjiao_yang

Sherry Yang

2 years

See you all at the 1st Foundation Models for Decision Making workshop @NeurIPSConf (Room 391) on Sat, Dec 3 2022. See schedule and zoom link at .

Tweet media one

1

20

89

0

1

14

@jparkerholder

Jack Parker-Holder

2 years

We also tested ACCEL in the BipedalWalker environment. ACCEL produces agents that are robust to a wide range of individual challenges, while the baselines often struggle to solve even the simple test tasks. [5/N]

Tweet media one

2

1

13

@jparkerholder

Jack Parker-Holder

3 years

AutoRL has shown to be effective for training RL agents on new problems where optimal configurations are not known, while also providing opportunities for significant performance gains on existing problems with access to more resources. 🚀 [2/4]

1

1

13

@jparkerholder

Jack Parker-Holder

2 years

Given the strength and simplicity of ACCEL, we think there is huge potential for future work. In particular, scaling to larger problems may require additional mechanisms to directly encourage diversity or adapt agent configurations. Plenty to do here! [7/N]

1

1

13

@jparkerholder

Jack Parker-Holder

1 year

If you don’t know about UED yet then check out this thread👇 These methods look set to play an increasingly prominent role as we seek to train more general agents for the real world 🚀

@_rockt

Tim Rocktäschel

1 year

If after 3 years of @MichaelD1729 's work on Unsupervised Environment Design (leading to & ) you are still using domain randomization (DR) for training more robust agents, consider PLR as a drop-in replacement!

Tweet media one

1

17

79

0

1

13

@jparkerholder

Jack Parker-Holder

5 months

@DrJimFan Totally agreed, lucky I met @ashrewards in my first week at GDM who has been doing this for years - but I think we were all surprised by how consistent the actions become at scale and it makes so much sense for world models

@jparkerholder

Jack Parker-Holder

5 months

Going for action-free training is a total game changer and it helps to do it with someone who has been thinking about this for years () who happens to also be one of the nicest people ever

1

3

37

1

1

11

@jparkerholder

Jack Parker-Holder

2 years

This project was co-led with @MinqiJiang alongside @MichaelD1729 @samveIyan @j_foerst @egrefen & @_rockt . The code will be released soon, please get in touch if interested! [N/N, N=8]

6

2

13

@jparkerholder

Jack Parker-Holder

2 years

Offline RL from pixels starter pack: * new datasets featuring visual observations ✅ * competitive baselines ✅ * a set of exciting open problems ✅ ...time to get started!! 🚀

@cong_ml

Cong Lu

2 years

Offline RL offers tremendous potential for training agents from large pre-collected datasets. However, the majority of work focuses on the proprioceptive setting. In this work we release the first public benchmark for continuous control using *visual observations*, V-D4RL. [1/N]

Tweet media one

1

6

40

0

2

13

@jparkerholder

Jack Parker-Holder

1 year

Interested in learning behaviors from offline data? Check out VD4RL for a set of standardized datasets and baselines… already used in some exciting recent papers and now published in @TmlrOrg 🔥🔥

@cong_ml

Cong Lu

1 year

Delighted that V-D4RL has been accepted at TMLR! Our benchmark and algorithms are the perfect way to start studying offline RL from pixels. As performance in proprioceptive envs saturate, it’s increasingly necessary to look further! 🧐 Here are some notable uses so far… [1/N]

Tweet media one

1

4

30

0

1

12

@jparkerholder

Jack Parker-Holder

1 year

What a time to be alive!!

@twominutepapers

Two Minute Papers

@twominutepapers

1 year

New Video - DeepMind’s New AI: Insanely Good At Games! #deepmind

3

7

36

0

1

12

@jparkerholder

Jack Parker-Holder

2 years

Ps. if these don't happen to be your research interests... I'd also happily spend hours talking about being a new parent or Chelsea FC's prospects for the upcoming season!

0

0

11

@jparkerholder

Jack Parker-Holder

2 years

Working with great people never gets old! Congrats to @ted_moskovitz @MichaelArbel @aldopacchiano on the best paper nomination!!🙌

@ted_moskovitz

Ted Moskovitz

2 years

Excited to say that our #AISTATS2022 paper “Towards an Understanding of Default Policies in Multitask Policy Optimization” was given an Honorable Mention for Best Paper! If you’re interested in hearing more (or are very bored), stop by our poster tomorrow at 4:30 BST 1/

2

8

34

0

0

11

@jparkerholder

Jack Parker-Holder

2 months

One of the main questions we get asked about Genie is where the rewards would come from. This work shows we can learn "well shaped rewards purely from internet-video" 😎... Looks like the pieces are coming together 🧩

@abhishekunique7

Abhishek Gupta

@abhishekunique7

2 months

Also, since this is simple classification, we can also apply this to non-robotic datasets such as Ego4D - ranking frames temporally within a video and using *other* videos as negatives for the discriminator. This result in well shaped rewards purely from internet-video (9/10)

2

0

2

1

0

11

@jparkerholder

Jack Parker-Holder

4 years

Proud and thankful for my wonderful (human) collaborators. We are all thrilled with our accepted @NeurIPSConf papers... except for Doris who is now fighting for authorship. Will get her a bone instead :) cc @aldopacchiano @nguyentienvu @j_foerst and others!

@jparkerholder

Jack Parker-Holder

4 years

Running her last minute experiments for @NeurIPSConf #neurips2020

Tweet media one

0

0

6

1

0

11

@jparkerholder

Jack Parker-Holder

2 years

Note that in all cases the complexity is emergent: There is no bonus for adding blocks or stumps, but this naturally occurs in the pursuit of high regret. Using the criteria from POET, we see that the ACCEL agent actually produces “Extremely Challenging” levels. [6/N]

Tweet media one

1

1

11

@jparkerholder

Jack Parker-Holder

8 months

This looks absolutely unbelievable, what a time to be working on UED!! Amazing work from @MinqiJiang and team 🚄🚀

@MinqiJiang

Minqi Jiang

8 months

Autocurricula can produce more general agents...but can be expensive to run 💸. Today, we're releasing minimax, a JAX library for RL autocurricula with 120x faster baselines. Runs that took 1 week now take < 3 hours. Paper:

6

86

463

0

0

11

@jparkerholder

Jack Parker-Holder

2 months

@_rockt If you want to go for high waisted trousers in the office I’m here for it @_rockt 🕺

0

0

10

@jparkerholder

Jack Parker-Holder

4 months

Great video from @twominutepapers as usual, totally agree this is a "DALL-E 1 moment" for text-to-environment generation 🚀

@jeffclune

Jeff Clune

4 months

I love the enthusiasm in this video about Genie: Thanks @twominutepapers !

2

6

29

0

2

10

@jparkerholder

Jack Parker-Holder

3 years

This work was led by @philipjohnball and @cong_ml and will be presented as a Spotlight at the #ICLR2021 SSL-RL workshop. Paper: Website: Please get in touch with any questions!! [8/8]

Tweet card media

Augmented World Models Facilitate Zero-Shot Dynamics...

Reinforcement learning from large-scale offline datasets provides us with the ability to learn policies without potentially unsafe or impractical exploration. Significant progress has been made in...

0

0

10

@jparkerholder

Jack Parker-Holder

6 months

Excited to see the awesome things that come out in 2024 🚀🚀🚀

@_samvelyan

Mikayel Samvelyan

6 months

The surge in #OpenEndedness research on arXiv marks a burgeoning interest in the field! The ascent is largely propelled by the trailblazing contributions of visionaries like @kenneth0stanley , @jeffclune , and @joelbot3000 , whose work continues to pave new pathways.

Tweet media one

3

19

121

0

0

10

@jparkerholder

Jack Parker-Holder

3 months

💯🫶

@_rockt

Tim Rocktäschel

3 months

Working on Genie with @GoogleDeepMind 's Open-Endedness Team and our colleagues has been a highlight of my career. Thank you Jake Bruce, @MichaelD1729 , @ashrewards , @jparkerholder , @YugeTen , @edwardfhughes , Matthew Lai, @aditimavalankar , @richie_internet , and many others.

1

4

82

0

0

9

@jparkerholder

Jack Parker-Holder

3 years

This is one of the biggest issues with RL papers in my view.. and it is compounded when there are also different versions of benchmarks or when baselines use different hyperparameters/architectures. Looks like great work! 👀

@agarwl_

Rishabh Agarwal

3 years

We also comment on the incompatibility of alternative evaluation protocols involving maximum across runs or during training to end-performance results. On Atari 100k, we find that the two protocols produce substantially different results. (5/N)

Tweet media one

1

1

15

0

0

9

@jparkerholder

Jack Parker-Holder

1 year

@phillip_isola This type of idea also works for RL, see ROSIE () from @hausman_k et al, GenAug () from @abhishekunique7 et al and SynthER from @cong_ml and @philipjohnball

@cong_ml

Cong Lu

1 year

RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]

5

37

184

0

0

9

@jparkerholder

Jack Parker-Holder

9 months

@akbirkhan 💯! Hot takes like this make me feel confident there has never been a better time to do an ML PhD 😀

0

0

9

@jparkerholder

Jack Parker-Holder

8 months

Go and do a PhD with Aldo! He’s easily one of the best people I’ve ever worked with, and pretty smart too 😀

@aldopacchiano

Aldo Pacchiano

8 months

(1/2) In 2024 I will be joining Boston University as an Assistant Professor in Computing and Data Sciences (CDS). Seeking Ph.D. students passionate about sequential decision making, reinforcement learning, and/or algorithmic fairness.

10

33

272

0

0

9

@jparkerholder

Jack Parker-Holder

7 months

Feels a bit like writing papers about Atari/DM control suite in 2023 and claiming the resulting gains could lead to AGI

@jxmnop

jack morris

7 months

i’m curious about effective altruism: how do so many smart people with the goal “do good for the world” wind up with the subgoal “analyze the neurons of GPT-2 small” or something similar?

44

11

317

1

0

9