Jack Parker-Holder Profile Banner
Jack Parker-Holder Profile
Jack Parker-Holder

@jparkerholder

2,560
Followers
697
Following
32
Media
775
Statuses

Research Scientist @GoogleDeepMind & Honorary Lecturer @UCL_DARK interested in generating worlds from internet data. Views are my own :)

London, England
Joined October 2018
Don't wanna be here? Send us removal request.
Pinned Tweet
@jparkerholder
Jack Parker-Holder
5 months
When we started this project the idea of training world models *exclusively* from Internet videos seemed wild, but it turns out latent actions are the key and the bitter lesson holds. Now we have a viable path to generating the rich diversity of environments we need for AGI. 🚀
@_rockt
Tim Rocktäschel
5 months
I am really excited to reveal what @GoogleDeepMind 's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.
144
571
3K
8
23
166
@jparkerholder
Jack Parker-Holder
2 years
I'm super excited to be joining @DeepMind today as a Research Scientist, working with @_rockt ! Thank you to everyone who helped make this possible! Watch this space 🌱
36
7
447
@jparkerholder
Jack Parker-Holder
3 years
🤖 Introducing the first survey on AutoRL: methods for automatically discovering multiple components of the RL training pipeline, from tuning hyperparameters and architectures to learning algorithms or automatically designing environments. Link 👉 [1/4]
Tweet media one
2
113
323
@jparkerholder
Jack Parker-Holder
2 years
Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]
3
47
226
@jparkerholder
Jack Parker-Holder
3 years
I always love hearing from former ML PhD students about the days before tensorflow/pytorch... maybe in a few years we will tell current PhD students about the time before free MuJoCo 🙌
@GoogleDeepMind
Google DeepMind
3 years
We’ve acquired the MuJoCo physics simulator () and are making it free for all, to support research everywhere. MuJoCo is a fast, powerful, easy-to-use, and soon to be open-source simulation tool, designed for robotics research:
85
2K
6K
4
11
136
@jparkerholder
Jack Parker-Holder
1 year
Interested in scaling open-ended learning systems? Check out the new RE posting in our team 🚀. Feel free to DM with any questions!
0
17
104
@jparkerholder
Jack Parker-Holder
1 year
Feel very fortunate to have contributed to this as my first project @DeepMind ! It is amazing to see what can be done when combining Transformer models with meta-RL and PLR in a vast, open-ended task space!
@FeryalMP
Feryal
1 year
I’m super excited to share our work on AdA: An Adaptive Agent capable of hypothesis-driven exploration which solves challenging unseen tasks with just a handful of experience, at a similar timescale to humans. See the thread for more details 👇 [1/N]
25
266
1K
3
9
100
@jparkerholder
Jack Parker-Holder
3 years
The case for offline RL is clear: we often have access to real world data in settings where it is expensive (and potentially even dangerous) to collect new experience. But what happens if this offline data doesn’t perfectly match the test environment? [1/8]
1
14
85
@jparkerholder
Jack Parker-Holder
3 years
First day @facebookai working with @_rockt and @egrefen .... should be a great summer! 😀
5
3
73
@jparkerholder
Jack Parker-Holder
3 years
Population Based Training (PBT) has been shown to be successful in a variety of RL settings, but often requires vast computational resources 💰. To address this, last year we introduced Population Based Bandits (PB2 ) [1/N]
2
16
60
@jparkerholder
Jack Parker-Holder
2 years
Super exciting time to work on population-based methods! We already have fast data collection, now this paper shows vectorizing agent updates can lead to huge speedups (on a GPU): Looking forward to discussing with the authors ( @instadeepai ) at #ICML2022 😀
3
7
60
@jparkerholder
Jack Parker-Holder
10 months
Working closely with many amazing members of @UCL_DARK (and @robertarail ) over the past few years has been a privilege and I am *also* super excited to make this official!! 😎🚀
@UCL_DARK
UCL DARK
10 months
We are super excited to announce that Dr Roberta Raileanu ( @robertarail ) and Dr Jack Parker-Holder ( @jparkerholder ) have joined @UCL_DARK as Honorary Lecturers! Both have done impressive work in Reinforcement Learning and Open-Endedness, and our lab is lucky to get their support.
Tweet media one
4
12
86
4
5
57
@jparkerholder
Jack Parker-Holder
2 years
Heading to Baltimore for #ICML2022 ✈️ Will be presenting ACCEL on Thursday and would love to chat about unsupervised environment design and open-endedness with many of you there! DM if you're around and want to catch up 😀
@jparkerholder
Jack Parker-Holder
2 years
Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]
3
47
226
1
3
53
@jparkerholder
Jack Parker-Holder
2 years
Heading to @NeurIPSConf tomorrow, would be great to chat about open-endedness, RL, world models or England’s chances at the word cup 😀 DMs open! #NeurIPS2022
4
4
51
@jparkerholder
Jack Parker-Holder
8 months
If you're thinking of applying for PhDs, interested in open-endedness/foundation models and don't mind rainy weather 🇬🇧, then consider applying to @UCL_DARK ! My DMs are open and I'll be in New Orleans for NeurIPS so please get in touch if this sounds like you! 😀
@UCL_DARK
UCL DARK
8 months
We ( @_rockt , @egrefen , @robertarail , and @jparkerholder ) are looking for PhD students to join us in Fall 2024. If you are interested in Open-Endedness, RL & Foundation Models, then apply here: and also write us at ucl-dark-admissions @googlegroups .com
3
20
65
3
8
43
@jparkerholder
Jack Parker-Holder
7 months
I’ll be ✈️ to #NeurIPS2023 on Monday and hoping to discuss: - open-endedness and why it matters for AGI #iykyk - world models - why it’s never been a better time to do a PhD in ML (especially @UCL_DARK 😉)! Find me at two posters + @aloeworkshop + hanging around the GDM booth 🤪
@_rockt
Tim Rocktäschel
7 months
Everyone from @GoogleDeepMind 's Open-Endedness Team and almost the entire @UCL_DARK Lab are going to be at @NeurIPSConf 2023 next week. You will find most of us at the @ALOEworkshop on Friday. Come and say hi!
0
6
59
0
3
41
@jparkerholder
Jack Parker-Holder
6 months
Not sure who needs to hear this, but, effectively filtering large and noisy datasets is a gift that keeps on giving!! 🎁 Often more impactful than fancy new model architectures 😅 We found this same thing in RL with autocurricula (e.g. PLR, ACCEL), and I'd bet it works elsewhere
@evgenia_rusak
Evgenia Rusak
6 months
In our new paper (oral , ICCV23), we develop a concept-specific pruning criterion (Density-Based-Pruning) which reduces the training cost by 72%. Joint work with @amrokamal1997 @kushal_tirumala @wielandbr @kamalikac @arimorcos (1/5)
Tweet media one
1
34
80
1
6
40
@jparkerholder
Jack Parker-Holder
5 months
Going for action-free training is a total game changer and it helps to do it with someone who has been thinking about this for years () who happens to also be one of the nicest people ever
@ashrewards
Ashley Edwards
5 months
This was such a fun and rewarding project to work on. Amazing job by the team! The most exciting thing for me is that we were able to achieve this without using a single doggone action label, which believe me, was not easy!
10
17
115
1
3
37
@jparkerholder
Jack Parker-Holder
2 years
Already the second day of the year and no huge breakthroughs in AI… what’s going on?
3
2
35
@jparkerholder
Jack Parker-Holder
11 months
Super cool work showing QD algorithms at scale 🚀 Congrats to the team!! May be of interest @CULLYAntoine @tehqin17 @jeffclune @MinqiJiang @_samvelyan
@TZahavy
Tom Zahavy
11 months
I'm super excited to share AlphaZeroᵈᵇ, a team of diverse #AlphaZero agents that collaborate to solve #Chess puzzles and demonstrate increased creativity. Check out our paper to learn more! A quick 🧵(1/n)
Tweet media one
5
71
335
1
6
35
@jparkerholder
Jack Parker-Holder
4 years
For anyone interested in finding diverse solutions for exploration or generalization, this is worth checking out! Was awesome to work on this project and I'm excited to see where the next ridges take us!! 🚀
@j_foerst
Jakob Foerster
4 years
The gradient is a locally greedy direction. Where do you get if you follow the eigenvectors of the Hessian instead? Our new paper, “Ridge Rider” (), explores how to do this and what happens in a variety of (toy) problems (if you dare to do so),.. Thread 1/N
Tweet media one
4
71
585
1
5
33
@jparkerholder
Jack Parker-Holder
2 years
👋 @MinqiJiang and I will be presenting ACCEL today @icmlconf , come by! Talk: Room 327 at 14:35 ET Poster: Hall E #919 Hopefully see you there 😀 #ICML2022
@jparkerholder
Jack Parker-Holder
2 years
Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]
3
47
226
0
9
32
@jparkerholder
Jack Parker-Holder
2 years
With Bayesian Generational PBT we can update *both* architectures and >10 hyperparameters on the fly in a single run 😮 even better it’s fast with parallel simulators ⚡️… great time to work in this area!!
@wanxingchen_
Xingchen Wan
2 years
(1/7) Population Based Training (PBT) has been shown to be highly effective for tuning hyperparameters (HPs) for deep RL. Now with the advent of massively parallel simulators, there has never been a better time to use these methods! However, PBT has a couple of key problems…
3
5
40
0
3
30
@jparkerholder
Jack Parker-Holder
1 year
The Open-Endedness team is growing 🌱 come and join us!! Exciting times 😀
@_rockt
Tim Rocktäschel
1 year
In addition to a Research Engineer, we are also looking for a Research Scientist 🧑‍🔬 to join @DeepMind 's Open-Endedness Team!  If you are excited about the intersection of open-ended, self-improving, generalist AI and foundation models, please apply 👇
4
18
141
1
2
30
@jparkerholder
Jack Parker-Holder
3 months
Looking forward to discussing Genie tomorrow!! 🧞‍♀️🧞🧞‍♂️
@Saptarashmi
Saptarashmi Bandyopadhyay
3 months
📢Happy to share our UMD MARL talk @ Apr 16, 12:00 pm ET📢/--by @GoogleDeepMind Research Scientist @jparkerholder on "Generative Interactive Environments (GENIE)" in-person: IRB-5137 virtually: @johnpdickerson @umdcs @umiacs @ml_umd #RL #AI #ML
0
0
10
0
4
30
@jparkerholder
Jack Parker-Holder
4 months
This is also what we see with Genie, predicting the future is sufficient to learn parallax and consistent latent actions
@NandoDF
Nando de Freitas 🏳️‍🌈
4 months
Predicting the next word "only" is sufficient for language models to learn a large body of knowledge that enables then to code, answer questions, understand many topics, chat, and so on. This is clear to many researchers now, and there are nice tutorials on why this works by
11
125
649
1
4
28
@jparkerholder
Jack Parker-Holder
3 years
By curating *randomly generated* environments we can produce a curriculum that makes it possible for a student agent to transfer zero-shot to challenging human designed ones, including Formula One tracks 🏎️... maybe one day F1 teams will use PLR? 😀 come check it out @NeurIPSConf
@MinqiJiang
Minqi Jiang
3 years
🏎️ Replay-Guided Adversarial Environment Design Prioritized Level Replay (PLR) is secretly a form of unsupervised environment design. This leads to new theory improving PLR + impressive zero-shot transfer, like driving the Nürburgring Grand Prix. paper:
5
23
131
1
5
28
@jparkerholder
Jack Parker-Holder
2 years
We introduce ACCEL, a new algorithm that extends replay-based Unsupervised Environment Design (UED) (e.g. ) by including an *editor*. The editor makes small changes to previously useful levels, which compound over time to produce complex structures. [2/N]
@MinqiJiang
Minqi Jiang
3 years
🏎️ Replay-Guided Adversarial Environment Design Prioritized Level Replay (PLR) is secretly a form of unsupervised environment design. This leads to new theory improving PLR + impressive zero-shot transfer, like driving the Nürburgring Grand Prix. paper:
5
23
131
2
8
28
@jparkerholder
Jack Parker-Holder
10 months
I think the most exciting thing about the current research paradigm is a shift in focus from *solutions* -> *stepping stones*. Every time a new LLM or VLM comes out it immediately enables new capabilities in a variety of unexpected downstream areas. What a time to be alive 🌱
0
0
28
@jparkerholder
Jack Parker-Holder
2 years
Was super fun chatting with @kanjun and @joshalbrecht , hopefully I said something useful in there somewhere! Also interesting to see how much has changed since we spoke in August (both in the field and for @genintelligent 🚀) what a time to be an AI researcher!!😀
@kanjun
Kanjun 🐙🏡
2 years
Had a really fun convo with @jparkerholder about co-evolving RL agents & environments, alternatives & blockers to population-based training, and why we aren't thinking properly about data efficiency in RL. We also discussed how Jack managed so many papers during his PhD 💪!
0
7
32
0
5
28
@jparkerholder
Jack Parker-Holder
2 years
Join us!! 😀
@_rockt
Tim Rocktäschel
2 years
We are hiring for @DeepMind ’s Open-Endedness team. If you have expertise in topics such as RL, evolutionary computation, PCG, quality diversity, novelty search, generative modelling, world models, intrinsic motivation etc., then please consider applying!
12
62
272
1
3
27
@jparkerholder
Jack Parker-Holder
5 months
As we see with Genie - foundation world models trained from videos offer the potential for generating the environments we need for AGI 🎮. New paper by @mengjiao_yang laying out all the possibilities in the space, exciting times 🚀
@_akhaliq
AK
5 months
Video as the New Language for Real-World Decision Making Both text and video data are abundant on the internet and support large-scale self-supervised learning through next token or frame prediction. However, they have not been equally leveraged: language models have had
Tweet media one
4
56
271
1
2
27
@jparkerholder
Jack Parker-Holder
2 years
Check out our #NeurIPS2022 paper showing we can train more general world models by collecting data with a diverse population of agents! Great work by @YingchenX and team!! Come chat to us in New Orleans 😀
@YingchenX
Yingchen Xu
2 years
Interested in learning general world models at scale? 🌍 Check out our new #NeurIPS2022 paper to find out! Paper: Website: [1/N]
3
42
161
0
10
27
@jparkerholder
Jack Parker-Holder
3 years
PSA: you can use linear models in deep RL papers and still get accepted at #ICML2021 !! Congrats to @philipjohnball and @cong_ml ... now let’s try and beat ViT with ridge regression :)
@jparkerholder
Jack Parker-Holder
3 years
The case for offline RL is clear: we often have access to real world data in settings where it is expensive (and potentially even dangerous) to collect new experience. But what happens if this offline data doesn’t perfectly match the test environment? [1/8]
1
14
85
0
2
26
@jparkerholder
Jack Parker-Holder
1 year
We can now scale UED to competitive multi-agent RL!! This plot is my favorite, showing that the agent-level dependence clearly matters 🤹‍♂️ come check out the paper at #ICLR2023
@_samvelyan
Mikayel Samvelyan
1 year
A key insight for multi-agent settings is that, from the perspective of the teacher, maximising the student’s regret over co-players independently of the environment (and vice versa) doesn’t guarantee maximising regret in the joint space of co-players and environments.
Tweet media one
2
0
12
0
3
24
@jparkerholder
Jack Parker-Holder
2 years
Probably the shortest reviews I’ve ever seen for a top tier conference… maybe we can use them as a prompt for a language model to generate more thorough reviews?? 🤔 #ICML2022
0
1
24
@jparkerholder
Jack Parker-Holder
3 months
Genie + @UCL_DARK = 🫶🚀
@UCL_DARK
UCL DARK
3 months
We're excited to announce that the Genie Team from @GoogleDeepMind will be our next invited speakers! Title: Genie: Generative Interactive Environments Speakers: @ashrewards , @jparkerholder , @YugeTen Sign up: 📌 90 High Holborn 📅 Tue 30 Apr, 17:00
2
11
42
1
2
24
@jparkerholder
Jack Parker-Holder
5 months
Thank you @maxjaderberg !! XLand was super inspiring for us, it showed that our current RL algorithms are already capable of amazing things when given sufficiently rich and diverse environments. Can't wait to push this direction further with future versions of Genie 🚀🚀
@maxjaderberg
Max Jaderberg
5 months
Very cool to see the @GoogleDeepMind Genie results: learning an action-conditional generative model purely unsupervised from video data. This is close to my heart in getting towards truly open-ended environments to train truly general agents with RL 1/
2
18
153
0
0
23
@jparkerholder
Jack Parker-Holder
1 year
Great news!! ALOE is back and in person. If you’re heading to @NeurIPSConf and interested in open-endedness, adaptive curricula or self-driven learning systems then hopefully see you there 🕺
@aloeworkshop
ALOE Workshop
1 year
🌱 The 2nd Agent Learning in Open-Endedness Workshop will be held at NeurIPS 2023 (Dec 10–16) in magnificent New Orleans. ⚜️ If your research considers learning in open-ended settings, consider submitting your work (by 11:59 PM Sept. 29th, AoE).
2
14
52
0
1
23
@jparkerholder
Jack Parker-Holder
5 months
It turns out foundation world models are the stepping stone required for converting children's sketches into interactive experiences 🌱
@jeffclune
Jeff Clune
5 months
One amazing thing Genie enables: anyone, including children, can draw a world and then *step into it* and explore it!! How cool is that!?! We tried this with drawings my children made, to their delight. My child drew this, and now can fly the eagles around. Magic!🧞✨
Tweet media one
5
31
186
0
2
22
@jparkerholder
Jack Parker-Holder
7 months
Loved this part of the documentary and so glad it has become a meme... also totally true 😅
@PhD_Genie
PhD_Genie
8 months
The academic way...
Tweet media one
7
352
3K
0
1
21
@jparkerholder
Jack Parker-Holder
2 years
Thanks to a fantastic effort from @MinqiJiang all the code from our recent work on UED is now public!! Excited to see the new ideas that come from this! 🍿
@MinqiJiang
Minqi Jiang
2 years
We have open sourced our recent algorithms for Unsupervised Environment Design! These algorithms produce adaptive curricula that result in robust RL agents. This codebase includes our implementations of ACCEL, Robust PLR, and PAIRED.
2
42
217
0
1
21
@jparkerholder
Jack Parker-Holder
2 years
Super excited about this, more info to follow 😀. #NeurIPS2022
@mengjiao_yang
Sherry Yang
2 years
Interested in foundation models + RL? Keep an eye out for the 1st "Foundation Models for Decision Making" workshop at NeurIPS 2022: . Call for submissions will soon follow. w. @du_yilun @jparkerholder @siddkaramcheti @IMordatch @shaneguML @ofirnachum
Tweet media one
5
101
573
1
2
21
@jparkerholder
Jack Parker-Holder
3 months
🧞‍♀️🫶
@_rockt
Tim Rocktäschel
3 months
How can we learn a foundational world model directly from Internet-scale videos without any action annotations? @YugeTen , @ashrewards and @jparkerholder from @GoogleDeepMind 's Open-Endedness Team are presenting "Genie: Generative Interactive Environments" at the @UCL_DARK Seminar
Tweet media one
2
17
117
1
0
20
@jparkerholder
Jack Parker-Holder
5 months
💯 and as many have pointed out, this is the worst video models are ever going to be. Super exciting to see the impact these models will have when used as world simulators with open-ended learning
@phillip_isola
Phillip Isola
5 months
So, rather than considering video models as a poor approximation to a real simulation engine, I think it's interesting to also consider them as something more: a new kind of world simulation that is in many ways far more complete than anything we have had before. 3/3
2
0
31
0
0
20
@jparkerholder
Jack Parker-Holder
4 years
@hardmaru Lol at everyone trying to explain monetary policy to a former rates trader
1
0
20
@jparkerholder
Jack Parker-Holder
3 months
Super exciting to see improved techniques for generating synthetic data for agents! Awesome work from @JacksonMattT and team, plenty more to be done in this space 🚀🚀🚀
@JacksonMattT
Matthew Jackson
3 months
🎮 Introducing the new and improved Policy-Guided Diffusion! Vastly more accurate trajectory generation than autoregressive models, with strong gains in offline RL performance! Plus a ton of new theory and results since our NeurIPS workshop paper... Check it out ⤵️
6
100
543
0
3
20
@jparkerholder
Jack Parker-Holder
2 years
Come to break the agents… stay to read about our new approach for unsupervised environment design 😀
@MinqiJiang
Minqi Jiang
2 years
🧬 For ACCEL, we made an interactive paper to accompany the typical PDF we all know and love. "Figure 1" is a demo that lets you challenge our agents by designing your own environments! Now you can also view agents from many training runs simultaneously.
2
52
245
0
4
20
@jparkerholder
Jack Parker-Holder
3 years
This was very much a collective effort from a great group of people! @RaghuSpaceRajan @XingyouSong @AndreBiedenkapp @yingjieMiao @The_Eimer @BaoheZhang1 @nguyentienvu @RCalandra @AleksandraFaust @FrankRHutter @LindauerMarius Look forward to seeing future progress here 📈! [4/4]
0
1
20
@jparkerholder
Jack Parker-Holder
2 years
Looking forward to seeing all the creative ideas submitted to this workshop! Submit by September 22nd 😀
@ofirnachum
Ofir Nachum
2 years
We are open for submissions! I know there are lots of people working on large models, pretraining, cross-domain/agent generalization for RL. Please submit your papers to the 1st FMDM workshop at NeurIPS 2022!
Tweet media one
1
20
126
0
1
20
@jparkerholder
Jack Parker-Holder
3 months
Super excited about this, we are only just beginning to see the potential for controllable video models!! #ICML2024
@cvgworkshop
Controllable Video Generation Workshop @ ICML2024
3 months
We are pleased to announce the first *controllable video generation* workshop at @icmlconf 2024! 📽️📽️📽️ We welcome submissions that explore video generation via different modes of control (e.g. text, pose, action). Deadline: 31st May AOE Website:
Tweet media one
1
9
47
0
2
19
@jparkerholder
Jack Parker-Holder
3 years
PSA: we are super excited to announce the workshop on Agent Learning in Open-Endedness (ALOE) at #ICLR2022 ! If you're interested in open-ended learning systems then check out the amazing speaker line-up and the CfP 😀
@aloeworkshop
ALOE Workshop
3 years
Announcing the first Agent Learning in Open-Endedness (ALOE) Workshop at #ICLR2022 ! We're calling for papers across many fields: If you work on open-ended learning, consider submitting. Paper deadline is February 25, 2022, AoE. .
Tweet media one
1
18
72
0
1
19
@jparkerholder
Jack Parker-Holder
1 year
🥚Eggsclusive🥚… introducing the first workshop on Environment Generation for Generalizable robots at #RSS2023 !! This workshop brings together many topics close to my heart: PCG, large offline datasets, generative modelling and much more! More info from @vbhatt_cs ⬇️⬇️⬇️
@vbhatt_cs
Varun Bhatt
1 year
We are excited to announce the first workshop on Environment Generation for Generalizable Robots (EGG) at #RSS2023 ()! Consider submitting if you are working in any area relevant to environment generation for robotics. Submissions due on May 17, 2023, AoE.
1
7
17
0
3
18
@jparkerholder
Jack Parker-Holder
6 months
This is really great work from @_samvelyan & @PaglieriDavide … and… it’s applied to football 🫶😀 bucket list item ✅
@_samvelyan
Mikayel Samvelyan
6 months
Uncovering vulnerabilities in multi-agent systems with the power of Open-Endedness! Introducing MADRID: Multi-Agent Diagnostics for Robustness via Illuminated Diversity ⚽️ Paper: Site: Code: 🔜 Here's what it's all about: 🧵👇
Tweet media one
1
41
150
0
4
18
@jparkerholder
Jack Parker-Holder
2 years
Hate to steal your thunder @pcastr …but I got 8!! I genuinely enjoy reviewing but this makes it impossible to do a good job @iclr_conf #ICLR2023
@pcastr
Pablo Samuel Castro
2 years
seven #ICLR2023 papers to review in 12 days (8 business days) is too much, imho...
8
0
61
2
1
18
@jparkerholder
Jack Parker-Holder
2 months
Our recent talk on Genie is now on YouTube 📽️ check it out!!
@UCL_DARK
UCL DARK
2 months
We were honored to have @ashrewards , @jparkerholder and @YugeTen from @GoogleDeepMind 's Open-Endedness Team presenting their foundation world model Genie at @ai_ucl Video available on our YouTube channel:
0
18
74
0
6
17
@jparkerholder
Jack Parker-Holder
2 years
This looks like a great tool for RL research!!
@_samvelyan
Mikayel Samvelyan
2 years
Thanks to @Bam4d , we now have a MiniHack Level Editor inside a browser which allows to easily design custom MiniHack environments using a convenient drag-and-drop functionality. Check it out at
Tweet media one
3
20
78
0
1
17
@jparkerholder
Jack Parker-Holder
1 year
Didn't make it to Hawaii, but, *just* made it into the fireside chat photo... I guess this is my 15 seconds of fame 😎
@sundarpichai
Sundar Pichai
1 year
Spent time with the Google DeepMind team in London this week, including the people working on our next generation models. Great to see the exciting progress and talk to @demishassabis and the teams about the future of AI.
Tweet media one
Tweet media two
Tweet media three
197
219
3K
0
0
16
@jparkerholder
Jack Parker-Holder
5 months
It has been a dream to work on Genie with such fantastic people, I’ve learned so much from all of them. We've also had a lot of fun, for example, using our model trained on platformers to convert random pictures of our pets into playable worlds 🤯🐶
0
2
16
@jparkerholder
Jack Parker-Holder
1 year
Exciting new approach for generating diverse co-players in cooperative games! This is a super hard problem and the solution required some flair 😀
@_andreilupu
Andrei Lupu
1 year
Access to diverse partners is crucial when training robust cooperators or evaluating ad-hoc coordination. In our top 25% #iclr2023 paper, we tackle the challenge of generating diverse cooperative policies and expose the issue of "sabotages" affecting simpler methods. A 🧵!
Tweet media one
2
20
70
0
0
16
@jparkerholder
Jack Parker-Holder
11 months
Learned adversaries are back 😎... after some amazing work from @ishitamed a variant of PAIRED can now match our previous sota UED algorithms (ACCEL and Robust PLR). This should unlock some exciting new research directions for autocurricula and environment generation 🚀
@ishitamed
Ishita Mediratta
11 months
📢 Exciting News! We're thrilled to announce our latest paper: "Stabilizing Unsupervised Environment Design with a Learned Adversary” 📚🤖 accepted at #CoLLAs 2023 @CoLLAs_Conf as an Oral presentation! 📄Paper: 💻Code: 1/🧵 👇
2
10
60
0
1
16
@jparkerholder
Jack Parker-Holder
2 years
Despite starting simple, levels in the replay buffer quickly become complex. Not only that, but ACCEL agents are capable of transfer to challenging human designed out-of-distribution environments, outperforming several strong baselines! [3/N]
Tweet media one
1
2
16
@jparkerholder
Jack Parker-Holder
3 years
It has been a pleasure to collaborate with @ucl_dark on so many exciting projects… come say hi at the conference!! #NeurIPS2021
@UCL_DARK
UCL DARK
3 years
We're excited to present @UCL_DARK 's work at #NeurIPS2021 and look forward to seeing you at the virtual conference! Check out all posters sessions and activities by our members below 👇
Tweet media one
Tweet media two
1
11
35
0
1
16
@jparkerholder
Jack Parker-Holder
3 years
Great summary of Ridge Rider!!
@RobertTLange
Robert Lange
3 years
📉 GD can be biased towards finding 'easy' solutions 🐈 By following the eigenvectors of the Hessian with negative eigenvalues, Ridge Rider explores a diverse set of solutions 🎨 #mlcollage [40] 📜: 💻: 🎬:
Tweet media one
1
32
148
0
0
16
@jparkerholder
Jack Parker-Holder
2 years
Welcome @aditimavalankar ! Exciting times for the Open-Endedness team 🙌
@aditimavalankar
Aditi Mavalankar
2 years
Super stoked to be back at @DeepMind in London this time as a Research Scientist in the Open-Endedness team! I look forward to working with all my brilliant colleagues here!
16
5
493
1
0
16
@jparkerholder
Jack Parker-Holder
3 months
If only real footballers got up that quickly… 😅
@GoogleDeepMind
Google DeepMind
3 months
Soccer players have to master a range of dynamic skills, from turning and kicking to chasing a ball. How could robots do the same? ⚽ We trained our AI agents to demonstrate a range of agile behaviors using reinforcement learning. Here’s how. 🧵
132
529
3K
1
1
16
@jparkerholder
Jack Parker-Holder
5 months
Why generate one adversarial prompt when you can instead generate them all…. And then train a drastically more robust model 🌈🌈🌈 Amazing work from @_samvelyan @_andreilupu @sharathraparthy and team!!
@_samvelyan
Mikayel Samvelyan
5 months
Introducing 🌈 Rainbow Teaming, a new method for generating diverse adversarial prompts for LLMs via LLMs It's a versatile tool 🛠️ for diagnosing model vulnerabilities across domains and creating data to enhance robustness & safety 🦺 Co-lead w/ @sharathraparthy & @_andreilupu
6
44
175
1
6
15
@jparkerholder
Jack Parker-Holder
2 years
Given the empirical gains, we wanted to see how far we could push the ACCEL agent. It turns out it gets over 50% success rate on mazes over an order of magnitude larger than the training curriculum! The next best baseline was PLR (25% success), while other methods failed. [4/N]
1
1
15
@jparkerholder
Jack Parker-Holder
2 years
Check out the updated AutoRL survey! 🦾🤖
@JAIR_Editor
J. AI Research-JAIR
2 years
New Article: "Automated Reinforcement Learning (AutoRL): A Survey and Open Problems" by Parker-Holder, Rajan, Song, Biedenkapp, Miao, Eimer, Zhang, Nguyen, Calandra, Faust, Hutter and Lindauer
0
5
16
0
2
15
@jparkerholder
Jack Parker-Holder
3 years
AutoRL faces significant challenges not seen in typical AutoML problems, leading to a distinct set of methods. In addition, the diversity of RL problems means methods span a wide range of communities. We provide a common taxonomy, discuss each area and pose open problems. [3/4]
Tweet media one
1
1
15
@jparkerholder
Jack Parker-Holder
1 year
Access to useful data is critical for training (and scaling) RL agents... and now we can cheaply generate it 😎! We have been discussing this type of thing for a while and diffusion seems to be the missing ingredient 🧑‍🍳 Amazing work as always by @cong_ml & @philipjohnball !!
@cong_ml
Cong Lu
1 year
RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]
5
37
184
0
4
14
@jparkerholder
Jack Parker-Holder
10 months
Couldn’t agree more with this! It’s also unfairly biased towards people who have more recently taken an algorithms class. I don’t see how there’s any useful signal in these interviews and it just adds loads of stress for candidates
@DrJimFan
Jim Fan
10 months
I rarely rant, but Leetcode is about the stupidest interview for AI scientist positions. It is totally out of distribution for daily tasks in AI, and doesn’t at all reflect research taste & skills. I honestly don’t think most tenured AI profs can solve hard Leetcode Qs without
84
81
1K
4
0
14
@jparkerholder
Jack Parker-Holder
4 years
The camera-ready version of this paper is now on arxiv: and here's the code for DvD-ES: .... time to start measuring diversity with determinants!!
@hardmaru
hardmaru
4 years
Effective Diversity in Population-Based Reinforcement Learning Interesting work that looks at ways to increase diversity in behaviors found using population-based methods for RL. Comparisons made to existing Evolution Strategies and Novelty Search methods
3
27
98
0
6
14
@jparkerholder
Jack Parker-Holder
2 years
Super excited about this!!
@mengjiao_yang
Sherry Yang
2 years
See you all at the 1st Foundation Models for Decision Making workshop @NeurIPSConf (Room 391) on Sat, Dec 3 2022. See schedule and zoom link at .
Tweet media one
1
20
89
0
1
14
@jparkerholder
Jack Parker-Holder
2 years
We also tested ACCEL in the BipedalWalker environment. ACCEL produces agents that are robust to a wide range of individual challenges, while the baselines often struggle to solve even the simple test tasks. [5/N]
Tweet media one
2
1
13
@jparkerholder
Jack Parker-Holder
3 years
AutoRL has shown to be effective for training RL agents on new problems where optimal configurations are not known, while also providing opportunities for significant performance gains on existing problems with access to more resources. 🚀 [2/4]
1
1
13
@jparkerholder
Jack Parker-Holder
2 years
Given the strength and simplicity of ACCEL, we think there is huge potential for future work. In particular, scaling to larger problems may require additional mechanisms to directly encourage diversity or adapt agent configurations. Plenty to do here! [7/N]
1
1
13
@jparkerholder
Jack Parker-Holder
1 year
If you don’t know about UED yet then check out this thread👇 These methods look set to play an increasingly prominent role as we seek to train more general agents for the real world 🚀
@_rockt
Tim Rocktäschel
1 year
If after 3 years of @MichaelD1729 's work on Unsupervised Environment Design (leading to & ) you are still using domain randomization (DR) for training more robust agents, consider PLR as a drop-in replacement!
Tweet media one
1
17
79
0
1
13
@jparkerholder
Jack Parker-Holder
5 months
@DrJimFan Totally agreed, lucky I met @ashrewards in my first week at GDM who has been doing this for years - but I think we were all surprised by how consistent the actions become at scale and it makes so much sense for world models
@jparkerholder
Jack Parker-Holder
5 months
Going for action-free training is a total game changer and it helps to do it with someone who has been thinking about this for years () who happens to also be one of the nicest people ever
1
3
37
1
1
11
@jparkerholder
Jack Parker-Holder
2 years
This project was co-led with @MinqiJiang alongside @MichaelD1729 @samveIyan @j_foerst @egrefen & @_rockt . The code will be released soon, please get in touch if interested! [N/N, N=8]
6
2
13
@jparkerholder
Jack Parker-Holder
2 years
Offline RL from pixels starter pack: * new datasets featuring visual observations ✅ * competitive baselines ✅ * a set of exciting open problems ✅ ...time to get started!! 🚀
@cong_ml
Cong Lu
2 years
Offline RL offers tremendous potential for training agents from large pre-collected datasets. However, the majority of work focuses on the proprioceptive setting. In this work we release the first public benchmark for continuous control using *visual observations*, V-D4RL. [1/N]
Tweet media one
1
6
40
0
2
13
@jparkerholder
Jack Parker-Holder
1 year
Interested in learning behaviors from offline data? Check out VD4RL for a set of standardized datasets and baselines… already used in some exciting recent papers and now published in @TmlrOrg 🔥🔥
@cong_ml
Cong Lu
1 year
Delighted that V-D4RL has been accepted at TMLR! Our benchmark and algorithms are the perfect way to start studying offline RL from pixels. As performance in proprioceptive envs saturate, it’s increasingly necessary to look further! 🧐 Here are some notable uses so far… [1/N]
Tweet media one
1
4
30
0
1
12
@jparkerholder
Jack Parker-Holder
1 year
What a time to be alive!!
@twominutepapers
Two Minute Papers
1 year
New Video - DeepMind’s New AI: Insanely Good At Games! #deepmind
3
7
36
0
1
12
@jparkerholder
Jack Parker-Holder
2 years
Ps. if these don't happen to be your research interests... I'd also happily spend hours talking about being a new parent or Chelsea FC's prospects for the upcoming season!
0
0
11
@jparkerholder
Jack Parker-Holder
2 years
Working with great people never gets old! Congrats to @ted_moskovitz @MichaelArbel @aldopacchiano on the best paper nomination!!🙌
@ted_moskovitz
Ted Moskovitz
2 years
Excited to say that our #AISTATS2022 paper “Towards an Understanding of Default Policies in Multitask Policy Optimization” was given an Honorable Mention for Best Paper! If you’re interested in hearing more (or are very bored), stop by our poster tomorrow at 4:30 BST 1/
2
8
34
0
0
11
@jparkerholder
Jack Parker-Holder
2 months
One of the main questions we get asked about Genie is where the rewards would come from. This work shows we can learn "well shaped rewards purely from internet-video" 😎... Looks like the pieces are coming together 🧩
@abhishekunique7
Abhishek Gupta
2 months
Also, since this is simple classification, we can also apply this to non-robotic datasets such as Ego4D - ranking frames temporally within a video and using *other* videos as negatives for the discriminator. This result in well shaped rewards purely from internet-video (9/10)
2
0
2
1
0
11
@jparkerholder
Jack Parker-Holder
4 years
Proud and thankful for my wonderful (human) collaborators. We are all thrilled with our accepted @NeurIPSConf papers... except for Doris who is now fighting for authorship. Will get her a bone instead :) cc @aldopacchiano @nguyentienvu @j_foerst and others!
@jparkerholder
Jack Parker-Holder
4 years
Running her last minute experiments for @NeurIPSConf #neurips2020
Tweet media one
0
0
6
1
0
11
@jparkerholder
Jack Parker-Holder
2 years
Note that in all cases the complexity is emergent: There is no bonus for adding blocks or stumps, but this naturally occurs in the pursuit of high regret. Using the criteria from POET, we see that the ACCEL agent actually produces “Extremely Challenging” levels. [6/N]
Tweet media one
1
1
11
@jparkerholder
Jack Parker-Holder
8 months
This looks absolutely unbelievable, what a time to be working on UED!! Amazing work from @MinqiJiang and team 🚄🚀
@MinqiJiang
Minqi Jiang
8 months
Autocurricula can produce more general agents...but can be expensive to run 💸. Today, we're releasing minimax, a JAX library for RL autocurricula with 120x faster baselines. Runs that took 1 week now take < 3 hours. Paper:
6
86
463
0
0
11
@jparkerholder
Jack Parker-Holder
2 months
@_rockt If you want to go for high waisted trousers in the office I’m here for it @_rockt 🕺
0
0
10
@jparkerholder
Jack Parker-Holder
4 months
Great video from @twominutepapers as usual, totally agree this is a "DALL-E 1 moment" for text-to-environment generation 🚀
@jeffclune
Jeff Clune
4 months
I love the enthusiasm in this video about Genie: Thanks @twominutepapers !
2
6
29
0
2
10
@jparkerholder
Jack Parker-Holder
6 months
Excited to see the awesome things that come out in 2024 🚀🚀🚀
@_samvelyan
Mikayel Samvelyan
6 months
The surge in #OpenEndedness research on arXiv marks a burgeoning interest in the field! The ascent is largely propelled by the trailblazing contributions of visionaries like @kenneth0stanley , @jeffclune , and @joelbot3000 , whose work continues to pave new pathways.
Tweet media one
3
19
121
0
0
10
@jparkerholder
Jack Parker-Holder
3 months
💯🫶
@_rockt
Tim Rocktäschel
3 months
Working on Genie with @GoogleDeepMind 's Open-Endedness Team and our colleagues has been a highlight of my career. Thank you Jake Bruce, @MichaelD1729 , @ashrewards , @jparkerholder , @YugeTen , @edwardfhughes , Matthew Lai, @aditimavalankar , @richie_internet , and many others.
1
4
82
0
0
9
@jparkerholder
Jack Parker-Holder
3 years
This is one of the biggest issues with RL papers in my view.. and it is compounded when there are also different versions of benchmarks or when baselines use different hyperparameters/architectures. Looks like great work! 👀
@agarwl_
Rishabh Agarwal
3 years
We also comment on the incompatibility of alternative evaluation protocols involving maximum across runs or during training to end-performance results. On Atari 100k, we find that the two protocols produce substantially different results. (5/N)
Tweet media one
1
1
15
0
0
9
@jparkerholder
Jack Parker-Holder
1 year
@phillip_isola This type of idea also works for RL, see ROSIE () from @hausman_k et al, GenAug () from @abhishekunique7 et al and SynthER from @cong_ml and @philipjohnball
@cong_ml
Cong Lu
1 year
RL agents🤖need a lot of data, which they usually need to gather themselves. But does that data need to be real? Enter *Synthetic Experience Replay*, leveraging recent advances in #GenerativeAI in order to vastly upsample⬆️ an agent’s training data! [1/N]
5
37
184
0
0
9
@jparkerholder
Jack Parker-Holder
9 months
@akbirkhan 💯! Hot takes like this make me feel confident there has never been a better time to do an ML PhD 😀
0
0
9
@jparkerholder
Jack Parker-Holder
8 months
Go and do a PhD with Aldo! He’s easily one of the best people I’ve ever worked with, and pretty smart too 😀
@aldopacchiano
Aldo Pacchiano
8 months
(1/2) In 2024 I will be joining Boston University as an Assistant Professor in Computing and Data Sciences (CDS). Seeking Ph.D. students passionate about sequential decision making, reinforcement learning, and/or algorithmic fairness.
10
33
272
0
0
9
@jparkerholder
Jack Parker-Holder
7 months
Feels a bit like writing papers about Atari/DM control suite in 2023 and claiming the resulting gains could lead to AGI
@jxmnop
jack morris
7 months
i’m curious about effective altruism: how do so many smart people with the goal “do good for the world” wind up with the subgoal “analyze the neurons of GPT-2 small” or something similar?
44
11
317
1
0
9