Roberta Raileanu Profile
Roberta Raileanu

@robertarail

4,071
Followers
1,306
Following
42
Media
1,159
Statuses

Research Scientist @Meta & Honorary Lecturer @UCL ex @DeepMind | @MSFTResearch | @NYU | @Princeton Llama-3 Tool Use Lead, Toolformer, Rainbow Teaming

London, UK
Joined April 2013
Don't wanna be here? Send us removal request.
@robertarail
Roberta Raileanu
2 years
Jakob Foerster ( @j_foerst ) and I are hiring a PhD student for our FAIR-Oxford program to work at the intersection of language and RL. The student will spend 50% of their time @UniofOxford and 50% @MetaAI (FAIR), while completing a DPhil (Oxford PhD). Deadline: 1st of March
13
78
366
@robertarail
Roberta Raileanu
2 years
Our group has multiple openings for internships at FAIR London ( @MetaAI ). I’m looking for someone to work on language models + decision making e.g. augmenting LMs with actions / tools / goals, interactive / open-ended learning for LMs, or RLHF. Apply at
10
67
338
@robertarail
Roberta Raileanu
3 years
Super excited to start as a Research Scientist at FAIR London today! I’m looking for an intern for 2022, so if you’re interested in generalization and representation in RL, unsupervised or continual RL, consider applying via .
22
15
337
@robertarail
Roberta Raileanu
10 months
My team is hiring interns to work on LLMs (part of Meta’s GenAI research lab): If you’re interested in working together on augmenting LLMs with tools and decision making abilities, or open-ended and continual learning for LLMs, consider applying and don’t
6
43
278
@robertarail
Roberta Raileanu
4 years
Excited to share our new paper “Automatic Data Augmentation for Generalization in Deep Reinforcement Learning” w/ @maxagoldstein8 , @denisyarats , @ikostrikov , and @rob_fergus ! Paper: Code: Website:
2
51
224
@robertarail
Roberta Raileanu
5 years
Excited to share our @iclr_conf 2020 paper “RIDE: Rewarding Impact-Driven Exploration For Procedurally-Generated Environments”: w/ @_rockt at @facebookai ! Code available at . 1/5
Tweet media one
Tweet media two
Tweet media three
5
38
183
@robertarail
Roberta Raileanu
1 year
Our FAIR London team is hiring multiple Research Scientists to work on tool-augmented LLMs! Apply here if interested:
4
47
175
@robertarail
Roberta Raileanu
2 years
Struggling to keep up with all the recent papers on Augmented Language Models? Check out our new survey on augmenting LLMs with reasoning, acting, and tool use.
@_akhaliq
AK
2 years
Augmented Language Models: a Survey abs:
Tweet media one
7
101
410
2
25
146
@robertarail
Roberta Raileanu
2 years
CSP: our new algorithm for scalable continual reinforcement learning. CSP allows users to control the trade-off between performance and model size by adaptively learning a subspace of policies. SOTA on Continual World and Brax. Interactive Website:
@jb_gaya
Jean-Baptiste Gaya
2 years
An agent able to tackle a sequence of tasks, without forgetting, while keeping a reasonable model size: this is what we propose with Continual Subspace of Policies (CSP) 1/6
2
7
26
1
6
102
@robertarail
Roberta Raileanu
10 months
I’ll be #NeurIPS2023 from Tues evening until Sat. Look forward to meeting old friends and making new ones. Let me know if you want to chat about: - augmenting LLMs with tools and decision-making - open-ended and continual learning with / for LLMs Here’s where you can find me:
1
7
100
@robertarail
Roberta Raileanu
7 months
🚨 New paper 🚨 on teaching LLMs to reason with RL! 🔎 We find that expert iteration (aka iterative rejection sampling) is a strong baseline, matching PPO's asymptotic performance and sample efficiency. 🙌 Kudos to @Dahoas1 who is a super productive researcher and produced not
@Dahoas1
Alex Havrilla
7 months
🚨🚨🚨Paper #2 from my time at Meta! In this work, we set out to understand how different algorithms fare at improving LLM reasoning from feedback. We compare expert iteration, PPO, and return-conditioned RL using Llama-2 as the base model.
Tweet media one
4
39
157
0
21
99
@robertarail
Roberta Raileanu
2 years
I’ll be at #NeurIPS2022 next week and would love to chat about RL, generalization, exploration, or its interaction with language. I’ll be presenting the following papers together with my co-authors:
1
9
92
@robertarail
Roberta Raileanu
11 months
Can LLaMA-2 teach agents to play NetHack? 🤖 Our new method, Motif, helps agents explore vast open-ended environments by distilling an LLM’s prior knowledge of the world. 🕹️ Motif learns an intrinsic reward model using LLM feedback over caption pairs, which is then used to
@proceduralia
Pierluca D'Oro
11 months
Can reinforcement learning from AI feedback unlock new capabilities in AI agents? Introducing Motif, an LLM-powered method for intrinsic motivation from AI feedback. Motif extracts reward functions from Llama 2's preferences and uses them to train agents with reinforcement
15
164
739
2
14
88
@robertarail
Roberta Raileanu
4 years
Delighted to give a 𝘀𝗽𝗼𝘁𝗹𝗶𝗴𝗵𝘁 talk about our work at the 𝘐𝘯𝘥𝘶𝘤𝘵𝘪𝘷𝘦 𝘉𝘪𝘢𝘴𝘦𝘴, 𝘐𝘯𝘷𝘢𝘳𝘪𝘢𝘯𝘤𝘦𝘴, 𝘢𝘯𝘥 𝘎𝘦𝘯𝘦𝘳𝘢𝘭𝘪𝘻𝘢𝘵𝘪𝘰𝘯 𝘪𝘯 𝘙𝘓 workshop #ICML2020 on Sat at 12:50pm UTC, followed by a poster session at 1:15pm UTC.
@robertarail
Roberta Raileanu
4 years
Excited to share our new paper “Automatic Data Augmentation for Generalization in Deep Reinforcement Learning” w/ @maxagoldstein8 , @denisyarats , @ikostrikov , and @rob_fergus ! Paper: Code: Website:
2
51
224
1
12
80
@robertarail
Roberta Raileanu
2 years
Meet Toolformer, a language model that learns to use tools via self-supervision. At test time, Toolformer decides what tool to use and how based on the context, without any additional prompting or finetuning. If you’re interested in this area, we’re hiring interns!
@timo_schick
Timo Schick
2 years
🎉 New paper 🎉 Introducing the Toolformer, a language model that teaches itself to use various tools in a self-supervised way. This significantly improves zero-shot performance and enables it to outperform much larger models. 🧰 🔗 Link:
42
297
1K
2
11
79
@robertarail
Roberta Raileanu
7 months
🚨 Introducing ToolVerifier, our new method for generalization to new tools. 🚀 ToolVerifier outperforms GPT-4 on some tasks despite being based on Llama-2 70b. 🤖 Key ideas: - fine-tune on synthetically generated data for tool use with reasoning traces - self-verification
@jaseweston
Jason Weston
7 months
🚨New paper!🚨 ToolVerifier. - Method to generalize to new tools - Self-asks contrastive questions to select between best tools and parameter choices - Fine-tuned on self-built synthetic data - 22% performance improvement over few-shot baseline 🧵(1/4)
Tweet media one
1
70
426
0
13
76
@robertarail
Roberta Raileanu
9 months
🚨 We’re organizing a new #ICLR2024 workshop on Generative AI for Decision Making and submissions are open! 🤖 The combination of Generative AI and Decision Making is emerging as one of the most promising paths towards generally capable agents, allowing
@bogdan_mazoure
Bogdan Mazoure
9 months
🚨 The call for papers for our #ICLR2024 workshop on Generative AI for Decision Making is out! 🚨 We welcome both short (4 pages) and full (9 pages) technical or positional papers. Details: 1/3
1
23
81
0
16
72
@robertarail
Roberta Raileanu
1 year
Super excited to join @UCL_DARK and get the chance to work (even more) closely with this very talented group!! 🚀
@UCL_DARK
UCL DARK
1 year
We are super excited to announce that Dr Roberta Raileanu ( @robertarail ) and Dr Jack Parker-Holder ( @jparkerholder ) have joined @UCL_DARK as Honorary Lecturers! Both have done impressive work in Reinforcement Learning and Open-Endedness, and our lab is lucky to get their support.
Tweet media one
4
12
86
5
2
71
@robertarail
Roberta Raileanu
10 months
🤖 Want an agent that can learn new tasks from only a handful of demonstrations and no weight updates? 🚀 Check out our new work on In-Context Learning for Sequential Decision-Making, where we show how we can use transformers to few-shot learn new Procgen and MiniHack tasks. 👋
@sharathraparthy
Sharath Raparthy
10 months
🚨 🚨 !!New Paper Alert!! 🚨 🚨 How can we train agents that learn new tasks (with different states, actions, dynamics and reward functions) from only a few demonstrations and no weight updates? In-context learning to the rescue! In our new paper, we show that by training
3
32
264
0
11
70
@robertarail
Roberta Raileanu
5 months
I’ll be at #ICLR2024 on Saturday for the LLM Agents Workshop Panel 🚀 Some of my collaborators will also be there throughout the week presenting our work on: - how RLHF affects LLM generalisation and diversity with @_robertkirk - training NetHack agents using LLM feedback with
Tweet media one
3
9
70
@robertarail
Roberta Raileanu
3 years
Our work on “Decoupling Value and Policy for Generalization in Reinforcement Learning” w/ @rob_fergus will be presented as a long talk @icmlconf 2021. Come check it out in Poster Session 1 (Tuesday, 12 - 2pm ET / 5 - 7pm BST): .
Tweet media one
1
16
66
@robertarail
Roberta Raileanu
4 years
Happy to finally release our work on “Fast Adaptation via Policy-Dynamics Value Functions” w/ @maxagoldstein8 , Arthur Szlam, and @rob_fergus , which will be presented at #icml2020 ! Paper: Code: Website:
1
18
66
@robertarail
Roberta Raileanu
2 months
Check out @RL_Conference 's Outstanding Paper Awards and the blog post on the process we used to decide. We awarded 7 papers that excel in one of the following aspects: - Applications of Reinforcement Learning - Empirical Reinforcement Learning Research - Empirical
@MarlosCMachado
Marlos C. Machado
2 months
In case you are not attending the @RL_Conference this year (I'm really sorry for you), @robertarail and I announced the list of RLC's Oustanding Paper Awards this year. If want to know the awarded papers, or the process we followed, read this piece:
2
11
85
0
8
59
@robertarail
Roberta Raileanu
3 years
Very cool work on unsupervised environment design! I really like the idea of an interactive demo where you can test agents on different environments in real time! I hope more RL papers will adopt this in the future, particularly if the focus is on generalization or robustness.
@jparkerholder
Jack Parker-Holder
3 years
Evolving Curricula with Regret-Based Environment Design Website: Paper: TL;DR: We introduce a new open-ended RL algorithm that produces complex levels and a robust agent that can solve them (e.g. below). Highlights ⬇️! [1/N]
3
47
225
1
9
58
@robertarail
Roberta Raileanu
2 months
It really takes a village…of Llama herders 🦙🦙🦙 Working on this has been an absolute privilege. Special shoutout to @mialon_gregoire and @sharathraparthy who worked tirelessly to bring tool use capabilities to Llama-3.1!!
@AIatMeta
AI at Meta
2 months
Starting today, open source is leading the way. Introducing Llama 3.1: Our most capable models yet. Today we’re releasing a collection of new Llama 3.1 models including our long awaited 405B. These models deliver improved reasoning capabilities, a larger 128K token context
284
1K
6K
5
3
54
@robertarail
Roberta Raileanu
6 months
Can’t wait to see what you all build with Llama 3, the best openly available LLM! Enjoy and stay tuned for more🦙🦙🦙
@ml_perception
Mike Lewis
6 months
Excited to share a preview of Llama3, including the release of an 8B and 70B (82 MMLU, should be the best open weights model!), and preliminary results for a 405B model (still training, but already competitive with GPT4). Lots more still to come...
18
97
506
1
3
53
@robertarail
Roberta Raileanu
17 days
🏆 Synthetic data generation is the new holy grail of AI. But ensuring the data is high-quality, interesting, and useful is non-trivial. 🤖 Our new method, Source2Synth provides a way of automatically generating and curating training examples (with and for LLMs 🦙) grounded in
@jaseweston
Jason Weston
18 days
🚨New paper: Source2Synth🚨 - Generates synthetic examples grounded in real data - Curation step makes data high quality based on answerability - Improves performance on two challenging domains: Multi-hop QA and using tools: SQL for tabular QA 🧵(1/4)
Tweet media one
1
56
250
2
6
55
@robertarail
Roberta Raileanu
3 years
Very excited our workshop on Agent Learning in Open-Endedness (ALOE) was accepted at #ICLR2022 ! Look forward to bringing together a diverse community to discuss how we can create more open-ended learning systems.
@aloeworkshop
ALOE Workshop
3 years
Announcing the first Agent Learning in Open-Endedness (ALOE) Workshop at #ICLR2022 ! We're calling for papers across many fields: If you work on open-ended learning, consider submitting. Paper deadline is February 25, 2022, AoE. .
Tweet media one
1
18
72
0
5
52
@robertarail
Roberta Raileanu
2 years
CSP accepted as spotlight at #ICLR2023 . Kudos to @jb_gaya who led this project and to all the other wonderful co-authors @doan_tl @LucasPCaccia @LaureSoulier @LudovicDenoyer
@robertarail
Roberta Raileanu
2 years
CSP: our new algorithm for scalable continual reinforcement learning. CSP allows users to control the trade-off between performance and model size by adaptively learning a subspace of policies. SOTA on Continual World and Brax. Interactive Website:
1
6
102
4
10
50
@robertarail
Roberta Raileanu
1 year
Check out our new #ICML paper on “Hyperparameters in RL and How to Tune Them”, led by @The_Eimer who did an amazing job! We recommend a set of best practices for reporting HP tuning in RL, which we hope will result in better reproducibility and more fair comparisons.
@The_Eimer
Theresa Eimer
1 year
No one likes fiddling with hyperparameters (HPs), especially in RL - but you don’t have to! Our #ICML paper shows that HP optimization methods outperform 10x more compute-expensive hand tuning - and that underreporting is a reproducibility hazard. 🧵:
Tweet media one
Tweet media two
2
10
51
1
6
50
@robertarail
Roberta Raileanu
4 years
New approach for exploration in procedurally generated environments with sparse reward! AMIGo learns to set and achieve increasingly more challenging goals, leading to SOTA performance on MiniGrid.
@egrefen
Edward Grefenstette
4 years
Got a complicated RL exploration problem? Sparse/no reward? It's dangerous to go alone: bring an AMIGo! This thread introduces work done by Andres Campero, with @robertarail , Josh B. Tenenbaum, @HeinrichKuttler , @_rockt and me during Andres' internship at FAIR London. [1/5]
Tweet media one
4
57
282
0
8
48
@robertarail
Roberta Raileanu
3 years
A very thorough, insightful, and much needed survey on generalization in deep RL!
@_robertkirk
Robert Kirk
3 years
I'm very excited to share a Survey of Generalisation in Deep Reinforcement Learning (), written with @yayitsamyzhang , @egrefen and @_rockt . Curious about generalisation in RL? Then this is the survey for you! Here's a thread giving a quick overview...
Tweet media one
7
134
545
0
3
48
@robertarail
Roberta Raileanu
1 year
🚨 New Paper 🚨 📜 We study the effects of RLHF on the generalization and diversity of LLMs - RLHF results in much better OOD generalization than SFT - RLHF leads to lower output diversity compared to SFT kudos to @_robertkirk who did an outstanding job leading this work!👇
@_robertkirk
Robert Kirk
1 year
Excited to share work from my FAIR internship on understanding the effects of RLHF on LLM generalisation and diversity: While RLHF outperforms SFT in-distribution and OOD, this comes at the cost of a big drop in output diversity! Read more below🧵
Tweet media one
3
57
336
0
8
47
@robertarail
Roberta Raileanu
2 months
We started this work back in 2022 with @sharathraparthy and @HenaffMikael when transformers only had 4k context length 😱. This was one of the main bottlenecks in extending our method to longer-horizon tasks like NetHack 👾 or Minecraft 🤖 I’m excited to see what can be achieved
@_rockt
Tim Rocktäschel
2 months
Fantastic keynote talk by @robertarail at the AutoRL workshop on achieving in-context learning for complex environments like @NetHack_LE / MiniHack and ProcGen. At current rate, I believe we'll see an in-context NetHack agent soon that can in-context imitate from expert play.
Tweet media one
2
7
46
1
8
47
@robertarail
Roberta Raileanu
7 months
🚨 Wondering when, where and how to improve your LLM’s reasoning? 🤖 Look no further! Our new method GLoRe shows how you can do just that. ⏲️ When? Use an outcome-based reward model (ORM) to decide which solutions need refinement. 🎯 Where? Use a step-wise ORM (SORM) to decide
@Dahoas1
Alex Havrilla
7 months
New paper alert🚨🚨🚨 How to bootstrap the reasoning refinement capabilities of LLMs using synthetic data? Introducing "GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements". Applied on GSM8K we can improve a strong RL finetuned LLama-2 13B by 12%
Tweet media one
2
20
88
0
11
46
@robertarail
Roberta Raileanu
1 year
📢 Check out our new survey on current challenges and applications of LLMs! There is still plenty to understand, improve, and build, so we hope this survey will lower the barrier for newcomers, researchers, and practitioners to help advance the field.
@robert_mchardy
Robert McHardy
1 year
(1) LLMs are ubiquitous, but what are the challenges requiring further research and in where have LLMs successfully been applied? In our new paper, Challenges and Applications of Large Language Models, we answer these questions comprehensively. 📚:
Tweet media one
4
86
396
1
7
46
@robertarail
Roberta Raileanu
4 years
NetHack is a great environment for driving research on exploration, planning, transfer, generalization, and many other key RL problems! Very curious to see what insights it will inspire.
@egrefen
Edward Grefenstette
4 years
Want to help push the boundaries of RL research? Need a rich, difficult, and procedurally-generated environment with loads of structure and intricacy? An astounding amount of human play data? Sophisticated strategies and documentation? We got you (and it's faster than ALE!) [1/6]
Tweet media one
2
49
171
1
6
43
@robertarail
Roberta Raileanu
2 years
Want agents that effectively explore varying environments with high-dimensional observations or huge state spaces? Check out E3B: Exploration via Elliptical Episodic Bonuses, which achieves a new SOTA on Habitat and MiniHack. Fantastic work led by @MikaelHenaff !
@HenaffMikael
Mikael Henaff
2 years
Excited to share our @NeurIPSConf paper where we propose E3B--a new algorithm for exploration in varying environments. Paper: Website: E3B sets new SOTA for both MiniHack and reward-free exploration on Habitat. A thread [1/N]
3
24
109
0
7
44
@robertarail
Roberta Raileanu
8 months
In our paper led by @ishitamed (, to be presented #ICLR2024 ), we found that online RL generalizes much better than offline learning methods (including decision transformers, behavioral cloning, and offline RL), on both Procgen an WebShop. But we were
@GhugareRaj
Raj Ghugare
8 months
What if I told you that there were a combinatorial number of tasks that Decision Transformer like methods overlook? Excited to share our paper which unwraps this by linking trajectory stitching to combinatorial generalization, leading to a practical data augmentation method.
2
10
58
1
6
43
@robertarail
Roberta Raileanu
3 years
We'll be presenting our work on solving hard exploration tasks via quasi-adversarial intrinsic goals at #ICLR2021 during poster session 2 on Monday. Led by Andres Campero, joint with @HeinrichKuttler , Josh Tenenbaum, @_rockt , and @egrefen .
@egrefen
Edward Grefenstette
3 years
“Going” to ICLR on Monday? Be sure to “stop by” our poster during session 2. We present a new near-adversarial metalearning-inspired method for structuring exploration in RL by modelling the boundary of the agent’s abilities. Easy to add to your agent!
Tweet media one
1
9
55
1
8
43
@robertarail
Roberta Raileanu
2 months
Excited to talk about how we can build more general and robust agents at the @AutoRL_Workshop #ICML2024 . I'll be discussing the advantages and limitations of using offline methods for learning sequential decision making tasks, such as in-context learning with transformers or
@AutoRL_Workshop
AutoRL Workshop
2 months
Up next: Jack Parker-Holder @jparkerholder works on open-endedness as a research scientist at @GoogleDeepMind and honorary lecturer with @UCL_DARK . His focus is on unlimited training data for open-ended RL - being able to generate interactive tasks controllably and at will 🔮
0
5
16
2
4
40
@robertarail
Roberta Raileanu
10 months
📢 @UCL_DARK has a few open PhD positions starting next year. If you’re interested in working with us on agents capable of robust decision making and open-ended learning consider applying!
@UCL_DARK
UCL DARK
10 months
We ( @_rockt , @egrefen , @robertarail , and @jparkerholder ) are looking for PhD students to join us in Fall 2024. If you are interested in Open-Endedness, RL & Foundation Models, then apply here: and also write us at ucl-dark-admissions @googlegroups .com
3
20
66
1
8
40
@robertarail
Roberta Raileanu
2 months
Thanks @tafm_rlc for organizing an excellent workshop at the @RL_Conference ! I deeply enjoyed the diversity of the posters and talks, debating the future of foundation models and agency with @criticalneuro and @MichaelD1729 (where we all seemed to agree that as a community we
1
8
38
@robertarail
Roberta Raileanu
8 months
We’re trying something different for the paper awards at the @RL_Conference . Instead of awarding papers for being the overall “best”, we award them for making significant contributions to specific aspects of research, aiming to promote more diverse contributions.
@MarlosCMachado
Marlos C. Machado
8 months
Roberta ( @robertarail ) and I came up with a different proposal on how to do paper awards. This is what @RL_Conference will do this year. The idea is to award papers for excelling in what they aim to accomplish. If you want to know more, take a look at the blog post we wrote.
1
5
31
0
1
36
@robertarail
Roberta Raileanu
2 years
Had a great time speaking about learning behaviors from large-scale datasets #GameAISchool2022 . More details about this work coming out soon.
@togelius
Julian Togelius
2 years
The fourth day of #GameAISchool2022 kicks off with @robertarail giving a very interesting talk about learning behavior from large-scale datasets, and in particular the @NetHack_LE . I think large-scale imitation learning will be very important to the future of games.
Tweet media one
1
3
27
0
4
37
@robertarail
Roberta Raileanu
1 year
If you work on generalization in sequential decision making, reinforcement learning, or planning, consider submitting your work at this #NeurIPS2023 workshop👇
@pulkit_verma
Pulkit Verma
1 year
10 days left to submit your work to #NeurIPS2023 Workshop on Generalization in Planning. We have an exciting speaker lineup: @FeryalMP , Giuseppe De Giacomo, Hector Geffner, @robertarail , @PeterStone_TX , @yayitsamyzhang CFP and more info available at
Tweet media one
0
1
9
0
3
35
@robertarail
Roberta Raileanu
1 year
In our new paper led by @yidingjiang , we highlight the importance of exploration for generalization to new environments. Encouraging the agent to visit states with high epistemic uncertainty results in EDE, the first value-based method to achieve SOTA on Procgen and Crafter.
@yidingjiang
Yiding Jiang
1 year
An agent is unlikely to be tested in the exact same environment it’s trained in. In our new work with @zicokolter and @robertarail , we show that exploration during training plays a crucial role in zero-shot generalization to new environments. 🧵 1/
Tweet media one
1
24
116
0
4
36
@robertarail
Roberta Raileanu
7 months
Thanks @arankomatsuzaki for highlighting our work! These insights inspired us to develop GLoRe, a new method for refining LLM reasoning using synthetic data: .
@arankomatsuzaki
Aran Komatsuzaki
7 months
Meta presents: Teaching Large Language Models to Reason with Reinforcement Learning Finds the sample complexity of Expert Iteration is similar to that of PPO and performs the best among all algorithms
Tweet media one
3
56
294
0
6
34
@robertarail
Roberta Raileanu
2 months
Had a blast presenting at the @AutoRL_Workshop #ICML2024 , thanks to the organizers and attendees for all the insightful questions!
@_rockt
Tim Rocktäschel
2 months
Fantastic keynote talk by @robertarail at the AutoRL workshop on achieving in-context learning for complex environments like @NetHack_LE / MiniHack and ProcGen. At current rate, I believe we'll see an in-context NetHack agent soon that can in-context imitate from expert play.
Tweet media one
2
7
46
2
2
34
@robertarail
Roberta Raileanu
3 years
Had a great time chatting about generalization in RL with Wendelin Böhmer, Cheng Zhang, Harm van Seijen, and Mingfei Sun at the Microsoft Research Summit! Thanks @MSFTResearch for organizing this! You can watch the discussion in the RL - Part 2 track:
0
5
33
@robertarail
Roberta Raileanu
2 years
Interested in AI and Games and wouldn’t mind spending a week in Greece learning about it? Come join us then!
0
4
34
@robertarail
Roberta Raileanu
2 years
Check out our new ICLR paper on MAESTRO led by @_samvelyan . MAESTRO trains robust agents for zero-sum two-player games by generating joint auto-curricula over both environments and co-players.
@_samvelyan
Mikayel Samvelyan
2 years
I’m excited to share our latest #ICLR2023 paper 🏎️ MAESTRO: Open-Ended Environment Design for Multi-Agent Reinforcement Learning 🏎 Paper: Website: Highlights: 👇
5
53
217
0
5
34
@robertarail
Roberta Raileanu
5 months
Had a blast debating the future of LLM agents with such legends in the field and hearing everyone’s insights on this timely topic! Thanks @llmagentsworkshop for the invite and for putting together an excellent workshop 🙏
@LLMAgents
LLM Agents Workshop@ICLR2024
5 months
Our panel session is starting!
Tweet media one
0
2
13
1
1
33
@robertarail
Roberta Raileanu
7 months
Training action-conditioned video generation models is a great feat. Even more impressive to do this fully unsupervised. This paves the way for so many exciting research directions and applications. Can't wait to see such neural simulators being used to train robots or other
@_rockt
Tim Rocktäschel
7 months
I am really excited to reveal what @GoogleDeepMind 's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.
143
575
3K
1
4
31
@robertarail
Roberta Raileanu
2 years
Excited to talk about generalization in RL @ICComputing this week!
@ic_arl
ICARL
2 years
Next seminar coming! 🥳🥳26th of May at 17:00 (UK) we are exited to have @robertarail 🤗from @MetaAI as our next invited speaker to tell us about her latest research on generalisation in deep reinforcement learning @imperialcollege Huxley 308 @ICComputing
1
4
10
1
4
32
@robertarail
Roberta Raileanu
5 months
Inspiring talk from @katjahofmann on how we can empower game creators with learned action and world models! The video game generation results look very impressive! At the GenAI for Decision Making workshop #ICLR2024
Tweet media one
0
3
32
@robertarail
Roberta Raileanu
10 months
Curious how RLHF and SFT compare in terms of OOD generalization and diversity? Check out @_robertkirk 's talk at @itif_workshop workshop at 4pm and our poster at 1pm - 2pm! Room 220 - 222 #NeurIPS2023
@_robertkirk
Robert Kirk
1 year
Excited to share work from my FAIR internship on understanding the effects of RLHF on LLM generalisation and diversity: While RLHF outperforms SFT in-distribution and OOD, this comes at the cost of a big drop in output diversity! Read more below🧵
Tweet media one
3
57
336
1
5
31
@robertarail
Roberta Raileanu
2 years
Nice idea on using LLMs to inject human priors into the exploration of RL agents. This biases exploration towards behaviors and tasks that humans care about. Could be particularly useful for open-ended environments with large state spaces like NetHack or MineCraft.
@d_yuqing
Yuqing Du
2 years
How can we encourage RL agents to explore human-meaningful behaviors *without* a human in the loop? @OliviaGWatkins2 and I are excited to share “Guiding Pretraining in Reinforcement Learning with LLMs”! 📜 🧵1/
5
79
354
0
2
31
@robertarail
Roberta Raileanu
5 months
🚨 DreamCraft 🚨 Ever wanted to generate MineCraft ⛏️💎🏡 structures by simply describing them in natural language? Our new method DreamCraft does just this using quantized NeRFs 🔥 DreamCraft can also be combined with functional constraints to ensure the generated 3D
@Smearle_RH
smearle
5 months
We introduce DreamCraft 🗣️🧱⛏️, a NeRF-based method for generating MineCraft structures from free-form text prompts. Work with @filippos_kok , @NieYuhe , @togelius and @robertarail at FDG 2024. 📜 E.g., "large medieval ship" "old wizard's tree mansion"
1
16
83
0
4
30
@robertarail
Roberta Raileanu
5 months
Rainbow Teaming poster getting crowded. Come hear @sharathraparthy and @_andreilupu explaining how we can use open-ended learning methods like quality-diversity to generate endless adversarial prompts for LLMs and make them more robust in the process 🌈🌈🌈
Tweet media one
@sharathraparthy
Sharath Raparthy
5 months
@_andreilupu @robertarail and I are presenting Rainbow Teaming 🌈 at Set LLM workshop. Location - Schubert 5. Come chat with us!
0
1
6
2
5
31
@robertarail
Roberta Raileanu
2 months
Open-endedness is all you need! Great read from @chalmermagne and @nathanbenaich at @airstreetpress outlining the challenges in building truly useful AI agents that can deal with the conplexity and unpredictability of the real-world. They highlight open-ended learning as an
@nathanbenaich
Nathan Benaich
2 months
New on @airstreetpress : Now that LLMs can convincingly automate much of a bored human’s tasks, attention is turning to “agentic AI”. In this piece, we evaluate how far advanced this work actually is, look at both promising research directions and the challenges ahead. Thread:
Tweet media one
7
32
152
1
8
30
@robertarail
Roberta Raileanu
2 years
Exploration, and more broadly active data collection, is key for generalization, not only in RL but also SL and SSL. We are just scratching the surface on understanding how the two interact.
@MinqiJiang
Minqi Jiang
2 years
The popular focus on big models in deep learning ignores the core assumption of having the right data—a product of *exploration.* @egrefen , @rockt_ , and I outline how + why exploration is key to creating more general AIs. Blog: 📜
9
56
285
0
3
29
@robertarail
Roberta Raileanu
2 months
Very inspiring work and super cool to see open-ended learning ideas being applied to scientific discovery, which is, of course, an open-ended process! Congrats to everyone involved!
@SakanaAILabs
Sakana AI
2 months
Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery! From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI
Tweet media one
Tweet media two
Tweet media three
Tweet media four
259
1K
6K
1
0
28
@robertarail
Roberta Raileanu
1 year
In our new #ICML oral, we find that episodic bonuses are best at exploring environments with little shared structure across episodes, while global bonuses are best for a lot of shared structure. Combining them achieves the best of both worlds and works well in multiple domains.
@HenaffMikael
Mikael Henaff
1 year
Exploration is well-studied for singleton MDPs, but many envs of interest change across episodes (i.e. procgen envs or embodied AI tasks). How should we explore in this case? In our upcoming @icmlconf oral, we study this question. A thread...1/N
1
11
51
0
4
28
@robertarail
Roberta Raileanu
1 year
If you are interested in open-ended learning or self-improving models, check out the ALOE workshop at NeurIPS 2023!
@aloeworkshop
ALOE Workshop
1 year
🌱 The 2nd Agent Learning in Open-Endedness Workshop will be held at NeurIPS 2023 (Dec 10–16) in magnificent New Orleans. ⚜️ If your research considers learning in open-ended settings, consider submitting your work (by 11:59 PM Sept. 29th, AoE).
2
14
52
0
3
27
@robertarail
Roberta Raileanu
8 months
Such a cool paper! Shows how to use open-ended search to find diverse adversarial scenarios for multi-agent systems. With applications to football!
@_samvelyan
Mikayel Samvelyan
8 months
Uncovering vulnerabilities in multi-agent systems with the power of Open-Endedness! Introducing MADRID: Multi-Agent Diagnostics for Robustness via Illuminated Diversity ⚽️ Paper: Site: Code: 🔜 Here's what it's all about: 🧵👇
Tweet media one
1
41
148
0
5
26
@robertarail
Roberta Raileanu
8 months
Stay tuned!
@MarlosCMachado
Marlos C. Machado
8 months
I'm very excited with what @robertarail and I are cooking up!
1
1
21
0
0
26
@robertarail
Roberta Raileanu
10 months
Come chat with us about the importance of exploration for generalization in RL this afternoon #NeurIPS2023 ! with @yidingjiang and @zicokolter
@robertarail
Roberta Raileanu
10 months
2) "On the Importance of Exploration for Generalization in Reinforcement Learning" led by @yidingjiang
Tweet media one
1
0
10
0
1
25
@robertarail
Roberta Raileanu
11 months
Great to see this!
@yayitsamyzhang
Amy Zhang
11 months
Thrilled to announce the first annual Reinforcement Learning Conference @RL_Conference , which will be held at UMass Amherst August 9-12! RLC is the first strongly peer-reviewed RL venue with proceedings, and our call for papers is now available: .
Tweet media one
5
61
419
0
2
25
@robertarail
Roberta Raileanu
2 months
I won’t be attending #ICML2024 in person this year, but many @UCL_DARK members will be there so make sure to check out their excellent work, including a number of orals and spotlights 🙀, as well as some of my own contributions: - GLoRe: When, Where, and How to Improve LLM
@UCL_DARK
UCL DARK
3 months
DARK is going to Vienna🏰🎶! We are excited to present our work at #icml2024 , including 3 Orals on; Debate for Scalable Oversight (), Genie (), and a position paper on Open-Endedness (). Come chat with us ⬇️🚀
Tweet media one
6
5
37
0
4
25
@robertarail
Roberta Raileanu
2 years
2 new papers on using LLMs to generate game levels! Very cool work. Language seems like a powerful way of generating open-ended tasks for training increasingly more capable and general agents. It’s also a neat way of incorporating human priors, which are currently lacking.
@togelius
Julian Togelius
2 years
Large language levels can do... lots of things. But can they generate game levels? Playable game levels, where puzzles are solvable? Two papers announced today address this question. The first is by our team at @NYUGameLab - read on for more:
Tweet media one
7
108
426
0
1
24
@robertarail
Roberta Raileanu
2 months
Proud to work at a company which is such a great champion of open source!
@Ahmad_Al_Dahle
Ahmad Al-Dahle
2 months
With today’s launch of our Llama 3.1 collection of models we’re making history with the largest and most capable open source AI model ever released. 128K context length, multilingual support, and new safety tools. Download 405B and our improved 8B & 70B here.
63
227
1K
2
0
23
@robertarail
Roberta Raileanu
4 months
Great work on using LLMs + evolutionary methods to discover new and well-performing ML algorithms! Feels like we're just scratching the surface on what's possible 🚀
@_chris_lu_
Chris Lu
4 months
Excited to share my first work from my internship @SakanaAILabs ! We used LLMs to design and implement new preference optimization algorithms for training LLMs, discovering cutting-edge methods! Co-led with @samianholt and Claudio Fanconi. Details in thread 🧵 (1/N)
4
35
157
0
1
23
@robertarail
Roberta Raileanu
1 year
📢 Do you work on LLM safety, alignment, fairness, accountability, or transparency? 📜 Submit your work to the #NeurIPS2023 SoLaR workshop on Socially Responsible Language Modelling Research!
@solarneurips
SoLaR @ NeurIPS2024
1 year
We are announcing the NeurIPS 2023 workshop on Socially Responsible Language Modelling Research (SoLaR)! We welcome submissions from all disciplines promoting fairness, accountability, transparency, and safety in language modeling.
1
25
69
0
3
23
@robertarail
Roberta Raileanu
2 years
Great work on improving sample efficiency in RL via resets, which allows more updates and thus better scaling with compute. Love the simplicity of the method and the many insights in the paper.
@proceduralia
Pierluca D'Oro
2 years
Our paper "Sample-Efficient Reinforcement Learning by Breaking the Replay Ratio Barrier" has been accepted to ICLR as top 5%! We show that lots of updates and resets lead to great sample efficiency. Come for the scaling, stay for the insights. 1/N🧵
Tweet media one
6
33
272
0
3
23
@robertarail
Roberta Raileanu
2 months
Don’t miss out @HenaffMikael presenting our work today on in-context learning of sequential decision making tasks! #ICML2024
@sharathraparthy
Sharath Raparthy
2 months
@HenaffMikael will be presenting our work “Generalization to New Sequential Decision Making Tasks with In-context Learning” Location - Hall C4-9 #2915 #ICML2024
0
1
10
1
2
23
@robertarail
Roberta Raileanu
1 year
📢Chain-of-Verification (CoVe) reduced hallucinations in LLMs by self-verifying its own answers via simple follow-up questions. Great work from @shehzaadzd during his FAIR internship!
@shehzaadzd
Shehzaad Dhuliawala
1 year
Chain-of-Verification Reduces Hallucination in LLMs (work done during my internship at FAIR) - LLMs more likely to hallucinate during longform generation - We show that generating verification questions for the LLM to self-answer allows the model to deliberate on its outputs 🧵
5
9
67
0
1
23
@robertarail
Roberta Raileanu
4 years
A new challenge for RL agents based on NetHack! Grateful to have been among the first few people to try it out for research.
@_rockt
Tim Rocktäschel
4 years
I am proud to announce the release of the NetHack Learning Environment (NLE)! NetHack is an extremely difficult procedurally-generated grid-world dungeon-crawl game that strikes a great balance between complexity and speed for single-agent reinforcement learning research. 1/
16
194
730
0
3
21
@robertarail
Roberta Raileanu
4 years
Check out our poster session at #ICML2020 if you want to learn more about a new framework for fast adaptation in RL using policy-dynamics value functions. Today (Wed) at 8pm UTC and tomorrow at 9am UTC. ICML talk: Paper:
0
4
22
@robertarail
Roberta Raileanu
7 months
🌈🌈🌈 Rainbow Teaming 🌈🌈🌈 Beyond excited to share Rainbow Teaming, our new method for open-ended generation of diverse adversarial prompts for LLMs. 🌟 Rainbow Teaming is a generic approach for finding prompts where the model fails according to some metric. For some domains
@_samvelyan
Mikayel Samvelyan
7 months
Introducing 🌈 Rainbow Teaming, a new method for generating diverse adversarial prompts for LLMs via LLMs It's a versatile tool 🛠️ for diagnosing model vulnerabilities across domains and creating data to enhance robustness & safety 🦺 Co-lead w/ @sharathraparthy & @_andreilupu
6
44
178
1
4
21
@robertarail
Roberta Raileanu
6 months
Great to see our survey on the Challenges and Applications of LLMs being cited by the White House.
@robert_mchardy
Robert McHardy
6 months
Very cool to see our work on LLMs being cited by the 2024 Economic Report of the President (p.256 & 371): @jeankaddour @maximilianmozes @herbiebradley @robertarail
1
1
11
0
2
21
@robertarail
Roberta Raileanu
1 year
Not at ICML this year but thankfully my collaborators are, so reach out :) @The_Eimer is presenting Hyperparameters in RL and How to Tune Them @HenaffMikael is presenting A Study of Global and Episodic Bonuses in Contextual MDPs
@The_Eimer
Theresa Eimer
1 year
The timeslot for the poster is Thursday 1:30pm in exhibit hall 1, poster 414 - see you then 👋
0
0
10
0
4
21
@robertarail
Roberta Raileanu
5 months
Can LLMs act as embodied agents 🤖 that learn from feedback and interaction with the world? Can diffusion models be used as world models to enhance planning, RL, or control 🕹️ algorithms? Can LLMs improve exploration in open-ended environments 🌎 by acting as human priors?
@bogdan_mazoure
Bogdan Mazoure
5 months
🚨 Pass by our #ICLR2024 workshop on Generative AI for Decision Making tomorrow, Saturday May 11! 🚨 We have a content-heavy day, including an exciting lineup of invited and contributed talks, as well as two poster sessions! Details:
1
7
32
0
4
21
@robertarail
Roberta Raileanu
1 year
🚀🚀🚀AstroLLaMA is out, check it out!! Fantastic work from the super talented folks @universe_tbd . Can’t wait to see what new insights and discoveries this will enable, right at the intersection of some of my favorite topics. 🔭🪐 🤖
@universe_tbd
UniverseTBD
1 year
We are thrilled to announce the release of AstroLLaMa, a LLM fine-tuned for astronomy ✨ it’s just a start of our journey towards harnessing the power of ML for the needs of researchers 🚀 #lettheastrollamaout
Tweet media one
0
6
20
0
0
20
@robertarail
Roberta Raileanu
5 months
Great to see the OpenDevin effort! presented by @gneubig at the @LLMAgents workshop at #ICLR2024
Tweet media one
0
0
20
@robertarail
Roberta Raileanu
8 months
Working at the intersection of GenAI and Decision Making? Submit your papers to our #ICLR2024 workshop! Deadline extended to February 9, AOE. Follow @genai4dm for official updates from the organizers.
@genai4dm
GenAI4DM Workshop
8 months
🚨 Missed the ICML deadline? Consider submitting a short (4 pages) or long (9 pages) paper to the GenAI+Decision Making workshop at ICLR 2024: . Deadline is extended to February 9, AOE.
1
1
8
0
5
20
@robertarail
Roberta Raileanu
4 years
Cool work on prioritized replay for procedurally generated environments!
@MinqiJiang
Minqi Jiang
4 years
Excited to share Prioritized Level Replay: a simple + effective way to improve generalization and sample efficiency of deep RL agents in procedurally-generated environments, where layout and dynamics can differ per episode. We call these variations "levels."
Tweet media one
1
40
164
1
9
20
@robertarail
Roberta Raileanu
2 months
A very special moment @RL_Conference .
@pcastr
Pablo Samuel Castro
2 months
Third keynote by Andy Barto @RL_Conference , arguing that it was always RL, with a standing ovation at the end!
Tweet media one
4
17
174
0
2
19
@robertarail
Roberta Raileanu
2 years
Very neat work on hierarchical RL that will be presented #NeurIPS2022 . Key idea: use compression to decompose demonstrations into generalizable skills.
@yidingjiang
Yiding Jiang
2 years
Hierarchical RL aims to break complex tasks into simpler subtasks, but how do we judge the quality of decomposition w/ minimal prior knowledge? In this new work at NeurIPS (), we show that good decompositions arise from compression. 🧵 1/
Tweet media one
2
51
276
0
0
19
@robertarail
Roberta Raileanu
10 months
Great to see this new benchmark for general assistants that requires multi-step reasoning, planning, and tool use, posing a challenge to current LLMs. Looking forward to see what solves this, whether it’s scaling, a new algorithm, or Q* (whatever Q* is).
@mialon_gregoire
Grégoire Mialon
10 months
Happy to share GAIA with the community! 1/4 Joint work with @clefourrier @TechySwift @Thom_Wolf @ylecun @ThomasScialom
5
34
138
0
1
19
@robertarail
Roberta Raileanu
5 years
RIDE achieves SOTA on challenging procedurally-generated MiniGrid tasks. In contrast to other exploration methods, RIDE does not suffer from diminishing intrinsic rewards during training and encourages agents substantially more to interact with objects they can control. 5/5
Tweet media one
Tweet media two
1
2
16
@robertarail
Roberta Raileanu
10 months
Excited to give a talk later today at the GenPlan workshop #NeurIPS2023 on In-Context Learning for Sequential Decision-Making Tasks. Thanks to the organizers for putting together this exciting workshop! 📍Room 238-239 ⏰4:00 - 4:35 PM
1
5
18
@robertarail
Roberta Raileanu
2 years
Today we’re presenting NLD, a large-scale dataset of NetHack demonstrations. #NeurIPS2022
@UCL_DARK
UCL DARK
2 years
2⃣ Dungeons and Data: A Large-Scale NetHack Dataset 🧙‍♂️🧝‍♀️ ⏰: 4:00PM 🚩: Hall J #1028 Say 👋 to @erichammy @robertarail @_rockt @HeinrichKuttler
0
2
4
0
4
18
@robertarail
Roberta Raileanu
2 years
In need of a large-scale dataset of demonstrations coupled with a super fast simulator to train a foundation model of behaviors? Look no further than the NetHack Learning Dataset! 10B state transitions / 1.5M human demonstrations 100K dataset FPS / 26K environment FPS
@erichammy
Eric Hambro
2 years
Delighted to present the NetHack Learning Dataset (NLD) at #NeurIPS2022 next week! NLD is a new large-scale dataset for NetHack and MiniHack, aimed at supercharging research in offline RL, learning from observations, and imitation learning. 1/
Tweet media one
3
20
137
0
3
16
@robertarail
Roberta Raileanu
4 months
Exciting to see more advances in open-ended learning!
@jeffclune
Jeff Clune
4 months
I am thrilled to introduce OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code. Led by @maxencefaldor and @jennyzhangzt , with @CULLYAntoine and myself. 🧵👇
10
43
217
0
5
17
@robertarail
Roberta Raileanu
11 months
Example of agents exhibiting reward hacking when trained with both in-game extrinsic reward and LLM intrinsic reward, but not when trained with each of these rewards alone. Would be curious to see if this observation transfers to chatbots or other language agents and what are
@MartinKlissarov
Martin Klissarov
11 months
While working on Motif, we discovered something unexpected: our NetHack-playing AI agent uses virtual drugs to get its rewards! We trained agents to find "the oracle", a game character that can be normally found in advanced game levels. We noticed surprisingly good performance
4
14
65
1
7
16
@robertarail
Roberta Raileanu
1 year
Nice work on scaling laws for behavioral cloning leading to significant improvements on NetHack.
@JensTuyls
Jens Tuyls
1 year
Imitation learning is one of the most widely used methods in ML, but how does compute affect its performance? We explore this question in the challenging game of NetHack and find our scaled-up agent to outperform prior SOTA by 2x! [1/6]
Tweet media one
2
19
107
0
1
16
@robertarail
Roberta Raileanu
7 months
Open-ended generation of synthetic data (both tasks and solutions!) is the real path to AGI.
@sharathraparthy
Sharath Raparthy
7 months
Introducing Rainbow Teaming - a new approach using open-ended search for effective, diverse prompt generation. Co-led with the amazing @_samvelyan and @_andreilupu . A detailed 🧵 on one of the most promising applications - High quality diverse synthetic data generation
1
8
33
1
4
15
@robertarail
Roberta Raileanu
8 months
Check out @sharathraparthy ’s @TalkRLPodcast interview about our recent work on in-context learning for sequential decision making. Thanks @robinc for the invite! Paper:
@TalkRLPodcast
TalkRL Podcast
8 months
Episode 48: Sharath Chandra Raparthy @sharathraparthy (AI Resident at @AIatMeta ) on In-Context Learning for Sequential Decision Tasks, GFlowNets, and more!
1
2
21
1
4
16
@robertarail
Roberta Raileanu
2 months
Great to see these two papers get the recognition they deserve, congrats to all co-authors!!
@_rockt
Tim Rocktäschel
2 months
We have been awarded two Best Paper Awards at @ICMLconf 2024 for "Genie: Generative Interactive Environments" and "Debating with More Persuasive LLMs Leads to More Truthful Answers"! Incredibly proud of @UCL_DARK and @GoogleDeepMind 's Open-Endedness Team. 🚀
Tweet media one
23
53
390
1
0
16