Professor, CS, U. British Columbia. CIFAR AI Chair, Vector Institute. Sr. Advisor, DeepMind | ML, AI, deep RL, deep learning, AI-Generating Algorithms (AI-GAs)
I am thrilled to introduce OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code. Led by
@maxencefaldor
and
@jennyzhangzt
, with
@CULLYAntoine
and myself. 🧵👇
I am extremely excited to announce (1) I've joined OpenAI to lead a large-scale effort into AI-generating Algorithms research, & (2) I'll be an Associate CS Professor at U. British Columbia in 2021, where I will continue to lead the OpenAI project. Both are dreams come true! 1/2
Excited to announce I have joined DeepMind as a Senior Research Advisor! I will be working with many fantastic people there on AI-generating algorithms & open-ended learning, one of many areas where
@DeepMind
is a leader. My roles at UBC & the Vector Institute will stay the same.
Introducing Video PreTraining (VPT): it learns complex behaviors by watching (pretraining on) vast amounts of online videos. On Minecraft, it produces the first AI capable of crafting diamond tools, which takes humans over 20 minutes (24,000 actions) 🧵👇
My favorite example of when AI surprised us, the scientists wielding it. A corner case in our algo challenged AI to learn to walk without touching its feet to the ground. Many more in from
@joelbot3000with
@jb_mouret
@CULLYAntoine
Cully et al Nature 2015
Thrilled to share that "First return, then explore" appears today in Nature! Go-Explore solves all unsolved Atari games*, ending a long quest by the field that began in Nature. Led by
@AdrienLE
&
@Joost_Huizinga
w
@joelbot3000
@kenneth0stanley
& myself.
1/
Introducing POET: it generates its own increasingly complex, diverse training environments & solves them. It automatically creates a learning curricula & training data, & potentially innovates endlessly! By
@ruiwang2uiuc
w/
@joelbot3000
&
@kenneth0stanley
Can AI agents design better AI agents?
We describe a newly forming research area "Automated Design of Agentic Systems" (ADAS) that aims to automatically design powerful agents & a new method where a meta agent invents new agents by programming them in code. Led by
@shengranhu
🧵
Introducing Thought Cloning: AI agents learn to *think* & act like humans by imitating the thoughts & actions of humans thinking out loud while acting, enhancing performance, efficiency, generalization, AI Safety & Interpretability. Led by
@shengranhu
1/5
Today is my last day at Uber. It's been a dream to have our startup acquired by Uber &, with wonderful colleagues, create Uber AI Labs. I'm proud of our work, eg. Go-Explore, POET, AI-GAs, GTNs, ANML, Differentiable Plasticity, Backpropamine, Deep Neuroevolution, PPGNs, etc. 1/3
Introducing The AI Scientist! 🧪🔬🔭It creates research ideas & experiments, any necessary code, runs experiments, plots & analyzes data, writes an ENTIRE science manuscript, & performs peer review! Then builds on "published" discoveries.Fully automated. A new era in science?🧵👇
Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery!
From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI
Introducing Generative Teaching Networks, which generate entirely synthetic data that is up to 9x faster to train on than real data!, enabling state-of-the-art Neural Architecture Search Led by
@felipesuch
w
@kenneth0stanley
,
@joelbot3000
, &
@AdityaRawaI
1/
Delighted to share that I've been promoted to Professor (aka “Full Professor”) A huge thanks to my wife, students, collaborators, colleagues, family, & friends for everything. It's been an exhilarating, wondrous, fascinating climb, and what a view! Now, which peak to climb next?
We're hiring software engineers, research engineers, & research scientists on the multi-agent team
@OpenAI
! Interested in AI-generating algorithms, many interacting agents, open-ended algorithms, automatically generating training environments, deep RL etc?
Reviewers in my experience almost always insist on SOTA for publication. This is the result. In doing so, we are asking to be lied to via p-hacking and brittle, non-reproducible results. We also choke off promising directions that are not immediately better, which is ridiculous!
Reviewing NeuRIPS papers one thing is clear: most of us think, our work needs to be better than others to get published and we sell(modify+highlight+hide) it to convince the reader, putting science second to it.
We have open sourced the code behind the neuroevolution papers described in our blog post This includes the code for the deep genetic algorithm that is competitive for Deep RL. We hope this helps the community. Please let us know if you try them.
I am recruiting PhD students at UBC. I especially encourage applications from those underrepresented in our field. Interested in AI-GAs, meta-learning, generating learning challenges, deep RL, open-endedness, QD, exploration, generalization, continual learning, etc? Please apply!
Learning to Continuously Learn. ANML meta-learns to reduce catastrophic forgetting, and can learn at least 600 tasks (Omniglot classes) sequentially and performs well on most afterwards. arXiv NeurIPS talk Led by Shawn Beaulieu 1/2
I agree. "I'm amazed that people confidently pronounce these things are not sentient, and when you ask them what they mean by sentient they say well they don't really know. So how can you be confident they're not sentient if you don't know what sentient means? "-Geoff Hinton 1/2
🤯 Full body tracking now possible using only WiFi signals
A deep neural network maps the phase and amplitude of WiFi signals to UV coordinates within 24 human regions
The model can estimate the dense pose of multiple subjects by utilizing WiFi signals as the only input
🧵
AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence
Historically, hand-designed pipelines are ultimately outperformed by entirely learned ones. Will that will be true of creating general AI itself? 1/6
Introducing the Synthetic Petri Dish for architecture search: Creates tiny NN variants in a "dish" & optimizes synthetic data to make variants perform like their full NN counterparts. Then rapid NAS search in the dish. Generalizes better than NN models! 1/
Here is the final link to my talk at the ICLR BeTR-RL workshop on AI-GAs and Learning to Continually Learn (with ANML). Thanks to the workshop organizers and my collaborators!
#ICLR2020
#iclr
Excited to introduce Intelligent Go-Explore: Foundation model (FM) agents have the potential to be invaluable, but struggle to learn hard-exploration tasks!
Our new algorithm drastically improves their exploration abilities via Go-Explore + FM intelligence. Led by
@cong_ml
🧵1/
I think biological development is the most impressive technology in the known universe. Can you think of anything more impressive? A seed (or egg) uses nanotech to gather all the resources needed to self-assemble molecules into a jaguar, hawk, oak tree, whale or human. Amazing!
A human body is so wonderfully nested. Its ~40T cells descend from individual eukaryotic cells before multi-cellularity. And each has ~1000 mitochondria, which were free-living bacteria before endosymbiosis. And all of it is home to 1-3X as many bacteria in the nooks and crannies
Go-Explore now solves all unsolved Atari games*, handles stochastic training throughout via goal-conditioned polices, reuses skills to intelligently explore after returning, and solves hard-exploration simulated robotics tasks! New paper led by
@AdrienLE
&
@Joost_Huizinga
1/6
My picks are included. What are yours?
Mine: LRL by
@janexwang
et al. (see also RL^2) and Reversible Learning by
@DougalMaclaurin
@ryan_p_adams
@DavidDuvenaud
. Both papers helped inspire AI-Generating Algorithms & its bet on meta-learning, GTNs, & ANML!
Delighted to announce my lab
@UBC_CS
was awarded a substantial grant from Open Philanthropy for work on AI alignment, safety, & existential risk. I want to make the development of powerful AI & AGI go as well as possible for humanity, & will allocate more time to these key topics
Introducing Enhanced POET: Open-Ended Reinforcement Learning through Unbounded Invention of Learning Challenges and their Solutions.Adds domain-general metrics of environment novelty and progress in open-ended algorithms + a new environment search space 1/
Generating multi-modal robot behavior (neural networks that perform many different tasks well) via the new Combinatorial Multi-Objective Evolutionary Algorithm (CMOEA). Very proud of this work. Led by Joost Huizinga
@joosthuizinga
Key idea: the more stepping stones the better!
Excited to share OMNI: Open-endedness via Models of human Notions of Interestingness. Lead:
@jennyzhangzt
Open-ended learning requires a vast space of possible tasks, but search thus can get lost. Agents must focus only on *interesting* tasks, but how?🧵1/
Very cool to see a replication of our paper "Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning"! Includes step-by-step instructions for AWS. Great work
@TDataScience
Here is a new talk entirely on ANML (a Neuromodulated Meta-Learning Algorithm)
@reworkAI
ANML meta-learns to reduce catastrophic forgetting, and can learn at least 600 tasks sequentially!
paper: Learning to Continually Learn ()
#AI
Yann LeCun
@ylecun
et al. publishing evolutionary algorithm tools. Welcoming the era of deep neuroevolution indeed! () Great to see the traditional ML community adopt these tools in the cases when they are useful.
This library from
@ylecun
’s lab implemented benchmarked versions of evolutionary computation algorithms such as: Differential Evolution, Fast Genetic Algorithm, CMA-ES, Particle Swarm Optimization. 🦎🐞🐜
My AI-Generating Algorithms paper described "Darwin-complete" environment search spaces: those that can represent any environment. I suggested one then: neural network world models. Another is code (e.g. LM-generated code for environments/simulators/games)
One amazing thing Genie enables: anyone, including children, can draw a world and then *step into it* and explore it!! How cool is that!?! We tried this with drawings my children made, to their delight. My child drew this, and now can fly the eagles around. Magic!🧞✨
Excited to share a project I've been working on at DeepMind. It's very cool & futuristic, with great potential for generating interactive worlds & agent training environments. Congrats to the team for the amazing work, especially leads
@ashrewards
@jparkerholder
&
@_rockt
🧵🚀👇🏽
I am extremely honored to receive the Presidential Early Career Award for Scientists and Engineers (PECASE) from the White House: Thanks to all of my wonderful collaborators and mentors throughout my career, without whom this would not be possible.
Join us! Our team at OpenAI is now hiring research scientists and engineers. Interested in AI-generating algorithms, multiple interacting agents, open-ended algorithms, automatically generating training environments, deep RL, and more? Please apply!
In the ICLR Debate I introduced an alternate path to general AI: AI-generating algorithms (AI-GAs), and the Three Pillars required to make AI-GAs. Full debate (on how much we should learn vs. build in): w Tenenbaum, Precup, Kaelbling,
@suchisaria
Thoughts?
Our paper was accepted at ICLR. Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity. Very exciting work led by
@ThomasMiconi
, a pioneer in this area. With Aditya Rawal,
@kenneth0stanley
, and myself
@iclr2019
I, with 3 Turing Award winners (including
@geoffreyhinton
, Bengio, Yao), Nobel laureate
@kahneman_daniel
, & other leaders in AI and AI governance, release a paper today recommending specific government & company actions on AI development to minimize risk
Agreed. I remember when I told journalists in 2016 AI would be able to generate any image you (or bad actors) want from a text description and they thought I was nuts. I also predicted videos and entire virtual worlds would come later, which they will! Strap in, society.
I greatly enjoyed giving a guest lecture in
@chelseabfinn
's meta-learning class at Stanford. Thanks to Chelsea for the invitation & for making all of the lectures available for everyone! Mine is a deep dive into AI-GAs, GTNs, differentiable plasticity/backpropamine, ANML, & POET.
Want to learn about meta-learning? Lecture videos for CS330 are now online!
Topics incl. MTL, few-shot learning, Bayesian meta-learning, lifelong learning, meta-RL & more:
+ 3 guest lectures from Kate Rakelly,
@svlevine
,
@jeffclune
Today we are open sourcing a new tool called VINE that visualizes deep neuroevolution populations learning over time. Project led by Rui Wang and Ken Stanley. We hope the community benefits from it! via
@ubereng
Introducing a new, totally different type of meta-learning: gradient-based Hebbian learning (stores info in weights, not activations). Congrats to
@ThomasMiconi
! With
@kenneth0stanley
Differentiable Plasticity: A New Method for Learning to Learn
@ubereng
Slides now available for our ICML Tutorial on Population-Based Methods for Training Deep Neural Networks: Novelty Search, Quality Diversity, Open-Ended Search Algorithms, & Indirect Encoding. w
@kenneth0stanley
&
@joelbot3000
#icml2019
Video soon.
Introducing: First-Explore, then Exploit: Meta-Learning Intelligent Exploration. Led by
@BenNorman451
Humans are masters at exploring. Unlike RL, we do not explore by trying to maximize reward (with noise), but instead explore to gain information! 1/8
Excited to share a project I have been working on at DeepMind. Congrats to the entire, wonderful SIMA team! I look forward to hearing what the community thinks of this advance.
Introducing SIMA: the first generalist AI agent to follow natural-language instructions in a broad range of 3D virtual environments and video games. 🕹️
It can complete tasks similar to a human, and outperforms an agent trained in just one setting. 🧵
Deep Curiosity Search introduces a new type of intrinsic motivation for deep RL: intralife exploration. Just rewarding agents to 'go somewhere new' can dramatically increase performance, tying state of the art on Montezuma’s Revenge Congrats Chris Stanton!
Introducing Video PreTraining (VPT): it learns complex behaviors by watching (pretraining on) vast amounts of online videos. On Minecraft, it produces the first AI capable of crafting diamond tools, which takes humans over 20 minutes (24,000 actions) 🧵👇
Excited to share a project I've been working on at DeepMind. It's very cool & futuristic, with great potential for generating interactive worlds & agent training environments. Congrats to the team for the amazing work, especially leads
@ashrewards
@jparkerholder
&
@_rockt
🧵🚀👇🏽
I am really excited to reveal what
@GoogleDeepMind
's Open Endedness Team has been up to 🚀. We introduce Genie 🧞, a foundation world model trained exclusively from Internet videos that can generate an endless variety of action-controllable 2D worlds given image prompts.
"API for accessing new AI models developed by OpenAI. Unlike most AI systems which are designed for one use-case, the API today provides a general-purpose “text in, text out” interface, allowing users to try it on virtually any English language task."
Update: Go-Explore remains state of the art on Montezuma’s & Pitfall when *tested* with sticky actions. Should we *require* stochastic training? Depends on the research objectives. Please read newly added section A healthy debate created by Go-Explore.
Introducing Fiber, a new open source platform for distributed machine learning, especially population-based methods like Enhanced POET. Program locally with a standard multiprocessing API then deploy to thousands of workers on any cluster. Led by
@_calio
I often have ideas for startups, research papers, short stories, etc. I am going to start posting them here to share them, see what people think, and in the hopes of maybe inspiring someone to something related (or even implement the idea). I'll tag things with
#someBodyDoThis
Momentum continues to build in Deep Neuroevolution. Another five-paper blog post, this time by Sentient. Congrats to Risto and all on the team!
Evolution is the New Deep Learning
Creating a Zoo of Atari-Playing Agents to Catalyze the Understanding of Deep Reinforcement Learning. Great work led by Joel Lehman
@joelbot3000
, w/ many excellent collaborators from Uber AI Labs, OpenAI, & Google Brain. Blog: Code:
I am delighted to report that Thought Cloning was not only accepted to NeurIPS
@NeurIPSConf
, but awarded a spotlight. Thanks to the AC, reviewers, and community. A huge congrats to
@shengranhu
!
#NeurIPS
#neurips2023
Introducing Thought Cloning: AI agents learn to *think* & act like humans by imitating the thoughts & actions of humans thinking out loud while acting, enhancing performance, efficiency, generalization, AI Safety & Interpretability. Led by
@shengranhu
1/5
I often have (what I think are) great ideas for startups (especially AI startups). Has someone created a way to suggest ideas to would-be entrepreneurs and then get a small bit of equity if they work on your idea? If not, is that itself a good idea (a site that does that)?
How can I get rid of all the AI clickbait in my feed? It's mostly from blue checkmark folks. Is there a way to ban/ignore/filter anyone who has one? Are the rest of you inundated with annoying click-baity AI Tweets?
Here's the video of my ICML Continual Learning Workshop talk: "Learning to Continually Learn" introduces AI-GAs as a path to AGI and focuses on ANML as an example of that research paradigm that works well to minimize catastrophic forgetting
#ICML2020
Just realized our Deep Neuroevolution paper is on the front page of Hacker News! That's fun. Also the most talked about ML paper on Twitter this week according to
@karpathy
's arXiv-sanity. Thanks all for your interest in our work!
I will be hiring at both institutions, so please watch this space for details on those opportunities if interested. A huge thanks to everyone at both OpenAI and UBC for these wonderful opportunities! I could not be more excited about both! 2/2
Don't aim for success if you want it; just do what you love and believe in, and it will come naturally. - David Frost
(via the excellent book Why Greatness Cannot be Planned by
@kenneth0stanley
and
@joelbot3000
).
Accelerating Deep Neuroevolution: Train Atari in Hours on a Single Personal Computer! What took ~1 hour on 720 CPUs now takes only ~4 hours on a *single* modern desktop. Code is open source. Awesome work by
@felipesuch
with
@kenneth0stanley
via
@ubereng
In honor of
@UberAILabs
we wanted to share this quote from Jeff Clune (
@jeffclune
). We hope all researchers & developers can find a good place to work & continue to research in this pandemic .
Your unique contributions to our field will always be remembered.
#uberlabs
We are beginning to see how powerful AI can catalyze science. In a future coming to you soon: ask the computer to conduct an experiment, interactively probe the results, and then ask for the next experiment, iterating quickly without writing a single line of code or pipetting!
GPT-3 Does The Work™️ on generating SVG charts, with a quick web app I built with
@billyjeanbillyj
. With a short sentence describing what you want to plot, its able to generate charts with titles, labels and legends from about a dozen primed examples.
cc
@gdb
Fiber, an open source platform for distributed machine learning, esp population-based methods like Enhanced POET. Program locally with a standard multiprocessing API then deploy to thousands of workers on any cluster.
@_calio
@ruiwang2uiuc
@kenneth0stanley
Nice new article on novelty search, QD, open-ended algos, and AI-generating algorithms, including connections to Go-Explore, AlphaStar, IT&E, & more. Thanks
@SilverJacket
for writing it and to
@RaiaHadsell
,
@maxjaderberg
, &
@togelius
for nice comments.
Another great example of AI outsmarting us. Here scientists took a robot that could solve tasks by picking up a box and then disabled its gripper to see if it could adapt to push the box around. It instead figured out how to pick the box up anyway!
Great article about AI-generating Algorithms ("getting AI to make itself"), including POET, GTNs, & ANML Also includes work & quotes from
@ruiwang2uiuc
(POET), Esteban Real (AutoML), &
@janexwang
(LRL), plus great work by
@OpenAI
Thanks for the fun interview & article
@strwbilly
!
Introducing a new algorithm that automatically, intelligently adjusts exploration vs. exploitation for
#DeepRL
. NSRA-ES exploits until stuck, then increasingly explores. It's newly added to 'Improving Exploration in ES' Great work on it Vashisht Madhavan!
This sheep escaped a farm and spent 6 years in the mountains, during which time he grew 60 pounds of wool. Wolves tried to eat him, but their teeth could not penetrate the floof. You don't have to turn hard to survive the wolves, just be really, really soft and fluffy.
We are deeply honored to receive the Outstanding Paper of the Decade award from ISAL
@alifeofficial
for The Evolutionary Origins of Modularity (Proceedings Royal Society 2013). The work was co-led by
@jb_mouret
& with
@hodlipson
. Thanks Alife community!
Very interesting new work on open-endedness and AI-Generating Algorithms from
@maxjaderberg
et al.
@DeepMind
, with special emphasis on Pillar 3 (automatically generating environments, like in POET). Congrats. I look forward to reading it in detail!
Very excited to release our new work: Open-Ended Learning Leads to Generally Capable Agents. tldr; algorithm that dynamically shapes task distributions to train agents on huge task space, resulting in surprisingly general behaviour
Thread: (1/n)
Open source code now available for ANML From Learning to Continually Learn (Beaulieu et al. ECAI 2020). ANML meta-learns how to learn without catastrophic forgetting, including up to 600 sequential tasks. Talk:
Excited to give a talk at Stanford today to
@chelseabfinn
's meta-learning and multi-task learning class (& open to all at Stanford). Thanks Chelsea for the invitation."How Meta-Learning Could Help Us Accomplish Our Grandest AI Ambitions, and Early, Exotic Steps in that Direction"
A friend of mine has cancer. I generated AI images of her as a superhero fighting cancer and she loved them. She asked if I could produce a 3D-printed version. Anyone know a talented 3D-blender-type artist that also knows how to 3D print their creations that I could hire?
Being an AI researcher as adult means reliving the video game part of my childhood in order. First Atari, then Super Mario, now streetfighter. Looking forward to Goldeneye next!
The MAME RL Algorithm Training Toolkit: A Python library that can provide a Gym-like API around almost any old arcade game. They show how to set up new ROMS, and provide RL example for “Street Fighter III Third Strike: Fight for the Future (Japan version)”
Secretly our Enhanced POET project was an attempt to move closer to making this picture from
@hardmaru
's ES blog post a reality. We use the same algorithm, and now have the cliff jumping! Next up: automating the generation of environments with parachutes!
My MIT talk is online. Thanks to Josh Tenenbaum for the nice introduction. "Improving Deep RL via Quality Diversity, Open-Ended, and AI-Generating Algorithms" (also covers VPT and how that fits into the AI-GA paradigm). Thanks
@cedcolas
for organizing!
Seeking a postdoc to join my lab at UBC! Interested in combining deep RL & large language models to advance open-endedness? Candidates should have pubs in one of: RL, large models, open-endedness, or related areas. 1-2+-year positions in paradise! Apply:
I am delighted to announce that I have been appointed as Canada CIFAR AI Chair. Thanks to everyone at CIFAR and the Vector Institute. Congrats to the other appointees, including
@VeredShwartz