Robert Lange Profile Banner
Robert Lange Profile
Robert Lange

@RobertTLange

8,206
Followers
558
Following
263
Media
490
Statuses

Founding Research Scientist @SakanaAILabs 🔬AI Scientist 🧬gymnax 🏋️ evosax 🦎 MLE-Infra 🤹 Ex: SR @Google DM. Legacy DeepMind Intern.

TKY/BLN
Joined April 2017
Don't wanna be here? Send us removal request.
Pinned Tweet
@RobertTLange
Robert Lange
2 months
🎉 Stoked to share The AI-Scientist 🧑‍🔬 - our end-to-end approach for conducting research with LLMs including ideation, coding, experiment execution, paper write-up & reviewing. Blog 📰: Paper 📜: Code 💻:
@SakanaAILabs
Sakana AI
2 months
Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery! From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI
Tweet media one
Tweet media two
Tweet media three
Tweet media four
259
1K
6K
14
62
346
@RobertTLange
Robert Lange
4 years
🎄I am a big fan of @ylecun ‘s & @alfcnz ‘s Deep Learning course. The attention to detail is incredible and one feels the love and passion, which goes into every single course week (my favorites: 7+8 on EBMs)🤗 #feelthelearn 📜: 📽️:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
283
1K
@RobertTLange
Robert Lange
3 years
It's the beginning of a new month - so let's reflect on the core ideas of statistics in the last 50 years ⏳ Great weekend read by @StatModeling & @avehtari covering the core developments, their commonalities & future directions 🧑‍🚀 #mlcollage [17/52] 📜:
Tweet media one
8
312
1K
@RobertTLange
Robert Lange
4 years
Beautiful overview of Bayesian Methods in ML by @shakir_za at #MLSS2020 . Left me pondering about many things beyond Bayesian Inference. Thank you Shakir🙏 Quote of the day: “The cyclist, not the cycle, steers.“🚴‍♀️ 🎤 P-I: 🎤 P-II:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
6
253
1K
@RobertTLange
Robert Lange
4 years
Really happy to share #visualmlnotes ✍️ a virtual gallery of sketchnotes taken at Machine Learning talks 🧠🤓🤖 which includes last weeks #ICLR2020 . Explore, exploit & feel free to share: 💻 website: 📝 repository:
13
256
907
@RobertTLange
Robert Lange
4 years
how it started how its going
Tweet media one
Tweet media two
10
48
799
@RobertTLange
Robert Lange
4 years
🤖JAX is more than just the 'next cool autodiff library'. The primitives allow us to flexibly leverage XLA and to speed-up + vectorize neuroevolution methods 🦎 with minimal engineering overhead. Find out more in my new blog post 📝:
Tweet media one
7
133
713
@RobertTLange
Robert Lange
4 years
Great tutorial on Meta-Learning by @yeewhye covering optimisation-based, black-box & a probabilistic perspective on learning task invariances at #MLSS2020 . Re-watch the videos here: 📺(Part I): 📺(Part II):
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
146
640
@RobertTLange
Robert Lange
2 years
🚀 I am very excited to share gymnax 🏋️ — a JAX-based library of RL environments with >20 different classic environments 🌎, which are all easily parallelizable and run on CPU/GPU/TPU. 💻[repo]: 📜[colab]:
13
103
622
@RobertTLange
Robert Lange
4 years
There is a lot to wrap your head around in LSTMs🤯. One way of thinking that helped me a lot is the 'conveyor belt' metaphor of the cell state 🧑‍🏭 by @ch402 . I put together a little animation 🖼️ Check out the amazing blog post by Chris Olah here✍️:
7
105
515
@RobertTLange
Robert Lange
5 years
What a week 🧠🤓💻! I loved meeting so many of you at #NeurIPS2019 - the ML community is truly wonderful. Checkout all my collected visual notes ✍️ & feel free to share:
Tweet media one
10
126
489
@RobertTLange
Robert Lange
4 years
The lottery ticket hypothesis 🎲 states that sparse nets can be trained given the right initialisation 🧬. Since the original paper ( @jefrankle & @mcarbin ) a lot has happened. Checkout my blog post for an overview of recent developments & open Qs. ✍️:
6
120
492
@RobertTLange
Robert Lange
2 years
🚀 How can meta-learning, self-attention & JAX power the next generation of Evolutionary Optimizers 🦎? Excited to share my @DeepMind internship project and our #ICLR2023 paper ‘Discovering Evolution Strategies via Meta-Black-Box Optimization’ 🎉 📜:
2
90
388
@RobertTLange
Robert Lange
4 years
Want to learn more about the power of the implicit function theorem, DEQs, Neural ODEs & Diff. Optim.? Checkout the outstanding #NeurIPS2020 tutorial by @DavidDuvenaud , @DavidDuvenaud & @SingularMattrix . Checkout the docs📜, recording📽️ & JAX code👩‍💻:
Tweet media one
Tweet media two
Tweet media three
4
56
352
@RobertTLange
Robert Lange
4 years
JAX sometimes has me feeling like a kid in a candy store 🍭 Here is a small example of how to sample batches of Ornstein-Uhlenbeck process realisations combining lax.fori_loop, jit & vmap 🚀 Auto-vectorisation made intuitive and scalable 🤗
Tweet media one
6
44
336
@RobertTLange
Robert Lange
5 years
Great #NeurIPS2019 tutorial kick-off by @EmtiyazKhan ! Showing the unifying Bayesian Principle bridging Human & Deep Learning. Variational Online Gauss-Newton (VOGN; Osawa et al., 19‘) = A Bayesian Love Story ❤️
Tweet media one
Tweet media two
7
88
337
@RobertTLange
Robert Lange
3 years
🎉 Excited to share `mle-monitor` - a lightweight ML experiment protocol and tool for monitoring resource utilization 📝 It covers local machines/servers and Slurm/Grid engine clusters 📉 💻 [repo]: 📜 [colab]:
4
41
296
@RobertTLange
Robert Lange
4 years
📈 What functions do ReLU nets 'like' to learn? 🌈 Using Fourier analysis Rahaman et al. (19') reveal their bias to learn low frequency modes first. Insights for implicit regularization & adv. robustness. #mlcollage [3/52] 📝: 💻:
Tweet media one
2
59
289
@RobertTLange
Robert Lange
4 years
🥳Really excited to be attending #MLSS2020 . Great set of talks by @bschoelkopf & Stefan Bauer starting from 101 causality to Representation Learning for Disentanglement 💯! Re-watch them here: 📺 (Part I): 📺 (Part II):
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
45
280
@RobertTLange
Robert Lange
3 years
How to train your d̶r̶a̶g̶o̶n̶ ViT? 🐉 Steiner et al. demonstrate that augmentation & regularization yield model performance comparable to training on 10x data. Many 💵-insights for practitioners. 🎨 #mlcollage [30/52] 📜: 💻:
Tweet media one
4
70
273
@RobertTLange
Robert Lange
3 years
🚀 Happy to share my hyperparameter search tool: `mle-hyperopt` - a lightweight API covering many strategies with search space refinement 🪓, configuration export 📥 & storage/reloading of previous logs 🔄 💻[repo]: 📜[colab]:
Tweet media one
Tweet media two
3
51
261
@RobertTLange
Robert Lange
4 years
Friday optimization revelations📉: My life needs more theoretical guarantees & convex + linear =❤️. Enlightening set of talks by @BachFrancis at #MLSS2020 . Recordings can be found here: 📽️(Part I): 📽️(Part II):
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
38
259
@RobertTLange
Robert Lange
3 years
🎉 Happy to share a mini-tool that I have been using on a daily basis: `mle-logging` - a lightweight logger 📉 for ML experiments, which makes it easy to aggregate logs across configurations & random seeds 🌱 💻 [repo]: 📜 [colab]:
Tweet media one
Tweet media two
6
51
246
@RobertTLange
Robert Lange
5 years
A concise & detailed intro to Reinforcement Learning! Thank you @katjahofmann for this great #NeurIPS2019 tutorial!
Tweet media one
Tweet media two
2
52
238
@RobertTLange
Robert Lange
4 years
🥳 New tooling blog post coming your way 🚆 'A Machine Learning Workflow for the iPad Pro' - including my favourite apps, routines and pipelines for working with remote machines and @Raspberry_Pi 💽👨‍💻. ✍️: 🤗: Thanks @tech_crafted for the inspiration!
Tweet media one
7
46
219
@RobertTLange
Robert Lange
5 years
Puuuh. What are you up to these days? 💭 I try to stay sane, clean my place 🧹& write✍️. Todays edition - 'Getting started with #JAX '. Learn how to embrace the 'jit-grad-vmap' powers 💻 and code your own GRU-RNN in JAX. Stay safe & home. 🤗
Tweet media one
Tweet media two
3
49
216
@RobertTLange
Robert Lange
3 years
💓 N-Beats is a pure Deep Learning architecture for 1D time series forecasting 📈 provides a M3/M4/tourism SOTA by combining learned/interpretable basis functions 🧑‍🔬 w. residual stacking & ensembling 🎨 #mlcollage [38/52] 📜: 💻:
Tweet media one
2
31
213
@RobertTLange
Robert Lange
5 years
Looking to get started with the @kaggle ARC challenge & want to learn about psychometric/ability-based assessment of intelligent systems? Checkout my blogpost which provides an intro to "On the measure of intelligence" & the corpus by @fchollet 🤖🧠🎉 👉
Tweet media one
0
61
211
@RobertTLange
Robert Lange
5 years
🎉 2019 🎉 was quite the year for Deep Reinforcement Learning. In todays blog post I list my top 10 papers 🦄💻🧠 What was your favourite paper? Let me know!
Tweet media one
Tweet media two
2
52
207
@RobertTLange
Robert Lange
4 years
Great start to an all-virtual #ICLR2020 & the ‘Causal Learning for Decision Making‘ workshop including talks by @bschoelkopf & Lars Buesing 🧠📉👨‍💻. Looking forward to more smooth Q&As and exploring the awesome web interface!
Tweet media one
Tweet media two
2
37
195
@RobertTLange
Robert Lange
9 months
🎉 Stoked to share that I joined @SakanaAILabs as a Research Scientist & founding member. @yujin_tang & @hardmaru 's work has been very inspirational for my meta-evolution endeavors🤗 Exciting times ahead: I will be working on nature-inspired foundation models & evolution 🐠/🧬.
@SakanaAILabs
Sakana AI
9 months
Excited to announce our seed round! We raised $30M to develop nature-inspired AI in Japan.
Tweet media one
63
160
1K
19
12
189
@RobertTLange
Robert Lange
3 years
🚀 Happy to share evosax - a JAX-based library of Evolution Strategies (ES) featuring >10 different ES ranging from classics (e.g. CMA-ES, PSO) 🦎 to modern neuroevolution methods (e.g. ARS, OpenES, ClipUp)🤖 💻[repo]: 📜[colab]:
Tweet media one
1
50
182
@RobertTLange
Robert Lange
7 months
🎉 Happy to share my internship project @GoogleDeepMind 🗼 – purely text-trained LLMs can act as evolutionary recombination operators 🦎 🧬 Our EvoLLM uses LLM backends to outperform competitive baselines. Work done w. @alanyttian & @yujin_tang 🤗 📜:
5
23
172
@RobertTLange
Robert Lange
4 years
Awesome new JAX tutorial by DeepMind 🥳 Covering the philosophy of stateful programs 💭, JAX primitives and more advanced topics such as TPU parallelism, higher-order & per-example gradients ∇. All in all a great resource for every level of expertise🚀 👉
Tweet media one
@matteohessel
matteo hessel
4 years
Check out our new JAX101 tutorial to learn about the fundamentals of JAX!
6
100
421
2
22
163
@RobertTLange
Robert Lange
3 years
How well do scalable Bayesian methods 🚀 approximate the true model average? @Pavel_Izmailov et al. (21') provide insights into performance, generalization, mixing & tempering 🌡️ of Bayesian Nets ! Hamiltonian MC + 512 TPU-v3 = 💘 #mlcollage [18/52] 📜:
Tweet media one
0
32
157
@RobertTLange
Robert Lange
4 years
#MLSS2020 was full of wonderful experiences 🦋 I hope to meet many of you soon & in person. Here are all #visualmlnotes , videos & slides: ✍️: 📼&📚: Thank you 🙏 to all hard working volunteers & organizers - you did awesome 🤗
Tweet media one
Tweet media two
3
27
156
@RobertTLange
Robert Lange
4 years
Thinking 💭about biological & artificial learning with the help of Marr‘s 3 levels of analysis. Here are the #visualmlnotes ✍️from Peter Dayan‘s talk at #MLSS2020 & a little pointer to a nice complementary paper by @jhamrick & @shakir_za : 👉
Tweet media one
Tweet media two
3
38
157
@RobertTLange
Robert Lange
3 years
🚀 How similar are network representations across the layers & architectures? And how do they emerge through training?🤸New blog on Centered Kernel Alignment ( @skornblith et al., 2019) & training All-CNN-C in JAX/flax 🤖 📝: 💻:
2
33
151
@RobertTLange
Robert Lange
2 years
Excited to share that I got to join DeepMind as a research intern ☀️ This has been a dream 💭 which felt out of reach for a long time. Super grateful to the many people that supported me along the way 🤗 Time to do awesome work with @flennerhag , @TZahavy & the discovery team🚀
Tweet media one
17
4
149
@RobertTLange
Robert Lange
3 years
📉 GD can be biased towards finding 'easy' solutions 🐈 By following the eigenvectors of the Hessian with negative eigenvalues, Ridge Rider explores a diverse set of solutions 🎨 #mlcollage [40] 📜: 💻: 🎬:
Tweet media one
1
32
145
@RobertTLange
Robert Lange
3 years
🗡️Sharpness-Aware Minimization (SAM) jointly optimizes loss value & sharpness seeking nhoods w. uniformly low loss🔍 Generalization & label noise robustness↑ 🎨 #mlcollage [36/52] 📜: 💻 [JAX]: 💻 [PyTorch]:
Tweet media one
1
23
140
@RobertTLange
Robert Lange
3 years
SSL joint-embedding training 🧑‍🤝‍🧑 w/o asymmetry shenanigans? 🤯 Zbontar, Jing et al. propose a simple info bottleneck objective avoiding trivial solutions. Robust to small batches + scales w. dimensionality #mlcollage [19/52] 📜: 💻:
Tweet media one
2
28
145
@RobertTLange
Robert Lange
3 years
Can artificial agents learn rapid sensory substitution? 👁️🔁👅 Tang* & Ha* introduce a Set Transformer-inspired agent which processes arbitrarily ordered/length sensory inputs 🎨 #mlcollage [33/52] 📜: 🌐: 📺:
Tweet media one
3
23
139
@RobertTLange
Robert Lange
6 months
🦎Can we teach Transformers to perform in-context Evolutionary Optimization? Surely! We propose Evolutionary Algorithm Distillation for pre-training Transformers to mimic teachers 🧑‍🏫 🎉 Work done @GoogleDeepMind 🗼with @alanyttian & @yujin_tang 🤗 📜:
Tweet media one
2
34
139
@RobertTLange
Robert Lange
3 years
Can NNs only learn to interpolate? @randall_balestr et al. argue that NNs have to extrapolate to solve high dimensional tasks🔶 Questioning the relation of extrapolation & generalization 🎨 #mlcollage [39/52] 📜: 🎙️ [ @MLStreetTalk ]:
Tweet media one
@MLStreetTalk
Machine Learning Street Talk
3 years
Epic new show out with @ylecun and @randall_balestr where we discuss their recent everything is extrapolation paper, interpolation and the curse of dimensionality, and also dig deep into Randall's work on the spline theory of deep learning. @DoctorDuggar @ecsquendor @ykilcher
Tweet media one
11
76
366
1
26
139
@RobertTLange
Robert Lange
5 years
‘Innate everything‘ 🧠🧐🐊 - @hardmaru argues for the importance of finding the right inductive biases in bodies/architectures (WANNs) & prediction/world models (Observational Dropout) - Transferable Skills Workshop #NeurIPS2019
Tweet media one
2
24
134
@RobertTLange
Robert Lange
10 months
🎉 Stoked to share NeuroEvoBench – a JAX-based Evolutionary Optimizer benchmark for Deep Learning 🦎/🧬 🌎 To be presented at #NeurIPS2023 Datasets & Benchmarks with @yujin_tang & @alanyttian 🌐: 📜: 🧑‍💻:
Tweet media one
5
24
117
@RobertTLange
Robert Lange
4 years
✍️Want to learn more about RL, generalization within & across tasks as well as the ‚reward is enough hypothesis‘ 🌍🔄🤖? Checkout a set of thought-provoking talks by @matteohessel , @aharutyu and David Silver at the @M2lSchool ✌️
Tweet media one
Tweet media two
Tweet media three
2
17
131
@RobertTLange
Robert Lange
1 year
🎉 I transitioned from Berlin to the Tokyo 🗼 office for the 2nd half of my @GoogleDeepMind student researcher time! 🤗Deeply thankful to @yujin_tang for all the support leading up to & during my first days in Japan 🇯🇵Everything still feels pretty surreal & I am super grateful!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
4
126
@RobertTLange
Robert Lange
5 years
People of the world - I just posted a new blog post covering my #CCN2019 experience & many keynote talks. It is fair to say - I had a truly fulfilling time 💻❤️🧠. Thank you to all organizers, volunteers & speakers ( @CogCompNeuro ). [1/2]
3
24
122
@RobertTLange
Robert Lange
4 years
This is a live dashboard 💻 monitoring my compute resources & the status/database of ML experiments 🚀 [more about this at a later point 🤗]. It is built with rich in ca. 10 hours of procreative work.
1
14
119
@RobertTLange
Robert Lange
5 years
Yoshua Bengio #NeurIPS2019 - „It is Time for ML to explore Consciousness“ 🧠👌🧐
Tweet media one
4
25
119
@RobertTLange
Robert Lange
1 year
👋 Come by poster 93 in this mornings #ICLR2023 poster session to chat about our work on Learned Evolution Strategies (LES) 🦎 📝:
Tweet media one
0
8
116
@RobertTLange
Robert Lange
5 years
Many gems in @OriolVinyalsML Deep RL workshop talk at #NeurIPS2019 on AlphaStar. Including scatter connections, imitation-based regularization, the league & the unique problem decomposition.
Tweet media one
3
20
113
@RobertTLange
Robert Lange
4 years
Workshop talks by Rich Sutton never fail to inspire 💭. Today’s #ICML2020 Life-Long Learning workshop talk was no different. Exciting ideas about RL agents that learn their own questions & answers in a virtuous cycle 🔴🔄🔵 - all within the General Value Function framework.
Tweet media one
0
21
113
@RobertTLange
Robert Lange
1 month
The DLCT @ml_collective talk on The AI Scientist is now available online! check out the recording 📺 & slides 🧑‍🎨 📺: 📜: Thanks @savvyRL for having us and everyone who attended & asked Qs!
Tweet media one
1
26
114
@RobertTLange
Robert Lange
4 months
🎉 Stoked to share our latest work @SakanaAILabs - DiscoPOP 🪩 We leverage LLMs as code-level mutation operators, which improve their own training algorithms. Thereby, we discover various performant preference optimization algorithms using LLM-driven meta-evolution (LLM²) 🔁
@SakanaAILabs
Sakana AI
4 months
Can LLMs invent better ways to train LLMs? At Sakana AI, we’re pioneering AI-driven methods to automate AI research and discovery. We’re excited to release DiscoPOP: a new SOTA preference optimization algorithm that was discovered and written by an LLM!
19
264
1K
2
24
113
@RobertTLange
Robert Lange
3 years
Very happy to present our work "On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning“ today at the #ICLR2021 @neverendingrl workshop. 🎲 + 🤖🔁🌎 Paper 📜: Poster Session 📢 [3 & 10pm CET]: Summary 👇
Tweet media one
3
19
107
@RobertTLange
Robert Lange
4 years
Neural net symmetries induce geometric constraints 🔷 which imply conservation laws under ∇-flow 🧑‍🔬 This allows for exact prediction of training dynamics. A Noether’s theorem for NNs — great theoretical work by Kunin et al. (2020) #mlcollage [7/52] 📝:
Tweet media one
1
21
108
@RobertTLange
Robert Lange
3 years
✂️Why can we train sparse/subspace-constrained NNs? Larsen et al. derive a theory based on Gordon's Escape Theorem 🧑 → 🌔 & investigate optimized (lottery) subspaces using train data/trajectory info🎲 🎨 #mlcollage [28/52] 📜: 💻:
Tweet media one
0
29
105
@RobertTLange
Robert Lange
3 years
⛩️ Gated Linear Networks (Veness et al., 19') are backprop-free & trained online + local via convex programming 🧮 GLNs combat catastrophic forgetting & the linearity allows for interpretable predictions. #mlcollage [15/52] 📜: 💻:
Tweet media one
0
18
98
@RobertTLange
Robert Lange
3 years
🎨 Beautiful comic summaries of David Silver's classic Reinforcement Learning course 🤖 by @d4phn3c !
Tweet media one
0
22
97
@RobertTLange
Robert Lange
1 month
📢 Two weeks since we released The AI Scientist 🧑‍🔬! We want to take the time to summarize a lot of the discussions we’ve been having with the community, and give some hints about what we are working on! 🫶 We are beyond grateful for all your feedback and the community debate
Tweet media one
2
13
95
@RobertTLange
Robert Lange
4 years
4 challenges in lifelong learning 👶-🧑-👵: Formalism, evaluation, exploration & representation. Great start to the Lifelong ML workshop at #ICML2020 by @katjahofmann , @luisa_zintgraf & @contactrika . P.S.: I have never seen such smooth multi-speaker transitions 😎
Tweet media one
1
18
92
@RobertTLange
Robert Lange
3 years
🔎 How can one measure the emergence of interpretable concept units in CNNs? @davidbau et al. propose network dissection 💉 based on the agreement of filter activations and segmentation models 🎨 #mlcollage [26/52] 📜: 💻:
Tweet media one
0
17
90
@RobertTLange
Robert Lange
4 months
🎉 Do you love JAX-based RL as much as I do? We just published rejax ⚡️ a lightning-fast library of pure JAX RL algos - all jit-, vector- & parallelizable! Enabling high-throughput applications such as meta-evolution 🧬 Work done with @_chris_lu_ & led by @JarekLiesen 🤗
Tweet media one
@JarekLiesen
Jarek Liesen
4 months
🥳 I'm releasing Rejax, a lightweight library of fully vectorizable RL algorithms! ⚡ Enjoy lightning-fast speed using jax.jit on the training function 🧬Use vmap and pmap on hyperparameters 🔙 Log using flexible callbacks 🌐 Available @ 📸 Take a tour!
4
29
169
0
7
93
@RobertTLange
Robert Lange
4 years
Nothing better than starting your day with some invertible models 🤠 Great historic review & explanations by @laurent_dinh at #ICLR2020 ! 🤖 Biggest personal takeaway: The power of sparse/triangular Jacobians in determinant computation 📐
Tweet media one
0
14
86
@RobertTLange
Robert Lange
1 year
🦎/🧬Learned Evolutionary Optimization (& Rob 😋) are going on tour! Super excited to be giving talks about our recent work on meta-discovering attention-based ES/GA & JAX during the coming days 🎙️ @AutomlSeminar : Today 4pm CET @ml_collective : Tomorrow 7pm CET Come & say hi 🤗
Tweet media one
2
14
87
@RobertTLange
Robert Lange
4 months
📺 Exciting talk on the xLSTM architecture and the challenges of questioning the first-mover advantage of the Transformer 🤖 by @HochreiterSepp @scioi_cluster 📜: 💻:
Tweet media one
3
16
82
@RobertTLange
Robert Lange
3 years
🥱Tired of tuning On-Policy DRL agents? Andrychowicz et al. trained 250k agents & evaluated hyperparams for >50 choices to make our lifes easier 🚀 providing evidence for common DRL wisdom & beyond 🪙 #mlcollage [21/52] 📜: 💻:
Tweet media one
0
15
84
@RobertTLange
Robert Lange
5 years
Powerful opening #NeurIPS2019 keynote by @celestekidd ! Many inspirational thoughts from developmental psychology. Curiosity and intrinsic motivation in RL have a lot of work to do.
Tweet media one
3
17
84
@RobertTLange
Robert Lange
2 months
🤖 Drop by the AutoRL workshop [Stolz 0 at #ICML2024 ] if you are interested in how LLMs can shape the future of LLM research 🤯 @_chris_lu_ and I are happy to answer any questions!
Tweet media one
@SakanaAILabs
Sakana AI
4 months
Can LLMs invent better ways to train LLMs? At Sakana AI, we’re pioneering AI-driven methods to automate AI research and discovery. We’re excited to release DiscoPOP: a new SOTA preference optimization algorithm that was discovered and written by an LLM!
19
264
1K
2
13
84
@RobertTLange
Robert Lange
3 years
Can we go beyond backprop + SGD? BLUR (Sandler et al., 21') meta-learns a shared low-dimensional genome 🦎 which modulates bi-directional updates 🔁 It generalizes across tasks + FFW architectures & allows NNs to have many states 🧠 #mlcollage [16/52] 📜:
Tweet media one
0
16
82
@RobertTLange
Robert Lange
4 years
A global workspace theory for coordination among neural modules in deep learning🧠🔄 🤖 Goyal et al. (21') propose a low-dim. bottleneck to facilitate synchronisation of specialists & replace costly pairwise attention interactions 🚀 #mlcollage [11/52] 📜:
Tweet media one
2
25
78
@RobertTLange
Robert Lange
2 years
🤸Very excited to share evosax 🦎 release v.0.10.0 and a small paper, which covers all features and summarizes recent progress in hardware accelerated & JAX-powered evolutionary optimization! 🧑‍💻: 📜: Many new features... 🧵
2
16
78
@RobertTLange
Robert Lange
4 years
🦋 Meta-Policy Gradients ∇∇ have the power to change how we think about algorithm design 🧠. Learn more about automated online hyperparameter tuning and end-to-end RL objective discovery 🤖 in my new blog post! 📝:
1
21
75
@RobertTLange
Robert Lange
5 years
Workshop talks should push conceptual limits. Fascinating talk by Rich Sutton at the Bio&Artificial RL workshop #NeurIPS2019 #SuperDyna P.S.: I will do my best 🧠🧐✍️
Tweet media one
Tweet media two
1
9
73
@RobertTLange
Robert Lange
4 years
⏰ Clockwork VAEs by Saxena et al. (21') scale temporally abstract latent dynamics models by imposing fixed clock speeds for different levels 📐 Very cool ablations that extract the level-info content and frequency adaptation 🧠 #mlcollage [10/52] 📜:
Tweet media one
2
11
73
@RobertTLange
Robert Lange
3 years
Tuning optimizers is a fundamental part of any DL pipeline 🚂 @robinschmidt_ *, @frankstefansch1 * & @PhilippHennig5 provide an empirical analysis of 1st-order optimizers across tasks, budgets & schedules 🚀 #mlcollage [25/52] 📜: 💻:
Tweet media one
4
9
73
@RobertTLange
Robert Lange
4 years
Thought provoking talk by @white_martha on the ingredients for BETRRL at the #ICLR2020 workshop🌏! Many interesting ideas for generalization in Meta-RL, learning objectives, restricting complex MDPs & auxiliary tasks 🚀🧐
Tweet media one
0
13
70
@RobertTLange
Robert Lange
1 month
🎉 Excited to present our work on The AI Scientist later today at DLCT @ml_collective . Will talk about the power & limitations of foundation models in scientific idea creation💡coding 🧑‍💻 writing ✍️ & reviewing 🧑‍⚖️ Drop by and ask all your pressing Qs 🤗 Of course, I am (only)
Tweet media one
@ml_collective
ML Collective
1 month
🚨 Don't miss out! Join us tomorrow at 10 AM PDT for DLCT with @RobertTLange as he dives into "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery." Step into the future of AI-driven research! #AI #DLCT
Tweet media one
2
1
9
1
15
72
@RobertTLange
Robert Lange
3 years
How does the RL problem affect the lottery ticket phenomenon 🤖🔁🎲? In our #ICLR2022 spotlight we contrast RL & behavioral cloning tickets, disentangle mask/initialization ticket contributions & analyse the resulting sparse task representations. 🧵👇 📝:
Tweet media one
2
10
71
@RobertTLange
Robert Lange
4 years
🎲Randomized autodiff reduces memory requirements of backprop by 'sketching' a sparse linearised computation graph. Check out cool work by @denizzokt , @NMcgreivy , @jaduol1 , @AlexBeatson & @ryan_p_adams #mlcollage [2/52] 📝: 📽️:
Tweet media one
2
11
69
@RobertTLange
Robert Lange
7 months
🥱 Training foundation models is so 2023 😋 🚀 Super stoked for @SakanaAILabs first release showing how to combine large open-source models in weight and data flow space! All powered by evolutionary optimization 🦎
@SakanaAILabs
Sakana AI
7 months
Introducing Evolutionary Model Merge: A new approach bringing us closer to automating foundation model development. We use evolution to find great ways of combining open-source models, building new powerful foundation models with user-specified abilities!
61
442
2K
3
6
68
@RobertTLange
Robert Lange
10 months
For anyone who didn't catch our (w. @yujin_tang & @alanyttian ) poster presentation on the coolest neuroevolution benchmark out there -- feel free to reach out & chat 📩 Would love to discuss evosax, gymnax and the future of evolutionary methods in the LLM era 🤗 #NeurIPS23
Tweet media one
@RobertTLange
Robert Lange
10 months
🎉 Stoked to share NeuroEvoBench – a JAX-based Evolutionary Optimizer benchmark for Deep Learning 🦎/🧬 🌎 To be presented at #NeurIPS2023 Datasets & Benchmarks with @yujin_tang & @alanyttian 🌐: 📜: 🧑‍💻:
Tweet media one
5
24
117
3
10
69
@RobertTLange
Robert Lange
3 years
Distilling teacher predictions 🧑‍🏫 from unlabelled examples provides an elegant approach to transfer task-specific knowledge 🧠 SimCLR-v2 effectively combines unsupervised pretraining, tuning & distillation #mlcollage [23/52] 📜: 💻:
Tweet media one
2
13
66
@RobertTLange
Robert Lange
3 years
❓How to efficiently estimate unbiased ∇ in unrolled optimization problems (e.g. hyperparameter tuning, learned optimizers)?🦎 Persistent ES does so by accumulating & applying correction terms for a series of truncated unrolls. 🎨 #mlcollage [35/52] 📜:
Tweet media one
1
9
65
@RobertTLange
Robert Lange
4 years
Trying something new 🎉 - One slide mini-collage of my personal 'paper of the week' 📜 1/52: VQ-VAEs had quite the week in ML 🥑+🪑=🦋 But how do β-VAEs relate to the visual ventral stream? Checkout Higgins et al. (2020) to find out 👉
Tweet media one
0
15
66
@RobertTLange
Robert Lange
4 years
🥳I had a great day at the #NeurIPS2020 Meta-Learning workshop 🎤 which included listening to @FrankRHutter , @luisa_zintgraf & @LouisKirschAI . Checkout many more fantastic talks & the panel! Thanks to the organizers 🤗 🖥: 🎥:
Tweet media one
Tweet media two
Tweet media three
1
8
64
@RobertTLange
Robert Lange
3 years
🧙 What are representational differences between Vision Transformers & CNNs? @maithra_raghu et al. investigate the role of self-attention & skip connections in aggregation & propagation of global info 🔎 🎨 #mlcollage [32/52] 📜:
Tweet media one
1
5
63
@RobertTLange
Robert Lange
3 years
🎉 Excited to share `mle-hyperopt` v0.0.5 - a lightweight hyperparameter optimization tool, which now also features implementations of Successive Halving 🪓, Hyperband 🎸 & Population-Based Training 🦎 📂 Repo: 📜 Colab:
Tweet media one
Tweet media two
3
15
61
@RobertTLange
Robert Lange
6 months
🧬 Evolution is the ultimate discovery process & its biological instantiation is the only proof of an open-ended process that has led to diverse intelligence! One of my deepest beliefs: A scalable evolutionary computation analogue will open up many new powerful perspectives 🧑‍🔬
Tweet media one
@_rockt
Tim Rocktäschel
6 months
@sarahcat21 When evolutionary computation and population based training didn't take off as much as they should have.
1
0
10
1
10
62
@RobertTLange
Robert Lange
3 years
Had a great time at last week's @sparsenn workshop ✂️ Absolutely loved the @thoefler 's tutorial covering many considerations (what, when, how). Beautiful distillation 🎨 Checkout the accompanying survey paper & recording 🤗 📜: 📺:
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
14
60
@RobertTLange
Robert Lange
4 years
What is the right framework to study generalization in neural nets? 🧠🔄🤖 @PreetumNakkiran et al. (21') study the gap between models trained to minimize the empirical & population loss 📉 Providing a new 🔍 for studying DL phenomena #mlcollage [13/52] 📜:
Tweet media one
0
8
62
@RobertTLange
Robert Lange
4 years
Synthetic ∇s hold the promise of decoupling neural modules 🔵🔄🔴 for large-scale distributed training based on local info. But what are underlying mechanisms & theoretical guarantees? Check out Czarnecki et al. (2017) to find out. #mlcollage [5/52] 📝:
Tweet media one
2
10
60
@RobertTLange
Robert Lange
1 year
🎙️Stocked to present evosax tomorrow at @PyConDE It has been quite the journey since my 1st blog on CMA-ES 🦎 and I have never been as stoked about the future of evo optim. 🚀 Slides 📜: Code 🤖: Event 📅:
Tweet media one
2
11
60
@RobertTLange
Robert Lange
3 years
Can memory-based meta-learning not only learn adaptive strategies 💭 but also hard-code innate behavior🦎? In our #AAAI2022 paper @sprekeler & I investigate how lifetime, task complexity & uncertainty shape meta-learned amortized Bayesian inference. 📝:
Tweet media one
2
10
57
@RobertTLange
Robert Lange
3 years
Can generative trajectory models replace offline RL 🦾algorithms? Decision Transformers autoregressively generate actions based on trajectory context & a desired return-to-go 🎨 #mlcollage [34/52] 📜: 📺: 🤖:
Tweet media one
2
14
52
@RobertTLange
Robert Lange
5 years
What drives hippocampus-neocortical interactions in memory consolidation? @SaxeLab argues for a top-down perspective & the predictability of the environment. 🧠🤓🌎
Tweet media one
Tweet media two
0
6
52
@RobertTLange
Robert Lange
3 years
How can we create training distributions rich enough to yield powerful policies for 🦾 manipulation? OpenAI et al. (21') scale asymmetric self-play to achieve 0-shot generalisation to unseen objects 🧊🍴. #mlcollage [14/52] 📜: 💻:
Tweet media one
0
10
52