Robert Lange @RobertTLange Twitter profile | Pikagi

Pikagi

Robert Lange

@RobertTLange

8,206

Followers

558

Following

263

Media

490

Statuses

Founding Research Scientist @SakanaAILabs 🔬AI Scientist 🧬gymnax 🏋️ evosax 🦎 MLE-Infra 🤹 Ex: SR @Google DM. Legacy DeepMind Intern.

TKY/BLN

https://t.co/C9mflk1Lac

Joined April 2017

Don't wanna be here? Send us removal request.

Pinned Tweet

@RobertTLange

Robert Lange

2 months

🎉 Stoked to share The AI-Scientist 🧑‍🔬 - our end-to-end approach for conducting research with LLMs including ideation, coding, experiment execution, paper write-up & reviewing. Blog 📰: Paper 📜: Code 💻:

@SakanaAILabs

Sakana AI

2 months

Introducing The AI Scientist: The world’s first AI system for automating scientific research and open-ended discovery! From ideation, writing code, running experiments and summarizing results, to writing entire papers and conducting peer-review, The AI

Tweet media one

Tweet media two

Tweet media three

Tweet media four

259

1K

6K

14

62

346

Last Seen Profiles

@UEFA

@ascensionparish

@cquetteprincess

@Setianti2023

@AsSa66986762

@ZPsicologos

@JJgodly2

@jandakembangstw

@dds8x

@SK_Trends

@Celestia2848

@cccc1hub

@ma__vo

@TechNorthHQ

@ferif_

@Sukavc14

@erotski_fondovi

@_1creppy

@Lofn_1

@smhsactivities

@Urryiii

@royale_high

@LaurenAFahy

@CubaEsen

@badorcc2019

@manoramash

@IslandGirlDJ

@Kreativ_Geist

@kuuromwi

@FriedlyTanner

@Khampeek1

@Tom91422707

@fumihirafu94998

@MauroGrelloni

@aBearSite

@itqanmoe2018

@RobertTLange

Robert Lange

4 years

🎄I am a big fan of @ylecun ‘s & @alfcnz ‘s Deep Learning course. The attention to detail is incredible and one feels the love and passion, which goes into every single course week (my favorites: 7+8 on EBMs)🤗 #feelthelearn 📜: 📽️:

Tweet media one

Tweet media two

Tweet media three

Tweet media four

8

283

1K

@RobertTLange

Robert Lange

3 years

It's the beginning of a new month - so let's reflect on the core ideas of statistics in the last 50 years ⏳ Great weekend read by @StatModeling & @avehtari covering the core developments, their commonalities & future directions 🧑‍🚀 #mlcollage [17/52] 📜:

Tweet media one

8

312

1K

@RobertTLange

Robert Lange

4 years

Beautiful overview of Bayesian Methods in ML by @shakir_za at #MLSS2020 . Left me pondering about many things beyond Bayesian Inference. Thank you Shakir🙏 Quote of the day: “The cyclist, not the cycle, steers.“🚴‍♀️ 🎤 P-I: 🎤 P-II:

Tweet media one

Tweet media two

Tweet media three

Tweet media four

6

253

1K

@RobertTLange

Robert Lange

4 years

Really happy to share #visualmlnotes ✍️ a virtual gallery of sketchnotes taken at Machine Learning talks 🧠🤓🤖 which includes last weeks #ICLR2020 . Explore, exploit & feel free to share: 💻 website: 📝 repository:

13

256

907

@RobertTLange

Robert Lange

4 years

how it started how its going

Tweet media one

Tweet media two

10

48

799

@RobertTLange

Robert Lange

4 years

🤖JAX is more than just the 'next cool autodiff library'. The primitives allow us to flexibly leverage XLA and to speed-up + vectorize neuroevolution methods 🦎 with minimal engineering overhead. Find out more in my new blog post 📝:

Tweet media one

7

133

713

@RobertTLange

Robert Lange

4 years

Great tutorial on Meta-Learning by @yeewhye covering optimisation-based, black-box & a probabilistic perspective on learning task invariances at #MLSS2020 . Re-watch the videos here: 📺(Part I): 📺(Part II):

Tweet media one

Tweet media two

Tweet media three

Tweet media four

3

146

640

@RobertTLange

Robert Lange

2 years

🚀 I am very excited to share gymnax 🏋️ — a JAX-based library of RL environments with >20 different classic environments 🌎, which are all easily parallelizable and run on CPU/GPU/TPU. 💻[repo]: 📜[colab]:

13

103

622

@RobertTLange

Robert Lange

4 years

There is a lot to wrap your head around in LSTMs🤯. One way of thinking that helped me a lot is the 'conveyor belt' metaphor of the cell state 🧑‍🏭 by @ch402 . I put together a little animation 🖼️ Check out the amazing blog post by Chris Olah here✍️:

7

105

515

@RobertTLange

Robert Lange

5 years

What a week 🧠🤓💻! I loved meeting so many of you at #NeurIPS2019 - the ML community is truly wonderful. Checkout all my collected visual notes ✍️ & feel free to share:

Tweet media one

10

126

489

@RobertTLange

Robert Lange

4 years

The lottery ticket hypothesis 🎲 states that sparse nets can be trained given the right initialisation 🧬. Since the original paper ( @jefrankle & @mcarbin ) a lot has happened. Checkout my blog post for an overview of recent developments & open Qs. ✍️:

6

120

492

@RobertTLange

Robert Lange

2 years

🚀 How can meta-learning, self-attention & JAX power the next generation of Evolutionary Optimizers 🦎? Excited to share my @DeepMind internship project and our #ICLR2023 paper ‘Discovering Evolution Strategies via Meta-Black-Box Optimization’ 🎉 📜:

2

90

388

@RobertTLange

Robert Lange

4 years

Want to learn more about the power of the implicit function theorem, DEQs, Neural ODEs & Diff. Optim.? Checkout the outstanding #NeurIPS2020 tutorial by @DavidDuvenaud , @DavidDuvenaud & @SingularMattrix . Checkout the docs📜, recording📽️ & JAX code👩‍💻:

Tweet media one

Tweet media two

Tweet media three

4

56

352

@RobertTLange

Robert Lange

4 years

JAX sometimes has me feeling like a kid in a candy store 🍭 Here is a small example of how to sample batches of Ornstein-Uhlenbeck process realisations combining lax.fori_loop, jit & vmap 🚀 Auto-vectorisation made intuitive and scalable 🤗

Tweet media one

6

44

336

@RobertTLange

Robert Lange

5 years

Great #NeurIPS2019 tutorial kick-off by @EmtiyazKhan ! Showing the unifying Bayesian Principle bridging Human & Deep Learning. Variational Online Gauss-Newton (VOGN; Osawa et al., 19‘) = A Bayesian Love Story ❤️

Tweet media one

Tweet media two

7

88

337

@RobertTLange

Robert Lange

3 years

🎉 Excited to share `mle-monitor` - a lightweight ML experiment protocol and tool for monitoring resource utilization 📝 It covers local machines/servers and Slurm/Grid engine clusters 📉 💻 [repo]: 📜 [colab]:

4

41

296

@RobertTLange

Robert Lange

4 years

📈 What functions do ReLU nets 'like' to learn? 🌈 Using Fourier analysis Rahaman et al. (19') reveal their bias to learn low frequency modes first. Insights for implicit regularization & adv. robustness. #mlcollage [3/52] 📝: 💻:

Tweet media one

2

59

289

@RobertTLange

Robert Lange

4 years

🥳Really excited to be attending #MLSS2020 . Great set of talks by @bschoelkopf & Stefan Bauer starting from 101 causality to Representation Learning for Disentanglement 💯! Re-watch them here: 📺 (Part I): 📺 (Part II):

Tweet media one

Tweet media two

Tweet media three

Tweet media four

1

45

280

@RobertTLange

Robert Lange

3 years

How to train your d̶r̶a̶g̶o̶n̶ ViT? 🐉 Steiner et al. demonstrate that augmentation & regularization yield model performance comparable to training on 10x data. Many 💵-insights for practitioners. 🎨 #mlcollage [30/52] 📜: 💻:

Tweet media one

4

70

273

@RobertTLange

Robert Lange

3 years

🚀 Happy to share my hyperparameter search tool: `mle-hyperopt` - a lightweight API covering many strategies with search space refinement 🪓, configuration export 📥 & storage/reloading of previous logs 🔄 💻[repo]: 📜[colab]:

Tweet media one

Tweet media two

3

51

261

@RobertTLange

Robert Lange

4 years

Friday optimization revelations📉: My life needs more theoretical guarantees & convex + linear =❤️. Enlightening set of talks by @BachFrancis at #MLSS2020 . Recordings can be found here: 📽️(Part I): 📽️(Part II):

Tweet media one

Tweet media two

Tweet media three

Tweet media four

1

38

259

@RobertTLange

Robert Lange

3 years

🎉 Happy to share a mini-tool that I have been using on a daily basis: `mle-logging` - a lightweight logger 📉 for ML experiments, which makes it easy to aggregate logs across configurations & random seeds 🌱 💻 [repo]: 📜 [colab]:

Tweet media one

Tweet media two

6

51

246

@RobertTLange

Robert Lange

5 years

A concise & detailed intro to Reinforcement Learning! Thank you @katjahofmann for this great #NeurIPS2019 tutorial!

Tweet media one

Tweet media two

2

52

238

@RobertTLange

Robert Lange

4 years

🥳 New tooling blog post coming your way 🚆 'A Machine Learning Workflow for the iPad Pro' - including my favourite apps, routines and pipelines for working with remote machines and @Raspberry_Pi 💽👨‍💻. ✍️: 🤗: Thanks @tech_crafted for the inspiration!

Tweet media one

7

46

219

@RobertTLange

Robert Lange

5 years

Puuuh. What are you up to these days? 💭 I try to stay sane, clean my place 🧹& write✍️. Todays edition - 'Getting started with #JAX '. Learn how to embrace the 'jit-grad-vmap' powers 💻 and code your own GRU-RNN in JAX. Stay safe & home. 🤗

Tweet media one

Tweet media two

3

49

216

@RobertTLange

Robert Lange

3 years

💓 N-Beats is a pure Deep Learning architecture for 1D time series forecasting 📈 provides a M3/M4/tourism SOTA by combining learned/interpretable basis functions 🧑‍🔬 w. residual stacking & ensembling 🎨 #mlcollage [38/52] 📜: 💻:

Tweet media one

2

31

213

@RobertTLange

Robert Lange

5 years

Looking to get started with the @kaggle ARC challenge & want to learn about psychometric/ability-based assessment of intelligent systems? Checkout my blogpost which provides an intro to "On the measure of intelligence" & the corpus by @fchollet 🤖🧠🎉 👉

Tweet media one

0

61

211

@RobertTLange

Robert Lange

5 years

🎉 2019 🎉 was quite the year for Deep Reinforcement Learning. In todays blog post I list my top 10 papers 🦄💻🧠 What was your favourite paper? Let me know!

Tweet media one

Tweet media two

2

52

207

@RobertTLange

Robert Lange

4 years

Great start to an all-virtual #ICLR2020 & the ‘Causal Learning for Decision Making‘ workshop including talks by @bschoelkopf & Lars Buesing 🧠📉👨‍💻. Looking forward to more smooth Q&As and exploring the awesome web interface!

Tweet media one

Tweet media two

2

37

195

@RobertTLange

Robert Lange

9 months

🎉 Stoked to share that I joined @SakanaAILabs as a Research Scientist & founding member. @yujin_tang & @hardmaru 's work has been very inspirational for my meta-evolution endeavors🤗 Exciting times ahead: I will be working on nature-inspired foundation models & evolution 🐠/🧬.

@SakanaAILabs

Sakana AI

9 months

Excited to announce our seed round! We raised $30M to develop nature-inspired AI in Japan.

Tweet media one

63

160

1K

19

12

189

@RobertTLange

Robert Lange

3 years

🚀 Happy to share evosax - a JAX-based library of Evolution Strategies (ES) featuring >10 different ES ranging from classics (e.g. CMA-ES, PSO) 🦎 to modern neuroevolution methods (e.g. ARS, OpenES, ClipUp)🤖 💻[repo]: 📜[colab]:

Tweet media one

1

50

182

@RobertTLange

Robert Lange

7 months

🎉 Happy to share my internship project @GoogleDeepMind 🗼 – purely text-trained LLMs can act as evolutionary recombination operators 🦎 🧬 Our EvoLLM uses LLM backends to outperform competitive baselines. Work done w. @alanyttian & @yujin_tang 🤗 📜:

5

23

172

@RobertTLange

Robert Lange

4 years

Awesome new JAX tutorial by DeepMind 🥳 Covering the philosophy of stateful programs 💭, JAX primitives and more advanced topics such as TPU parallelism, higher-order & per-example gradients ∇. All in all a great resource for every level of expertise🚀 👉

Tweet media one

@matteohessel

matteo hessel

4 years

Check out our new JAX101 tutorial to learn about the fundamentals of JAX!

6

100

421

2

22

163

@RobertTLange

Robert Lange

3 years

How well do scalable Bayesian methods 🚀 approximate the true model average? @Pavel_Izmailov et al. (21') provide insights into performance, generalization, mixing & tempering 🌡️ of Bayesian Nets ! Hamiltonian MC + 512 TPU-v3 = 💘 #mlcollage [18/52] 📜:

Tweet media one

0

32

157

@RobertTLange

Robert Lange

4 years

#MLSS2020 was full of wonderful experiences 🦋 I hope to meet many of you soon & in person. Here are all #visualmlnotes , videos & slides: ✍️: 📼&📚: Thank you 🙏 to all hard working volunteers & organizers - you did awesome 🤗

Tweet media one

Tweet media two

3

27

156

@RobertTLange

Robert Lange

4 years

Thinking 💭about biological & artificial learning with the help of Marr‘s 3 levels of analysis. Here are the #visualmlnotes ✍️from Peter Dayan‘s talk at #MLSS2020 & a little pointer to a nice complementary paper by @jhamrick & @shakir_za : 👉

Tweet media one

Tweet media two

3

38

157

@RobertTLange

Robert Lange

3 years

🚀 How similar are network representations across the layers & architectures? And how do they emerge through training?🤸New blog on Centered Kernel Alignment ( @skornblith et al., 2019) & training All-CNN-C in JAX/flax 🤖 📝: 💻:

2

33

151

@RobertTLange

Robert Lange

2 years

Excited to share that I got to join DeepMind as a research intern ☀️ This has been a dream 💭 which felt out of reach for a long time. Super grateful to the many people that supported me along the way 🤗 Time to do awesome work with @flennerhag , @TZahavy & the discovery team🚀

Tweet media one

17

4

149

@RobertTLange

Robert Lange

3 years

📉 GD can be biased towards finding 'easy' solutions 🐈 By following the eigenvectors of the Hessian with negative eigenvalues, Ridge Rider explores a diverse set of solutions 🎨 #mlcollage [40] 📜: 💻: 🎬:

Tweet media one

1

32

145

@RobertTLange

Robert Lange

3 years

🗡️Sharpness-Aware Minimization (SAM) jointly optimizes loss value & sharpness seeking nhoods w. uniformly low loss🔍 Generalization & label noise robustness↑ 🎨 #mlcollage [36/52] 📜: 💻 [JAX]: 💻 [PyTorch]:

Tweet media one

1

23

140

@RobertTLange

Robert Lange

3 years

SSL joint-embedding training 🧑‍🤝‍🧑 w/o asymmetry shenanigans? 🤯 Zbontar, Jing et al. propose a simple info bottleneck objective avoiding trivial solutions. Robust to small batches + scales w. dimensionality #mlcollage [19/52] 📜: 💻:

Tweet media one

2

28

145

@RobertTLange

Robert Lange

3 years

Can artificial agents learn rapid sensory substitution? 👁️🔁👅 Tang* & Ha* introduce a Set Transformer-inspired agent which processes arbitrarily ordered/length sensory inputs 🎨 #mlcollage [33/52] 📜: 🌐: 📺:

Tweet media one

3

23

139

@RobertTLange

Robert Lange

6 months

🦎Can we teach Transformers to perform in-context Evolutionary Optimization? Surely! We propose Evolutionary Algorithm Distillation for pre-training Transformers to mimic teachers 🧑‍🏫 🎉 Work done @GoogleDeepMind 🗼with @alanyttian & @yujin_tang 🤗 📜:

Tweet media one

2

34

139

@RobertTLange

Robert Lange

3 years

Can NNs only learn to interpolate? @randall_balestr et al. argue that NNs have to extrapolate to solve high dimensional tasks🔶 Questioning the relation of extrapolation & generalization 🎨 #mlcollage [39/52] 📜: 🎙️ [ @MLStreetTalk ]:

Tweet media one

@MLStreetTalk

Machine Learning Street Talk

3 years

Epic new show out with @ylecun and @randall_balestr where we discuss their recent everything is extrapolation paper, interpolation and the curse of dimensionality, and also dig deep into Randall's work on the spline theory of deep learning. @DoctorDuggar @ecsquendor @ykilcher

Tweet media one

11

76

366

1

26

139

@RobertTLange

Robert Lange

5 years

‘Innate everything‘ 🧠🧐🐊 - @hardmaru argues for the importance of finding the right inductive biases in bodies/architectures (WANNs) & prediction/world models (Observational Dropout) - Transferable Skills Workshop #NeurIPS2019

Tweet media one

2

24

134

@RobertTLange

Robert Lange

10 months

🎉 Stoked to share NeuroEvoBench – a JAX-based Evolutionary Optimizer benchmark for Deep Learning 🦎/🧬 🌎 To be presented at #NeurIPS2023 Datasets & Benchmarks with @yujin_tang & @alanyttian 🌐: 📜: 🧑‍💻:

Tweet media one

5

24

117

@RobertTLange

Robert Lange

4 years

✍️Want to learn more about RL, generalization within & across tasks as well as the ‚reward is enough hypothesis‘ 🌍🔄🤖? Checkout a set of thought-provoking talks by @matteohessel , @aharutyu and David Silver at the @M2lSchool ✌️

Tweet media one

Tweet media two

Tweet media three

2

17

131

@RobertTLange

Robert Lange

1 year

🎉 I transitioned from Berlin to the Tokyo 🗼 office for the 2nd half of my @GoogleDeepMind student researcher time! 🤗Deeply thankful to @yujin_tang for all the support leading up to & during my first days in Japan 🇯🇵Everything still feels pretty surreal & I am super grateful!

Tweet media one

Tweet media two

Tweet media three

Tweet media four

1

4

126

@RobertTLange

Robert Lange

5 years

People of the world - I just posted a new blog post covering my #CCN2019 experience & many keynote talks. It is fair to say - I had a truly fulfilling time 💻❤️🧠. Thank you to all organizers, volunteers & speakers ( @CogCompNeuro ). [1/2]

3

24

122

@RobertTLange

Robert Lange

4 years

This is a live dashboard 💻 monitoring my compute resources & the status/database of ML experiments 🚀 [more about this at a later point 🤗]. It is built with rich in ca. 10 hours of procreative work.

1

14

119

@RobertTLange

Robert Lange

5 years

Yoshua Bengio #NeurIPS2019 - „It is Time for ML to explore Consciousness“ 🧠👌🧐

Tweet media one

4

25

119

@RobertTLange

Robert Lange

1 year

👋 Come by poster 93 in this mornings #ICLR2023 poster session to chat about our work on Learned Evolution Strategies (LES) 🦎 📝:

Tweet media one

0

8

116

@RobertTLange

Robert Lange

5 years

Many gems in @OriolVinyalsML Deep RL workshop talk at #NeurIPS2019 on AlphaStar. Including scatter connections, imitation-based regularization, the league & the unique problem decomposition.

Tweet media one

3

20

113

@RobertTLange

Robert Lange

4 years

Workshop talks by Rich Sutton never fail to inspire 💭. Today’s #ICML2020 Life-Long Learning workshop talk was no different. Exciting ideas about RL agents that learn their own questions & answers in a virtuous cycle 🔴🔄🔵 - all within the General Value Function framework.

Tweet media one

0

21

113

@RobertTLange

Robert Lange

1 month

The DLCT @ml_collective talk on The AI Scientist is now available online! check out the recording 📺 & slides 🧑‍🎨 📺: 📜: Thanks @savvyRL for having us and everyone who attended & asked Qs!

Tweet media one

1

26

114

@RobertTLange

Robert Lange

4 months

🎉 Stoked to share our latest work @SakanaAILabs - DiscoPOP 🪩 We leverage LLMs as code-level mutation operators, which improve their own training algorithms. Thereby, we discover various performant preference optimization algorithms using LLM-driven meta-evolution (LLM²) 🔁

@SakanaAILabs

Sakana AI

4 months

Can LLMs invent better ways to train LLMs? At Sakana AI, we’re pioneering AI-driven methods to automate AI research and discovery. We’re excited to release DiscoPOP: a new SOTA preference optimization algorithm that was discovered and written by an LLM!

19

264

1K

2

24

113

@RobertTLange

Robert Lange

3 years

Very happy to present our work "On Lottery Tickets and Minimal Task Representations in Deep Reinforcement Learning“ today at the #ICLR2021 @neverendingrl workshop. 🎲 + 🤖🔁🌎 Paper 📜: Poster Session 📢 [3 & 10pm CET]: Summary 👇

Tweet media one

3

19

107

@RobertTLange

Robert Lange

4 years

Neural net symmetries induce geometric constraints 🔷 which imply conservation laws under ∇-flow 🧑‍🔬 This allows for exact prediction of training dynamics. A Noether’s theorem for NNs — great theoretical work by Kunin et al. (2020) #mlcollage [7/52] 📝:

Tweet media one

1

21

108

@RobertTLange

Robert Lange

3 years

✂️Why can we train sparse/subspace-constrained NNs? Larsen et al. derive a theory based on Gordon's Escape Theorem 🧑 → 🌔 & investigate optimized (lottery) subspaces using train data/trajectory info🎲 🎨 #mlcollage [28/52] 📜: 💻:

Tweet media one

0

29

105

@RobertTLange

Robert Lange

3 years

⛩️ Gated Linear Networks (Veness et al., 19') are backprop-free & trained online + local via convex programming 🧮 GLNs combat catastrophic forgetting & the linearity allows for interpretable predictions. #mlcollage [15/52] 📜: 💻:

Tweet media one

0

18

98

@RobertTLange

Robert Lange

3 years

🎨 Beautiful comic summaries of David Silver's classic Reinforcement Learning course 🤖 by @d4phn3c !

Tweet media one

0

22

97

@RobertTLange

Robert Lange

1 month

📢 Two weeks since we released The AI Scientist 🧑‍🔬! We want to take the time to summarize a lot of the discussions we’ve been having with the community, and give some hints about what we are working on! 🫶 We are beyond grateful for all your feedback and the community debate

Tweet media one

2

13

95

@RobertTLange

Robert Lange

4 years

4 challenges in lifelong learning 👶-🧑-👵: Formalism, evaluation, exploration & representation. Great start to the Lifelong ML workshop at #ICML2020 by @katjahofmann , @luisa_zintgraf & @contactrika . P.S.: I have never seen such smooth multi-speaker transitions 😎

Tweet media one

1

18

92

@RobertTLange

Robert Lange

3 years

🔎 How can one measure the emergence of interpretable concept units in CNNs? @davidbau et al. propose network dissection 💉 based on the agreement of filter activations and segmentation models 🎨 #mlcollage [26/52] 📜: 💻:

Tweet media one

0

17

90

@RobertTLange

Robert Lange

4 months

🎉 Do you love JAX-based RL as much as I do? We just published rejax ⚡️ a lightning-fast library of pure JAX RL algos - all jit-, vector- & parallelizable! Enabling high-throughput applications such as meta-evolution 🧬 Work done with @_chris_lu_ & led by @JarekLiesen 🤗

Tweet media one

@JarekLiesen

Jarek Liesen

4 months

🥳 I'm releasing Rejax, a lightweight library of fully vectorizable RL algorithms! ⚡ Enjoy lightning-fast speed using jax.jit on the training function 🧬Use vmap and pmap on hyperparameters 🔙 Log using flexible callbacks 🌐 Available @ 📸 Take a tour!

4

29

169

0

7

93

@RobertTLange

Robert Lange

4 years

Nothing better than starting your day with some invertible models 🤠 Great historic review & explanations by @laurent_dinh at #ICLR2020 ! 🤖 Biggest personal takeaway: The power of sparse/triangular Jacobians in determinant computation 📐

Tweet media one

0

14

86

@RobertTLange

Robert Lange

1 year

🦎/🧬Learned Evolutionary Optimization (& Rob 😋) are going on tour! Super excited to be giving talks about our recent work on meta-discovering attention-based ES/GA & JAX during the coming days 🎙️ @AutomlSeminar : Today 4pm CET @ml_collective : Tomorrow 7pm CET Come & say hi 🤗

Tweet media one

2

14

87

@RobertTLange

Robert Lange

4 months

📺 Exciting talk on the xLSTM architecture and the challenges of questioning the first-mover advantage of the Transformer 🤖 by @HochreiterSepp @scioi_cluster 📜: 💻:

Tweet media one

3

16

82

@RobertTLange

Robert Lange

3 years

🥱Tired of tuning On-Policy DRL agents? Andrychowicz et al. trained 250k agents & evaluated hyperparams for >50 choices to make our lifes easier 🚀 providing evidence for common DRL wisdom & beyond 🪙 #mlcollage [21/52] 📜: 💻:

Tweet media one

0

15

84

@RobertTLange

Robert Lange

5 years

Powerful opening #NeurIPS2019 keynote by @celestekidd ! Many inspirational thoughts from developmental psychology. Curiosity and intrinsic motivation in RL have a lot of work to do.

Tweet media one

3

17

84

@RobertTLange

Robert Lange

2 months

🤖 Drop by the AutoRL workshop [Stolz 0 at #ICML2024 ] if you are interested in how LLMs can shape the future of LLM research 🤯 @_chris_lu_ and I are happy to answer any questions!

Tweet media one

@SakanaAILabs

Sakana AI

4 months

Can LLMs invent better ways to train LLMs? At Sakana AI, we’re pioneering AI-driven methods to automate AI research and discovery. We’re excited to release DiscoPOP: a new SOTA preference optimization algorithm that was discovered and written by an LLM!

19

264

1K

2

13

84

@RobertTLange

Robert Lange

3 years

Can we go beyond backprop + SGD? BLUR (Sandler et al., 21') meta-learns a shared low-dimensional genome 🦎 which modulates bi-directional updates 🔁 It generalizes across tasks + FFW architectures & allows NNs to have many states 🧠 #mlcollage [16/52] 📜:

Tweet media one

0

16

82

@RobertTLange

Robert Lange

4 years

A global workspace theory for coordination among neural modules in deep learning🧠🔄 🤖 Goyal et al. (21') propose a low-dim. bottleneck to facilitate synchronisation of specialists & replace costly pairwise attention interactions 🚀 #mlcollage [11/52] 📜:

Tweet media one

2

25

78

@RobertTLange

Robert Lange

2 years

🤸Very excited to share evosax 🦎 release v.0.10.0 and a small paper, which covers all features and summarizes recent progress in hardware accelerated & JAX-powered evolutionary optimization! 🧑‍💻: 📜: Many new features... 🧵

2

16

78

@RobertTLange

Robert Lange

4 years

🦋 Meta-Policy Gradients ∇∇ have the power to change how we think about algorithm design 🧠. Learn more about automated online hyperparameter tuning and end-to-end RL objective discovery 🤖 in my new blog post! 📝:

1

21

75

@RobertTLange

Robert Lange

5 years

Workshop talks should push conceptual limits. Fascinating talk by Rich Sutton at the Bio&Artificial RL workshop #NeurIPS2019 #SuperDyna P.S.: I will do my best 🧠🧐✍️

Tweet media one

Tweet media two

1

9

73

@RobertTLange

Robert Lange

4 years

⏰ Clockwork VAEs by Saxena et al. (21') scale temporally abstract latent dynamics models by imposing fixed clock speeds for different levels 📐 Very cool ablations that extract the level-info content and frequency adaptation 🧠 #mlcollage [10/52] 📜:

Tweet media one

2

11

73

@RobertTLange

Robert Lange

3 years

Tuning optimizers is a fundamental part of any DL pipeline 🚂 @robinschmidt_ *, @frankstefansch1 * & @PhilippHennig5 provide an empirical analysis of 1st-order optimizers across tasks, budgets & schedules 🚀 #mlcollage [25/52] 📜: 💻:

Tweet media one

4

9

73

@RobertTLange

Robert Lange

4 years

Thought provoking talk by @white_martha on the ingredients for BETRRL at the #ICLR2020 workshop🌏! Many interesting ideas for generalization in Meta-RL, learning objectives, restricting complex MDPs & auxiliary tasks 🚀🧐

Tweet media one

0

13

70

@RobertTLange

Robert Lange

1 month

🎉 Excited to present our work on The AI Scientist later today at DLCT @ml_collective . Will talk about the power & limitations of foundation models in scientific idea creation💡coding 🧑‍💻 writing ✍️ & reviewing 🧑‍⚖️ Drop by and ask all your pressing Qs 🤗 Of course, I am (only)

Tweet media one

@ml_collective

ML Collective

1 month

🚨 Don't miss out! Join us tomorrow at 10 AM PDT for DLCT with @RobertTLange as he dives into "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery." Step into the future of AI-driven research! #AI #DLCT

Tweet media one

2

1

9

1

15

72

@RobertTLange

Robert Lange

3 years

How does the RL problem affect the lottery ticket phenomenon 🤖🔁🎲? In our #ICLR2022 spotlight we contrast RL & behavioral cloning tickets, disentangle mask/initialization ticket contributions & analyse the resulting sparse task representations. 🧵👇 📝:

Tweet media one

2

10

71

@RobertTLange

Robert Lange

4 years

🎲Randomized autodiff reduces memory requirements of backprop by 'sketching' a sparse linearised computation graph. Check out cool work by @denizzokt , @NMcgreivy , @jaduol1 , @AlexBeatson & @ryan_p_adams #mlcollage [2/52] 📝: 📽️:

Tweet media one

2

11

69

@RobertTLange

Robert Lange

7 months

🥱 Training foundation models is so 2023 😋 🚀 Super stoked for @SakanaAILabs first release showing how to combine large open-source models in weight and data flow space! All powered by evolutionary optimization 🦎

@SakanaAILabs

Sakana AI

7 months

Introducing Evolutionary Model Merge: A new approach bringing us closer to automating foundation model development. We use evolution to find great ways of combining open-source models, building new powerful foundation models with user-specified abilities!

61

442

2K

3

6

68

@RobertTLange

Robert Lange

10 months

For anyone who didn't catch our (w. @yujin_tang & @alanyttian ) poster presentation on the coolest neuroevolution benchmark out there -- feel free to reach out & chat 📩 Would love to discuss evosax, gymnax and the future of evolutionary methods in the LLM era 🤗 #NeurIPS23

Tweet media one

@RobertTLange

Robert Lange

10 months

🎉 Stoked to share NeuroEvoBench – a JAX-based Evolutionary Optimizer benchmark for Deep Learning 🦎/🧬 🌎 To be presented at #NeurIPS2023 Datasets & Benchmarks with @yujin_tang & @alanyttian 🌐: 📜: 🧑‍💻:

Tweet media one

5

24

117

3

10

69

@RobertTLange

Robert Lange

3 years

Distilling teacher predictions 🧑‍🏫 from unlabelled examples provides an elegant approach to transfer task-specific knowledge 🧠 SimCLR-v2 effectively combines unsupervised pretraining, tuning & distillation #mlcollage [23/52] 📜: 💻:

Tweet media one

2

13

66

@RobertTLange

Robert Lange

3 years

❓How to efficiently estimate unbiased ∇ in unrolled optimization problems (e.g. hyperparameter tuning, learned optimizers)?🦎 Persistent ES does so by accumulating & applying correction terms for a series of truncated unrolls. 🎨 #mlcollage [35/52] 📜:

Tweet media one

1

9

65

@RobertTLange

Robert Lange

4 years

Trying something new 🎉 - One slide mini-collage of my personal 'paper of the week' 📜 1/52: VQ-VAEs had quite the week in ML 🥑+🪑=🦋 But how do β-VAEs relate to the visual ventral stream? Checkout Higgins et al. (2020) to find out 👉

Tweet media one

0

15

66

@RobertTLange

Robert Lange

4 years

🥳I had a great day at the #NeurIPS2020 Meta-Learning workshop 🎤 which included listening to @FrankRHutter , @luisa_zintgraf & @LouisKirschAI . Checkout many more fantastic talks & the panel! Thanks to the organizers 🤗 🖥: 🎥:

Tweet media one

Tweet media two

Tweet media three

1

8

64

@RobertTLange

Robert Lange

3 years

🧙 What are representational differences between Vision Transformers & CNNs? @maithra_raghu et al. investigate the role of self-attention & skip connections in aggregation & propagation of global info 🔎 🎨 #mlcollage [32/52] 📜:

Tweet media one

1

5

63

@RobertTLange

Robert Lange

3 years

🎉 Excited to share `mle-hyperopt` v0.0.5 - a lightweight hyperparameter optimization tool, which now also features implementations of Successive Halving 🪓, Hyperband 🎸 & Population-Based Training 🦎 📂 Repo: 📜 Colab:

Tweet media one

Tweet media two

3

15

61

@RobertTLange

Robert Lange

6 months

🧬 Evolution is the ultimate discovery process & its biological instantiation is the only proof of an open-ended process that has led to diverse intelligence! One of my deepest beliefs: A scalable evolutionary computation analogue will open up many new powerful perspectives 🧑‍🔬

Tweet media one

@_rockt

Tim Rocktäschel

6 months

@sarahcat21 When evolutionary computation and population based training didn't take off as much as they should have.

1

0

10

1

10

62

@RobertTLange

Robert Lange

3 years

Had a great time at last week's @sparsenn workshop ✂️ Absolutely loved the @thoefler 's tutorial covering many considerations (what, when, how). Beautiful distillation 🎨 Checkout the accompanying survey paper & recording 🤗 📜: 📺:

Tweet media one

Tweet media two

Tweet media three

Tweet media four

0

14

60

@RobertTLange

Robert Lange

4 years

What is the right framework to study generalization in neural nets? 🧠🔄🤖 @PreetumNakkiran et al. (21') study the gap between models trained to minimize the empirical & population loss 📉 Providing a new 🔍 for studying DL phenomena #mlcollage [13/52] 📜:

Tweet media one

0

8

62

@RobertTLange

Robert Lange

4 years

Synthetic ∇s hold the promise of decoupling neural modules 🔵🔄🔴 for large-scale distributed training based on local info. But what are underlying mechanisms & theoretical guarantees? Check out Czarnecki et al. (2017) to find out. #mlcollage [5/52] 📝:

Tweet media one

2

10

60

@RobertTLange

Robert Lange

1 year

🎙️Stocked to present evosax tomorrow at @PyConDE It has been quite the journey since my 1st blog on CMA-ES 🦎 and I have never been as stoked about the future of evo optim. 🚀 Slides 📜: Code 🤖: Event 📅:

Tweet media one

2

11

60

@RobertTLange

Robert Lange

3 years

Can memory-based meta-learning not only learn adaptive strategies 💭 but also hard-code innate behavior🦎? In our #AAAI2022 paper @sprekeler & I investigate how lifetime, task complexity & uncertainty shape meta-learned amortized Bayesian inference. 📝:

Tweet media one

2

10

57

@RobertTLange

Robert Lange

3 years

Can generative trajectory models replace offline RL 🦾algorithms? Decision Transformers autoregressively generate actions based on trajectory context & a desired return-to-go 🎨 #mlcollage [34/52] 📜: 📺: 🤖:

Tweet media one

2

14

52

@RobertTLange

Robert Lange

5 years

What drives hippocampus-neocortical interactions in memory consolidation? @SaxeLab argues for a top-down perspective & the predictability of the environment. 🧠🤓🌎

Tweet media one

Tweet media two

0

6

52

@RobertTLange

Robert Lange

3 years

How can we create training distributions rich enough to yield powerful policies for 🦾 manipulation? OpenAI et al. (21') scale asymmetric self-play to achieve 0-shot generalisation to unseen objects 🧊🍴. #mlcollage [14/52] 📜: 💻:

Tweet media one

0

10

52