Arthur Douillard Profile Banner
Arthur Douillard Profile
Arthur Douillard

@Ar_Douillard

3,382
Followers
1,872
Following
393
Media
3,181
Statuses

Modular & Distributed Learning for LLMs @ DeepMind, Continual Learning PhD @ Sorbonne

London, England
Joined January 2016
Don't wanna be here? Send us removal request.
Pinned Tweet
@Ar_Douillard
Arthur Douillard
4 months
I'm super excited to release DiPaCo, a new kind of mixture of experts, that can scale engineering-wise to data centers across the entire world! A few words about it in this thread 🧵
@_akhaliq
AK
4 months
Google presents DiPaCo Distributed Path Composition Progress in machine learning (ML) has been fueled by scaling neural network models. This scaling has been enabled by ever more heroic feats of engineering, necessary for accommodating ML approaches that require high
Tweet media one
2
26
161
9
37
187
@Ar_Douillard
Arthur Douillard
3 years
I've released my course on deep learning for computer vision! It includes slides, google colab, and Anki decks for all 6 topics I'm covering. We code from the basics (backprop from scratch) to the SotA (transformers & MLP-mixer). Feedback appreciated!
Tweet media one
14
166
684
@Ar_Douillard
Arthur Douillard
3 years
Vision transformers are more biased towards shapes (as humans are) than Convolutional Networks:
Tweet media one
8
142
666
@Ar_Douillard
Arthur Douillard
2 years
I am excited to share that, after my PhD 👨‍🎓, I will join @DeepMind this summer as a Research Scientist in the Continual Learning team led by Marc'Aurelio Ranzato! 🎉
28
7
473
@Ar_Douillard
Arthur Douillard
8 months
🚨 We released our work on data parallelism for language models *distributed* across the entire world! 🧵Thread below 👇
@_akhaliq
AK
8 months
DiLoCo: Distributed Low-Communication Training of Language Models paper page: Large language models (LLM) have become a critical component in many applications of machine learning. However, standard approaches to training LLM require a large number of
Tweet media one
11
54
316
17
69
380
@Ar_Douillard
Arthur Douillard
3 years
Github + VSCode on your browser = 🤯 Just add "1s" before the ".com", and tada! Here is an example with our Continual Learning library Continuum:
Tweet media one
9
85
382
@Ar_Douillard
Arthur Douillard
2 years
Main topic in NeurIPS parties is GPT-4. The rumors are wild
11
19
273
@Ar_Douillard
Arthur Douillard
1 year
🚨 My team at @DeepMind is looking for a Research Engineer in Efficient Large-Scale Learning! 👉 ❓ Unprecedented scale + efficiently adaptation to new tasks 📚 Distributed large-scale learning and continual learning!
5
38
254
@Ar_Douillard
Arthur Douillard
2 years
Google released a few days ago Minerva (), a language model (PaLM) that solves highschool math problems. Funny part: the prompt includes "I hope it is correct"!
Tweet media one
10
34
251
@Ar_Douillard
Arthur Douillard
3 months
Something is cooking 🕵️‍♂️
Tweet media one
9
8
220
@Ar_Douillard
Arthur Douillard
2 years
From being a computer lover, to being a Doctor in computer science
Tweet media one
Tweet media two
9
2
162
@Ar_Douillard
Arthur Douillard
3 years
Ok, I've learn today that there is a 'inference_mode' context manager that does the 'no_grad' job, but with added speed. Seen from the Grokking Pytorch:
Tweet media one
6
26
151
@Ar_Douillard
Arthur Douillard
3 years
🎄It's christmas' time, so we recently added plenty of datasets for continual learning! +50 datasets for classification & segmentation +7 different continual scenarios 👉 And surprise, we now support HuggingFace's NLP datasest! 👇🧵
Tweet media one
1
21
147
@Ar_Douillard
Arthur Douillard
10 days
No code too
@LeopolisDream
Alex Yanko 🇺🇦
12 days
Welcome the new architecture: Terminator No residuals, no dot product attention, no normalization...
Tweet media one
16
135
815
5
3
130
@Ar_Douillard
Arthur Douillard
3 years
PixMix: merging images with fractals Leads to models more robust to corruption, adversary attacks, with better calibration, etc. than the other baselines (MixUp, CutMix, CutOut).
Tweet media one
2
25
102
@Ar_Douillard
Arthur Douillard
3 years
PoolFormer: replacing self-attention / spatial mlp / fourier transform by a simple average pooling. - Less operations (each pooling reducing number of tokens by 2x) - As good as other "meta-former" Are we going to reinvent convnets?
Tweet media one
1
27
99
@Ar_Douillard
Arthur Douillard
3 years
I love when fixing a bug in my neural network degrades its performance...
1
8
82
@Ar_Douillard
Arthur Douillard
3 years
I've got my christmas present early: More than 11k unique visitors on my deep learning for computer vision course! 🤗 I'm so happy! 👉 2022's goal: recording a video for each lesson
Tweet media one
0
14
82
@Ar_Douillard
Arthur Douillard
3 years
I've submitted my first paper ever at CVPR2020 and got rejected, it was hard. But I'm happy to announce that my third paper, PLOP, has been accepted to #CVPR2021 ! Code will be released soon!
@Ar_Douillard
Arthur Douillard
4 years
New work from Y.Chen, A.Dapogny, @quobbe , and myself. We tackle Continual Semantic Segmentation by introducing a novel distillation loss exploiting local & global details, and an uncertainty-based pseudo-labeling handling background shift (We are PLOP)
Tweet media one
1
13
35
4
7
76
@Ar_Douillard
Arthur Douillard
1 month
My team is looking for a research engineer in New York! Our recent efforts include DiLoCo (distributed learning) and DiPaCo (distributed mixture of experts). Those projects that I've co-led, were the most exciting projects I've contributed, and i can tell you one thing: there
4
5
72
@Ar_Douillard
Arthur Douillard
2 years
The first transformer designed for Continual Learning in Computer Vision has been accepted to #CVPR2022 ! 🥳 Using a dynamic approach, it forgets less than previous ensembling methods while using fewer parameters. 💻: 📕: 🧵👇
Tweet media one
4
16
72
@Ar_Douillard
Arthur Douillard
6 years
@ncremins GDPR is coming.
0
26
69
@Ar_Douillard
Arthur Douillard
4 years
Tired of implementing the many data settings of Continual Learning? @TLesort & I present Continuum! A Pytorch library that enables you in a few lines to have a Continual dataset: MNIST, PermutedMNIST, CIFAR10/CIFAR100, ImageNet, CORe50, and many more!
Tweet media one
1
27
68
@Ar_Douillard
Arthur Douillard
2 years
Great! I'm finishing my PhD in June, and CVPR 2022 will be my only opportunities to have in-person conference of my whooole PhD
@CVPR
#CVPR2024
2 years
Message from our #CVPR2022 Program Chairs: Unless the epidemiological situation changes drastically, CVPR 2022 will be in person, with an online option for those who cannot travel. Information on visa letters will be sent to authors in the next few days.
0
22
227
3
1
66
@Ar_Douillard
Arthur Douillard
4 years
I'm proud to present you my first ever paper uploaded recently on arXiv: "Small Task Incremental Learning" We design a novel distillation loss that outperform previous SotA by a large margin, especially on 50 tasks of only 1 class!
Tweet media one
7
23
67
@Ar_Douillard
Arthur Douillard
1 year
My last paper as a phd student 🤩
@mlia_isir
MLIA
1 year
Pending #CVPR2023 in June, we are pleased to share our 4 accepted papers (3/4) "CoMFormer: Continual Learning in Semantic and Panoptic Segmentation" by @fcdl94 , @quobbe , @Ar_Douillard preprint: Collab w/ Politecnico di Torino and @heuritechlab
Tweet media one
1
2
20
0
2
67
@Ar_Douillard
Arthur Douillard
2 months
Something I didn't fully realize during my PhD but now see: the extended bitter lesson is that hyperparameters are sometimes more important than a new architecture. So many research papers proposing new archi/losses/optimizers would get crushed by a well-tuned baseline.
@andrew_n_carr
Andrew Carr (e/🤸)
2 months
The DeepSeek-V2 paper was full of pretty amazing nuggets of wisdom. I spent the afternoon copying lots of their training setup into our model. Orange is previous and Blue is new with DeepSeek hyper parameters. Things that mattered most: 1. Warm up LR ratio 2. Batch ramp
Tweet media one
14
54
585
3
5
65
@Ar_Douillard
Arthur Douillard
2 years
Transformers for Small-Scale Datasets: - Tokenization with overlap between patches - Add pooling to reduce nb of tokens - Mask with -∞ the diagonal attention logits to avoid tokens attending on themselves - add learned temperature - improve all archi
Tweet media one
Tweet media two
1
6
63
@Ar_Douillard
Arthur Douillard
4 months
I think people don’t realize the progress in AI that happened in the last 5 years. Being close to the level of a junior dev isn’t impressive anymore?
@gonza_nardini
Gonza Nardini
4 months
@itsandrewgao Sounds a bit disappointing honestly. The requests were a bit hard, but a good AI should be able to solve these, they aren't exactly rocket science. I think most jr devs would be able to solve them One request it couldn't even complete and the other just deployed a buggy solution
5
0
19
7
1
59
@Ar_Douillard
Arthur Douillard
3 months
TPUs are pretty great tbh. One of the best move Google ever made
@a__tomala
Alex Tomala
3 months
Just use TPUs
14
4
125
3
2
59
@Ar_Douillard
Arthur Douillard
5 years
@AndrewYNg Another reason, even more crucial, is that AI researchers open source a lot their methods
1
4
54
@Ar_Douillard
Arthur Douillard
6 years
Venn diagram of the various subfields of #AI Source: The #DeepLearning book of Goodfellow, Bengio, & Courville. #DataScience #MachineLearning #ArtificialIntelligence
Tweet media one
0
26
47
@Ar_Douillard
Arthur Douillard
3 years
I’ve finished my 24h course today with my students. The latest chapter about computer vision’s future has been updated: transformer, mlp-mixer, pool-former, sam, etc. And there are tutos to code them from scratch + anki cards
@Ar_Douillard
Arthur Douillard
3 years
I've released my course on deep learning for computer vision! It includes slides, google colab, and Anki decks for all 6 topics I'm covering. We code from the basics (backprop from scratch) to the SotA (transformers & MLP-mixer). Feedback appreciated!
Tweet media one
14
166
684
1
2
49
@Ar_Douillard
Arthur Douillard
2 years
After 1.5 years we finally got our paper on object rehearsal for continual semantic segmentation accepted at TPAMI!
Tweet media one
1
9
49
@Ar_Douillard
Arthur Douillard
2 years
I’ve done 2 conferences in person, both in New Orleans, both in 2022. So far NeurIPS is soooo much more interesting than CVPR.
4
2
46
@Ar_Douillard
Arthur Douillard
3 years
#CVPR is in the top-5, per citation, of all venues 🤯 It says a lot about the rapid growth of the field, I barely keep up at reading papers published at top conferences in my niche domain
Tweet media one
3
13
48
@Ar_Douillard
Arthur Douillard
2 years
Come see at our poster presentation of DyTox, Continual transformer, this afternoon ! Poster 131b. #CVPR2022
Tweet media one
0
3
46
@Ar_Douillard
Arthur Douillard
2 years
That’s where Continual Learning could really shine: 1. Keep this model 2. Add continually new tasks 3. … 4. AGI?
@GoogleDeepMind
Google DeepMind
2 years
Gato🐈a scalable generalist agent that uses a single transformer with exactly the same weights to play Atari, follow text instructions, caption images, chat with people, control a real robot arm, and more: Paper: 1/
95
1K
5K
4
7
44
@Ar_Douillard
Arthur Douillard
2 years
1 paper accepted at #CVPR2022 on continual transformer :) With @ramealexandre , Guillaume Couairon, and @quobbe . More details & code in the coming weeks.
2
1
45
@Ar_Douillard
Arthur Douillard
3 years
Amazing tutorial on reinforcement learning by @DeepMind at the @EEMLcommunity :
Tweet media one
1
7
44
@Ar_Douillard
Arthur Douillard
2 years
A concern I have in Continual Learning models, is that it's often hard to understand from the pdf paper if: - use rehearsal or not, if yes how much? - # params does model use vs baselines? - task id at test-time? - # of tasks? - pretrained? It's getting hard to compare models
9
1
44
@Ar_Douillard
Arthur Douillard
2 years
Eh, I guess my idea from two months ago was right. -->
@arankomatsuzaki
Aran Komatsuzaki
2 years
Corrupted Image Modeling for Self-Supervised Visual Pre-Training ELECTRA-version of BeiT/MAE with CNN/ViT performs competitively with SotA on vision self-supervised learning.
Tweet media one
2
42
186
0
5
44
@Ar_Douillard
Arthur Douillard
7 months
I’m quite impressed by the number of people on this platform making threads to explain what is OpenAI’s Q*. I guess their time working on the blockchain made them prescient about AI.
5
3
42
@Ar_Douillard
Arthur Douillard
4 months
It's a bit of a vanity metric, but i'm super proud to have reach the 1000 citations mark 😀
Tweet media one
2
0
42
@Ar_Douillard
Arthur Douillard
2 years
👨‍🎓 My PhD thesis on Continual Learning for Computer Vision is now online! 📚 👉 I cover continual learning across img classification w/ metric learning & growing transformers, segmentation w/ distillation & efficient replay, and even zero-shot learning.
Tweet media one
0
3
41
@Ar_Douillard
Arthur Douillard
4 years
We have been accepted at ECCV2020 ! Thanks to my awesome coauthors @quobbe @DrEAVJr @CharlesOllion @ThomasR_Fr @heuritechlab @mlia_lip6
@Ar_Douillard
Arthur Douillard
4 years
I'm proud to present you my first ever paper uploaded recently on arXiv: "Small Task Incremental Learning" We design a novel distillation loss that outperform previous SotA by a large margin, especially on 50 tasks of only 1 class!
Tweet media one
7
23
67
5
16
40
@Ar_Douillard
Arthur Douillard
3 years
I'm presenting tomorrow Continuum, a light-weight library to do continual learning! Come watch Friday 3 April, 5.30PM CEST :) 📌 Eventbrite event: 📌 Miscrosoft Teams:
Tweet media one
1
7
38
@Ar_Douillard
Arthur Douillard
2 months
DiLoCo is refused to ICML 😢 On the one hand, I'm annoyed at one of the reviewer asking proof of convergence for our distributed training scheme at LLM scale. On the other hand, the program chair has wrote very well balanced conclusion, thanks for that!
@Ar_Douillard
Arthur Douillard
8 months
🚨 We released our work on data parallelism for language models *distributed* across the entire world! 🧵Thread below 👇
17
69
380
5
2
39
@Ar_Douillard
Arthur Douillard
2 years
I’m defending next monday my PhD! It’ll be live streamed on youtube: 13 June, 2PM CEST, 8AM New York time
@mlia_isir
MLIA
2 years
📢Thesis defense Happy to announce Arthur Douillard's @Ar_Douillard thesis defense next week! It will take place on Monday, June 13th at 2 p.m. Title: "Continual Learning for Computer Vision" Supervisors: Matthieu Cord @quobbe & Thomas Robert @ThomasR_Fr
Tweet media one
1
4
14
4
1
38
@Ar_Douillard
Arthur Douillard
3 years
It may sound vain, but I reached today the 20 citations bar (not a lot compared to most of my Twitter feed, but a lot for me), and it's make me very happy. I'm glad my work is deemed useful by others, it gives purpose to all my failed experiments I guess 🙃
1
0
38
@Ar_Douillard
Arthur Douillard
5 years
I'm glad to announce that I'll start this July a PhD in #DeepLearning for computer vision at @LIP6_lab / @ScienceSorbonne under the supervision of @quobbe ! And I'll still work part-time with the great french startup @heuritechdata A little boy's dream of AI becomes reality!
2
5
36
@Ar_Douillard
Arthur Douillard
4 years
New work from Y.Chen, A.Dapogny, @quobbe , and myself. We tackle Continual Semantic Segmentation by introducing a novel distillation loss exploiting local & global details, and an uncertainty-based pseudo-labeling handling background shift (We are PLOP)
Tweet media one
1
13
35
@Ar_Douillard
Arthur Douillard
3 years
My cat ε and I are honored to be featured in today's @CVPR 's daily. Despite being virtual, so far I'm enjoying a lot this conf', I've learned a lot! #CVPR2021
Tweet media one
0
6
35
@Ar_Douillard
Arthur Douillard
2 years
@carrigmat And that you used a few hundred TPUs on JFT300M
0
2
34
@Ar_Douillard
Arthur Douillard
4 months
Language Models require twice less compute every 8 months, better than Moore's Law.
Tweet media one
1
4
32
@Ar_Douillard
Arthur Douillard
4 months
Everything everywhere all at once. Our long-term goal is to train a network across the entire world, using all the compute. Thus, we need to re-visit existing architectures to limit the communication overhead, memory limit, and inference speed. Current methods aren't enough!
Tweet media one
2
5
30
@Ar_Douillard
Arthur Douillard
3 years
Continuum now supports the Continual Learning CTRL benchmark of @TomVeniat @LudovicDenoyer @MarcRanzato @facebookai ! * 5 predefined CTRL datasets * easy to custom your own CTRL Code: Colab: Paper:
Tweet media one
2
11
31
@Ar_Douillard
Arthur Douillard
2 years
Excellent literature review on the loss landscape of neural network by @dam_nlp : -> 1. Wide Basins and Generalization 2. Intrinsic Dimensionality 3. Mode Connectivity 4. SGD Training Dynamics
Tweet media one
1
5
30
@Ar_Douillard
Arthur Douillard
4 years
I just saw a code base with a hyperparameter of 0.968. 0.968 Who does gridsearch so fine-grained?
9
2
30
@Ar_Douillard
Arthur Douillard
2 years
🚨 I'm excited about the release of 🏝️ NEVIS'22, a benchmark where we collected 106 datasets from the last 30 years of CV research! 🤖 Can you design a model to efficiently learn them all using forward transfer? 📜 My first paper while at @DeepMind 🥰
@GoogleDeepMind
Google DeepMind
2 years
Introducing NEVIS’22, a new benchmark developed using 30 years of computer vision research. This provides an opportunity to explore how AI models can continually build on their knowledge to learn future tasks more efficiently. ➡️
3
48
214
0
12
30
@Ar_Douillard
Arthur Douillard
2 years
I’m going to NeurIPS! Fellow continual learners, let’s have a chat 😃
2
0
30
@Ar_Douillard
Arthur Douillard
4 months
Oh this is really cool! They train an encoder-decoder transformer to predict the search dynamics of A*, which resulted in a search with less steps than just using the classical A* algorithm.
Tweet media one
3
3
30
@Ar_Douillard
Arthur Douillard
5 years
@heuritechlab & I've just published a technical introduction to Incremental Learning with #DeepLearning ! Being able to learn continuously is an important features of any intelligent system, see what are the current strategies. @ContinualAI
1
12
30
@Ar_Douillard
Arthur Douillard
2 years
Poster 2.0 @ NeurIPS
Tweet media one
0
0
30
@Ar_Douillard
Arthur Douillard
2 years
I’m presenting today at the Continual Learning workshop of #CVPR2022 both Dytox (dynamic transformer) and Saporta et al.’s MuHDI (continual adaptation). Come chat with me, poster #207a and #208a
Tweet media one
Tweet media two
1
1
29
@Ar_Douillard
Arthur Douillard
5 years
I've just published in @TDataScience a small #DeepLearning article about a research paper I liked!
@TDataScience
Towards Data Science
5 years
How To Be Confident In Your Neural Network Confidence
Tweet media one
Tweet media two
0
10
21
0
7
28
@Ar_Douillard
Arthur Douillard
2 years
I'm been reading this weekend on @OReillyMedia the early draft version of the book from @huggingface on NLP. Super interesting, and I've learned tons about NLP (QA, BigBird...) and made Anki cards about that. I'm eager to see the final book version once it's published!
Tweet media one
Tweet media two
0
3
29
@Ar_Douillard
Arthur Douillard
2 months
I’m honored to see our work on distributed training (DiLoCo) and distributed mixture of experts (DiPaCo) highlighted during ICLR’s keynote by @RaiaHadsell !
Tweet media one
Tweet media two
2
1
28
@Ar_Douillard
Arthur Douillard
3 years
I've been learning Chinese for a year (at a very slow pace) out of boredom of the lockdown. And today it has been useful, I can now understand issues raised on my github repo in chinese, 我很高兴!
Tweet media one
3
0
28
@Ar_Douillard
Arthur Douillard
2 years
It’s was super exciting (and tiring) to finally present my work on Continual Transformer at today #CVPR2022 session!
Tweet media one
1
3
28
@Ar_Douillard
Arthur Douillard
6 months
We release the async extension of DiLoCo shared in November, led by our amazing intern @cranialxix ! 👀 TL;DR: we do distributed data-parallelism of a language model across the world, synchronized every 10-100 of steps, AND using heterogenous devices 🧵 below
@_akhaliq
AK
6 months
Google Deepmind present Asynchronous Local-SGD Training for Language Modeling paper page: Local stochastic gradient descent (Local-SGD), also referred to as federated averaging, is an approach to distributed optimization where each device performs more
Tweet media one
2
29
160
3
7
28
@Ar_Douillard
Arthur Douillard
6 years
Inspired by #themorningpaper of @adriancolyer : A reading of "A Few Useful Things to Know about Machine Learning" by Prof Domingos: #MachineLearning
2
6
26
@Ar_Douillard
Arthur Douillard
2 years
I haven’t yet got to CVPR that I already meet many researchers at the airport! So cool!
1
0
27
@Ar_Douillard
Arthur Douillard
5 months
I’ll be talking about our recent DiLoCo: how to train your network distributed across the world!
@CohereForAI
Cohere For AI
5 months
Join our community-led Geo Regional Asia on Wednesday, February 21st as they welcome @Ar_Douillard , Sr. Researcher at @GoogleDeepMind to discuss "DiLoCo: Distributed Low-Communication Training of Language Models." Learn more:
Tweet media one
1
1
7
0
3
27
@Ar_Douillard
Arthur Douillard
3 years
ICCV 2023 will be in Paris 🇫🇷 !
@CSProfKGD
Kosta Derpanis
3 years
Start making your travel plans. Upcoming #ComputerVision conferences (subject to change)
Tweet media one
Tweet media two
5
7
75
1
0
25
@Ar_Douillard
Arthur Douillard
3 months
Given how hard it is to get a visa in the US, Europe should really become the next AI powerhouse
1
0
25
@Ar_Douillard
Arthur Douillard
3 years
Our paper "Insights from the Future for Continual Learning" has been published at the CLVISION Workshop of #CVPR2021 ! We exploit zeroshot to incorpore future concepts in the current embeddings and minimize interference Paper: Code:
Tweet media one
2
5
24
@Ar_Douillard
Arthur Douillard
6 years
I'm very happy to announce that I've won with @RemiMeunier , Antoine Naulet, and @dataiku 2 of the 3 prizes offered for the @NATO Innovation Challenge! We pitched a solution using Dataiku's DSS and #DeepLearning for satellite imagery! #NATOiChall #DataScience #DeepLearning
Tweet media one
0
8
22
@Ar_Douillard
Arthur Douillard
3 months
I have a solution for them:
@corbtt
Kyle Corbitt
3 months
Spoke to a Microsoft engineer on the GPT-6 training cluster project. He kvetched about the pain they're having provisioning infiniband-class links between GPUs in different regions. Me: "why not just colocate the cluster in one region?" Him: "Oh yeah we tried that first. We
227
785
6K
1
3
23
@Ar_Douillard
Arthur Douillard
2 years
I'll be giving a talk about our recent CVPR 2022 on Continual Transformer on Thursday 7th at 5:30 PM CET. Come join me to hear more about it! Thanks @v_lomonaco @jamessealesmith @ContinualAI for the invite :) Stream link:
1
6
22
@Ar_Douillard
Arthur Douillard
3 months
Cool & hard benchmark: OSWorld. Where you have to fill tasks on ubuntu that requires multiple steps planning, and potentially search over internet to solve them.
3
3
22
@Ar_Douillard
Arthur Douillard
2 months
I really like the device-level balancing loss: LLMs are now more about engineering than some abstract architecture. Balancing across experts make sense w.r.t to the ML performance, but at that scale the communication bottleneck is critical too
Tweet media one
0
3
22
@Ar_Douillard
Arthur Douillard
6 years
I'm glad to have been selected to compete for the #NATOiChall with @dataiku , some serious #DataScience innovation is been prepared to impress @NATO !
Tweet media one
0
3
20
@Ar_Douillard
Arthur Douillard
3 years
I'm presenting our work on using prior from the future in Continual Learning to make networks more selfless at this afternoon @CVPR @ContinualAI Workshop. "Insights from the Future for Continual Learning" Paper: #CVPR2021
Tweet media one
1
11
22
@Ar_Douillard
Arthur Douillard
1 year
I love Continual Learning, but what excites me the most about this conference is the pre-registration process. So many times in Deep Learning, "hypothesis" arrive after the experiments. What if we do it the proper way instead? More details soon 👀
@ContinualAI
ContinualAI
1 year
🚨Are you ready!?🚨 Today we're announcing the Continual AI Unconference (CLAI Unconf) to be held in October 2023! CLAI Unconf is virtual & multi-timezone, covering diverse CL topics with pre-registered papers & contributed talks! ➡️ Please share!
Tweet media one
3
31
67
0
2
22
@Ar_Douillard
Arthur Douillard
2 years
With GELU being the cool kid now, ReLU had to find a new job
Tweet media one
1
0
21
@Ar_Douillard
Arthur Douillard
6 years
Facebook just released DensePose, a #DeepLearning model and an associated dataset to map a 3d representation of human on rgb images!
0
13
18
@Ar_Douillard
Arthur Douillard
3 years
I've released the code of our #CVPR2021 's PLOP on Continual Segmentation! Code: Camera-ready pdf:
Tweet media one
2
5
20
@Ar_Douillard
Arthur Douillard
10 days
Tweet media one
0
0
20
@Ar_Douillard
Arthur Douillard
1 year
Research Scientists are just a wrapper over Nvidia GPUs
@ericjang11
Eric Jang
1 year
Venture Capital is just a wrapper over Nvidia GPUs
3
4
81
0
0
20
@Ar_Douillard
Arthur Douillard
4 months
TL;DR: a experimental mixture of experts that can be trained across the world, with no limit engineering-wise on its size, while being able to be light-weight and fast at test-time. arXiv link:
3
5
18
@Ar_Douillard
Arthur Douillard
10 months
So I just saw a in-person talk of Schmidhuber. No slides, just him talking and waving hands. Well, I have to say he's a really great speaker
0
0
20
@Ar_Douillard
Arthur Douillard
2 years
I’d say the same for Continual Learning, we should stop working on {Split, Permuted, Rotated}-MNIST. Bigger datasets like ImageNet (or mini/tiny/100-subset) and Core50 are more representative.
@aaron_defazio
Aaron Defazio
2 years
Optimization improvements on MNIST and CIFAR-10 rarely transfer to larger problems, the sooner we stop testing on them the better.
10
9
138
1
0
20
@Ar_Douillard
Arthur Douillard
5 months
This please. I’m very surprised how much this paper was retweet, while clearly there is a strong confounding factor: the paper quality itself.
@DamienTeney
Damien Teney
5 months
⚠️Correlation ≠ causation. The controls in this study only account for topic and publication venue. These "influencers" don't tweet random papers, and any skill they have in picking promising papers can explain why this study finds that they eventually accrue more citations.
5
6
55
2
4
20