Stéphane d'Ascoli @stephanedascoli Twitter profile | Pikagi

Pikagi

Stéphane d'Ascoli

@stephanedascoli

1,049

Followers

184

Following

19

Media

96

Statuses

Research Scientist @AIatMeta working on neural decoding. Prev: AI4science fellow @EPFL , PhD & physics @ENS_ULM , astro @NASA .

Paris, France

https://t.co/KjtEIkVpof

Joined November 2018

Don't wanna be here? Send us removal request.

Pinned Tweet

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

Think Transfomers are terrible at logical reasoning? Think again 💥 In this collaboration with Samy Bengio, @jsusskin (Apple) & Emmanuel Abbé (EPFL), we show that when trained with Boolean inputs and symbolic outputs, they become very powerful 🧠 🧵⤵️

Tweet media one

8

122

611

Last Seen Profiles

@bokeplokalmalam

@Mid_Penn_Bank

@heyyanelize

@nana_mura

@SaudiAirlinesEn

@senseiumpJ_man

@keneshuku

@Corrupt_Oriona

@NACCLeagues

@AsSa66986762

@HelioGoulart2

@MalibuCSD

@NadaRfakih

@KimMikkels35703

@xobryson

@DaviMarvelous

@kabuki0424

@tukumoseisuke

@syohei26290320

@BDE_Corp

@emmaepup

@1me0na1ka8

@kedamarimossari

@Towchukwu

@TwitchyIsa

@SuiCorner

@dajuanlane

@mchancecnn

@Wizard_Skull

@neuvillemon

@Shrey__123

@hobiebrownies

@tswift13iwase

@KenadyAlan14479

@ClixzyyEU

@maphumi_

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

Today I met machine learning’s number one enemy #spuriouscorrelations

Tweet media one

13

39

607

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

Thrilled to announce that I will be joining @MetaAI next month as a Research Scientist 😍 I will be working in the Brain & AI team on decoding language from neural activity, to hopefully help those which have difficulties to speak or type. Learn more here:

Using AI to decode speech from brain activity

Decoding speech from brain activity has mostly benefited from invasive approaches. New research from FAIR shows AI could instead make use of noninvasive brain scans.

12

12

275

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

🚨 ODEFormer is on Arxiv! We show that Transformers can recover the differential equations governing dynamical systems from noisy & irregularly sampled trajectories. Very fun collaboration with @SorenBecker , @TrackingPlumes , @pschwllr & @k__niki ! 🧵⤵️

Tweet media one

3

52

249

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

CNNs are more sample-efficient, ViTs are more powerful. Can we get the best of both worlds? Check out our paper accepted at @icmlconf Thanks to my collaborators @HugoTouvron @leavittron @arimorcos @leventsagun @GiulioBiroli 🧵⤵️

Tweet media one

3

28

176

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

After a couple great years at @MetaAI and @ENS_ULM , I will be starting as @AI4ScienceEPFL fellow next month 😍 Can’t wait to leverage modern AI tools with biologists, neuroscientists, chemists and physicists 🧬🧠🧪🔭 If you work at @EPFL and want to meet up, please reach out !

4

3

85

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

1,2,3,5,8,13… What is the next term ? This kind of question is typical in IQ tests, but has received little attention in AI. We had great fun training Transformers to tackle this problem, check out our paper and our online demo :

@GuillaumeLample

Guillaume Lample @ ICLR 2024

@GuillaumeLample

3 years

Deep Symbolic Regression for Recurrent Sequences -- We show that transformers are great at predicting symbolic functions from values, and can predict the recurrence relation of sequences better than Mathematica. You can try it here:

Tweet media one

Tweet media two

Tweet media three

25

162

732

5

16

74

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

New preprint : . When and how should you decay your learning rate ? We give some theoretical insights on this crucial question in our latest work with @MariaRefinetti and @GiulioBiroli . (1/3)

Tweet media one

3

11

40

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

We hope this work can be applied to other fields in science and spark more research on symbolic reasoning in LLMs. We release our code & models publicly and provide a pip package & interactive Colab demo! A few attention maps for the pleasure of the eye:

Tweet media one

Tweet media two

0

2

40

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

The so-called "Boolformer" takes as input a set of N (x,y) pairs in {0,1}^D x {0,1}, and tries to predict a Boolean formula which approximates these observations. Here are two very simple examples: addition and multiplication of 2-bit numbers.

Tweet media one

Tweet media two

1

6

39

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

Great video by @ykilcher on our symbolic regression paper with @pa_kamienny @GuillaumeLample @f_charton . Watch until the end to discover his musical skills 😅

@ykilcher

Yannic Kilcher 🇸🇨

3 years

📜Paper Video Time!📜Today I'm talking to Stéphane d'Ascoli ( @stephanedascoli ) about Deep Symbolic Regression for Recurrent Sequences. This model is given a sequence of numbers, like 1, 2, 3, 5, 8 and it figures out the *rule behind* the sequence. Insane🤯

Tweet media one

4

50

220

0

5

31

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

Cet après-midi, mon ami Arthur et moi avions l’honneur d’être invités par Étienne Klein sur France Culture dans la Conversation Scientifique. Une émission consacrée pleinement à notre sujet préféré, l’espace-temps et sa courbure : a (ré)-écouter ici !

Tweet card media

L’espace-temps est courbe : qu’est-ce à dire ?

La théorie de la gravitation a constitué une véritable révolution dans l'histoire de la science : comment a-t-elle pu se produire?

www.radiofrance.fr

1

0

24

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

We apply the Boolformer to a set of classification tasks from PMLB, ranging from predicting chess moves to diagnosing horse colic. Our model achieves similar performance to classic ML methods, while outputting interpretable Boolean formulas!

Tweet media one

Tweet media two

1

1

25

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

4 years

Vos semaines surchargées et vos fins de mois difficiles ne sont plus une excuse pour ne pas vous intéresser à l’IA ! Avec ce nouveau livre aussi concis que bon marché, découvrez une nouvelle notion sur l’IA chaque jour entre deux arrêts de métro !

Tweet media one

0

3

20

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

We also applied the Boolformer to the task of gene regulatory network inference, which is central in biology. On a recent benchmark, our model is competitive with state-of-the-art genetic algorithms for Boolean modelling, while running several orders of magnitude faster!

Tweet media one

1

1

19

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

4 years

Double descent has recently become popular in deep learning, but a similar curve was observed in the 1990s for least squares. Wonder if these kinds of overfitting are the same ? Come and see our Spotlight at #NeurIPS2020 and chat in the poster session !

Tweet media one

0

4

13

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

🧑‍🔬 We hope our method can guide the intuition of domain experts in many fields of the natural sciences. To facilitate this, we released ODEFormer & ODEBench publicly and built a pip package & interactive demo to help get started:

Tweet card media

GitHub - sdascoli/odeformer

Contribute to sdascoli/odeformer development by creating an account on GitHub.

0

1

13

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

Yeah James Webb is nice, but did you know that you can produce these kind of pictures using just… an iPhone (with a perfect night, long exposure and a bit of postprocessing) ?! Taken in Pumalin national park, southern Chile.

Tweet media one

Tweet media two

Tweet media three

1

1

9

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

📈 Given the limitations of the "Strogatz" benchmark for this task, we introduce ODEBench, a more extensive collection of dynamical systems curated from the literature. On both benchmarks, ODEFormer achieves SOTA, with fast inference and impressive robustness to noise!

Tweet media one

Tweet media two

1

0

10

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

Could Transformers become the gold standard for symbolic (and even non-symbolic) regression ? Check out our latest paper to find out !

@f_charton

François Charton

2 years

Our new paper on Symbolic Regression with @pa_kamienny @stephanedascoli @GuillaumeLample is now on Arxiv ! We achieve performance comparable to SOTA genetic algorithms on SRBench with Transformers, whose inference time is orders of magnitude lower! 1/4

Tweet media one

3

27

117

0

1

7

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

4 years

Le livre sur la relativité que nous avons écrit avec Arthur Touati est enfin entre nos mains ! Si vous avez aimé Interstellar et vous voulez vous replonger la tête dans les étoiles, n’hésitez pas à le pré-commander ici : Sortie officielle le 25 mars 🚀🛰🧑‍🚀

1

0

7

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

In convex problems, the best is to decay as 1/time. What about non-convex problems? For random gaussian losses on the sphere, we show that the optimal decay rate is smaller than one (0.5 in plot below). This could explain why the inverse square root schedule is so popular! (2/3)

Tweet media one

1

1

7

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

🌎We open-sourced our code and pretrained models at . Enjoy playing around with it !

GitHub - facebookresearch/convit: Code for the Convolutional Vision Transformer (ConViT)

Code for the Convolutional Vision Transformer (ConViT) - facebookresearch/convit

0

1

7

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

🚀The ConViT benefits from a vastly increased sample efficiency, without any sacrifice in terms of maximal performance. We hope this model will spark more exploration of "soft" inductive biases, which make learning easier, but vanish away when not needed!

Tweet media one

1

0

5

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

We then study inference problems, where two phases emerge: a search phase, followed by a convergence phase once the signal is detected. Here, the optimal schedule is to keep a large constant learning rate to speed up the search, then decay as 1/time once in a convex basin. (3/3)

Tweet media one

1

1

5

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

4 years

Tweet media one

0

0

5

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

4 years

@NicolasToueille @Aurelie_JEAN @ylecun Merci beaucoup ! Un honneur d’être parmi d’aussi grands noms 😇

1

0

4

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

Très honoré que notre Voyage au Cœur de l’Atome soit récompensé ainsi ! 🤩 Merci @PrixROBERVAL , @badry96 et @editionsfirst

@PrixROBERVAL

Prix ROBERVAL

2 years

Retour sur la cérémonie du Prix Roberval: Bravo à la lauréate de la catégorie Grand Public Aline Richard Zivohlava pour son œuvre “La Saga CRISPR” et au coup de cœur des médias de la catégorie Grand Public “Voyage au cœur de l’atome” d’Adrien Bouscal et Stéphane d’Ascoli !

Tweet media one

Tweet media two

Tweet media three

Tweet media four

0

2

8

1

1

4

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

💡The ConViT uses Gated Positional Self-Attention (GPSA) layers, which are initialized to mimick convolutions, then let each attention head learn more complex relationships through a learnable gating parameter.

Tweet media one

1

0

4

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

3 years

🧑‍🔬Hybrid models are a good compromise, but optimal architecture is very task-dependent. What if we let each layer decide whether to perform convolutions or self-attention? This is the idea behind the ConViT, an “adaptive” hybrid model!

Tweet media one

1

0

3

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

@np_hard Yes it is, Chiloe island ! How on earth did you recognise ?

1

0

2

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

@sirbayes @jsusskin Not directly with this model (doesn’t have numbers in its vocabulary), but we considered real-valued inputs in previous work on SR, both for 1D recurrent sequences () and multidimensional point clouds () 🙂

Tweet card media

Deep Symbolic Regression for Recurrent Sequences

Symbolic regression, i.e. predicting a function from the observation of its values, is well-known to be a challenging task. In this paper, we train Transformers to infer the function or recurrence...

1

0

2

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

@KrzakalaF Haha thanks, that’s the next level after forgetting email attachments 😂

0

0

1

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

@sirbayes @jsusskin Would be cool to try and build a multimodal symbolic Transformer !

0

0

1

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

@gini_do @_Vassim @FondationLOreal @UNESCO @4womeninscience Amazing, congrats !!!

0

0

1

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

2 years

Come visit us at @icmlconf on Wednesday evening !

@f_charton

François Charton

2 years

The source code for our ICML 2022 paper Deep Learning for Recurrent Sequences () is now available on . Spotlight: Wednesday 20, 16:50 ET Poster session: Wednesday 20, 18:30 ET @stephanedascoli @pa_kamienny @GuillaumeLample

1

2

17

0

0

1

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

10 months

@francoisfleuret Stochastic method: pick a learning rate eps and initialise m=x_0. Then if x_i > m, m+=eps, otherwise m-=eps. You can decay the learning rate etc.

0

0

1

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

4 years

@YaniKhezzar Ravi que ça vous ait plu, merci beaucoup ☺️

0

0

1

@stephanedascoli

Stéphane d'Ascoli

@stephanedascoli

1 year

@mariabrbic I’ll miss you all too !

0

0

1