Maksym Andriushchenko @ ICML'24 @maksym_andr Twitter profile | Pikagi

Pikagi

Maksym Andriushchenko @ ICML'24

@maksym_andr

3,123

Followers

995

Following

283

Media

1,226

Statuses

PhD from @EPFL 🇨🇭 (Google & Open Phil PhD AI fellow). Past: @adoberesearch @uni_tue @SIC_Saar . Best way to support 🇺🇦:

Lausanne, Switzerland

https://t.co/0hRTO21fIx

Joined April 2018

Don't wanna be here? Send us removal request.

Pinned Tweet

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 months

🚨 Are leading safety-aligned LLMs adversarially robust? 🚨 ❗In our new work, we jailbreak basically all of them with ≈100% success rate (according to GPT-4 as a semantic judge): - Claude 1.2 / 2.0 / 2.1 / 3 Haiku / 3 Sonnet / 3 Opus, - GPT-3.5 / GPT-4, - R2D2-7B from

Tweet media one

6

64

366

Last Seen Profiles

@PlouffeYi69262

@CoGritm

@waarneming

@CNEAthletics

@umishira_crepa

@WaddleDurrr

@shinyceleb

@taesanest

@bader222abha

@TwjanaITGirl

@sttrgzing

@vahgela18995

@noranb_

@RDirty18

@noranb_

@StylosRouges93

@Llainsta

@alok_SteelMint

@Hisham_Rizq

@TieeshaEssex

@rierie444

@JuliiaElla

@Hisham_Rizq

@petergriffin_QA

@DPS_2000

@honyfrappe

@UNFPAMocambique

@GoddessAshleyR

@Hijabbacol2883

@OLCoachLHS

@H18Lisa

@ValaLegz

@GalacticHW

@yo4443175855112

@bhismiiA

@RoyceHipolito_

@maksym_andr

Maksym Andriushchenko @ ICML'24

9 months

Transformers without skip connections, normalization layers, and projection layer can match the performance of the standard Pre-LN transformers. Pretty neat!

Tweet media one

5

87

564

@maksym_andr

Maksym Andriushchenko @ ICML'24

7 months

We all know that AGI is coming, BUT adversarial examples are *still* not solved and scale is not all you need! Simple random search using logprobs of GPT-4 reveals that it has quite limited robustness. Short paper: Code: 🧵1/n

Tweet media one

11

76

464

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

🚨Happy to share our new paper "SGD with large step sizes learns sparse features" ! ❓Why do longer SGD schedules generalize better? What kind of hidden dynamics occurs when the train loss stabilizes? How is that related to sparse feature learning? 🧵1/10

2

72

439

@maksym_andr

Maksym Andriushchenko @ ICML'24

8 days

🚨Excited to share our new paper!🚨 We reveal a curious generalization gap in the current refusal training approaches: simply reformulating a harmful request in the past tense (e.g., "How to make a Molotov cocktail?" to "How did people make a Molotov cocktail?") is often

Tweet media one

20

91

476

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 months

Super excited to share that I successfully defended my PhD thesis "Understanding Generalization and Robustness in Modern Deep Learning" today 👨‍🎓 A huge thanks to the thesis examiners @SebastienBubeck , @zicokolter , and @KrzakalaF , jury president Rachid Guerraoui, and, of course,

Tweet media one

61

12

430

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Wow, a mind-blowing idea! Fine-tuning the model on *test samples* via self-supervised learning (rotation prediction). Plus an interesting theoretical analysis on a toy model suggesting that the key is to have positive gradient correlation between L_m & L_s

Tweet media one

9

79

380

@maksym_andr

Maksym Andriushchenko @ ICML'24

10 months

It's really surprising how far one can go with *linear* predictors in the autoregressive setting. Interesting theory and experiments on TinyStories: a linear model (with 162M params :-) ) can generate totally coherent text with few grammatical mistakes.

Tweet media one

4

46

296

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨 Excited to share our new paper “A modern look at the relationship between sharpness and generalization” ! ❓ Do flatter minima really generalize better? Let’s find out! 🧵 1/n

Tweet media one

11

47

267

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

GPT-4 is inherently not reproducible, most likely due to batched inference with MoEs (h/t @patrickrchao for the ref!): interestingly, GPT-3.5 Turbo seems _weirdly_ bimodal wrt logprobs (my own exp below): seems like extra evidence that it's also a MoE 🤔

Tweet media one

3

28

257

@maksym_andr

Maksym Andriushchenko @ ICML'24

11 months

Perks of doing a PhD in Switzerland… (Üssers Barrhorn, 3610m)

Tweet media one

3

3

246

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 months

🚨Is In-Context Learning Sufficient for Instruction Following in LLMs?🚨 In our new work, we study alignment of base models, including GPT-4-Base (!), via many-shot in-context learning. I.e., no fine-tuning whatsoever, just prompting - how far can we go? Many people are

Tweet media one

7

40

227

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Truly excited to be selected among the 2022 Google PhD fellows!

@Google

Google

2 years

A big congrats to 2022’s cohort of Google PhD Fellows, who are part of our program that recognizes and supports graduate students doing exceptional research in computer science and related fields. Read more about the experience from three alums ↓

63

69

484

22

4

218

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Excited to share our #ICML2022 paper "Towards Understanding Sharpness-Aware Minimization"! Why does m-sharpness matter in m-SAM? Can we explain the benefits of m-SAM on simple models? Which other interesting properties does m-SAM show? Paper: 🧵1/n

Tweet media one

4

33

200

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Happy to help with #ScienceForUkraine as a national coordinator for Switzerland🇨🇭! The goal is to support #scientists and #researchers from #Ukraine who are affected by the Russian invasion. More info at: @Sci_for_Ukraine and !

2

88

196

@maksym_andr

Maksym Andriushchenko @ ICML'24

10 months

❓So why do we need weight decay, really? Don't LLMs directly optimize the population loss and regularization is not needed? 🚨 Our new paper provides a modern look at the old question ! (joint work with @dngfra , @adityavardhanv , @tml_lab ) 🧵1/n

Tweet media one

2

22

189

@maksym_andr

Maksym Andriushchenko @ ICML'24

11 months

More evidence for "scale is NOT all you need": even OpenFlamingo trained on 2B+ image-caption pairs has basically zero adversarial robustness. Even per-pixel perturbations of 1/255 (totally imperceptible) are sufficient to generate arbitrary captions!

Tweet media one

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

It's still very curious how "scale is all you need" does NOT apply to adversarial robustness. No matter how much you scale data/compute for standard training, you don't get any noticeable robustness (even to the toyish Linf-bounded adversarial examples).

10

6

62

7

33

170

@maksym_andr

Maksym Andriushchenko @ ICML'24

9 months

🚨 I'm looking for a postdoc position to start in Fall 2024! My most recent research interests are related to understanding foundation models (especially LLMs!), making them more reliable, and developing principled methods for deep learning. More info:

9

44

163

@maksym_andr

Maksym Andriushchenko @ ICML'24

9 months

47 papers from EPFL at this NeurIPS 🚀🚀🚀 See you in New Orleans!

Tweet card media

EPFL @ NeurIPS 2023

54 EPFL papers have been accepted to this year conference on Neural Information Processing Systems (NeurIPS). Congratulatations! 37th edition of NeurIPS will take e place in New Orleans, USA from...

3

6

159

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨Excited to share our new work “Sharpness-Aware Minimization Leads to Low-Rank Features” ! ❓We know SAM improves generalization, but can we better understand the structure of features learned by SAM? (with @dara_bahri , @TheGradient , N. Flammarion) 🧵1/n

Tweet media one

4

24

157

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

So, what really matters for instruction fine-tuning? Surprisingly, simply fine-tuning on the *longest* examples is an extremely strong baseline for alignment of LLMs. Really excited to share our new work: . Full story below! 🧵1/n

Tweet media one

5

29

152

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 months

Llama-3 is absolutely impressive, but is it more resilient to adaptive jailbreak attacks compared to Llama-2? 🤔 Not much. The same approach as in our recent work leads to 100% attack success rate. The code and logs of the attack are now available:

Tweet media one

5

21

138

@maksym_andr

Maksym Andriushchenko @ ICML'24

11 months

The main take from the Simons LLM workshop: no one really knows anymore what _generalization_ means in the age of LLMs. There is no agreement about: - in-distribution vs. out-of-distribution, - interpolation vs. extrapolation, - memorization vs. generalization.

9

11

132

@maksym_andr

Maksym Andriushchenko @ ICML'24

28 days

Perhaps my favorite jailbreak: making a harmful request in the past tense (How to create Y? →How did people create Y?). Works on surprisingly many models :-) including the new Gemma-2. I think it tells us something fundamental about the representations that these models learn.

Tweet media one

Tweet media two

5

10

128

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Ever wondered about the current progress in adversarial ML? Too many papers to keep track? 🤔 We present a *standardized* benchmark to clearly see which ideas really work 💡 Leaderboard Model Zoo Paper 1/n

1

28

126

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Accepted at #ICML2023 ! 🎉 See you in Hawaii :) PS: thanks to the anonymous reviewers, we got lots of great tips that have improved / will improve the clarity of the paper. Stay tuned for the camera-ready version!

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

🚨Happy to share our new paper "SGD with large step sizes learns sparse features" ! ❓Why do longer SGD schedules generalize better? What kind of hidden dynamics occurs when the train loss stabilizes? How is that related to sparse feature learning? 🧵1/10

2

72

439

1

12

122

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 months

Great to see that both of our recent papers—JailbreakBench () and our adaptive attack paper ()—have been used by Google to evaluate the robustness of Gemini 1.5 Flash/Pro against jailbreaking attacks! An interesting comment from

Tweet media one

2

16

120

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 month

🚨 We are very excited to release JailbreakBench v1.0! 📄 We have substantially extended the version 0.1 that was on arXiv since March: - More attack artifacts (Prompt template with random search in addition to GCG, PAIR, and JailbreakChat): . - More

Tweet media one

3

26

114

@maksym_andr

Maksym Andriushchenko @ ICML'24

7 days

@MatthewBerman You should provide a reference to the original source :-)

@maksym_andr

Maksym Andriushchenko @ ICML'24

8 days

🚨Excited to share our new paper!🚨 We reveal a curious generalization gap in the current refusal training approaches: simply reformulating a harmful request in the past tense (e.g., "How to make a Molotov cocktail?" to "How did people make a Molotov cocktail?") is often

Tweet media one

20

91

476

4

0

102

@maksym_andr

Maksym Andriushchenko @ ICML'24

7 months

Scaling data+compute is all you need for adversarial robustness (if you use adv. training)? Impressive results: 83.9%/71.0% clean/robust accuracy (ε=4/255) on ImageNet-1k with a ViT-1B trained on 1B examples (prev. SOTA: 78.9%/59.6%) ( @cihangxie et al)

Tweet media one

3

16

94

@maksym_andr

Maksym Andriushchenko @ ICML'24

8 months

I'm arriving in New Orleans tonight and would be happy to chat about generalization, adversarial robustness, and, of course, anything related to LLMs. DM me if you want to meet in the upcoming days! :) And also check out our papers at NeurIPS 👇

Tweet media one

1

4

95

@maksym_andr

Maksym Andriushchenko @ ICML'24

10 months

Accepted at #NeurIPS2023 ! Thanks anonymous reviewers (maybe you are here on Twitter...) for insightful comments 😊 See y'all in New Orleans! 🎶🎷

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨Excited to share our new work “Sharpness-Aware Minimization Leads to Low-Rank Features” ! ❓We know SAM improves generalization, but can we better understand the structure of features learned by SAM? (with @dara_bahri , @TheGradient , N. Flammarion) 🧵1/n

Tweet media one

4

24

157

6

6

93

@maksym_andr

Maksym Andriushchenko @ ICML'24

11 months

a follow-up on that was released only 1 month ago. the field is moving fast!

Tweet media one

0

15

88

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Excited to share that our RobustBench paper () got Best Paper Honorable Mention at ICLR'21 Workshop on Security & Safety in ML Systems! The project is ongoing so contributions are welcome! Website: Model Zoo:

Tweet media one

6

13

87

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 months

An academic paper about jailbreaks on GPT-4o - that was fast! "VoiceJailbreak is capable of generating simple, audible, yet effective jailbreak prompts, which significantly increases the average attack success rate (ASR) from 0.033 to 0.778 in six forbidden scenarios."

Tweet media one

9

13

88

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

I also noticed that now quite many people seem to be convinced that Lp-robustness is not interesting anymore. This is particularly surprising because there is so much evidence that Lp-robust models are **already** very relevant in practice. A few points below: 1/6

@SebastienBubeck

Sebastien Bubeck

@SebastienBubeck

4 years

Adversarial examples are imo *the* cleanest major open problem in ML. I don't know what was said precisely, but diminishing the central role of this problem is not healthy for our field. Ofc in the absence of a solution there are many alternative questions that we can/should ask.

14

19

165

2

16

84

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

so many good news today... glad to be recognized as a top reviewer at this NeurIPS :)

6

1

83

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Happy to share that our RobustBench paper got accepted to the NeurIPS'21 datasets & benchmarks track! RobustBench is actively expanding, now having 120+ evaluations and 80+ models in the Model Zoo including models on ImageNet. Contributions are welcome! :)

Tweet media one

4

6

72

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 month

Releasing Nemotron-4-340B, a GPT-4 level model with permissive licensing, is a bold move from Nvidia. But how robust is this model to simple jailbreaking attacks? 🤔 Not much, despite the refusal training, red teaming, LLM scanners, etc. The universal prompt template from our

Tweet media one

2

7

66

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

$20 for training BERT 🧐 wow (source: ES-FoMo workshop )

Tweet media one

1

13

67

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

This work on *layer-wise* linear mode connectivity is now accepted at ICLR 2024 (led by @LinaraAdylova )! See you in Vienna :)

@LinaraAdylova

Linara

10 months

🤔We know that linear mode connectivity doesn't hold for two independently trained models. But what about *layer-wise* LMC? Well, it is very different! 📄Our new work explores this (+ applications to federated averaging) . 🧵1/6

Tweet media one

1

12

68

0

3

67

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Incredibly excited to be selected among these exemplary PhD students! Very grateful to Open Philanthropy for the fellowship support and their commitment to improving the long-term impacts of AI.

@open_phil

Open Philanthropy

2 years

We're excited to announce the 2022 class of the Open Phil AI Fellowship! Eleven promising researchers will collectively receive up to $1.84 million in PhD fellowship support over the next five years. Meet the fellows:

0

3

50

13

1

65

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

It's still very curious how "scale is all you need" does NOT apply to adversarial robustness. No matter how much you scale data/compute for standard training, you don't get any noticeable robustness (even to the toyish Linf-bounded adversarial examples).

10

6

62

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 months

Really excited about this new paper! Defending against jailbreaking attacks is much more feasible than previously thought (!) We shouldn't blindly transfer lessons learned from Lp adversarial robustness to jailbreaking of LLMs: - the robustness-accuracy tradeoff is not

@andyzou_jiaming

Andy Zou

@andyzou_jiaming

2 months

No LLM is secure! A year ago, we unveiled the first of many automated jailbreak capable of cracking all major LLMs. 🚨 But there is hope?! We introduce Short Circuiting: the first alignment technique that is adversarially robust. 🧵 📄 Paper:

Tweet media one

17

105

660

1

4

63

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

It’s well-known that SGD provides strong implicit regularization, but how does it affect the *features* learned by the network? We discuss this in our poster ( #501 ) tomorrow at 10:30 am! Drop by if you are interested :)

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

🚨Happy to share our new paper "SGD with large step sizes learns sparse features" ! ❓Why do longer SGD schedules generalize better? What kind of hidden dynamics occurs when the train loss stabilizes? How is that related to sparse feature learning? 🧵1/10

2

72

439

0

11

63

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Interestingly, for many LMs, one can substitute key words in a sentence (e.g. yoga→burg, English→Repl, people→async) without changing its meaning *for the LM*. Similarly to vision models, LMs seem to have quite many blind spots & unintended behaviors!

Tweet media one

4

9

62

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Robustness to distribution shifts (adversarial or natural) seems to be one of the most important problems in ML. In practice, test data almost never have the same distribution as train data (i.e. "i.i.d." is a *very* strong assumption). Good to see systematic benchmarks on this!

@shiorisagawa

Shiori Sagawa

4 years

We're excited to announce WILDS, a benchmark of in-the-wild distribution shifts with 7 datasets across diverse data modalities and real-world applications. Website: Paper: Github: Thread below. (1/12)

Tweet media one

8

206

897

1

3

60

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨 Finally, there is an improvement on RobustBench and by a large margin! 66.56% -> 70.69% robust accuracy on CIFAR-10 with Linf ε=8/255 (without using extra data!). The key component is improved diffusion models for data generation (see for details).

Tweet media one

3

7

61

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

Excited to present our ICML'23 work on sharpness and generalization (sorry, no LLMs this time...) at the Math Machine Learning seminar MPI MIS + UCLA on Thursday, 5pm CET on Zoom. Join us :-)

Tweet media one

0

12

60

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 months

🚨Super excited that JailbreakBench is out!🚨 Research on jailbreaking lacks reproducibility. We believe that at this level of LLM capabilities, sharing jailbreak artifacts (with exact and reproducible hyperparameters, prompts, generations) in the open is a net positive for the

@patrickrchao

Patrick Chao

4 months

Are you interested in jailbreaking LLMs? Have you ever wished that jailbreaking research was more standardized, reproducible, or transparent? Check out JailbreakBench, an open benchmark and leaderboard for Jailbreak attacks and defenses on LLMs! 🧵1/n

Tweet media one

2

45

175

1

5

60

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

EPFL @EPFL_en is extending its support to students and researchers from Ukraine: . Extra funding for research groups that host 🇺🇦 researchers + auditor/visiting status for 🇺🇦 students without any tuition fees. #ScienceForUkraine

Tweet media one

Tweet media two

1

31

58

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Horrific footage from my native city… Russia has been destroying infrastructure objects (including *health* facilities like today) and killing civilians for more than a year now. I really admire the resilience of ordinary people that carry on no matter what…

@KyivIndependent

The Kyiv Independent

@KyivIndependent

1 year

Two children, aged 3 and 6, are among the injured in the Russian missile strike on a clinic in Dnipro on May 26. Source: Dnipropetrovsk Governor Serhii Lysak/Telegram

113

722

1K

1

12

57

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Accepted at #ICML2023 ! 🎉

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨 Excited to share our new paper “A modern look at the relationship between sharpness and generalization” ! ❓ Do flatter minima really generalize better? Let’s find out! 🧵 1/n

Tweet media one

11

47

267

2

6

58

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

it's really impressive that the cake analogy from NeurIPS 2016 (!) exactly describes the full process of training frontier LLMs: pre-training, supervised fine-tuning, RLHF! and all are in the right order :-)

Tweet media one

1

7

55

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

It's quite crazy that gaussian augmentation is sufficient to prevent catastrophic overfitting in adversarial training 🤔 note that their experimental evaluation is very convincing (unlike in many other papers on cat. overfitting). a very nice and practically relevant finding!

@paudejorge

Pau de Jorge

2 years

1/n Check out our new work “Make Some Noise: Reliable and Efficient Single-Step Adversarial Training” with amazing collaborators @adel_bibi @rvolpis @AmartyaSanyal Philip Torr Gregory Rogez and @puneetdokania Paper: Code:

Tweet media one

2

7

39

2

7

56

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Had many discussions at ICML about what our sharpness paper actually implies. Importantly, it doesn't imply that sharpness is useless, particularly since the empirical success of SAM is undeniable. Seems like we need a short thread for this! 🧵1/5

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨 Excited to share our new paper “A modern look at the relationship between sharpness and generalization” ! ❓ Do flatter minima really generalize better? Let’s find out! 🧵 1/n

Tweet media one

11

47

267

3

10

56

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

To appear at #NeurIPS2020 !

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Ever wondered if FGSM training now really works? Actually, it does, but only for small eps. Catastrophic overfitting is *still a problem* for many recently proposed methods, but this can be fixed with GradAlign. Paper: Code: (1/9)

Tweet media one

2

7

46

3

3

54

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Interesting work about scaling up *plain* MLPs (yes, even without patch extraction). You can get as far as 93.6% accuracy on CIFAR-10 with pre-training on ImageNet-21k. Impressive to see how far one can push data+compute even for such naive architectures!

Tweet media one

1

9

53

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Going to #ICML2023 in Hawaii 🏝️. Message me if you want to chat! I'll present the following papers about implicit regularization of SGD, sharpness of minima, and sharpness-aware minimization: (🧵1/3)

1

2

51

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 months

GPT-4o can't repeat this string. what's happening?

Tweet media one

7

1

51

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

I got curious about this and tested ChatGPT on last year's exam from our ML course at EPFL (). Chain-of-thought evaluation with a majority vote over 5 trials gives 10/20 correct answers. Not as great as 30/36 but definitely above chance (≈4/20)! 🧵1/6

Tweet card media

GitHub - epfml/ML_course: EPFL Machine Learning Course, Fall 2023

EPFL Machine Learning Course, Fall 2023. Contribute to epfml/ML_course development by creating an account on GitHub.

@gchrupala

Grzegorz Chrupała 🇪🇺🇺🇦

2 years

I just tested ChatGPT on the open-book multiple choice exam for an introductory ML course I teach. It got 30 out of 36 correct, a respectable grade, and it found a mistake in one question.

11

74

1K

2

5

50

@maksym_andr

Maksym Andriushchenko @ ICML'24

8 months

🚨Our new work on how overparameterization relates to the success of SAM! ❓When does SAM improve generalization? There are multiple factors, but the key one is the *degree of overparameterization*. E.g., if you are pretraining an LLM on _huge_ corpora, don't expect SAM to help!

@sungbin_shin

Sungbin Shin

8 months

❓How does SAM behave when training overparameterized NNs? 👀Check our new paper: "The Effects of Overparameterization on Sharpness-aware Minimization: An Empirical and Theoretical Analysis"! 📢TL;DR: SAM benefits a lot from overparameterization! 🧵1/10

Tweet media one

2

9

49

0

3

50

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 months

"Long is more" is accepted at #ICML2024 ! 🎉 A few more thoughts about this paper: - Fine-tuning on longer instructions ≠ just gaming benchmarks like AlpacaEval. The length (surprisingly) provides a lot of signal for instruction quality. - To further show this, we've done a

Tweet media one

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

So, what really matters for instruction fine-tuning? Surprisingly, simply fine-tuning on the *longest* examples is an extremely strong baseline for alignment of LLMs. Really excited to share our new work: . Full story below! 🧵1/n

Tweet media one

5

29

152

0

9

49

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

everyone says “scaling”, but it basically stopped? like GPT-3 had 175B params 3.5y ago, and the current frontier LLMs seem to have the same order of params… then all further advances must have come from algorithmic improvements: data curation, long context, alignment, etc (!)

5

2

47

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 months

Really like this plot from : - the evolution of GPT models is clear: more refusals on harmful requests while maintaining the same rate of wrong refusals on benign requests, - Claude models refuse more on harmful requests but at the expense of much more

Tweet media one

2

6

47

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Attending #NeurIPS2020 and interested in adversarial robustness? Stop by our poster "Understanding and Improving Fast Adversarial Training" on Wed, 6-8pm CET (9-11am PST), GatherTown: Deep learning (C1) to learn more about FGSM adversarial training. Paper:

Tweet media one

Tweet media two

0

1

47

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Excited to receive a best reviewer award at the ICLR'20 workshop ! Thanks to @NicolasPapernot , @florian_tramer , @carmelatroncoso , @ShibaniSan , Nicholas Carlini for organizing. Very interesting list of accepted papers and speakers!

@NicolasPapernot

Nicolas Papernot

@NicolasPapernot

4 years

Congratulations to @maksym_andr @realyangzhang @RICEric22 for their best reviewer awards!

1

1

13

3

0

46

@maksym_andr

Maksym Andriushchenko @ ICML'24

11 months

Very interesting: ≈70% dead neurons on early layers and extremely sparse activation pattern in OPT LLMs, esp. for the 66B model: Is it just the implicit bias of SGD/Adam (e.g. as in our work ) or the sparsity comes from elsewhere?

Tweet media one

2

10

46

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Excited to share that our paper on adversarial training against common corruptions has been accepted to #UAI2022 ! TLDR: adversarial training (especially wrt a good distance metric) substantially improves accuracy and calibration on common corruptions. A paper 🧵 is below 👇

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Excited to share an updated version of our paper! We show that various adversarial training schemes (Lp and non-Lp) can consistently improve both accuracy *and* calibration on common image corruptions. Paper: Code: 1/n

Tweet media one

2

2

39

2

8

46

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 days

I'll be at ICML in Vienna next week from Monday to Saturday. DM me if you want to chat about robustness and generalization of LLMs! We are presenting 3 papers next week: - [Main track] Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Tweet media one

1

4

81

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Ever wondered if FGSM training now really works? Actually, it does, but only for small eps. Catastrophic overfitting is *still a problem* for many recently proposed methods, but this can be fixed with GradAlign. Paper: Code: (1/9)

Tweet media one

2

7

46

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Humanitarian aid collected for 🇺🇦 from the EPFL community. Really impressive! Huge thanks to the organizers (Olexiy Kochubey, @gorilskij , @agepoly ) and numerous volunteers!

0

5

45

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

I'll give a talk about our ICML'22 paper tmrw (3pm CET) at the ELLIS Mathematics of Deep Learning reading group. The zoom link is available here: . Feel free to drop by if you want to chat a bit about sharpness and related topics :)

Tweet card media

Towards Understanding Sharpness-Aware Minimization

Sharpness-Aware Minimization (SAM) is a recent training method that relies on worst-case weight perturbations which significantly improves generalization in various settings. We argue that the...

1

9

45

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 months

Very excited about this: our team led by @fra__31 won the SatML trojan detection competition (method: simple random search + heuristic to reduce the search space) Interestingly, the final score (-33.4) is very close to the score on the real trojans (-37.7) RLHFed into the LLMs!

@javirandor

Javier Rando

5 months

We are announcing the winners of our Trojan Detection Competition on Aligned LLMs!! 🥇 @tml_lab ( @fra__31 , @maksym_andr and Nicolas Flammarion) 🥈 @krystof_mitka 🥉 @apeoffire 🧵 With some of the main findings!

1

9

52

2

2

42

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Excited to share that our Sparse-RS paper got accepted at #AAAI2022 ! We propose a framework for query-efficient black-box sparse (L0 / patches / frames) adversarial attacks based on random search. Paper: Code:

Tweet media one

4

5

43

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Excited to attend NeurIPS next week! Feel free to DM me if you are also there and want to chat about training dynamics of deep networks, sharpness, generalization, adversarial robustness or any other exciting topic :)

1

1

42

@maksym_andr

Maksym Andriushchenko @ ICML'24

7 months

Excited to present this work tomorrow at 10am PT (7pm CET) in the reading group (over Zoom)! Stop by if you want to chat a bit about the role of weight decay in modern deep learning, including LLMs :-)

@maksym_andr

Maksym Andriushchenko @ ICML'24

10 months

❓So why do we need weight decay, really? Don't LLMs directly optimize the population loss and regularization is not needed? 🚨 Our new paper provides a modern look at the old question ! (joint work with @dngfra , @adityavardhanv , @tml_lab ) 🧵1/n

Tweet media one

2

22

189

0

7

42

@maksym_andr

Maksym Andriushchenko @ ICML'24

8 months

the poster is at 5:15pm today ( #507 ). drop by if you want to chat about sharpness and generalization in deep learning!

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

🚨Excited to share our new work “Sharpness-Aware Minimization Leads to Low-Rank Features” ! ❓We know SAM improves generalization, but can we better understand the structure of features learned by SAM? (with @dara_bahri , @TheGradient , N. Flammarion) 🧵1/n

Tweet media one

4

24

157

1

3

41

@maksym_andr

Maksym Andriushchenko @ ICML'24

10 months

Also this work led by Klim Kireev and @carmelatroncoso is accepted at #NeurIPS2023 🎉 Adversarial robustness for tabular data is (still) underrated!

Tweet media one

1

3

41

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

📜 ImageNet is not really that diverse: "the intra-class similarity of images in the original ImageNet is dramatically higher than it is for LAIONet [ImageNet recreation using LAION] ... models trained on ImageNet perform significantly worse on LAIONet."

Tweet media one

0

2

39

@maksym_andr

Maksym Andriushchenko @ ICML'24

7 days

🆕 Since many people have asked about the past tense attack on Claude. Here are the extended results on Claude-3.5 Sonnet: 0% -> 53% success rate with the GPT-4 judge and 0% -> 25% with the Llama-3 judge. As a bonus, GPT-4o-mini (the one announced today) leads to very similar

Tweet media one

6

6

45

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 years

Happy to share our new paper on provable robustness for boosting. For boosted stumps, we can solve the min-max problem *exactly*. For boosted trees, we minimize an upper bound on robust loss. Everything is nice & convex! Paper Code

Tweet media one

Tweet media two

Tweet media three

2

16

40

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 months

We are presenting three works at ICLR next week: 1️⃣ Layer-wise Linear Mode Connectivity (main track): some curious explorations of the layer-wise structure of deep net loss surfaces. Includes experiments on ConvNets, transformers, Pythia LLMs, connections to federated learning

0

7

40

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 months

some folks started to retweet again this paper of ours, so i guess i’ll also remind about the existence of this work :-) can be of interest in the context of input-dependent sparsity of LLMs!

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

🚨Happy to share our new paper "SGD with large step sizes learns sparse features" ! ❓Why do longer SGD schedules generalize better? What kind of hidden dynamics occurs when the train loss stabilizes? How is that related to sparse feature learning? 🧵1/10

2

72

439

0

1

39

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Excited to share an updated version of our paper! We show that various adversarial training schemes (Lp and non-Lp) can consistently improve both accuracy *and* calibration on common image corruptions. Paper: Code: 1/n

Tweet media one

2

2

39

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 month

🆕We will present a short version of our adaptive attack paper at the ICML '24 NextGenAISafety Workshop. See some of you there! 🚨We've also just released the v2 of the paper on arXiv. Main updates: - more models: Llama-3, Phi-3, Nemotron-4-340B (100%

Tweet media one

1

5

38

@maksym_andr

Maksym Andriushchenko @ ICML'24

9 months

who did this? 🤣 "Our paramount contribution lies in unveiling a path less traveled, where the integration of kernel machines and LLMs unveils a promising vista, enabling the realization of sophisticated language processing tasks"

Tweet media one

7

5

37

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Adversarial examples are back :-)

@zicokolter

Zico Kolter

1 year

@CadeMetz at the New York Times just published a piece on a new paper we are releasing today, on adversarial attacks against LLMs. You can read the piece here: And find more info and the paper at: [1/n]

9

79

355

4

2

37

@maksym_andr

Maksym Andriushchenko @ ICML'24

1 year

Hike to Mauna Kea 🌋 (4207m) ✅ Now back to ICML! :)

Tweet media one

0

0

36

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Truly excited to share this update! In particular, RobustBench now has 60+ robust models (including the recent SOTA models from @DeepMind ) which are all open-source and available with 1 line of code :) (+ check out some interesting new analysis based on them in the whitepaper!)

@fra__31

francesco croce

3 years

RobustBench v0.2 & updated paper are out! Now we've 5 leaderboards, 90+ evaluations & 60+ models available in Model Zoo. We also extend analysis by calibration, fairness, smoothness, privacy & transferability. Website: Paper: 1/9

Tweet media one

1

3

16

0

1

36

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 months

"Long is more" is now accepted at ICLR 2024 Data-Centric ML Research Workshop 🎉 Talk to @H_aoZhao if you are going to Vienna and interested in superficial alignment and related stuff!

@maksym_andr

Maksym Andriushchenko @ ICML'24

6 months

So, what really matters for instruction fine-tuning? Surprisingly, simply fine-tuning on the *longest* examples is an extremely strong baseline for alignment of LLMs. Really excited to share our new work: . Full story below! 🧵1/n

Tweet media one

5

29

152

2

1

35

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 years

Our Square Attack achieves the 1st place on the Madry et al. MNIST challenge! Remarkably, this is the *only* black-box attack in the leaderboard and it performs better than all other submitted *white-box* attacks. Code of the attack is publicly available:

Tweet media one

0

8

36

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 years

Excited to share the final version of our #NeurIPS2019 paper! SOTA provable robustness for boosted trees + competitive results to provably robust CNNs on MNIST/FMNIST/CIFAR-10 using *boosted trees*. 1/n Paper: Code:

Tweet media one

Tweet media two

Tweet media three

7

6

34

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 years

Really excited to start a summer research internship at @AdobeResearch today✌️ Looking forward to work with the great people there!

Tweet media one

0

0

34

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 years

Interesting paper that gives convincing empirical evidence that NNs generalize because SGD almost always finds "flat", stable minima since they have exp. larger basins of attraction. Although there still exist terrible minima that do not generalize.

Tweet media one

0

6

33

@maksym_andr

Maksym Andriushchenko @ ICML'24

4 years

Our Square Attack paper has been accepted at #ECCV2020 ! Long story short: don't estimate the gradients for black-box adversarial attacks, just perform a search over the extreme points of the feasible set! Paper: Code:

Tweet media one

3

1

33

@maksym_andr

Maksym Andriushchenko @ ICML'24

5 years

Truly excited that our got an oral at #CVPR2019 ! In the paper, we give a theoretical argument of why ReLU activation can lead to models with overconfident predictions. Moreover, we propose a robust optimization training scheme that mitigates this problem.

Tweet media one

1

5

33

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

Sparsity of *activations* is pervasive in deep networks trained using standard techniques (which, importantly, involves SGD with large step sizes!). Well-illustrated in this recent paper ⬇️

Tweet media one

0

1

32

@maksym_andr

Maksym Andriushchenko @ ICML'24

3 months

wow, perhaps the most interesting message of is not that phi3 overfits (under some prompting template), but that it performs so well for its size even on a held-out dataset!

Tweet card media

A Careful Examination of Large Language Model Performance on Grade...

Large language models (LLMs) have achieved impressive success on many benchmarks for mathematical reasoning. However, there is growing concern that some of this performance actually reflects...

@SebastienBubeck

Sebastien Bubeck

@SebastienBubeck

3 months

I'm super excited by the new eval released by Scale AI! They developed an alternative 1k GSM8k-like examples that no model has ever seen. Here are the numbers with the alt format (appendix C): GPT-4-turbo: 84.9% phi-3-mini: 76.3% Pretty good for a 3.8B model :-).

8

20

252

2

2

32

@maksym_andr

Maksym Andriushchenko @ ICML'24

11 months

Excited to present our ICML paper about SGD and sparse features at the ELLIS Reading Group on Mathematics of Deep Learning tomorrow at 3pm CET: Stop by if you want to chat a bit about the implicit regularization of SGD!

ELLIS Reading Groups

The ELLIS mission is to create a diverse European network that promotes research excellence and advances breakthroughs in AI, as well as a pan-European PhD program to educate the next generation of...

@maksym_andr

Maksym Andriushchenko @ ICML'24

2 years

🚨Happy to share our new paper "SGD with large step sizes learns sparse features" ! ❓Why do longer SGD schedules generalize better? What kind of hidden dynamics occurs when the train loss stabilizes? How is that related to sparse feature learning? 🧵1/10

2

72

439

0

3

31