Nicholas Roberts @nick11roberts Twitter profile | Pikagi

Pikagi

Nicholas Roberts

@nick11roberts

828

Followers

1,560

Following

57

Media

443

Statuses

Ph.D. student @WisconsinCS . Working on data-centric automated machine learning. Previously at CMU @mldcmu , UCSD @ucsd_cse , FCC @fresnocity .

Madison, WI

https://t.co/yVDK1JYmh1

Joined April 2012

Don't wanna be here? Send us removal request.

Pinned Tweet

@nick11roberts

Nicholas Roberts

5 months

So many new LLM architectures (Mambas🐍, Transformers🤖,🦙,🦔, Hyenas🐺,🦓…), so little GPU time to combine them into hybrid LLMs… Good news! Today we release Manticore, a system for creating **pretrained hybrids** from pretrained models! 👨‍🌾🦁🦂 1/n

Tweet media one

9

52

185

Last Seen Profiles

@ElSapoNoMiente

@HoltzmanKa5029

@JudeCanning

@RyanSmith_21

@fabula_vtuber

@rabumomorara

@MASA_S14_8400

@sabaqamar7861

@khald33005

@cytheragoddes

@regalb2014

@osamu_isono

@AmberRose325187

@stwmaniax

@JkagCr7T5F2LkEn

@stw_pdg

@TravisDolter

@IsraaZakaria6

@800_EN

@tossi_850

@llloolll26

@EChildhoodMaths

@stw_pdg

@Eddiep757

@Kine2022

@LHSSportsMktg

@ChuckFreimund

@chanTORiKOD

@bokeplokalmalam

@DW_44Q

@DESG_oner

@adjjgdgbxxx

@bostonoeoi

@NUE5TARON

@AZ_nekoPW

@baywriterallat1

@nick11roberts

Nicholas Roberts

4 years

Life update: I'm super excited to share that I'll be starting my Ph.D. at @WisconsinCS in the Fall!

5

1

92

@nick11roberts

Nicholas Roberts

1 year

Tired of reading about superconductors? Check out our new work that just hit arXiv: about rethinking how we get predictions out of classifiers, and how to incorporate the *geometry* of the labels —i.e., how labels relate to one another! ⚙️📐 [1/n]

Tweet card media

Geometry-Aware Adaptation for Pretrained Models

Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped...

4

15

82

@nick11roberts

Nicholas Roberts

3 years

ResNet not working? Use our #NeurIPS2021 paper to find what to use in-place of convs by searching our space of “XD-operations” containing convs, Fourier neural operators, graph convs, SOTA ops for neural PDE solvers, and infinitely many more [1/n]

2

13

46

@nick11roberts

Nicholas Roberts

2 years

Excited to share our Automated Weak Supervision benchmark at @NeurIPSConf next week! We’ll be in Hall J, #1029 at 11:30a on Thursday – drop by and chat with us! #NeurIPS2022 [1/n]

Tweet media one

2

12

38

@nick11roberts

Nicholas Roberts

3 months

I’ll be in Vienna this week for ICML! I’ll be presenting Manticore — our exciting new method for creating pretrained hybrid LLMs later in the week at the ES-FoMo, FM-Wild, NGSM, and LCFM workshops. Come by to chat about pretrained hybrid models!

Tweet media one

6

8

34

@nick11roberts

Nicholas Roberts

11 months

Stoked to be headed to @NeurIPSConf #NeurIPS2023 soon! Come check out our papers this year! (Thurs 10:45) Geometry Aware Adaptation for Pretrained Models (Weds 10:45) Skill-it! A Data-Driven Skills Framework for Understanding and Training LMs 🎉

@nick11roberts

Nicholas Roberts

1 year

Tired of reading about superconductors? Check out our new work that just hit arXiv: about rethinking how we get predictions out of classifiers, and how to incorporate the *geometry* of the labels —i.e., how labels relate to one another! ⚙️📐 [1/n]

4

15

82

1

7

27

@nick11roberts

Nicholas Roberts

2 years

Can’t wait to share NAS-Bench-360, our new Neural Architecture Search benchmark for diverse tasks at @NeurIPSConf next week! Come chat with us on Tuesday in Hall J, #1029 at 11:30! #NeurIPS2022 [1/n]

Tweet media one

1

9

25

@nick11roberts

Nicholas Roberts

1 year

This work was just accepted to #NeurIPS2023 ! Unfortunately it’s not about superconductors (what a throwback right?) Check out our thread to learn about how to improve your existing classifiers using the geometry of the label space!

Tweet card media

Geometry-Aware Adaptation for Pretrained Models

Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped...

@nick11roberts

Nicholas Roberts

1 year

Tired of reading about superconductors? Check out our new work that just hit arXiv: about rethinking how we get predictions out of classifiers, and how to incorporate the *geometry* of the labels —i.e., how labels relate to one another! ⚙️📐 [1/n]

4

15

82

0

5

23

@nick11roberts

Nicholas Roberts

1 year

Super excited to have been selected as an MLCommons Rising Star! Thank you @MLCommons , and I look forward to the workshop!!!

@MLCommons

MLCommons

1 year

We are pleased to present the inaugural MLCommons Rising Stars cohort. This talented group of PhD students are the future leaders of ML and Systems research.

0

8

52

0

1

20

@nick11roberts

Nicholas Roberts

2 years

I’m super hyped to finally spam Twitter about this! The winning team will be getting a $15,000 top prize—BTW, if you’re at @UWMadison , you should be aware that our top prize is roughly equivalent to 100 years of @hoofersailing membership. #doubleAdvertisement

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

A little over a week ago, we launched the AutoML Decathlon #NeurIPS2022 competition—a competition to develop efficient AutoML methods that work on diverse machine learning tasks for a chance to win a $15,000 top prize! [1/n]

Tweet media one

1

8

19

0

3

17

@nick11roberts

Nicholas Roberts

1 year

I’ll be in Berlin for the @automl_conf next week, hit me up if you want to chat! PS: "Why is he tweeting from Montana?” I’m en route back to Madison from Seattle trying to catch my flight to Berlin on Wednesday 🤠. So I’m kind of already on my way to the conference!

1

2

16

@nick11roberts

Nicholas Roberts

2 years

Had such a fun time co-leading the organizing team for the AutoML Decathlon competition! We just had our virtual workshop @NeurIPSConf this morning, so in case you missed it, here’s the final leaderboard! Really excited to share more details soon!

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

Thrilled to share the final *test* leaderboard rankings for the AutoML Decathlon 2022 competition!!! Big congrats to Team TrueFit for winning AutoML Decathlon 2022 and the $15,000 grand prize!!! 🧵🧵🧵

Tweet media one

1

2

12

0

1

16

@nick11roberts

Nicholas Roberts

2 years

Congrats to the winning team and to the runner up team for best presentations today at the #AutoMLFallSchool @AutoMLDecathlon Hackathon, and major props to everyone who participated! Also a huge thanks and shoutout to the #AutoMLFallSchool organizers for having us!!!

@LindauerMarius

Marius Lindauer

@LindauerMarius

2 years

Congratulations to the winners of our hackathon at our #AutoMLFallSchool -- winner: Team 42; runner up: MaybeWinning.

Tweet media one

Tweet media two

0

6

37

0

0

11

@nick11roberts

Nicholas Roberts

1 year

We’ll be at the 11:30a poster session today—come by and chat with us! Really stoked about this work!!!

@fredsala

Fred Sala

1 year

Generative models are awesome at producing data, and weak supervision is great at efficient labeling. Can we combine them to get cheap datasets for training or fine-tuning? Excited to present our #ICLR2023 paper "Generative Modeling Helps Weak Supervision (and Vice Versa)"

Tweet media one

2

18

71

0

0

10

@nick11roberts

Nicholas Roberts

2 years

Super pumped to be presenting AutoWS-Bench-101 at NeurIPS next week! I will be generating more spam about this and our other AutoML for diverse tasks work at NeurIPS in the coming days… So stay tuned 🎸 And thank you @SnorkelAI for the shoutout!

Tweet card media

AutoWS-Bench-101: Benchmarking Automated Weak Supervision with 100 Labels

Weak supervision (WS) is a powerful method to build labeled datasets for training supervised models in the face of little-to-no labeled data. It replaces hand-labeling data with aggregating...

@SnorkelAI

Snorkel AI

2 years

AutoWS-Bench-101 by @nick11roberts , @fredsala , et al. for evaluating automated weak supervision (AutoWS) techniques on a set of diverse application domains on which it has been previously difficult or impossible to apply traditional WS techniques. [4/7]

1

0

7

0

2

10

@nick11roberts

Nicholas Roberts

5 months

***Manticore addresses both of these!*** We automate the design of hybrids, while using existing pretrained models that you can just get from Huggingface, to create PRETRAINED hybrids! 3/n

1

0

8

@nick11roberts

Nicholas Roberts

5 months

While hybrid models are quickly gaining traction, there are a bunch of challenges: They require hardcore manual expert-driven design, and new hybrids must be trained completely from scratch. Tough luck for us GPU-poor right??? 2/n

1

0

7

@nick11roberts

Nicholas Roberts

1 year

Stoked about this work! If you’re interested in LMs and/or LLMs, this paper is for you, so check out Mayee’s thread on this!!! Side note I guess there’s a band called “skillet” but this paper is very very unrelated—actually not my first naming clarification today 🤠

@MayeeChen

Mayee Chen

1 year

Large language models (LMs) rely heavily on training data quality. How do we best select training data for good downstream model performance across tasks? Introducing 🍳Skill-It: a data-driven framework for understanding and training LMs! Paper: 1/13

7

127

463

1

1

9

@nick11roberts

Nicholas Roberts

2 years

Stoked to announce the AutoML Cup 2023 at @automl_conf !!! This competition is the direct follow-up to the AutoML Decathlon 2022 that we ran as part of the @NeurIPSConf competition track. Stay tuned for updates! 🏆🤖🎸

@AutoML_Cup

AutoML Cup

2 years

Announcing the AutoML Cup 2023!!! 🤖🏆📊 The AutoML Cup is an automated machine learning competition with a focus on diverse machine learning tasks and data settings — which will be part of the @automl_conf 2023.

1

9

16

0

2

9

@nick11roberts

Nicholas Roberts

5 months

Thank you for featuring Manticore @fly51fly !

@fly51fly

fly51fly

5 months

[LG] Pretrained Hybrids with MAD Skills N Roberts, S Guo, Z Gao, S S S N GNVV… [University of Wisconsin-Madison] (2024) - Proposes Manticore, a framework to automatically design hybrid architectures combining different pretrained models like Transformers

Tweet media one

Tweet media two

Tweet media three

Tweet media four

0

5

10

1

1

9

@nick11roberts

Nicholas Roberts

5 months

We will be releasing code for Manticore and models shortly, so stay tuned! Had a blast creating this with Wisconsin friends: Samuel Guo, @Zhiqi_Gao_2001 , Satya Sai Srinath Namburi GNVV, @SonNicCr , Chengjun Wu, Chengyu Duan, and @fredsala 12/12

1

1

7

@nick11roberts

Nicholas Roberts

5 months

Manticore uses ideas from Neural Architecture Search (NAS) and simple “projector” layers that can translate the features between pretrained blocks with different architectures. Ok, time for an example… 4/n

1

0

8

@nick11roberts

Nicholas Roberts

3 years

Check out our paper on automated dataset construction for diverse label types tomorrow at @ICLR_conf ! We’ll be at poster session 12, Thurs. evening!

3

3

8

@nick11roberts

Nicholas Roberts

5 months

By the way, the name “Manticore” comes from Persian mythology: The Manticore is a hybrid creature with the head of a human, the body of a lion, and the tail of a scorpion. 👨‍🌾🦁🦂/12

Tweet media one

1

0

8

@nick11roberts

Nicholas Roberts

2 years

Super excited about this line of work and even more excited to see what people come up with for diverse tasks!!! Check out NAS-Bench-360: and our brand new AutoML Decathlon competition at NeurIPS 2022:

@mlcmublog

ML@CMU

2 years

Do state-of-the-art AutoML methods work on diverse tasks? @khodakmoments and @atalwalkar introduce a new benchmark and a NeurIPS 2022 competition with the goal of finding out: Blog:

Tweet media one

Tweet media two

Tweet media three

0

12

35

0

1

8

@nick11roberts

Nicholas Roberts

9 months

They finally made fetch happen…

@WisconsinCS

UW-Madison Computer Sciences

9 months

As the esteemed scholar Gretchen Wieners once said: That's so fetch. Well done and well deserved, @WesSchroll and Tyler!

0

0

3

2

0

7

@nick11roberts

Nicholas Roberts

5 months

The search trajectory from MAD seems to follow the architecture gradient on the fine-tuning task!!! Isn’t that neat? This suggests that the pretrained hybrids that we search for on these tasks may have some form of universality. 10/n

1

0

6

@nick11roberts

Nicholas Roberts

1 year

@zacharylipton So long as search spaces are inspired by existing architectures, you’re going to get things that look like existing architectures. True for vision/NLP/well-explored domains, which limits the gains there. Instead, NAS for adapting these architectures to diverse tasks is the way.

1

0

3

@nick11roberts

Nicholas Roberts

2 years

Remember how much you hated Joffrey in Game of Thrones when you still liked Game of Thrones? That’s XGBoost right now. Submit your methods to AutoML Decathlon today to claim your place on the iron throne + $15,000!!!

Tweet media one

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

Think you can develop a machine learning method that beats XGBoost and a linear model on diverse tasks for $15,000? We think you can too. Right now is the perfect time to submit to the AutoML Decathlon 2022 competition at #NeurIPS2022 ! [1/n]

Tweet media one

1

11

17

0

1

7

@nick11roberts

Nicholas Roberts

2 years

Excited to be giving a talk this Thursday at the AutoML Seminar with Sam Guo and @williamcxxz about the AutoML Decathlon competition!

@AutomlSeminar

AutoML Seminar

2 years

Competitions and benchmarks have been one of the major accelerators in AutoML. @nick11roberts , @williamcxxz and Samuel Guo will present the results and insights of the #Neurips2022 AutoML decathlon challenge on Thursday: .

0

0

7

0

5

7

@nick11roberts

Nicholas Roberts

2 years

Lots of cool things at the AutoML Fall School — including the AutoML Decathlon *HACKATHON* Excited to help folks to get a running start on their submissions!

@LindauerMarius

Marius Lindauer

@LindauerMarius

2 years

We have finalized the schedule for the upcoming #AutoML Fall School in October. I believe that this is an excellent mix between invited lectures and hands-on sessions, both for academic packages and industry software. I hope I will see you there

Tweet media one

0

9

30

0

0

7

@nick11roberts

Nicholas Roberts

5 months

Free self-alignment just dropped — check it out and get yourself Aligned, it’s EZ!

@dyhadila

Dyah Adila🦄

5 months

Can we align pre-trained models quickly and at no cost? 🤔 Sounds challenging! Our latest research tackles this question. Surprisingly, we found compelling evidence that it just might be possible! 🌟🔍 Preprint:

3

16

87

1

0

7

@nick11roberts

Nicholas Roberts

5 months

We do this by using these super helpful synthetics as a proxy for search, instead of just doing search on the downstream fine-tuning task… The losses here are on the fine-tuning task, while the search trajectory is from the MAD tasks… 9/n

Tweet media one

1

0

6

@nick11roberts

Nicholas Roberts

2 years

There are also several other AutoWS techniques that we’re looking to add to our benchmark: AutoSWAP ( @tsengalb99 , @JenJSun , @yisongyue ), ASTRA ( @gkaraml , @AhmedHAwadallah ), Nemo (Cheng-Yu Hsieh, @JieyuZhang20 , @ajratner ), Label Prop. w/WS ( @rpukdeee , @dylanjsam ), & more! [10/n]

1

0

6

@nick11roberts

Nicholas Roberts

5 months

Throw in some linear layers (with skip connects and gating) before and after the blocks from each of the two models to **translate** the features between them so that they can ‘speak a common language’ 6/n

Tweet media one

1

0

6

@nick11roberts

Nicholas Roberts

5 months

So you want to build a hybrid from the pretrained EleutherAI/pythia-410m and state-spaces/mamba-370m models… Here’s what Manticore does… 5/n

Tweet media one

1

0

6

@nick11roberts

Nicholas Roberts

2 years

A short poem: We can’t let XGBoost win $15K, It’s such a simple baseline, no way! Submit to AutoML Decathlon today~ And be sure to follow @AutoMLDecathlon for leaderboard updates! XGBoost won’t be in the lead for long. 🙂

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

Think you can develop a machine learning method that beats XGBoost and a linear model on diverse tasks for $15,000? We think you can too. Right now is the perfect time to submit to the AutoML Decathlon 2022 competition at #NeurIPS2022 ! [1/n]

Tweet media one

1

11

17

0

0

6

@nick11roberts

Nicholas Roberts

5 months

Here’s another banger — when the Mamba and Transformer models have different “skills” they can result in a hybrid that is better on fine-tuning tasks where both skills are required. (shameless Skill-It! 🍳plug @MayeeChen ) 11/n

Tweet media one

1

0

6

@nick11roberts

Nicholas Roberts

5 months

So that’s how it’s done folks. “But wait, is the search for mixture weights expensive?” I knew you’d ask, and no it’s not. You can actually just ‘program’ the mixture weights using the amazing synthetic Mechanistic Architecture Design (MAD) tasks 8/n

Tweet card media

Mechanistic Design and Scaling of Hybrid Architectures

The development of deep learning architectures is a resource-demanding process, due to a vast design space, long prototyping times, and high compute costs associated with at-scale model training...

1

0

6

@nick11roberts

Nicholas Roberts

5 months

Next, we want to learn how much influence each model will have on the overall hybrid (because what if one of them doesn’t perform well?) This is where the NAS stuff comes in. We search for mixture weights of a convex combination of their blocks: 7/n

Tweet media one

1

1

6

@nick11roberts

Nicholas Roberts

1 year

Might have noticed that our method is called “Loki.” It’s not a Marvel reference. It refers to the plural, loci, of the “locus” of the Fréchet mean. This is (kind of) a convex hull analogue for metric spaces — check out why we used this weird name HERE:

Tweet card media

Geometry-Aware Adaptation for Pretrained Models

Machine learning models -- including prominent zero-shot models -- are often trained on datasets whose labels are only a small proportion of a larger label space. Such spaces are commonly equipped...

1

0

6

@nick11roberts

Nicholas Roberts

1 year

@zacharylipton Not saying that we are, but seems like long-term human architecture search is better at finding reusable motifs than NAS algorithms in general. I suspect that this is why traditional NAS is often used to navigate the perf-efficiency curve rather than to actually do this

1

0

1

@nick11roberts

Nicholas Roberts

9 years

Totally excited for @SDHacks and @CalHacks . This is going to be a good hackathon season.

0

0

5

@nick11roberts

Nicholas Roberts

1 year

Work done with @XintongLi0501 , @dyhadila , Sonia Cromp, @zihengh1 , @jzhao326 , and my advisor @fredsala ! Part of a thread of work we’re stoked about: [b^{b-2} / n]

Tweet card media

Lifting Weak Supervision To Structured Prediction

Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates from a variety of sources. WS is theoretically well...

0

0

5

@nick11roberts

Nicholas Roberts

11 months

More generally if you want to meet up, I’ll be around all week. Feel free to reach out and we can find a slot Looking forward to seeing folks in New Orleans next week!!!

0

0

2

@nick11roberts

Nicholas Roberts

1 year

@zacharylipton In general, agreed! Though sounds expensive and even then, gluing together operations in different combinations in hopes to find new motifs is probably only useful in domains where humans haven’t basically done years of Human Architecture Search themselves

1

0

0

@nick11roberts

Nicholas Roberts

4 months

At the iconic @splashcafe in #Pismo ! Had a wonderful time ❤️

@dyhadila

Dyah Adila🦄

4 months

Sprocket Lab Cali Central coast trip ❤️

Tweet media one

0

0

13

0

0

5

@nick11roberts

Nicholas Roberts

2 years

ALERT!!! AutoML Decathlon @NeurIPSConf 2022 update: The reign of XGBoost has, at long last, come to an end… And a new competitor enters the ring! Reminder to submit your methods to AutoML Decathlon for a chance to win $15,000 and eternal glory!

Tweet media one

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

NEW COMPETITOR ALERT!!! XGBoost has been dethroned and a *NEW* competitor is leading in the AutoML Decathlon competition at @NeurIPSConf 2022!!! Submit your method today for a chance to win the $15,000 top prize and stay tuned for more leaderboard updates!

Tweet media one

0

0

9

0

2

5

@nick11roberts

Nicholas Roberts

2 years

Jointly led by @tu_renbo , with help from @khodakmoments , @JunhongShen1 , @fredsala , @atalwalkar . PS, we also ran an AutoML for diverse tasks competition this year at NeurIPS: the @AutoMLDecathlon ! [14/n], n=14

Tweet card media

Does AutoML work for diverse tasks?

Over the past decade, machine learning (ML) has grown rapidly in both popularity and complexity. Driven by advances in deep neural networks, ML is now being applied far beyond its traditional domains...

blog.ml.cmu.edu

0

1

5

@nick11roberts

Nicholas Roberts

8 years

Somehow I ended up in the UCSD Gospel Choir and I'm performing tonight. Odd.

0

0

4

@nick11roberts

Nicholas Roberts

2 years

So excited to help participants get started on the AutoML Decathlon at the AutoML Fall School!!!

Tweet card media

AutoML Fall School 2022

Motivation

sites.google.com

@LindauerMarius

Marius Lindauer

@LindauerMarius

2 years

The #AutoML Fall school 2023 joins forces with the AutoML Decathlon team. This means, @atalwalkar , Samuel and Nick will give a hands-on introduction to the Decathlon setup at the fall school, and we will spend the hackathon on coming up with good submissions for the Decathlon.

Tweet media one

1

7

27

0

0

4

@nick11roberts

Nicholas Roberts

2 years

Stoked about this! Stay tuned for more deets…

@harit_v

Harit Vishwakarma

2 years

Super excited to share that our work with @nick11roberts and my advisor @fredsala , "Lifting Weak Supervision to Structured Prediction" has been accepted at #NeurIPS2022 . Preprint coming soon!

1

7

25

0

0

4

@nick11roberts

Nicholas Roberts

2 years

With only about a month left to submit, submit today for eternal glory etc. + $15,000! *Fun fact:* $15,000 can buy you the complete box set of all 6 seasons of Lost 53 times over! “Kate! We have to go back!” No Jack, you and Kate need to submit to the AutoML Decathlon!!!

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

LEADERBOARD UPDATE!!! Many new competitors have entered the ring, including the #Minions , who are now the third team to beat XGBoost! ***We are entering the final month of the AutoML Decathlon @NeurIPSConf 2022 competition, so submit your methods soon!!!***

Tweet media one

0

0

5

0

1

4

@nick11roberts

Nicholas Roberts

2 years

Relatedly:

Tweet media one

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

Think you can develop a machine learning method that beats XGBoost and a linear model on diverse tasks for $15,000? We think you can too. Right now is the perfect time to submit to the AutoML Decathlon 2022 competition at #NeurIPS2022 ! [1/n]

Tweet media one

1

11

17

0

0

4

@nick11roberts

Nicholas Roberts

2 years

I will get my #Prius #hitched as soon as I find someone foolish enough to do the job! After that, I will just tow a vintage Skeeter ice boat out to #LakeMonona a few weekends out of the year.

@nick11roberts

Nicholas Roberts

2 years

Should I get a trailer hitch installed on my Prius? I will abide by the results of this poll.

0

0

0

0

0

3

@nick11roberts

Nicholas Roberts

2 years

Very stoked about this work!!! Come chat with us on Wednesday! #NeurIPS2022 #NeurIPS22 #NeurIPS #lifting 💪

@harit_v

Harit Vishwakarma

2 years

Excited to present our work on 💪🏋️ Lifting Weak Supervision to Structured Prediction 💪🏋️ @NeurIPSConf this week! We’ll be in Hall J, #334 at 4pm on Wednesday– drop by and chat with us! #NeurIPS2022 [1/n]

Tweet media one

1

4

21

0

0

3

@nick11roberts

Nicholas Roberts

1 year

In this example, the grid can be arbitrarily large, but your pretrained classifier only needs to be trained on a constant number of classes (4) “but don’t predictions suffer if the space is large?” Yeah, yeah, we have learning theory results for the prediction error: [11/n]

Tweet media one

1

0

3

@nick11roberts

Nicholas Roberts

2 years

Finally, we found that zero-cost proxies performed inconsistently across diverse tasks – this corroborates prior findings by @crwhite_ml , @khodakmoments , @tu_renbo , @sytelus , @SebastienBubeck @debadeepta [12/n]

1

0

3

@nick11roberts

Nicholas Roberts

5 months

@MayeeChen Thank you @MayeeChen !!! 😁

0

0

2

@nick11roberts

Nicholas Roberts

2 years

The goal of NAS is to automate the design of neural networks for a given task, which saves human effort. This process typically involves the following three components: a search space, a search method, and a way to estimate performance. [2/n]

Tweet media one

1

0

3

@nick11roberts

Nicholas Roberts

11 months

@em_dinan Messaged!

0

0

0

@nick11roberts

Nicholas Roberts

3 months

@MayeeChen @Changho_Shin_ “Wow sick hybrid!!! How did you manage to train that on a single GPU in your friend’s basement?”

0

0

3

@nick11roberts

Nicholas Roberts

1 year

This interface works by setting the weights of this Fréchet mean. There are a bunch of ways to set the weights for this interface, some covered by our prior work and [8/n]

Tweet card media

Universalizing Weak Supervision

Weak supervision (WS) frameworks are a popular way to bypass hand-labeling large datasets for training data-hungry models. These approaches synthesize multiple noisy but cheaply-acquired estimates...

@harit_v

Harit Vishwakarma

1 year

📢 A fun blog post 📢 with my advisor @fredsala Checkout our blog post on improving the reliability of LLMs via aggregation using super cool classical tools🔧🔨. BlogPost: Paper: Code:

Tweet media one

1

8

25

1

0

3

@nick11roberts

Nicholas Roberts

3 years

With this substitution, we slowly realized that our search space elegantly encodes many interesting neural operations. Most excitingly, graph convs and Fourier neural operators @ZongyiLiCaltech are XD-operations. [9/n]

Tweet media one

1

0

3

@nick11roberts

Nicholas Roberts

2 years

!!!

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

Leaderboard update!!! New competitors have entered the ring. Submit your method today to earn your spot on the leaderboard and for a chance to win $15,000!

Tweet media one

0

1

8

0

0

3

@nick11roberts

Nicholas Roberts

2 years

Also today is the last day to get early bird registration for the #AutoML Fall school and to get a *head start* on the @AutoMLDecathlon competition by participating in the Fall School hackathon!!!

@AutoMLDecathlon

AutoML Decathlon

@AutoMLDecathlon

2 years

We're very excited to be joining forces with the AutoML Fall School 2022! Register for the #AutoML Fall School today and get a head start on the AutoML Decathlon!!! Quite literally—*register today* because today is the last day for early bird registration for the Fall school!

0

0

11

0

1

3

@nick11roberts

Nicholas Roberts

11 years

@SwaggySpragg dat windows vista.

1

0

2

@nick11roberts

Nicholas Roberts

11 years

Chamomile/rooibos blend with pomegranate honey. Quite delicious. It ended up being an orangish color. http://t.co/wCLtYcSZyQ

Tweet media one

0

0

2

@nick11roberts

Nicholas Roberts

2 years

@NeuroLuebbert @DeepMind Is this named after @ggetLA ???

1

0

1

@nick11roberts

Nicholas Roberts

1 year

This allows the pretrained model to 🗺️ navigate🗺️ the metric space of labels just by using the softmax outputs E.g., if your metric space is a grid, but your classifier can only output probabilities for the classes in the corners, you can actually output any class! [10/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

11 years

Jasmine flower green tea with lemonade. Andnfnfiendkdkfnfikwndkskfnghtnaidnfueosnskaalqmsoqndjvjfkfmmmfndkndkdnffjmmmmmmd,dodmmmmmmmmmmmmmm!

1

0

2

@nick11roberts

Nicholas Roberts

2 years

Jointly led with @XintongLi0501 (who is actively applying to Ph.D. programs!), with help from @zihengh1 , @dyhadila , Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, and with guidance from @fredsala and @awsTO ! [12/n], n=12

0

0

2

@nick11roberts

Nicholas Roberts

2 years

There are several examples of AutoWS methods, including Snuba ( @paroma_varma ), Interactive Weak Supervision ( @BenBoecking , @willieneis ), and GOGGLES ( @nilakshdas ). The common thread between these methods? [5/n]

1

0

2

@nick11roberts

Nicholas Roberts

1 year

How the heck do we do this? In short, - we deal with the size of the space by using the metric geometry of the labels, and - we use the Fréchet mean as a plug-in to ⚙️interface⚙️ a ***pretrained model*** with the metric space [7/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

1 year

No! In reality, we basically want outputs resembling data structures. Examples: - ASTs - chains of thought - folded proteins Most ML builds up to these as best we can using our base primitives, but doesn’t use the native relationships between these objects. [3/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

3 years

Partner in crime @khodakmoments , joint work with @tri_dao , @liamcli , @hazyresearch , @atalwalkar [19/n]

1

0

2

@nick11roberts

Nicholas Roberts

1 year

We also have a bunch of other exciting theory results: - characterizing what classes your pretrained model needs, - how to optimally expand the set of classes you can predict, - how to efficiently figure out what classes you can actually predict [12/n]

1

0

2

@nick11roberts

Nicholas Roberts

1 year

Experimentally, we show consistent lifts over just using the standard argmax prediction rule (which we actually generalize, see paper). We even show that in cases where you actually can predict everything (zero-shot CLIP), just using the metric geometry STILL helps! [13/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

2 years

Excited to chat about NAS-Bench-360 and the AutoML Decathlon competition today! More deets below #NeurIPS2022

@atalwalkar

Ameet Talwalkar

2 years

Can’t wait to attend @NeurIPSConf tomorrow, my first in-person conference in way too long! And excited to share this experience with several students / collaborators who are finally getting to present their work in person... 1/N

1

6

35

0

0

2

@nick11roberts

Nicholas Roberts

5 months

@AGIHouseSF @khoomeik Cool! Reminds me of Deeply Supervised Nets.

0

0

2

@nick11roberts

Nicholas Roberts

2 years

A promising answer is to use Automated Weak Supervision (AutoWS), which replaces label functions with weak learners obtained using a small amount of labeled data. Here’s that visualization of the AutoWS pipeline again: [4/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

2 years

Weak Supervision is a super powerful framework for constructing labeled datasets – instead of actual labels, it relies on having access to several “labeling functions” that are able to produce noisy guesses about the true label Y, given some X. [2/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

1 year

Also no! We show how you can *reuse a pretrained classifier* that was trained only on a subset of the space to predict anything you want in your complicated label space of data structures. [6/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

5 months

Found my coauthors’ twitter handles: @srguo24 @srinath_namburi @ChengyuD27

1

0

2

@nick11roberts

Nicholas Roberts

1 year

So yeah, it seems like training models on such a huge label space, if you try to do it without using base primitives, might be pretty much impossible… Wouldn’t you need training examples representing every possible AST? [5/n]

1

0

2

@nick11roberts

Nicholas Roberts

1 year

Without primitives, these output spaces are huge… Let’s consider the AST space — Cayley tells us that the size of the space is actually *worse* than exponential… It’s b^{b-2} in the # of vertices. Like, Zoinks Scoob… [4/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

5 months

@khoomeik Seems like there should be a way to generalize this to trade-off perf and how much you can parallelize it, with this being one extreme and backprop being the other extreme. Neat project!

0

0

2

@nick11roberts

Nicholas Roberts

1 year

Let’s start off by unboxing how we make predictions using machine learning… The majority of ML uses a really simple set of base primitives as labels — binary, multi-class, and regression labels. But is this actually what we want in practice? [2/n]

1

0

2

@nick11roberts

Nicholas Roberts

7 years

I just registered for the @EPFL_en 's Applied Machine Learning Days 2018 @appliedmldays . Come join Europe's most exciting AI event! #AMLD2018

0

1

1

@nick11roberts

Nicholas Roberts

3 years

@rajiinio 10000000000000% this

0

0

1

@nick11roberts

Nicholas Roberts

6 years

Look what we did!

0

0

2

@nick11roberts

Nicholas Roberts

1 year

But here, we want to somehow plug in a pretrained model to the Fréchet mean… A natural choice is to directly use the per-class probability estimates from the softmax as the weights! [9/n]

Tweet media one

1

0

2

@nick11roberts

Nicholas Roberts

1 year

@BlancheMinerva @zacharylipton Can you elaborate on this? Typically when people say this, they really mean random search with weight sharing, which IS a (single-shot) NAS algorithm. On the other hand if you have unlimited time and compute, RS will outperform anything.

0

0

0

@nick11roberts

Nicholas Roberts

2 years

@ruthhook_ @typeofemale @goblinodds Overall, this is a side of science that I wasn’t aware of until exactly now. What

0

0

2

@nick11roberts

Nicholas Roberts

3 months

I’m also here all week so pls reach out if you want to chat, more generally 👨‍🌾 Stoked for the conference!

0

0

1

@nick11roberts

Nicholas Roberts

8 years

@HarvestMarketDE @republicoftea matcha whisk

0

0

1

@nick11roberts

Nicholas Roberts

2 years

Prior work found that the same set of NAS operations were important across the vision tasks of NAS-Bench-201 – we found that this was not true for diverse tasks: [11/n]

Tweet media one

1

0

1

@nick11roberts

Nicholas Roberts

2 years

This pipeline works well for text data – it’s easy to write label functions for text. OTOH, it’s quite a bit harder to write these label functions for data with more complex features, such as images or the vast majority of other ML tasks. [3/n]

1

0

1

@nick11roberts

Nicholas Roberts

4 years

@AlexHermstad I’m not aware of any current CS curriculum doing this, no. However, my intro to programming class in high school started off by teaching people using a flowchart based UI algorithm builder that compiled into code if you wanted it to.

0

0

1