Nicholas Roberts Profile Banner
Nicholas Roberts Profile
Nicholas Roberts

@nick11roberts

828
Followers
1,560
Following
57
Media
443
Statuses

Ph.D. student @WisconsinCS . Working on data-centric automated machine learning. Previously at CMU @mldcmu , UCSD @ucsd_cse , FCC @fresnocity .

Madison, WI
Joined April 2012
Don't wanna be here? Send us removal request.
Pinned Tweet
@nick11roberts
Nicholas Roberts
5 months
So many new LLM architectures (Mambas🐍, Transformers🤖,🦙,🦔, Hyenas🐺,🦓…), so little GPU time to combine them into hybrid LLMs… Good news! Today we release Manticore, a system for creating **pretrained hybrids** from pretrained models! 👨‍🌾🦁🦂 1/n
Tweet media one
9
52
185
@nick11roberts
Nicholas Roberts
4 years
Life update: I'm super excited to share that I'll be starting my Ph.D. at @WisconsinCS in the Fall!
5
1
92
@nick11roberts
Nicholas Roberts
1 year
Tired of reading about superconductors? Check out our new work that just hit arXiv: about rethinking how we get predictions out of classifiers, and how to incorporate the *geometry* of the labels —i.e., how labels relate to one another! ⚙️📐 [1/n]
4
15
82
@nick11roberts
Nicholas Roberts
3 years
ResNet not working? Use our #NeurIPS2021 paper to find what to use in-place of convs by searching our space of “XD-operations” containing convs, Fourier neural operators, graph convs, SOTA ops for neural PDE solvers, and infinitely many more [1/n]
2
13
46
@nick11roberts
Nicholas Roberts
2 years
Excited to share our Automated Weak Supervision benchmark at @NeurIPSConf next week! We’ll be in Hall J, #1029 at 11:30a on Thursday – drop by and chat with us! #NeurIPS2022 [1/n]
Tweet media one
2
12
38
@nick11roberts
Nicholas Roberts
3 months
I’ll be in Vienna this week for ICML! I’ll be presenting Manticore — our exciting new method for creating pretrained hybrid LLMs later in the week at the ES-FoMo, FM-Wild, NGSM, and LCFM workshops. Come by to chat about pretrained hybrid models!
Tweet media one
6
8
34
@nick11roberts
Nicholas Roberts
11 months
Stoked to be headed to @NeurIPSConf #NeurIPS2023 soon! Come check out our papers this year! (Thurs 10:45) Geometry Aware Adaptation for Pretrained Models (Weds 10:45) Skill-it! A Data-Driven Skills Framework for Understanding and Training LMs 🎉
@nick11roberts
Nicholas Roberts
1 year
Tired of reading about superconductors? Check out our new work that just hit arXiv: about rethinking how we get predictions out of classifiers, and how to incorporate the *geometry* of the labels —i.e., how labels relate to one another! ⚙️📐 [1/n]
4
15
82
1
7
27
@nick11roberts
Nicholas Roberts
2 years
Can’t wait to share NAS-Bench-360, our new Neural Architecture Search benchmark for diverse tasks at @NeurIPSConf next week! Come chat with us on Tuesday in Hall J, #1029 at 11:30! #NeurIPS2022 [1/n]
Tweet media one
1
9
25
@nick11roberts
Nicholas Roberts
1 year
This work was just accepted to #NeurIPS2023 ! Unfortunately it’s not about superconductors (what a throwback right?) Check out our thread to learn about how to improve your existing classifiers using the geometry of the label space!
@nick11roberts
Nicholas Roberts
1 year
Tired of reading about superconductors? Check out our new work that just hit arXiv: about rethinking how we get predictions out of classifiers, and how to incorporate the *geometry* of the labels —i.e., how labels relate to one another! ⚙️📐 [1/n]
4
15
82
0
5
23
@nick11roberts
Nicholas Roberts
1 year
Super excited to have been selected as an MLCommons Rising Star! Thank you @MLCommons , and I look forward to the workshop!!!
@MLCommons
MLCommons
1 year
We are pleased to present the inaugural MLCommons Rising Stars cohort. This talented group of PhD students are the future leaders of ML and Systems research.
0
8
52
0
1
20
@nick11roberts
Nicholas Roberts
2 years
I’m super hyped to finally spam Twitter about this! The winning team will be getting a $15,000 top prize—BTW, if you’re at @UWMadison , you should be aware that our top prize is roughly equivalent to 100 years of @hoofersailing membership. #doubleAdvertisement
@AutoMLDecathlon
AutoML Decathlon
2 years
A little over a week ago, we launched the AutoML Decathlon #NeurIPS2022 competition—a competition to develop efficient AutoML methods that work on diverse machine learning tasks for a chance to win a $15,000 top prize! [1/n]
Tweet media one
1
8
19
0
3
17
@nick11roberts
Nicholas Roberts
1 year
I’ll be in Berlin for the @automl_conf next week, hit me up if you want to chat! PS: "Why is he tweeting from Montana?” I’m en route back to Madison from Seattle trying to catch my flight to Berlin on Wednesday 🤠. So I’m kind of already on my way to the conference!
1
2
16
@nick11roberts
Nicholas Roberts
2 years
Had such a fun time co-leading the organizing team for the AutoML Decathlon competition! We just had our virtual workshop @NeurIPSConf this morning, so in case you missed it, here’s the final leaderboard! Really excited to share more details soon!
@AutoMLDecathlon
AutoML Decathlon
2 years
Thrilled to share the final *test* leaderboard rankings for the AutoML Decathlon 2022 competition!!! Big congrats to Team TrueFit for winning AutoML Decathlon 2022 and the $15,000 grand prize!!! 🧵🧵🧵
Tweet media one
1
2
12
0
1
16
@nick11roberts
Nicholas Roberts
2 years
Congrats to the winning team and to the runner up team for best presentations today at the #AutoMLFallSchool @AutoMLDecathlon Hackathon, and major props to everyone who participated! Also a huge thanks and shoutout to the #AutoMLFallSchool organizers for having us!!!
@LindauerMarius
Marius Lindauer
2 years
Congratulations to the winners of our hackathon at our #AutoMLFallSchool -- winner: Team 42; runner up: MaybeWinning.
Tweet media one
Tweet media two
0
6
37
0
0
11
@nick11roberts
Nicholas Roberts
1 year
We’ll be at the 11:30a poster session today—come by and chat with us! Really stoked about this work!!!
@fredsala
Fred Sala
1 year
Generative models are awesome at producing data, and weak supervision is great at efficient labeling. Can we combine them to get cheap datasets for training or fine-tuning? Excited to present our #ICLR2023 paper "Generative Modeling Helps Weak Supervision (and Vice Versa)"
Tweet media one
2
18
71
0
0
10
@nick11roberts
Nicholas Roberts
2 years
Super pumped to be presenting AutoWS-Bench-101 at NeurIPS next week! I will be generating more spam about this and our other AutoML for diverse tasks work at NeurIPS in the coming days… So stay tuned 🎸 And thank you @SnorkelAI for the shoutout!
@SnorkelAI
Snorkel AI
2 years
AutoWS-Bench-101 by @nick11roberts , @fredsala , et al. for evaluating automated weak supervision (AutoWS) techniques on a set of diverse application domains on which it has been previously difficult or impossible to apply traditional WS techniques. [4/7]
1
0
7
0
2
10
@nick11roberts
Nicholas Roberts
5 months
***Manticore addresses both of these!*** We automate the design of hybrids, while using existing pretrained models that you can just get from Huggingface, to create PRETRAINED hybrids! 3/n
1
0
8
@nick11roberts
Nicholas Roberts
5 months
While hybrid models are quickly gaining traction, there are a bunch of challenges: They require hardcore manual expert-driven design, and new hybrids must be trained completely from scratch. Tough luck for us GPU-poor right??? 2/n
1
0
7
@nick11roberts
Nicholas Roberts
1 year
Stoked about this work! If you’re interested in LMs and/or LLMs, this paper is for you, so check out Mayee’s thread on this!!! Side note I guess there’s a band called “skillet” but this paper is very very unrelated—actually not my first naming clarification today 🤠
@MayeeChen
Mayee Chen
1 year
Large language models (LMs) rely heavily on training data quality. How do we best select training data for good downstream model performance across tasks? Introducing 🍳Skill-It: a data-driven framework for understanding and training LMs! Paper: 1/13
7
127
463
1
1
9
@nick11roberts
Nicholas Roberts
2 years
Stoked to announce the AutoML Cup 2023 at @automl_conf !!! This competition is the direct follow-up to the AutoML Decathlon 2022 that we ran as part of the @NeurIPSConf competition track. Stay tuned for updates! 🏆🤖🎸
@AutoML_Cup
AutoML Cup
2 years
Announcing the AutoML Cup 2023!!! 🤖🏆📊 The AutoML Cup is an automated machine learning competition with a focus on diverse machine learning tasks and data settings — which will be part of the @automl_conf 2023.
1
9
16
0
2
9
@nick11roberts
Nicholas Roberts
5 months
Thank you for featuring Manticore @fly51fly !
@fly51fly
fly51fly
5 months
[LG] Pretrained Hybrids with MAD Skills N Roberts, S Guo, Z Gao, S S S N GNVV… [University of Wisconsin-Madison] (2024) - Proposes Manticore, a framework to automatically design hybrid architectures combining different pretrained models like Transformers
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
5
10
1
1
9
@nick11roberts
Nicholas Roberts
5 months
We will be releasing code for Manticore and models shortly, so stay tuned! Had a blast creating this with Wisconsin friends: Samuel Guo, @Zhiqi_Gao_2001 , Satya Sai Srinath Namburi GNVV, @SonNicCr , Chengjun Wu, Chengyu Duan, and @fredsala 12/12
1
1
7
@nick11roberts
Nicholas Roberts
5 months
Manticore uses ideas from Neural Architecture Search (NAS) and simple “projector” layers that can translate the features between pretrained blocks with different architectures. Ok, time for an example… 4/n
1
0
8
@nick11roberts
Nicholas Roberts
3 years
Check out our paper on automated dataset construction for diverse label types tomorrow at @ICLR_conf ! We’ll be at poster session 12, Thurs. evening!
3
3
8
@nick11roberts
Nicholas Roberts
5 months
By the way, the name “Manticore” comes from Persian mythology: The Manticore is a hybrid creature with the head of a human, the body of a lion, and the tail of a scorpion. 👨‍🌾🦁🦂/12
Tweet media one
1
0
8
@nick11roberts
Nicholas Roberts
2 years
Super excited about this line of work and even more excited to see what people come up with for diverse tasks!!! Check out NAS-Bench-360: and our brand new AutoML Decathlon competition at NeurIPS 2022:
@mlcmublog
ML@CMU
2 years
Do state-of-the-art AutoML methods work on diverse tasks? @khodakmoments and @atalwalkar introduce a new benchmark and a NeurIPS 2022 competition with the goal of finding out: Blog:
Tweet media one
Tweet media two
Tweet media three
0
12
35
0
1
8
@nick11roberts
Nicholas Roberts
9 months
They finally made fetch happen…
@WisconsinCS
UW-Madison Computer Sciences
9 months
As the esteemed scholar Gretchen Wieners once said: That's so fetch. Well done and well deserved, @WesSchroll and Tyler!
0
0
3
2
0
7
@nick11roberts
Nicholas Roberts
5 months
The search trajectory from MAD seems to follow the architecture gradient on the fine-tuning task!!! Isn’t that neat? This suggests that the pretrained hybrids that we search for on these tasks may have some form of universality. 10/n
1
0
6
@nick11roberts
Nicholas Roberts
1 year
@zacharylipton So long as search spaces are inspired by existing architectures, you’re going to get things that look like existing architectures. True for vision/NLP/well-explored domains, which limits the gains there. Instead, NAS for adapting these architectures to diverse tasks is the way.
1
0
3
@nick11roberts
Nicholas Roberts
2 years
Remember how much you hated Joffrey in Game of Thrones when you still liked Game of Thrones? That’s XGBoost right now. Submit your methods to AutoML Decathlon today to claim your place on the iron throne + $15,000!!!
Tweet media one
@AutoMLDecathlon
AutoML Decathlon
2 years
Think you can develop a machine learning method that beats XGBoost and a linear model on diverse tasks for $15,000? We think you can too. Right now is the perfect time to submit to the AutoML Decathlon 2022 competition at #NeurIPS2022 ! [1/n]
Tweet media one
1
11
17
0
1
7
@nick11roberts
Nicholas Roberts
2 years
Excited to be giving a talk this Thursday at the AutoML Seminar with Sam Guo and @williamcxxz about the AutoML Decathlon competition!
@AutomlSeminar
AutoML Seminar
2 years
Competitions and benchmarks have been one of the major accelerators in AutoML. @nick11roberts , @williamcxxz and Samuel Guo will present the results and insights of the #Neurips2022 AutoML decathlon challenge on Thursday: .
0
0
7
0
5
7
@nick11roberts
Nicholas Roberts
2 years
Lots of cool things at the AutoML Fall School — including the AutoML Decathlon *HACKATHON* Excited to help folks to get a running start on their submissions!
@LindauerMarius
Marius Lindauer
2 years
We have finalized the schedule for the upcoming #AutoML Fall School in October. I believe that this is an excellent mix between invited lectures and hands-on sessions, both for academic packages and industry software. I hope I will see you there
Tweet media one
0
9
30
0
0
7
@nick11roberts
Nicholas Roberts
5 months
Free self-alignment just dropped — check it out and get yourself Aligned, it’s EZ!
@dyhadila
Dyah Adila🦄
5 months
Can we align pre-trained models quickly and at no cost? 🤔 Sounds challenging! Our latest research tackles this question. Surprisingly, we found compelling evidence that it just might be possible! 🌟🔍 Preprint:
3
16
87
1
0
7
@nick11roberts
Nicholas Roberts
5 months
We do this by using these super helpful synthetics as a proxy for search, instead of just doing search on the downstream fine-tuning task… The losses here are on the fine-tuning task, while the search trajectory is from the MAD tasks… 9/n
Tweet media one
1
0
6
@nick11roberts
Nicholas Roberts
2 years
There are also several other AutoWS techniques that we’re looking to add to our benchmark: AutoSWAP ( @tsengalb99 , @JenJSun , @yisongyue ), ASTRA ( @gkaraml , @AhmedHAwadallah ), Nemo (Cheng-Yu Hsieh, @JieyuZhang20 , @ajratner ), Label Prop. w/WS ( @rpukdeee , @dylanjsam ), & more! [10/n]
1
0
6
@nick11roberts
Nicholas Roberts
5 months
Throw in some linear layers (with skip connects and gating) before and after the blocks from each of the two models to **translate** the features between them so that they can ‘speak a common language’ 6/n
Tweet media one
1
0
6
@nick11roberts
Nicholas Roberts
5 months
So you want to build a hybrid from the pretrained EleutherAI/pythia-410m and state-spaces/mamba-370m models… Here’s what Manticore does… 5/n
Tweet media one
1
0
6
@nick11roberts
Nicholas Roberts
2 years
A short poem: We can’t let XGBoost win $15K, It’s such a simple baseline, no way! Submit to AutoML Decathlon today~ And be sure to follow @AutoMLDecathlon for leaderboard updates! XGBoost won’t be in the lead for long. 🙂
@AutoMLDecathlon
AutoML Decathlon
2 years
Think you can develop a machine learning method that beats XGBoost and a linear model on diverse tasks for $15,000? We think you can too. Right now is the perfect time to submit to the AutoML Decathlon 2022 competition at #NeurIPS2022 ! [1/n]
Tweet media one
1
11
17
0
0
6
@nick11roberts
Nicholas Roberts
5 months
Here’s another banger — when the Mamba and Transformer models have different “skills” they can result in a hybrid that is better on fine-tuning tasks where both skills are required. (shameless Skill-It! 🍳plug @MayeeChen ) 11/n
Tweet media one
1
0
6
@nick11roberts
Nicholas Roberts
5 months
So that’s how it’s done folks. “But wait, is the search for mixture weights expensive?” I knew you’d ask, and no it’s not. You can actually just ‘program’ the mixture weights using the amazing synthetic Mechanistic Architecture Design (MAD) tasks 8/n
1
0
6
@nick11roberts
Nicholas Roberts
5 months
Next, we want to learn how much influence each model will have on the overall hybrid (because what if one of them doesn’t perform well?) This is where the NAS stuff comes in. We search for mixture weights of a convex combination of their blocks: 7/n
Tweet media one
1
1
6
@nick11roberts
Nicholas Roberts
1 year
Might have noticed that our method is called “Loki.” It’s not a Marvel reference. It refers to the plural, loci, of the “locus” of the Fréchet mean. This is (kind of) a convex hull analogue for metric spaces — check out why we used this weird name HERE:
1
0
6
@nick11roberts
Nicholas Roberts
1 year
@zacharylipton Not saying that we are, but seems like long-term human architecture search is better at finding reusable motifs than NAS algorithms in general. I suspect that this is why traditional NAS is often used to navigate the perf-efficiency curve rather than to actually do this
1
0
1
@nick11roberts
Nicholas Roberts
9 years
Totally excited for @SDHacks and @CalHacks . This is going to be a good hackathon season.
0
0
5
@nick11roberts
Nicholas Roberts
11 months
More generally if you want to meet up, I’ll be around all week. Feel free to reach out and we can find a slot Looking forward to seeing folks in New Orleans next week!!!
0
0
2
@nick11roberts
Nicholas Roberts
1 year
@zacharylipton In general, agreed! Though sounds expensive and even then, gluing together operations in different combinations in hopes to find new motifs is probably only useful in domains where humans haven’t basically done years of Human Architecture Search themselves
1
0
0
@nick11roberts
Nicholas Roberts
4 months
At the iconic @splashcafe in #Pismo ! Had a wonderful time ❤️
@dyhadila
Dyah Adila🦄
4 months
Sprocket Lab Cali Central coast trip ❤️
Tweet media one
0
0
13
0
0
5
@nick11roberts
Nicholas Roberts
2 years
ALERT!!! AutoML Decathlon @NeurIPSConf 2022 update: The reign of XGBoost has, at long last, come to an end… And a new competitor enters the ring! Reminder to submit your methods to AutoML Decathlon for a chance to win $15,000 and eternal glory!
Tweet media one
@AutoMLDecathlon
AutoML Decathlon
2 years
NEW COMPETITOR ALERT!!! XGBoost has been dethroned and a *NEW* competitor is leading in the AutoML Decathlon competition at @NeurIPSConf 2022!!! Submit your method today for a chance to win the $15,000 top prize and stay tuned for more leaderboard updates!
Tweet media one
0
0
9
0
2
5
@nick11roberts
Nicholas Roberts
8 years
Somehow I ended up in the UCSD Gospel Choir and I'm performing tonight. Odd.
0
0
4
@nick11roberts
Nicholas Roberts
2 years
So excited to help participants get started on the AutoML Decathlon at the AutoML Fall School!!!
@LindauerMarius
Marius Lindauer
2 years
The #AutoML Fall school 2023 joins forces with the AutoML Decathlon team. This means, @atalwalkar , Samuel and Nick will give a hands-on introduction to the Decathlon setup at the fall school, and we will spend the hackathon on coming up with good submissions for the Decathlon.
Tweet media one
1
7
27
0
0
4
@nick11roberts
Nicholas Roberts
2 years
Stoked about this! Stay tuned for more deets…
@harit_v
Harit Vishwakarma
2 years
Super excited to share that our work with @nick11roberts and my advisor @fredsala , "Lifting Weak Supervision to Structured Prediction" has been accepted at #NeurIPS2022 . Preprint coming soon!
1
7
25
0
0
4
@nick11roberts
Nicholas Roberts
2 years
With only about a month left to submit, submit today for eternal glory etc. + $15,000! *Fun fact:* $15,000 can buy you the complete box set of all 6 seasons of Lost 53 times over! “Kate! We have to go back!” No Jack, you and Kate need to submit to the AutoML Decathlon!!!
@AutoMLDecathlon
AutoML Decathlon
2 years
LEADERBOARD UPDATE!!! Many new competitors have entered the ring, including the #Minions , who are now the third team to beat XGBoost! ***We are entering the final month of the AutoML Decathlon @NeurIPSConf 2022 competition, so submit your methods soon!!!***
Tweet media one
0
0
5
0
1
4
@nick11roberts
Nicholas Roberts
2 years
Relatedly:
Tweet media one
@AutoMLDecathlon
AutoML Decathlon
2 years
Think you can develop a machine learning method that beats XGBoost and a linear model on diverse tasks for $15,000? We think you can too. Right now is the perfect time to submit to the AutoML Decathlon 2022 competition at #NeurIPS2022 ! [1/n]
Tweet media one
1
11
17
0
0
4
@nick11roberts
Nicholas Roberts
2 years
I will get my #Prius #hitched as soon as I find someone foolish enough to do the job! After that, I will just tow a vintage Skeeter ice boat out to #LakeMonona a few weekends out of the year.
@nick11roberts
Nicholas Roberts
2 years
Should I get a trailer hitch installed on my Prius? I will abide by the results of this poll.
0
0
0
0
0
3
@nick11roberts
Nicholas Roberts
2 years
Very stoked about this work!!! Come chat with us on Wednesday! #NeurIPS2022 #NeurIPS22 #NeurIPS #lifting 💪
@harit_v
Harit Vishwakarma
2 years
Excited to present our work on 💪🏋️ Lifting Weak Supervision to Structured Prediction 💪🏋️ @NeurIPSConf this week! We’ll be in Hall J, #334 at 4pm on Wednesday– drop by and chat with us! #NeurIPS2022 [1/n]
Tweet media one
1
4
21
0
0
3
@nick11roberts
Nicholas Roberts
1 year
In this example, the grid can be arbitrarily large, but your pretrained classifier only needs to be trained on a constant number of classes (4) “but don’t predictions suffer if the space is large?” Yeah, yeah, we have learning theory results for the prediction error: [11/n]
Tweet media one
1
0
3
@nick11roberts
Nicholas Roberts
2 years
Finally, we found that zero-cost proxies performed inconsistently across diverse tasks – this corroborates prior findings by @crwhite_ml , @khodakmoments , @tu_renbo , @sytelus , @SebastienBubeck @debadeepta [12/n]
1
0
3
@nick11roberts
Nicholas Roberts
5 months
@MayeeChen Thank you @MayeeChen !!! 😁
0
0
2
@nick11roberts
Nicholas Roberts
2 years
The goal of NAS is to automate the design of neural networks for a given task, which saves human effort. This process typically involves the following three components: a search space, a search method, and a way to estimate performance. [2/n]
Tweet media one
1
0
3
@nick11roberts
Nicholas Roberts
11 months
@em_dinan Messaged!
0
0
0
@nick11roberts
Nicholas Roberts
3 months
@MayeeChen @Changho_Shin_ “Wow sick hybrid!!! How did you manage to train that on a single GPU in your friend’s basement?”
0
0
3
@nick11roberts
Nicholas Roberts
1 year
This interface works by setting the weights of this Fréchet mean. There are a bunch of ways to set the weights for this interface, some covered by our prior work and [8/n]
@harit_v
Harit Vishwakarma
1 year
📢 A fun blog post 📢 with my advisor @fredsala Checkout our blog post on improving the reliability of LLMs via aggregation using super cool classical tools🔧🔨. BlogPost: Paper: Code:
Tweet media one
1
8
25
1
0
3
@nick11roberts
Nicholas Roberts
3 years
With this substitution, we slowly realized that our search space elegantly encodes many interesting neural operations. Most excitingly, graph convs and Fourier neural operators @ZongyiLiCaltech are XD-operations. [9/n]
Tweet media one
1
0
3
@nick11roberts
Nicholas Roberts
2 years
!!!
@AutoMLDecathlon
AutoML Decathlon
2 years
Leaderboard update!!! New competitors have entered the ring. Submit your method today to earn your spot on the leaderboard and for a chance to win $15,000!
Tweet media one
0
1
8
0
0
3
@nick11roberts
Nicholas Roberts
2 years
Also today is the last day to get early bird registration for the #AutoML Fall school and to get a *head start* on the @AutoMLDecathlon competition by participating in the Fall School hackathon!!!
@AutoMLDecathlon
AutoML Decathlon
2 years
We're very excited to be joining forces with the AutoML Fall School 2022! Register for the #AutoML Fall School today and get a head start on the AutoML Decathlon!!! Quite literally—*register today* because today is the last day for early bird registration for the Fall school!
0
0
11
0
1
3
@nick11roberts
Nicholas Roberts
11 years
@SwaggySpragg dat windows vista.
1
0
2
@nick11roberts
Nicholas Roberts
11 years
Chamomile/rooibos blend with pomegranate honey. Quite delicious. It ended up being an orangish color. http://t.co/wCLtYcSZyQ
Tweet media one
0
0
2
@nick11roberts
Nicholas Roberts
2 years
1
0
1
@nick11roberts
Nicholas Roberts
1 year
This allows the pretrained model to 🗺️ navigate🗺️ the metric space of labels just by using the softmax outputs E.g., if your metric space is a grid, but your classifier can only output probabilities for the classes in the corners, you can actually output any class! [10/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
11 years
Jasmine flower green tea with lemonade. Andnfnfiendkdkfnfikwndkskfnghtnaidnfueosnskaalqmsoqndjvjfkfmmmfndkndkdnffjmmmmmmd,dodmmmmmmmmmmmmmm!
1
0
2
@nick11roberts
Nicholas Roberts
2 years
Jointly led with @XintongLi0501 (who is actively applying to Ph.D. programs!), with help from @zihengh1 , @dyhadila , Spencer Schoenberg, Cheng-Yu Liu, Lauren Pick, Haotian Ma, and with guidance from @fredsala and @awsTO ! [12/n], n=12
0
0
2
@nick11roberts
Nicholas Roberts
2 years
There are several examples of AutoWS methods, including Snuba ( @paroma_varma ), Interactive Weak Supervision ( @BenBoecking , @willieneis ), and GOGGLES ( @nilakshdas ). The common thread between these methods? [5/n]
1
0
2
@nick11roberts
Nicholas Roberts
1 year
How the heck do we do this? In short, - we deal with the size of the space by using the metric geometry of the labels, and - we use the Fréchet mean as a plug-in to ⚙️interface⚙️ a ***pretrained model*** with the metric space [7/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
1 year
No! In reality, we basically want outputs resembling data structures. Examples: - ASTs - chains of thought - folded proteins Most ML builds up to these as best we can using our base primitives, but doesn’t use the native relationships between these objects. [3/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
3 years
Partner in crime @khodakmoments , joint work with @tri_dao , @liamcli , @hazyresearch , @atalwalkar [19/n]
1
0
2
@nick11roberts
Nicholas Roberts
1 year
We also have a bunch of other exciting theory results: - characterizing what classes your pretrained model needs, - how to optimally expand the set of classes you can predict, - how to efficiently figure out what classes you can actually predict [12/n]
1
0
2
@nick11roberts
Nicholas Roberts
1 year
Experimentally, we show consistent lifts over just using the standard argmax prediction rule (which we actually generalize, see paper). We even show that in cases where you actually can predict everything (zero-shot CLIP), just using the metric geometry STILL helps! [13/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
2 years
Excited to chat about NAS-Bench-360 and the AutoML Decathlon competition today! More deets below #NeurIPS2022
@atalwalkar
Ameet Talwalkar
2 years
Can’t wait to attend @NeurIPSConf tomorrow, my first in-person conference in way too long! And excited to share this experience with several students / collaborators who are finally getting to present their work in person... 1/N
1
6
35
0
0
2
@nick11roberts
Nicholas Roberts
5 months
@AGIHouseSF @khoomeik Cool! Reminds me of Deeply Supervised Nets.
0
0
2
@nick11roberts
Nicholas Roberts
2 years
A promising answer is to use Automated Weak Supervision (AutoWS), which replaces label functions with weak learners obtained using a small amount of labeled data. Here’s that visualization of the AutoWS pipeline again: [4/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
2 years
Weak Supervision is a super powerful framework for constructing labeled datasets – instead of actual labels, it relies on having access to several “labeling functions” that are able to produce noisy guesses about the true label Y, given some X. [2/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
1 year
Also no! We show how you can *reuse a pretrained classifier* that was trained only on a subset of the space to predict anything you want in your complicated label space of data structures. [6/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
5 months
Found my coauthors’ twitter handles: @srguo24 @srinath_namburi @ChengyuD27
1
0
2
@nick11roberts
Nicholas Roberts
1 year
So yeah, it seems like training models on such a huge label space, if you try to do it without using base primitives, might be pretty much impossible… Wouldn’t you need training examples representing every possible AST? [5/n]
1
0
2
@nick11roberts
Nicholas Roberts
1 year
Without primitives, these output spaces are huge… Let’s consider the AST space — Cayley tells us that the size of the space is actually *worse* than exponential… It’s b^{b-2} in the # of vertices. Like, Zoinks Scoob… [4/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
5 months
@khoomeik Seems like there should be a way to generalize this to trade-off perf and how much you can parallelize it, with this being one extreme and backprop being the other extreme. Neat project!
0
0
2
@nick11roberts
Nicholas Roberts
1 year
Let’s start off by unboxing how we make predictions using machine learning… The majority of ML uses a really simple set of base primitives as labels — binary, multi-class, and regression labels. But is this actually what we want in practice? [2/n]
1
0
2
@nick11roberts
Nicholas Roberts
7 years
I just registered for the @EPFL_en 's Applied Machine Learning Days 2018 @appliedmldays . Come join Europe's most exciting AI event! #AMLD2018
0
1
1
@nick11roberts
Nicholas Roberts
3 years
@rajiinio 10000000000000% this
0
0
1
@nick11roberts
Nicholas Roberts
6 years
Look what we did!
0
0
2
@nick11roberts
Nicholas Roberts
1 year
But here, we want to somehow plug in a pretrained model to the Fréchet mean… A natural choice is to directly use the per-class probability estimates from the softmax as the weights! [9/n]
Tweet media one
1
0
2
@nick11roberts
Nicholas Roberts
1 year
@BlancheMinerva @zacharylipton Can you elaborate on this? Typically when people say this, they really mean random search with weight sharing, which IS a (single-shot) NAS algorithm. On the other hand if you have unlimited time and compute, RS will outperform anything.
0
0
0
@nick11roberts
Nicholas Roberts
2 years
@ruthhook_ @typeofemale @goblinodds Overall, this is a side of science that I wasn’t aware of until exactly now. What
0
0
2
@nick11roberts
Nicholas Roberts
3 months
I’m also here all week so pls reach out if you want to chat, more generally 👨‍🌾 Stoked for the conference!
0
0
1
@nick11roberts
Nicholas Roberts
8 years
0
0
1
@nick11roberts
Nicholas Roberts
2 years
Prior work found that the same set of NAS operations were important across the vision tasks of NAS-Bench-201 – we found that this was not true for diverse tasks: [11/n]
Tweet media one
1
0
1
@nick11roberts
Nicholas Roberts
2 years
This pipeline works well for text data – it’s easy to write label functions for text. OTOH, it’s quite a bit harder to write these label functions for data with more complex features, such as images or the vast majority of other ML tasks. [3/n]
1
0
1
@nick11roberts
Nicholas Roberts
4 years
@AlexHermstad I’m not aware of any current CS curriculum doing this, no. However, my intro to programming class in high school started off by teaching people using a flowchart based UI algorithm builder that compiled into code if you wanted it to.
0
0
1