Ali Sabet Profile Banner
Ali Sabet Profile
Ali Sabet

@alisabets

813
Followers
2,012
Following
49
Media
2,646
Statuses

incoming pretrainer @ faang. prev: playground ai, cohere ai, vector institute. co-/creator: pgv2/2.5 | cohere command v 1 | BLoRA | urzas ai. @uwaterloo cs grad

Joined May 2020
Don't wanna be here? Send us removal request.
Pinned Tweet
@alisabets
Ali Sabet
5 months
Releasing another one, you're welcome✌️.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@Suhail
Suhail
5 months
1/ We are releasing Playground v2.5, our latest foundation model to create images. We tested our model across 20K+ users in a rigorous benchmark that went beyond anything we've seen to date. This model is open weights. More information in the tweets below. 👇
75
155
1K
0
0
12
@alisabets
Ali Sabet
11 months
Introducing the BLoRA repo Hook several LoRAs into the same language model, and generate simultaneously in the same batch. Batch outputs can even be streamed.
14
69
505
@alisabets
Ali Sabet
6 months
@dieworkwear um you’re mixing up tim with tom.
3
0
429
@alisabets
Ali Sabet
8 months
@JSTOR Aaron Swartz says hello
0
2
301
@alisabets
Ali Sabet
7 months
1
0
202
@alisabets
Ali Sabet
4 months
@Rainmaker1973 laurel or yanny?
9
5
147
@alisabets
Ali Sabet
5 months
@JohnnySobczak arrival and dune 1 were the peak imo.
4
0
129
@alisabets
Ali Sabet
2 years
@dalmaer @Carnage4Life Just flooding AI with ex-crypto scammers.
1
0
77
@alisabets
Ali Sabet
7 months
@typedfemale if they CUDA they woulda
1
0
80
@alisabets
Ali Sabet
2 years
@_madFrog @culturaltutor I think you missed the point. The post was intended to lay out how stories evolve across cultures and time, and where Shakespeare drew his influences from. I found it very instructive.
2
1
72
@alisabets
Ali Sabet
2 years
@WriteArthur A bit overdramatic. If you include gender/race/etc in the text prompt it'll generate the image you want. In production, users would modify prompt or finetune to get expected behaviour. Should look up alignment research to learn more.
1
1
70
@alisabets
Ali Sabet
4 months
@FedeItaliano76 doubles as a spice harvester.
0
0
58
@alisabets
Ali Sabet
5 months
@minchoi most of the artifacts are forgivable imo, cinema and animation cut corners too usually. aesthetically pleasing enough for me to ignore minor mistakes.
5
0
49
@alisabets
Ali Sabet
11 months
Great explanation from my friend @yacineMTB
@yacineMTB
kache
11 months
Because the trainable parameters for low rank layer adapters are so small, you can hold them all simultaneously in memory. Meaning, you can have the same bag of beans, and change its behavior by swapping LoRA. Huggingface's PEFT allows swapping adapters over their API
Tweet media one
2
4
88
1
2
45
@alisabets
Ali Sabet
6 months
@SteveStuWill Proxy for wealth.
4
1
45
@alisabets
Ali Sabet
5 months
@GillVerd wloo interns are the secret life force of the valley
1
1
42
@alisabets
Ali Sabet
4 months
@aitaikimochi they're murderberries now.
0
1
42
@alisabets
Ali Sabet
2 years
@blennon_ Not true, ie open-source stable diffusiob quickly surpassed dall-e on performance and marketshare. ChatGPT is a great product out-of-the-box, but customization, control, margins, etc will always incentivize devs to churn to open-source.
3
2
39
@alisabets
Ali Sabet
5 months
@MegaBasedChad upgraded from Leggo to LFG
0
2
40
@alisabets
Ali Sabet
5 months
@spikedoanz yggdrasil diffusion.
2
2
39
@alisabets
Ali Sabet
2 years
@johnjnay @TheEconomist @stateofaireport OpenAI's is the most impactful. The rest are just paper mills
7
0
33
@alisabets
Ali Sabet
8 months
Payoff of a lot of hard work with the team @playground_ai 🚀. We're also open-sourcing the weights on huggingface 🤗! More improvements coming soon✌️.
@Suhail
Suhail
8 months
1/ Introducing Playground v2: A new commercially open model from our team that we trained from scratch. Most notably our model was preferred 2.5x more than the current leading open model (SDXL). We're excited to give back to the community as we're just getting started.
Tweet media one
81
213
1K
3
7
33
@alisabets
Ali Sabet
3 months
@vikhyatk cool experiment! though the intuition doesn't necessarily extrapolate for higher frequency functions. ie for sin(500x), gelu is generally more stable and converges faster when scaling depth/width.
3
1
34
@alisabets
Ali Sabet
5 months
@Genki_JPN i played that for a recital once.
4
0
30
@alisabets
Ali Sabet
7 months
@AravSrinivas they need to nail basic search first
0
0
26
@alisabets
Ali Sabet
2 years
@tsrandall I've been doing this with davinci for 6 months. Didn't tell anyone so I could enjoy the productivity diff while it lasted.
2
2
26
@alisabets
Ali Sabet
2 years
Launched a magic card generator app () with friends at Cohere AI. Shared to twitter, hackernews, and reddit and over 4 days, the results are🧵:
Tweet media one
Tweet media two
3
3
23
@alisabets
Ali Sabet
10 months
@yacineMTB is he drinking soy the other four days
1
0
22
@alisabets
Ali Sabet
6 months
@_tim_brooks @prafdhar @billpeeb @OpenAI Hmmm, looks like it was trained in a rendered environment from a game engine. Have complete frame control and can easily auto-caption or use other control inputs as conditioning.
1
0
16
@alisabets
Ali Sabet
3 months
@andrew_n_carr the hand is a 3d asset, and you trained a temporal text2pose model to puppeteer? seems very tractable since there’s lots of animation data available to be annotated with a pose estimator. surprised by how smooth the motion is though.
1
0
16
@alisabets
Ali Sabet
2 years
@nathanbenaich ChatGPT is simply GPT3.5 tuned with a conversational interface. OA users can finetune their own version that behaves similarly. What OA discovered is that UI+AI gets massive traction, like Copilot has.
2
0
18
@alisabets
Ali Sabet
7 months
@samfbiddle Umm what happened to “safety”
1
1
15
@alisabets
Ali Sabet
3 months
Tweet media one
@martin_casado
martin_casado
3 months
Oh good god. DRM for models. It's like every bad, fucked up idea from the late 90's and early 2000's is being dredged up again for the AI safety wars.
25
42
234
0
2
15
@alisabets
Ali Sabet
7 months
@abacaj in diffusion models i find overfitting to the point of reconstructing the training set gives better fidelity and aesthetics
1
0
15
@alisabets
Ali Sabet
6 months
@TheMarieOakes Megan Botox.
1
0
14
@alisabets
Ali Sabet
6 months
@M_Vigik @dieworkwear i don’t accept.
1
0
14
@alisabets
Ali Sabet
5 months
@Scoopsahoy232 this better be sarcasm.
0
0
12
@alisabets
Ali Sabet
2 years
@Suhail Don't make the mistake of building from scratch. Start from open-source then iterate once you see what usage actually looks like, ie your users may want to generate marketing related assets, so you'll want to curate more of that kind of training data.
1
0
10
@alisabets
Ali Sabet
2 years
@EMostaque Short-sighted artist backlash to generative AI has echoes of Napster days. Image generators are free distribution for artists, but in the form of derivative work instead of direct copies as with peer2peer. Artists capitalizing this will dominate the future art landscape.
4
2
11
@alisabets
Ali Sabet
6 months
@yacineMTB make software that’s fun to watch 👌
1
0
10
@alisabets
Ali Sabet
4 months
@LinusEkenstam i don’t find this very compelling. i think the ad market will be over-saturated with cheap ai generated content, which’ll drive consumers even more to human ads.
1
0
10
@alisabets
Ali Sabet
7 months
@weirddalle did taylor swift also surrender unconditionally to the us
1
0
11
@alisabets
Ali Sabet
10 months
@jon_victor_ it's called research
0
0
11
@alisabets
Ali Sabet
7 months
This is essentially LLM apis right now
@doctorow
Cory Doctorow NONCONSENSUAL BLUE TICK
7 months
First of all, there's the ways trusted computing is *designed* to hurt you. The most reliable path to enshittiication a computer that runs programs you can't alter, and rats you out to third parties if you run counter-programs that disenshittify the service you're using. 24/
1
11
147
1
3
10
@alisabets
Ali Sabet
10 months
@rasbt Attention is a type of convolutional operator and author wanted to port the parallelism of CNNs to sequential modelling.
1
0
10
@alisabets
Ali Sabet
6 months
@sevensixfive also doubles as a spice harvester.
0
0
10
@alisabets
Ali Sabet
5 months
@JezCorden is this the same person that did the moogle/chocobo redesigns in ff7 rebirth
0
0
9
@alisabets
Ali Sabet
2 years
@sterlingcrispin Saying that it's 3x more powerful is pure speculation as it's never been publicly released for testing. Benchmarks and cherry-picked results can be misleading.
1
0
9
@alisabets
Ali Sabet
5 months
Gotta pat myself on the back for this one. Not even our best model.
@ClementDelangue
clem 🤗
5 months
Playground v2.5 – 1024px Aesthetic Model is now #4 trending on . Great job @Suhail @playground_ai team!
Tweet media one
1
11
78
1
0
9
@alisabets
Ali Sabet
2 years
@EMostaque @jackclarkSF @amasad @carperai Bloom pretraining data was poorly curated and filtered, especially for multi-lingual data. Surprised Bloom creators didn't follow Pile's methodology, which was more successful. Definitely a failure-case for open source collaboration.
2
0
9
@alisabets
Ali Sabet
9 months
@chamath This guy synthesizes
1
0
4
@alisabets
Ali Sabet
5 months
@jeremyphoward brute-force captioning all training images with a 17B vlm is all you need. though strong text synthesis causes regressions, ie text leaking into images unintentionally.
0
0
7
@alisabets
Ali Sabet
2 years
@marktenenholtz thank god crypto's dying.
1
0
9
@alisabets
Ali Sabet
5 months
@DiscussingFilm ryan gosling’s hair is having a renaissance right now.
1
0
9
@alisabets
Ali Sabet
4 months
@LandsknechtPike mercenary industrial complex.
0
0
6
@alisabets
Ali Sabet
3 months
watch my stanford phd friend's intro lecture on transformers, from pre-history to sota.
@stevenyfeng
Steven Feng
3 months
The first lecture of our @Stanford CS25 V4 Transformers course () is now released! Check it out here: . We (the instructors) gave a brief intro and overview of the history of NLP, Transformers and how they work, and their impact. We
4
119
490
0
1
9
@alisabets
Ali Sabet
7 months
@growing_daniel it’s not Confusing, it’s Confucius
0
0
8
@alisabets
Ali Sabet
3 months
@Ethan_smith_20 literal tail lol
0
0
7
@alisabets
Ali Sabet
4 months
@nabeelqu competitive autism.
0
0
7
@alisabets
Ali Sabet
2 years
@krishnanrohit If it fits within gpt3 api's 4K context window then a simple prompt is sufficient. O/w chunk the page and summarize recursively. For q/a you can do chunk+retrieve+generate answer.
1
1
6
@alisabets
Ali Sabet
4 months
@xuanalogue mfw yarin gal already said this in 2015
Tweet media one
2
1
7
@alisabets
Ali Sabet
2 years
@russelljkaplan I use GPT3 to generate synthetic data for most of my projects these days. Synthetic + human edits is an even more powerful combination.
2
0
7
@alisabets
Ali Sabet
7 months
@weirddalle shrinkflation at work
0
0
6
@alisabets
Ali Sabet
11 months
@NPCollapse @jackclarkSF Mind keeping your attention-seeking crackpot doomerism to yourself for a change?
0
0
6
@alisabets
Ali Sabet
4 months
@danielhanchen @UnslothAI paged adam is genius, natural extension to paged attention
0
0
7
@alisabets
Ali Sabet
1 year
@UltraTerm @OpenAI By "emerging threats" they mean competition to OpenAI
1
0
7
@alisabets
Ali Sabet
2 years
@tomgoldsteincs Ineffective. I can bypass this by finetuning on samples of my own writing.
1
0
7
@alisabets
Ali Sabet
6 months
@martinmbauer set t=e^x, so = sum(t^n/n!) = e^t = e^e^x ???
0
0
7
@alisabets
Ali Sabet
7 months
@ShengwuLi More like the Epstein lab
0
0
6
@alisabets
Ali Sabet
2 years
@DrJimFan I'm doubtful about LaMDA and Sparrow. Google's subpar when it comes to dataset curation. Also, market feedback is essential for building useful AI
0
0
6
@alisabets
Ali Sabet
2 years
@alexandr_wang I think the contractors were for SFT, not RLHF. Not sure you'd need 1K labellers for ranking.
0
0
6
@alisabets
Ali Sabet
5 months
@haydenfield Sounds like a whistleblower reward grift.
Tweet media one
0
0
5
@alisabets
Ali Sabet
7 months
@var_epsilon there needs to be a training data service to cold-start these new libraries for copilot
0
0
6
@alisabets
Ali Sabet
5 months
@wrathofgnon this is the premise for death stranding.
0
0
6
@alisabets
Ali Sabet
2 years
@typedfemale I stick to triton.
1
0
6
@alisabets
Ali Sabet
1 year
@gfodor @ESYudkowsky @AndrewYNg we need more AI prevent to Eliezer from shitposting
0
0
6
@alisabets
Ali Sabet
4 months
@ClementDelangue make an offer they can't refuse.
0
1
6
@alisabets
Ali Sabet
2 years
@Suhail The pretrain dev cycle is long and capital intensive. It'll also be difficult for you to set up proper evals without knowing where performance should be prioritized.
1
0
4
@alisabets
Ali Sabet
6 months
Our first repo citation in the wild 🤙. New model(s) coming soon 🚀⏰
Tweet media one
0
0
5
@alisabets
Ali Sabet
1 year
@GaryMarcus Graduating from goal-post moving to road-blocking it seems. Trolling for relevance is a grind.
0
0
5
@alisabets
Ali Sabet
2 years
@abacusai Fails with finetuned models.
3
0
5
@alisabets
Ali Sabet
6 months
@creatine_cycle every time i scp from my macbook
0
0
3
@alisabets
Ali Sabet
2 years
@EMostaque shit's about to get stable 🤙
0
0
4
@alisabets
Ali Sabet
4 months
@vikhyatk the way the gpt3 paper implemented was train a classifier on curated text as positive class, and common-crawl as negative class. still a strong baseline imo.
1
0
5
@alisabets
Ali Sabet
6 months
@GaryMarcus Is gary marcus real
0
0
4
@alisabets
Ali Sabet
3 months
@andrew_n_carr i’m guessing there’s a lot of jitter from pose estimator error, which has to be smoothed somehow. probs need a custom model to label all those degrees of freedom. generating descriptive captions for the trajectories adds extra complexity. definitely sounds hard to get right.
1
0
5
@alisabets
Ali Sabet
4 months
@yacineMTB oo threshold on depthmap then segment?
1
0
5
@alisabets
Ali Sabet
2 years
@soumithchintala Most annotators would be for SFT, not RLHF. RLHF scales with fewer labellers, since it's just preference ranking.
1
0
5
@alisabets
Ali Sabet
1 year
@pcastr @Soccermatics One is a panel of scientists and the other is activists.
3
0
5
@alisabets
Ali Sabet
4 months
@tszzl “il y a” literally means there in french.
0
0
5
@alisabets
Ali Sabet
5 months
@OrdinaryGamers looks like my wallet’s bout to ghost.
0
0
5
@alisabets
Ali Sabet
2 years
@Suhail On the modelling side, Stability's latent diffusion models are state-of-art wrt to performance and scalability, good stack to build from. Avoid PhDs who've only worked in research w/o touching product, they'll kill you with wasteful R&D and tech debt.
1
0
5
@alisabets
Ali Sabet
7 months
@goodside sleeper agent 🫵
0
0
4
@alisabets
Ali Sabet
1 year
@arankomatsuzaki @GoogleAI Never heard of XNOR-Nets?
0
0
4
@alisabets
Ali Sabet
5 months
@erikdunteman @morgallant lol remember morgan dropping by the cohere office all the time while he was building.
0
0
2
@alisabets
Ali Sabet
2 years
@tszzl @alexandr_wang Distillation invalidates pretraining moats as well.
1
0
4
@alisabets
Ali Sabet
8 months
We’re #2 on Huggingface Models now
Tweet media one
0
0
4
@alisabets
Ali Sabet
2 years
@tunguz what happened to XGBoost
1
0
4