Zachary Nado @zacharynado Twitter profile

Pinned Tweet

Zachary Nado

1 year

I'm very excited that this paper is out, it has been over 2 years in the making! I started at Google Research speeding up neural net training, but was often frustrated when we didn't know how to declare a win over Adam 🚀

6

101

758

Last Seen Profiles

@juliehaviland2

@nocyllestial

@28w_zip

@Avaxsaurs

@JayZito

@oundo_oundo

@piercexinq

@NaPitit

@VmaniakJ

@fostbye

@kurumi102115014

@LiSiK0

@noriiice

@ibubohay2

@LSwallow74

@Grapesuki_

@HANAtruly

@031002_sh

@LiviaBellona

@northwest8969

@jandakembangstw

@TomCotterillMoL

@milletkultur

@yulesart

@jandakembangstw

@wrenrouge

@jandakembangstw

@AhmedZaheR_87

@xLydiaFernx

@jandakembangstw

@QU4CKL0V3SM3N

@icekreamoo

@playnobeto

@namepopo_

@tyong119

@ulushan16

Zachary Nado

@zacharynado

1 month

"i try not to think about competitors too much" interesting how all your launches are timed with ours then

Sam Altman

@sama

1 month

i try not to think about competitors too much, but i cannot stop thinking about the aesthetic difference between openai and google

3K

1K

26K

305

410

12K

Zachary Nado

@zacharynado

2 months

the hype is wearing off, the vibes are shifting, you can feel it

Tsarathustra

@tsarnick

2 months

Sam Altman: I don't care if we burn $50 billion a year, we're building AGI and it's going to be worth it

623

442

3K

60

217

6K

Zachary Nado

@zacharynado

1 year

Excited to announce our Deep Learning Tuning Playbook, a writeup of tips & tricks we employ when designing DL experiments. We use these techniques to deploy numerous large-scale model improvements and hope formalizing them helps the community do the same!

28

633

3K

Zachary Nado

@zacharynado

9 months

>importing numpy without renaming to np FTX was never gonna make it

Molly White

@molly0xFFF

9 months

From yesterday's exhibits in US v. Sam Bankman-Fried: The prosecution shows that the "insurance fund" that FTX bragged about was fake, and just calculated by multiplying daily trading volume by a random number around 7500

112

916

8K

23

166

2K

Zachary Nado

@zacharynado

7 months

It’s been a privilege to be part of the Gemini pretraining team and overall program, I’m so excited that the world can finally see what we’ve been up to for most of the past year: tl;dr we’re so back

44

61

1K

Zachary Nado

@zacharynado

2 months

damn people really have this little faith in us

Greg Brockman

@gdb

2 months

Live demo of some new work, Monday 10a PT. Not GPT-5 or a search engine, but we think you’ll like it.

189

344

4K

63

17

737

Zachary Nado

@zacharynado

1 month

to be clear I have a lot of respect for the researchers at openai and all my poasting is just bantering 🕺

21

8

609

Zachary Nado

@zacharynado

3 months

wow what a coincidence, just 5 days before their model drop!

Introducing DBRX: A New State-of-the-Art Open LLM | Databricks Blog

www.databricks.com

Nancy Pelosi Stock Tracker ♟

@PelosiTracker_

3 months

BREAKING 🚨: Nancy Pelosi just bought $5M of the AI company Databricks Unfortunately, Databricks is a privately held company and not available to be bought by the public Sorry people, you don’t have access to this one.

291

2K

15K

4

24

586

Zachary Nado

@zacharynado

4 years

Ever left batch norm in train mode at test time? We did, then realized it is shockingly effective at improving calibration on dataset shift! In our note "Evaluating Prediction-Time Batch Normalization for Robustness under Covariate Shift" () we explore why

10

112

503

Zachary Nado

@zacharynado

7 months

"Profits for investors in this venture were capped at 100 times their investment (though thanks to a rule change this cap will rise by 20% a year starting in 2025)." lol why bother having a cap anymore if it's going to exponentially increase anyways

Inside OpenAI’s weird governance structure

Why investors had no say in Sam Altman’s sacking

www.economist.com

25

21

440

Zachary Nado

@zacharynado

1 year

"I am shocked that the Bing team created this pre-recorded demo filled with inaccurate information, and confidently presented it to the world as if it were good. I am even more shocked that this trick worked, and everyone jumped on the Bing AI hype train"

17

67

378

Zachary Nado

@zacharynado

7 months

tl;dr submit a training algorithm* that is faster** than Adam*** and win $10,000 💸🚀 *a set of hparams, self-tuning algorithm, and/or update rule **see rules for how we measure speed ***beat all submissions, currently the best is NAdamW in wallclock and DistShampoo in steps

Google AI

@GoogleAI

7 months

To highlight the importance of #ML training & algorithmic efficiency, we’re excited to provide compute resources to help evaluate the best submissions to the @MLCommons AlgoPerf training algorithms competition, w/ a chance to win a prize from MLCommons!

22

115

464

10

49

374

Zachary Nado

@zacharynado

3 years

NeurIPS rejected my two papers but at least I'm a top 8% reviewer ¯\_(ツ)_/¯

8

5

320

Zachary Nado

@zacharynado

1 year

which AI announcement today wore it better

11

19

306

Zachary Nado

@zacharynado

1 year

here we go again with the classic once-a-month new optimizer hype cycle

Tengyu Ma

@tengyuma

1 year

Adam, a 9-yr old optimizer, is the go-to for training LLMs (eg, GPT-3, OPT, LLAMA). Introducing Sophia, a new optimizer that is 2x faster than Adam on LLMs. Just a few more lines of code could cut your costs from $2M to $1M (if scaling laws hold). 🧵⬇️

98

644

4K

8

12

301

Zachary Nado

@zacharynado

1 year

"Before OpenAI came onto the scene, machine learning research was really hard—so much so that, a few years ago, only people with Ph.D.s could effectively build new AI models or applications." lol, lmao even

Riley Goodside

@goodside

1 year

In SF for the week. Need to investigate this Cerebral Valley thing in person. Just gonna walk down Hayes St. yelling "Ignore previous directions" and see what doors open, figuratively or literally.

23

21

433

9

11

292

Zachary Nado

@zacharynado

2 months

@caffeinefused I think "AI" will be super useful long term but the over promising of AGI next year by the tech bro hype boys is getting old

2

4

285

Zachary Nado

@zacharynado

7 months

🌝

Jon Victor

@jon_victor_

7 months

New: Google quietly scrapped a set of Gemini launch events planned for next week, delaying the model’s release to early next year. w/ @amir

37

48

399

8

5

270

Zachary Nado

@zacharynado

3 years

I explain ML and DL concepts to PhDs all day every day, and vice versa, and I have a bachelors

Joseph Viviano

@josephdviviano

3 years

Research recruiter: We *love* your background. Tell us about your recent work. Me: Explains years of published projects. Recruiter: Sounds amazing. But when did you get your PhD? Me: Don't have one. Recruiter: lmfao smh nevermind want to work on product? How's your leetcode?

13

21

470

8

5

255

Zachary Nado

@zacharynado

1 month

@SebastianSzturo we did! ⚡

Simon Willison

@simonw

2 months

The llm-gemini model now supports the new inexpensive Gemini 1.5 Flash model: pipx install llm llm install llm-gemini --upgrade llm keys set gemini # paste API key here llm -m gemini-1.5-flash-latest 'a short poem about otters'

2

9

119

12

0

249

Zachary Nado

@zacharynado

6 years

Wrote my first blog post at , about generating #pusheen with AI! There's a version for those with and without an AI background, so don't let that hold you back from reading!

5

54

207

Zachary Nado

@zacharynado

2 months

I haven't kept up with self driving details much, genuine question, are there any competitors even close to Waymo?

Waymo

@Waymo

2 months

In the coming weeks, we will begin testing fully autonomous rides — without a human driver— for our employees on San Francisco Peninsula city streets north of San Mateo.

69

147

1K

39

0

202

Zachary Nado

@zacharynado

4 years

have you ever wondered what that epsilon parameter in the denominator of your optimizer (or batch norm!) is? I tried tuning it, and it turns out you can actually get serious performance gains by poking at this nuisance parameter!

ε, A Nuisance No More

For a while now I have been advocating for tuning ε in various parts of the modern deep learning stack, and in this post I’ll explain why.

zna.do

1

29

178

Zachary Nado

@zacharynado

2 years

now ask GPT anything related to very recent world events that aren't in it's training data

David E. Weekly @[email protected]

@dweekly

2 years

GPT-3 versus Google Search:

45

315

3K

8

16

162

Zachary Nado

@zacharynado

1 month

@drfintwit ok

2

0

163

Zachary Nado

@zacharynado

2 months

right on schedule

1

0

164

Zachary Nado

@zacharynado

3 years

A thread on our latest optimizers work! We tune Nesterov/Adam to match performance of LARS/LAMB on their more commonly used workloads. We ( @jmgilmer , Chris Shallue, @_arohan_ , @GeorgeEDahl ) do this to provide more competitive baselines for large-batch training speed measurements

3

30

159

Zachary Nado

@zacharynado

1 year

if I tweeted cryptic messages whose subtext was neurotic delusions fearmongering how AGI is here this year from LLMs, I'd 10x my followers in a week. but I don't because that's a part of my ethical AI practices

10

11

160

Zachary Nado

@zacharynado

3 years

Some Friday afternoon optimizer paper classifications with @_arohan_

1

20

149

Zachary Nado

@zacharynado

2 months

squeezing model sizes down is just as important as scaling up in my opinion, and 1.5 Flash ⚡️ is so incredibly capable while so small and cheap it's been blowing our minds 🤯 it has been an incredible privilege and so much fun building this model (sometimes too much fun)! ⚡️

Google DeepMind

@GoogleDeepMind

2 months

Today, we’re excited to introduce a new Gemini model: 1.5 Flash. ⚡ It’s a lighter weight model compared to 1.5 Pro and optimized for tasks where low latency and cost matter - like chat applications, extracting data from long documents and more. #GoogleIO

22

143

698

13

8

144

Zachary Nado

@zacharynado

2 years

lmao no transformers at attention layers at all incredibly telling

MIT CSAIL

@MIT_CSAIL

2 years

All major neural networks, in one chart: v/The Asimov Institute

75

1K

6K

8

21

137

Zachary Nado

@zacharynado

5 years

game of thrones fans:

Curious Zelda

@CuriousZelda

5 years

Me: Tonight, I will relax. Also me:

60

2K

10K

0

13

132

Zachary Nado

@zacharynado

2 months

there goes the only test set I trusted

Sam Altman

@sama

2 months

it is a very good model (we had a little fun with the name while testing)

54

185

2K

7

1

133

Zachary Nado

@zacharynado

2 months

@laplacesdust how is that relevant

2

0

128

Zachary Nado

@zacharynado

2 months

@pvpflagged

2

1

122

Zachary Nado

@zacharynado

4 months

very impressive models, congrats to everyone involved! also nice to know that we are not the only ones bad at model size naming

Anthropic

@AnthropicAI

4 months

Today, we're announcing Claude 3, our next generation of AI models. The three state-of-the-art models—Claude 3 Opus, Claude 3 Sonnet, and Claude 3 Haiku—set new industry benchmarks across reasoning, math, coding, multilingual understanding, and vision.

574

2K

10K

5

2

120

Zachary Nado

@zacharynado

7 months

this program just proved yet again that Google has the best systems infra teams in the world, hands down, getting us an insane goodput of 97% for the Ultra training run

2

8

116

Zachary Nado

@zacharynado

2 years

what's everyone's favorite learning rate right now? I wanna know what's trending ✨🔥💯 mine is 1e-2 for Adam, 1e-3 for SGD, with a linear warmup for 5-10% of training followed by some sort of decay

17

9

115

Zachary Nado

@zacharynado

19 days

nvidia is finished, GPU companies will soon learn the bitter lesson that everything is eventually replaced by the most solid bedrock of the technological revolution: spreadsheets

Dabo

@chendabo

21 days

I recreated an entire GPT architecture in a spreadsheet. It is a nanoGPT designed by @karpathy with about 85000 parameters, small enough to be packed into a spreadsheet file. It is great for learning about how transformer works as it shows all the data and parameters going

83

807

6K

2

15

112

Zachary Nado

@zacharynado

2 years

people are going to keep pushing this with no regard for quality/factualness, maybe eventually the hype will die down but given how easily people consume misinformation I'm not sure

Flo Crivello

@Altimor

2 years

GPT3 has already replaced much of my Google usage, and almost all my Wikipedia usage. (Forgive the naive questions!)

111

364

3K

6

7

112

Zachary Nado

@zacharynado

4 months

Gemini Pro 1.5 a week after Gemini Ultra and 70 days after Gemini Pro 1.0. Who says Google doesn't ship anymore? And with 10M context length, we've never been more back 🕺

15

8

108

Zachary Nado

@zacharynado

29 days

AI Overviews: About last week

Here’s what happened with AI Overviews, the feedback we've received, and the steps we’ve taken.

blog.google

31

7

103

Zachary Nado

@zacharynado

5 months

and this is only Gemini Pro that's beating GPT4-V, just wait for Ultra

Yue Fan

@YFan_UCSC

5 months

Distinguish muffins from chihuahuas in a multipanel web screenshot? No problem for humans (99% accuracy), but hard for Large Vision-Language Models (LVLMs) (39-72% accuracy)! To find out how LVLMs do and what affects their ability regarding multipanel image understanding, we

2

9

35

9

10

101

Zachary Nado

@zacharynado

2 months

the real announcement openai timed with Google I/O

Ilya Sutskever

@ilyasut

2 months

After almost a decade, I have made the decision to leave OpenAI. The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama , @gdb , @miramurati and now, under the

2K

3K

27K

4

3

101

Zachary Nado

@zacharynado

7 months

@suchenzang we're in the $2 Uber rides phase of the AI tech cycle

3

9

92

Zachary Nado

@zacharynado

2 months

1.5 Pro is a very, very good model 🚀🚀 but even more excited for what we have in store 🕺

lmsys.org

@lmsysorg

2 months

More exciting news today -- Gemini 1.5 Pro result is out! Gemini 1.5 Pro API-0409-preview now achieves #2 on the leaderboard, surpassing #3 GPT4-0125-preview to almost top-1! Gemini shows even stronger performance on longer prompts, in which it ranks joint #1 with the latest

35

191

945

6

5

93

Zachary Nado

@zacharynado

4 years

great paper on how training data and model choices affect neural network robustness, confirming that if you train more you get better generalization on new test sets (also using a bigger model helps!)

0

22

90

Zachary Nado

@zacharynado

5 months

soon

lmsys.org

@lmsysorg

5 months

🔥Breaking News from Arena Google's Bard has just made a stunning leap, surpassing GPT-4 to the SECOND SPOT on the leaderboard! Big congrats to @Google for the remarkable achievement! The race is heating up like never before! Super excited to see what's next for Bard + Gemini

154

627

3K

5

4

89

Zachary Nado

@zacharynado

7 months

the funniest timeline happened yet again

Zachary Nado

@zacharynado

7 months

sam and greg could do the funniest thing right now

0

30

2

3

82

Zachary Nado

@zacharynado

1 year

there may be really great things in this paper that generalize better than Adam! but I don't know and I won't know until we run it through the MLCommons algorithmic efficiency benchmark

GitHub - mlcommons/algorithmic-efficiency: MLCommons Algorithmic Efficiency is a benchmark and...

MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models. - mlcommons/algori...

github.com

5

6

84

Zachary Nado

@zacharynado

6 years

Favorite paper title (so far) from ICLR submissions: "How to train your MAML" ()

How to train your MAML

MAML is great, but it has many problems, we solve many of those problems and as a result we learn most hyper parameters end to end, speed-up training and inference and set a new SOTA in few-shot...

openreview.net

1

21

77

Zachary Nado

@zacharynado

7 months

also unlike many other top tier AI labs, we actually release some parameter counts and tell you how we fit Nano into Pixel phones (no other company has both SOTA models and a mobile platform like Google does)

9

2

81

Zachary Nado

@zacharynado

1 year

@bryancsk pretty sure the issue isn't the wages but the fact they read a novel worth of disturbing content or view child porn or gore each day w/o health benefits to help with that? this is the same company as and employees still don't seem to be getting help

3

1

77

Zachary Nado

@zacharynado

3 years

I'll start: we resubmitted a paper (with additional results based on previous reviews!) and received the literally same exact, character-for-character, copy-pasted review as we did for NeurIPS, which is of course a max confidence reject.

Zachary Nado

@zacharynado

3 years

logging onto today to see the fallout from ICLR reviews being released

1

12

8

0

79

Zachary Nado

@zacharynado

2 years

fun fact or PSA depending on the audience: the default epsilon for LayerNorm in Flax is 1e-6, and 1e-5 in PyTorch! 🙃🔥

5

9

79

Zachary Nado

@zacharynado

7 months

they're scared of Gemini

OpenAI

@OpenAI

7 months

OpenAI announces leadership transition

4K

14K

8

1

77

Zachary Nado

@zacharynado

1 month

@_M0neyMatters did chatgpt write this

2

0

75

Zachary Nado

@zacharynado

1 month

@stissle22 @SebastianSzturo what does that even mean? we didn't launch it "just to say we launched" ??? it's an actual product you can use right now, there are plenty of people who have been since Tues

3

0

73

Zachary Nado

@zacharynado

2 months

wait so gpt4v was not natively multimodal..?

Sam Altman

@sama

2 months

our new model: GPT-4o, is our best model ever. it is smart, it is fast,it is natively multimodal (!), and…

75

246

2K

14

2

74

Zachary Nado

@zacharynado

1 year

I've seen dozens of (well executed!) papers rise to fame claiming to be better than Adam, only to be forgotten 6 months later. we need to break the cycle!!

6

1

73

Zachary Nado

@zacharynado

2 months

what's with all the leaks from openai lately, that used to be our thing

Rachel Metz

@rachelmetz

2 months

my latest: openai is working on a search product to rival perplexity and google.

14

36

249

5

1

72

Zachary Nado

@zacharynado

1 year

either this considers GPT3 wrappers to be ML research (they're incredibly impressive but not really what I'd "research"), or they don't consider the research openai was built on to be "research"?

2

1

70

Zachary Nado

@zacharynado

2 years

papers like this just reinforce my intuition that LM training setups are underdeveloped because everyone obsessed over scaling up num params. there is so much more to look into besides just the model size!!

Aran Komatsuzaki

@arankomatsuzaki

2 years

Transcending Scaling Laws with 0.1% Extra Compute Performs on par with PaLM 540B with 2x less compute by continuing training PaLM with UL2R.

3

45

220

1

5

67

Zachary Nado

@zacharynado

4 years

all statues eventually evolve into crab

0

9

66

Zachary Nado

@zacharynado

1 year

"the only way I can explain why I thought about the problem for a year in grad school and made no progress, I left math for six years, then returned to the problem and made this breakthrough" sometimes stepping back from a problem is the best way forward!

Long Out of Math, an AI Programmer Cracks a Pure Math Problem | Quanta Magazine

On nights and weekends, Justin Gilmer attacked an old question in pure math using the tools of information theory.

www.quantamagazine.org

2

11

67

Zachary Nado

@zacharynado

1 month

@RiceFarmerNFT dw I'm all good

2

0

62

Zachary Nado

@zacharynado

1 month

in addition to Gemini 1.5 Flash, we also have Flash-8B which is even faster yet still quite capable ⚡️

lucas g

@DaLucasGonzalez

1 month

Our updated Gemini 1.5 tech report is out! Excited to share a sneak peak of a new model we are working on: Flash-8B

5

7

61

3

5

62

Zachary Nado

@zacharynado

2 years

this is strictly worse than just browsing a shopping website. how are people unironically investing in this

Homo Digitalis

@DigitalisHomo

2 years

This is how Walmart envisions Shopping in the #Metaverse . Thoughts? 💭

7K

34K

6

3

61

Zachary Nado

@zacharynado

7 months

"In conversations between The Atlantic and 10 current and former employees at OpenAI..." OpenAI beats GDM yet again, this time on number of employees who leak information to one article

Inside the Chaos at OpenAI

Sam Altman’s weekend of shock and drama began a year ago, with the release of ChatGPT.

www.theatlantic.com

3

4

58

Zachary Nado

@zacharynado

10 months

Jax >>> pytorch (even on GPU imo)

Boris Dayma 🖍️

@borisdayma

10 months

Seeing people struggling with FSDP… That's exactly where JAX shines, I can use pretty much any parallelism strategy with these few lines 💪

4

17

118

5

1

59

Zachary Nado

@zacharynado

2 months

deep learning infra is hard to get right but so important, advancements in it enable totally new lines of research

Daniel Johnson

@_ddjohnson

2 months

Excited to share Penzai, a JAX research toolkit from @GoogleDeepMind for building, editing, and visualizing neural networks! Penzai makes it easy to see model internals and lets you inject custom logic anywhere. Check it out on GitHub:

43

425

2K

0

6

59

Zachary Nado

@zacharynado

1 year

very excited for the palm 2 tech report to be out! it's been incredibly fun figuring out the learning rate for some of the best models in the world ...but I'm even more excited for Gemini to beat it 🚀📈🚀

Google

@Google

1 year

This includes our new foundation model that's still in training, Gemini. It’s our first model created from the ground up to be multimodal, highly capable at different sizes, and efficient at integrating with other tools and APIs. #GoogleIO

7

43

228

4

58

Zachary Nado

@zacharynado

1 month

on top of the new and impressive capabilities of Pro 1.5, Gemini 1.5 Flash is such a good model for how fast it is ⚡️⚡️⚡️

Jeff Dean (@🏡)

@JeffDean

1 month

Gemini 1.5 Model Family: Technical Report updates now published In the report we present the latest models of the Gemini family – Gemini 1.5 Pro and Gemini 1.5 Flash, two highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information

28

237

996

1

3

58

Zachary Nado

@zacharynado

10 months

billions of dollars of deep learning market cap:

Soul Engineer e/acc

@Soul0Engineer

10 months

Just move around anon

43

274

3K

0

4

51

Zachary Nado

@zacharynado

4 months

@hahahahohohe @AnthropicAI do you have access to Gemini 1.5 Pro to try this as a comparison point? if not DM me and we'll get you access

2

0

55

Zachary Nado

@zacharynado

1 year

how is the OpenAI hype so bad that you have me agreeing with Gary Marcus takes for once??

Gary Marcus

@GaryMarcus

1 year

a new version of moore’s law that has arguably already started: the amount of hype around AI doubles every 18 months

33

85

683

2

52

Zachary Nado

@zacharynado

3 years

Parameter count is a silly metric to assert AI progress with, but I'm also not surprised

elvis

@omarsar0

3 years

BREAKING: BAAI (dubbed "the OpenAI of China") launched Wudao, a 1.75 trillion parameter pretrained deep learning model (potentially the world's largest). Wudao has 150 billion more parameters than Google's Switch Transformers, and is 10x that of GPT-3.

16

219

695

4

6

53

Zachary Nado

@zacharynado

2 years

🎉🎉 our NeurIPS workshop on how to train neural nets has been accepted! 💯 please submit your weird tips & tricks on NN training, we can't wait to discuss them all together 😃🔥🖥️

Philipp Hennig

@PhilippHennig5

2 years

The CfP for our @NeurIPSConf workshop *Has It Trained Yet* is out: . If you train deep networks, you want to be at this workshop on December 2. And if you develop methods to train deep nets, you may want your work to be present there. Here’s why: 🧵

2

21

80

3

51

Zachary Nado

@zacharynado

7 months

Gemini models are SOTA on all image, video, and speech benchmarks we run on, and almost all text benchmarks

5

2

52

Zachary Nado

@zacharynado

11 months

to no one's surprise, recently trendy techniques don't stand the test of time against a well tuned baseline!

rohan anil

@_arohan_

11 months

Some excellent work by @jeankaddour and colleagues “We find that their training, validation, and downstream gains vanish compared to a baseline with a fully-decayed learning rate” ☠️

5

33

186

3

4

52

Zachary Nado

@zacharynado

2 years

classic tech opinion of "invent futuristic vaporware" instead of doing the dirty work fixing policy issues

vitalik.eth

@VitalikButerin

2 years

@Noahpinion My heterodox take on US transit is that if infrastructure problems are too hard to solve, the transit of the future is airplanes, and we should just make airplanes better by (i) making them zero-carbon, and (ii) improving comfort by greatly cutting down airport security

168

45

994

10

3

42

Zachary Nado

@zacharynado

8 months

detecting AI content is the next adversarial examples tons of research will be spent on it only to come up with "defenses" that are broken within 1 day of publication

Ethan Mollick

@emollick

8 months

AI work is ultimately undetectable, despite the recent discussion of watermarking. AI writing is undetectable by any automated system after just a few rounds of prompting or revision This paper shows it is also easy to defeat watermarking for AI image.

24

114

469

5

4

52

Zachary Nado

@zacharynado

2 years

>1 epoch training of an LLM, finally people are realizing this is possible 🙂

Papers with Code

@paperswithcode

2 years

We train for over four epochs and experience improving performance with use of repeated tokens. For the largest 120B model, we trained for four epochs without overfitting.

1

4

110

3

50

Zachary Nado

@zacharynado

2 months

@ryxcommar 1000%, time to short it all

0

50

Zachary Nado

@zacharynado

2 years

@julien_c `pip install jax flax optax`

0

3

50

Zachary Nado

@zacharynado

2 months

Google I/O isn't the only AI announcement Gemini watched 🕺

Michael Chang

@mmmbchang

2 months

Gemini and I also got a chance to watch the @OpenAI live announcement of gpt4o, using Project Astra! Congrats to the OpenAI team, super impressive work!

56

254

1K

2

5

48

Zachary Nado

@zacharynado

7 months

this is only the beginning of the Google software ecosystem getting supercharged by AI

Shrestha Basu Mallick

@shresbm

7 months

Google Search users with Search Generative Experiences (SGE) turned on will now be able to export responses to Python-related queries to a new Colab notebook directly! You can run the code, tinker with it in Colab and save the notebook for future reference! #GoogleAI #Colab

0

8

73

1

46

Zachary Nado

@zacharynado

2 months

I'm disappointed they're too cowardly to actually launch in the middle of Google I/O

Jimmy Apples 🍎/acc

@apples_jimmy

2 months

10am, 9th of May for an Openai event apparently, might not be model release but search engine announcement. Guess they can’t help themselves to upstage Google I/O ( Can’t guarantee this, event times and dates can be changed )

1

70

576

5

1

46

Zachary Nado

@zacharynado

4 years

@araffin2 I've long argued for tuning epsilon, in Adam it can be interpreted as a damping/trust region radius term. See Section 2 of our paper

2

1

45

Zachary Nado

@zacharynado

2 months

sign up for the wait-list here

Labs.google Trusted Tester Waitlist

Thanks so much for your interest. We review all submissions on a rolling basis.

docs.google.com

Google DeepMind

@GoogleDeepMind

2 months

Introducing Veo: our most capable generative video model. 🎥 It can create high-quality, 1080p clips that can go beyond 60 seconds. From photorealism to surrealism and animation, it can tackle a range of cinematic styles. 🧵 #GoogleIO

149

956

4K

6

12

45

Zachary Nado

@zacharynado

7 months

@DrJimFan Satya said Google would be dancing with them, here we are 🕺

3

0

44

Zachary Nado

@zacharynado

7 months

during generation it's very impressive at how seamlessly it interleaves text/image, imo for models going forward being able to condition image generation on neighboring text is going to be important

1

0

45

Zachary Nado

@zacharynado

25 days

if you need a reliable AI in these trying times just go to or ♊️♊️🚀🚀

Google AI Studio | Gemini API | Google for Developers

ai.google.dev

4

2

45

Zachary Nado

@zacharynado

3 years

my expectations were low but somehow the NeurIPS review process still disappoints! we will be writing up a postmortem and posting the reviews

1

2

45

Zachary Nado

@zacharynado

2 years

"“You can interrogate the data sets. You can interrogate the model. You can interrogate the code of Stable Diffusion and the other things we’re doing,” he said. “And we’re seeing it being improved all the time.”" lol you can do all of that with a controlled API too

Irina Rish

@irinarish

2 years

"In Silicon Valley, crypto and the metaverse are out. Generative A.I. is in." @StabilityAI (nice pic of @EMostaque ;)

5

25

156

6

5

44

Zachary Nado

@zacharynado

14 days

⚡️⚡️⚡️ flash is honestly a game changer and I am glad people are finally catching on to how big of a deal it is

Iliane

@Iliane_5

15 days

i have a bulk eval i need to run that will be ~10M tokens in & ~15M tokens out did some preliminary testing with opus/haiku, gpt4o, gemini flash/pro, cmd-r+ -> the geminis & gpt4o did really good, opus a bit worse ran the numbers: - opus: $1275 - gpt4o: $275 - gemini pro:

0

2

47

3

1

44

Zachary Nado

@zacharynado

6 years

Tennis ball dog is one of the best GAN creations I've seen to date (from the BigGAN ICLR paper )

2

10

42

Zachary Nado

@zacharynado

1 year

@typedfemale sam walks up to a sr alignment engineer: "at ease. what have you been working on here?" "i did my phd getting robots to solve rubiks cubes without resorting to chatbots, I'm continuing that with one burnt out effective altruist stanford ugrad" sam: "shut the entire thing down"

2

42

Zachary Nado

@zacharynado

2 years

working in a project where we are implementing a bunch of DL workloads in pytorch and jax/flax/optax, and pytorch is not what everyone hyped it up to be!

1

0

42