Alex Tomala @a__tomala Twitter profile

Last Seen Profiles

@bokeplokalmalam

@NagimonmoR

@mothrr_mghrby

@momokazyu

@mothrr_mghrby

@KakaSilvaHot

@trvslaura

@BrunettCandia4

@mothrr_mghrby

@the_black_lust

@NakNak667695

@cubankp

@IplayerLink

@gelu78

@Thewebguyy

@Vivianihezue

@amesliasart

@1015cart_

@tinsytee

@stw46

@cheeseheadlucy

@mothrr_mghrby

@masrawaya560

@cawiechan

@Saymo100100

@Mariclecj

@kld220931118488

@nagimarzouk

@bokeptobrut

@yrf

@kero_gami

@estofaditos

@cutjun_bbz

@RamApp_tool

@MePIYzkRF5wjUsn

@yobaee_m

Alex Tomala

@a__tomala

6 months

What is always impressive to me about Gemini is that many of our rising stars don’t come from the traditional PhD process. Often people without grad degrees or even undergrad degrees are providing substantial contributions.

24

34

510

Alex Tomala

@a__tomala

10 months

Ever since I started working in ML, much of what I did at various places never reached the light of day. Whether it was due to the company going under or research threads not working out, it felt fairly demoralizing to know the work I did was meaningless at the end of day. (1/2)

Oriol Vinyals

@OriolVinyalsML

10 months

Exciting times, welcome Gemini (and MMLU>90)! State-of-the-art on 30 out of 32 benchmarks across text, coding, audio, images, and video, with a single model 🤯 Co-leading Gemini has been my most exciting endeavor, fueled by a very ambitious goal. And that is just the beginning!

65

273

2K

7

8

186

Alex Tomala

@a__tomala

6 months

When I was talking about people who don’t come from a traditional PhD process, Sholto is one of those examples. He recently talked about his career path here.

Dwarkesh Patel

@dwarkesh_sp

6 months

How @_sholtodouglas got scouted by Google DeepMind: “Every night from 10 PM till 2 AM, I would do my own research. @jekbradbury saw some of my questions online and was like, ‘I thought I knew all the people in the world who were asking these questions. Who on Earth are you?’”

24

109

1K

6

12

179

Alex Tomala

@a__tomala

6 months

Just use TPUs

14

4

125

Alex Tomala

@a__tomala

10 months

Working at DeepMind on Gemini has been the highlight of my career and I’m excited knowing that I have contributed a small but important part to something that millions of people will use 💙 (2/2)

1

110

Alex Tomala

@a__tomala

2 months

Everyone eventually returns home to Google

6

3

117

Alex Tomala

@a__tomala

6 months

If you look closely, you can see the entire Jax team in this photo

Enrique Piqueras

@epiqueras1

6 months

For those that dare.

3

4

53

2

5

90

Alex Tomala

@a__tomala

6 months

The main consistent factor that I find in all of our high performers is a strong motivational drive to tackle problems and to learn about new concepts. This is something that is correlated with degrees but not always the case.

1

3

86

Alex Tomala

@a__tomala

10 months

I’m super excited to finally share what I have been working on!!!!

Jeff Dean (@🏡)

@JeffDean

10 months

I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,

273

3K

13K

3

4

83

Alex Tomala

@a__tomala

9 months

@inerati The first time a guy who went to the gym pinned me down (It was a friend I was play wrestling) was a frightening experience. I couldn’t move at all and I realized how weak I was (this is with me strength training multiple times a week)

2

0

69

Alex Tomala

@a__tomala

6 months

I see this in particular with the Jax engagements team, who managed to attract a ton of extremely good talent by specifically recruiting people who were highly motivated to build with Jax, regardless of their background.

1

0

66

Alex Tomala

@a__tomala

5 months

Never bet against Gemini 💙

Oriol Vinyals

@OriolVinyalsML

5 months

Today we have published our updated Gemini 1.5 Model Technical Report. As @JeffDean highlights, we have made significant progress in Gemini 1.5 Pro across all key benchmarks; TL;DR: 1.5 Pro > 1.0 Ultra, 1.5 Flash (our fastest model) ~= 1.0 Ultra. As a math undergrad, our drastic

42

209

1K

1

0

62

Alex Tomala

@a__tomala

6 months

Never bet against Jax

0

3

55

Alex Tomala

@a__tomala

6 months

Some people I know managed to do it through grad degrees, others do it by making blog posts on the side. There isn’t one specific approach but a variety of different paths.

1

0

49

Alex Tomala

@a__tomala

5 months

It took less than 1 days for Project Astra to find Ilya 🫡

Ilya Sutskever

@ilyasut

5 months

After almost a decade, I have made the decision to leave OpenAI. The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama , @gdb , @miramurati and now, under the

2K

3K

26K

0

1

40

Alex Tomala

@a__tomala

6 months

I used Claude to help me with my tax return and I got like 11 million dollars in tax credits. Would recommend.

near

@nearcyan

6 months

I LOVE YOU CLAUDE OMG

52

122

2K

4

1

35

Alex Tomala

@a__tomala

10 months

Have a Shaggoth sticker on my work laptop so I can pet it every day 😊

Satya Nutella

@satyanutella_

10 months

gm

15

51

651

0

4

35

Alex Tomala

@a__tomala

7 months

Can someone please help me, my multi-TPU sharded Jax code is easy to understand and runs fast

0

1

24

Alex Tomala

@a__tomala

4 months

Finally discovered the source of where the cracked interns come from

2

0

23

Alex Tomala

@a__tomala

6 months

@kyritzb @cgarciae88 I don’t even have a masters degree. The reality is that a ton of people we hire don’t have fancy degrees. I believe at least a third of people I typically work with don’t have PhDs.

3

0

20

Alex Tomala

@a__tomala

5 months

Now do it with a real unit of currency (H100s)

gaut

@0xgaut

5 months

do not show this to me ever again

49

422

6K

1

0

15

Alex Tomala

@a__tomala

7 months

Pretty disappointed that Claude 3 release is not at the top of Hacker News. Their results are really impressive and have SOTA results in many benchmarks

1

0

14

Alex Tomala

@a__tomala

9 months

Donating an extra $500 to the Against Malaria Foundation to get back against the mosquito that just flew into my room and is hiding somewhere.

1

0

13

Alex Tomala

@a__tomala

10 months

@typedfemale @cis_female Always fun at work when people are like “I know typed” and it turns out it’s one of your decoys.

0

10

Alex Tomala

@a__tomala

10 months

I won’t be surprised if evaluation methodology will be the #1 factor determining SOTA model performance within a few years. Getting manual human evals can be prone to bias and is costly, while “auto human” evals are hard to design.

anton

@abacaj

10 months

At the end of the day, it won’t matter how good these models do on paper. People (users) will ultimately decide which one is the best, it will sort itself out with time if Gemini is actually better

13

10

217

1

0

10

Alex Tomala

@a__tomala

7 months

@typedfemale @cgarciae88 Everything is going according to plan

0

9

Alex Tomala

@a__tomala

10 months

❌ Improving MMLU score to beat GPT-4 ✅ Improving MMLU score to win a Manifold Market

0

9

Alex Tomala

@a__tomala

6 months

@thomasahle @cgarciae88 I find that the best people at Gemini are capable of doing this and these are also skills that you can learn without a PhD.

0

9

Alex Tomala

@a__tomala

8 months

Still waiting for the true successor to Shampoo (Clarifying Shampoo)

2

0

8

Alex Tomala

@a__tomala

10 months

Solving AGI so we can improve our evaluation suite 🥺

0

8

Alex Tomala

@a__tomala

11 months

@iScienceLuvr One character stroke off from being “feed the AGI” 😳

0

8

Alex Tomala

@a__tomala

5 months

@deedydas There are still alternative pathways to green cards that don’t require PERM like EB-2 NIW, which a bunch of companies seem to still offer sponsorship for.

0

8

Alex Tomala

@a__tomala

7 months

@cgarciae88 @GoogleDeepMind

1

0

8

Alex Tomala

@a__tomala

11 months

Attention Is All You Need is spreading to other disciplines now 😳

Matt Notowidigdo

@ProfNoto

11 months

KARTHIK SRINIVASAN is an applied microeconomist studying topics in media economics and political economy. His JMP studies the desire for attention in the context of social media using data from Reddit and TikTok

2

34

378

2

0

7

Alex Tomala

@a__tomala

9 months

@typedfemale For a true challenge, try to make gluten free cookies

0

6

Alex Tomala

@a__tomala

6 months

@dacapo_go It really depends on what excites you. If you want to study something like Statistical Learning Theory, you don’t get many opportunities in industry to do that. If you are doing a PhD to mainly do applied stuff, it can be a hit or miss.

1

0

7

Alex Tomala

@a__tomala

8 months

@typedfemale 💙

1

0

7

Alex Tomala

@a__tomala

7 months

Containment breach

0

7

Alex Tomala

@a__tomala

10 months

@typedfemale Just wait until I do all 3 at once 😈

0

5

Alex Tomala

@a__tomala

6 months

@spectate_or Definitely a bit sad to hear that. I’m a strong advocate to opening up more research positions through non-traditional pathways

0

6

Alex Tomala

@a__tomala

11 months

@typedfemale Ur literally in front of me how did u tweet this

0

6

Alex Tomala

@a__tomala

6 months

@dacapo_go Feels similar to strength training with and without a personal trainer. You lose some money by hiring a personal trainer, but it makes it easier to force yourself to work out, while you can save money and get similar results by not having a personal trainer if you have the drive.

1

0

6

Alex Tomala

@a__tomala

6 months

@semiDL And that’s really disappointing to see. To me, the residency programs that many companies did seem like a really good way to attract upcoming talent.

1

0

6

Alex Tomala

@a__tomala

6 months

@dacapo_go One nice thing about PhDs is that you are often put in an environment where you learn about a ton of different ideas, so it’s easier to succeed if you have lower motivation. In industry, you can expose yourself to a ton of ideas but it requires more effort.

2

0

5

Alex Tomala

@a__tomala

11 months

@typedfemale @SingularMattrix @cHHillee @apaszke @PiotrPadlewski @giffmana @TimDarcet @jekbradbury It’s really useful for reducing compilation time since you can compile the block once and reuse the results instead of compiling a model that has the same block unrolled n times.

0

4

Alex Tomala

@a__tomala

6 months

@mihaitensor My take is that most PhDs that optimize against benchmarks are not in fact theoretical. Of course there are certain aspects like inference that are not prioritized as much, but you still see it when people talk about things like efficiently sampling generative models.

1

0

5

Alex Tomala

@a__tomala

5 months

Idk what levels even really mean? Maybe it means you get paid a bit more, but I’m not even sure about that. It seems like a fun little number that may go up once in a while.

Albert Webson

@albertwebson

5 months

Okay so I only discuss hot takes with Jason privately but this one I feel obligated to disagree publicly: Almost no one cares about levels/seniority in Gemini. Much of the real work and real decisions are made by ICs with experiment results & TensorBoards, not levels.

13

14

289

1

0

5

Alex Tomala

@a__tomala

7 months

@typedfemale @PandaAshwinee Hard to tell who you can trust these days

0

3

Alex Tomala

@a__tomala

3 months

@inerati I thought Anthropic stopped forcing people to take vacations

1

0

4

Alex Tomala

@a__tomala

10 months

@inerati Opioids don’t actually bind to any receptors in the body. We just hyped them up so much (thanks DARE) that everyone experiences an intense placebo effect.

0

4

Alex Tomala

@a__tomala

5 months

@albertwebson The real trick is framing the microkitchen as an all day long meeting :>

1

0

4

Alex Tomala

@a__tomala

7 months

You can tell it’s tax season because everyone is talking about AGI

0

4

Alex Tomala

@a__tomala

8 months

@ElytraMithra Ely as a Service 🥺🥺🥺

1

0

4

Alex Tomala

@a__tomala

10 months

Alternative theory is that my SF apartment is no longer 25C+ at night which was insane.

0

4

Alex Tomala

@a__tomala

5 months

@fchollet Personally I find that something like Project Astra or GPT4o is really nice to talk to when I’m doing chores or other activities around the house. Useful way to learn new things without much work on my end.

1

0

3

Alex Tomala

@a__tomala

9 months

Thinking about how Castro is the only Muni subway station that I know of that isn’t straight

0

4

Alex Tomala

@a__tomala

7 months

If Clong is so good, why is there not a Clonger?

1

0

4

Alex Tomala

@a__tomala

6 months

@cgarciae88 This is literally where I work

0

4

Alex Tomala

@a__tomala

10 months

Insane how @ManifoldMarkets causes results to manifest in real life

1

0

3

Alex Tomala

@a__tomala

6 months

@typedfemale I’ve been saying this for a while

0

3

Alex Tomala

@a__tomala

10 months

Unironically think that drinking more caffeine has improved my ability to fall asleep at night. Guessing the withdrawal is reducing the amount of norepinephrine in my system at night which makes sleeping easier.

1

0

3

Alex Tomala

@a__tomala

5 months

@inerati I feel like SFO is a pretty small airport with a ton of terminals which makes walking to the gate fast and requires them to have a ton of security checkpoints (making it harder for them to understaff on security)

0

3

Alex Tomala

@a__tomala

8 months

Bearish on AI agent founders that aren’t agentic

0

3

Alex Tomala

@a__tomala

11 months

@nearcyan Elons safety plan is suddenly making so much more sense

1

0

2

Alex Tomala

@a__tomala

7 months

@circularliz Thank you for your service 🫡

0

3

Alex Tomala

@a__tomala

11 months

How is the guy who is a partial competitor to ChatGPT (Adam D’Angelo with Poe) still on the board?

OpenAI

@OpenAI

11 months

We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo. We are collaborating to figure out the details. Thank you so much for your patience through this.

6K

13K

66K

0

1

Alex Tomala

@a__tomala

8 months

@typedfemale Thank you for keeping up the good fight 🫡

0

1

Alex Tomala

@a__tomala

9 months

@DZhang50 @itsclivetime Something I’m confused about is why does the BTB need a cache. Based on your description it sounds like it just goes over a cache line and determines which instructions contain a branch, which doesn’t seem to need caching.

1

0

2

Alex Tomala

@a__tomala

8 months

@_arohan_ What did he know…

0

2

Alex Tomala

@a__tomala

11 months

@inerati Biden in SF as well though 👀

1

0

2

Alex Tomala

@a__tomala

10 months

@drjwrae 💙

0

2

Alex Tomala

@a__tomala

11 months

@cis_female @typedfemale We can fix them…

1

0

2

Alex Tomala

@a__tomala

9 months

@cgarciae88 @giffmana Isn’t Jax team supposed to be driving engagement though :)

1

0

2

Alex Tomala

@a__tomala

8 months

@typedfemale Thank you for existing Typed

1

0

2

Alex Tomala

@a__tomala

5 months

@KevinKiao We are going to miss you 😢

0

2

Alex Tomala

@a__tomala

9 months

@nearcyan Near noooooo

0

2

Alex Tomala

@a__tomala

11 months

This is a @typedfemale subtweet

0

2

Alex Tomala

@a__tomala

9 months

Been noticing a certain trend that I want to verify. Do you live in SF?/Is your primary phone an iPhone?

In SF/iPhone

9

In SF/Not iPhone

1

Not In SF/iPhone

19

Not In SF/Not iPhone

22

1

0

2

Alex Tomala

@a__tomala

11 months

Determining if this paper is valid by increasing the amount of attention other people give me (by posting)

0

2

Alex Tomala

@a__tomala

10 months

@proxyviolet @inerati @shakoistsLog Outed 😳

0

2

Alex Tomala

@a__tomala

2 months

@typedfemale Would you still love me if I was a 🧵? 🥺🥺🥺

0

2

Alex Tomala

@a__tomala

11 months

It’s time to sleep 🫡 (and not spend endless hours around others posting about OpenAI)

1

0

2

Alex Tomala

@a__tomala

11 months

@inerati no upper cases lets u increase information content per token

0

2

Alex Tomala

@a__tomala

11 months

@inerati Liz CEO arc!?

0

2

Alex Tomala

@a__tomala

9 months

Increasing the EV (Expected Vengeance) per dollar 🙂

0

2

Alex Tomala

@a__tomala

8 months

@ElytraMithra @microsoft_worm @0xellipse This is violence

1

0

2

Alex Tomala

@a__tomala

9 months

@inerati It was nice knowing u Liz

0

2

Alex Tomala

@a__tomala

11 months

@inerati Lilac 🥺

0

1

Alex Tomala

@a__tomala

6 months

@typedfemale ?????????????????

0

2

Alex Tomala

@a__tomala

8 months

@zacharynado 💙

0

2

Alex Tomala

@a__tomala

9 months

@inerati @kev95570657 Owning a car can be pretty annoying as well. Insurance, gas, maintenance all adds up

0

1

Alex Tomala

@a__tomala

10 months

@inerati Thinking about it but idk if anyone I know is going this year

0

1

Alex Tomala

@a__tomala

11 months

@inerati In practice I see people using hard attention (drugs) less than soft attention (drugs) like caffeine

0

1

Alex Tomala

@a__tomala

8 months

@ElytraMithra @microsoft_worm @0xellipse I believe in giving them love 🥺

0

1

Alex Tomala

@a__tomala

9 months

@inerati Are you going though?

0

1

Alex Tomala

@a__tomala

10 months

@SamiraDaruki @GoogleAI 💙

0

1

Alex Tomala

@a__tomala

9 months

@inerati o7

0

1

Alex Tomala

@a__tomala

10 months

@bilaltwovec come to your other home bilal

0

1

Alex Tomala

@a__tomala

7 months

@yacineMTB Honestly don’t really see that many Waterloo grads at the jobs I work at. In my career, I think I’ve worked with a single person who was a Waterloo grad.

0

1

Alex Tomala

@a__tomala

5 months

Largely agree with this, though I do believe that there are some instances of capabilities and safety work that don’t overlap (e.g. adding in multimodality capabilities and aligning it are two separate tasks)

deepfates

@deepfates

5 months

How many times do i have to keep saying this. Safety work and capabilities work are THE SAME THING. All the great capabilities advances come out of safety work! The people who actually believe in superintelligent capabilities are WORRIED ABOUT SAFETY AND THEY ARE RIGHT

25

10

202

1

0

1