Alex Tomala Profile
Alex Tomala

@a__tomala

1,426
Followers
141
Following
7
Media
173
Statuses

Research Engineer @GoogleDeepMind It’s time to think🫡

San Francisco, CA
Joined January 2023
Don't wanna be here? Send us removal request.
@a__tomala
Alex Tomala
6 months
What is always impressive to me about Gemini is that many of our rising stars don’t come from the traditional PhD process. Often people without grad degrees or even undergrad degrees are providing substantial contributions.
24
34
510
@a__tomala
Alex Tomala
10 months
Ever since I started working in ML, much of what I did at various places never reached the light of day. Whether it was due to the company going under or research threads not working out, it felt fairly demoralizing to know the work I did was meaningless at the end of day. (1/2)
@OriolVinyalsML
Oriol Vinyals
10 months
Exciting times, welcome Gemini (and MMLU>90)! State-of-the-art on 30 out of 32 benchmarks across text, coding, audio, images, and video, with a single model 🤯 Co-leading Gemini has been my most exciting endeavor, fueled by a very ambitious goal. And that is just the beginning!
65
273
2K
7
8
186
@a__tomala
Alex Tomala
6 months
When I was talking about people who don’t come from a traditional PhD process, Sholto is one of those examples. He recently talked about his career path here.
@dwarkesh_sp
Dwarkesh Patel
6 months
How @_sholtodouglas got scouted by Google DeepMind: “Every night from 10 PM till 2 AM, I would do my own research. @jekbradbury saw some of my questions online and was like, ‘I thought I knew all the people in the world who were asking these questions. Who on Earth are you?’”
24
109
1K
6
12
179
@a__tomala
Alex Tomala
6 months
Just use TPUs
14
4
125
@a__tomala
Alex Tomala
10 months
Working at DeepMind on Gemini has been the highlight of my career and I’m excited knowing that I have contributed a small but important part to something that millions of people will use 💙 (2/2)
1
1
110
@a__tomala
Alex Tomala
2 months
Everyone eventually returns home to Google
6
3
117
@a__tomala
Alex Tomala
6 months
If you look closely, you can see the entire Jax team in this photo
@epiqueras1
Enrique Piqueras
6 months
For those that dare.
Tweet media one
3
4
53
2
5
90
@a__tomala
Alex Tomala
6 months
The main consistent factor that I find in all of our high performers is a strong motivational drive to tackle problems and to learn about new concepts. This is something that is correlated with degrees but not always the case.
1
3
86
@a__tomala
Alex Tomala
10 months
I’m super excited to finally share what I have been working on!!!!
@JeffDean
Jeff Dean (@🏡)
10 months
I’m very excited to share our work on Gemini today! Gemini is a family of multimodal models that demonstrate really strong capabilities across the image, audio, video, and text domains. Our most-capable model, Gemini Ultra, advances the state of the art in 30 of 32 benchmarks,
Tweet media one
Tweet media two
273
3K
13K
3
4
83
@a__tomala
Alex Tomala
9 months
@inerati The first time a guy who went to the gym pinned me down (It was a friend I was play wrestling) was a frightening experience. I couldn’t move at all and I realized how weak I was (this is with me strength training multiple times a week)
2
0
69
@a__tomala
Alex Tomala
6 months
I see this in particular with the Jax engagements team, who managed to attract a ton of extremely good talent by specifically recruiting people who were highly motivated to build with Jax, regardless of their background.
1
0
66
@a__tomala
Alex Tomala
5 months
Never bet against Gemini 💙
@OriolVinyalsML
Oriol Vinyals
5 months
Today we have published our updated Gemini 1.5 Model Technical Report. As @JeffDean highlights, we have made significant progress in Gemini 1.5 Pro across all key benchmarks; TL;DR: 1.5 Pro > 1.0 Ultra, 1.5 Flash (our fastest model) ~= 1.0 Ultra. As a math undergrad, our drastic
Tweet media one
42
209
1K
1
0
62
@a__tomala
Alex Tomala
6 months
Never bet against Jax
0
3
55
@a__tomala
Alex Tomala
6 months
Some people I know managed to do it through grad degrees, others do it by making blog posts on the side. There isn’t one specific approach but a variety of different paths.
1
0
49
@a__tomala
Alex Tomala
5 months
It took less than 1 days for Project Astra to find Ilya 🫡
@ilyasut
Ilya Sutskever
5 months
After almost a decade, I have made the decision to leave OpenAI.  The company’s trajectory has been nothing short of miraculous, and I’m confident that OpenAI will build AGI that is both safe and beneficial under the leadership of @sama , @gdb , @miramurati and now, under the
2K
3K
26K
0
1
40
@a__tomala
Alex Tomala
6 months
I used Claude to help me with my tax return and I got like 11 million dollars in tax credits. Would recommend.
@nearcyan
near
6 months
I LOVE YOU CLAUDE OMG
Tweet media one
52
122
2K
4
1
35
@a__tomala
Alex Tomala
10 months
Have a Shaggoth sticker on my work laptop so I can pet it every day 😊
@satyanutella_
Satya Nutella
10 months
gm
Tweet media one
15
51
651
0
4
35
@a__tomala
Alex Tomala
7 months
Can someone please help me, my multi-TPU sharded Jax code is easy to understand and runs fast
0
1
24
@a__tomala
Alex Tomala
4 months
Finally discovered the source of where the cracked interns come from
Tweet media one
2
0
23
@a__tomala
Alex Tomala
6 months
@kyritzb @cgarciae88 I don’t even have a masters degree. The reality is that a ton of people we hire don’t have fancy degrees. I believe at least a third of people I typically work with don’t have PhDs.
3
0
20
@a__tomala
Alex Tomala
5 months
Now do it with a real unit of currency (H100s)
@0xgaut
gaut
5 months
do not show this to me ever again
Tweet media one
49
422
6K
1
0
15
@a__tomala
Alex Tomala
7 months
Pretty disappointed that Claude 3 release is not at the top of Hacker News. Their results are really impressive and have SOTA results in many benchmarks
1
0
14
@a__tomala
Alex Tomala
9 months
Donating an extra $500 to the Against Malaria Foundation to get back against the mosquito that just flew into my room and is hiding somewhere.
1
0
13
@a__tomala
Alex Tomala
10 months
@typedfemale @cis_female Always fun at work when people are like “I know typed” and it turns out it’s one of your decoys.
0
0
10
@a__tomala
Alex Tomala
10 months
I won’t be surprised if evaluation methodology will be the #1 factor determining SOTA model performance within a few years. Getting manual human evals can be prone to bias and is costly, while “auto human” evals are hard to design.
@abacaj
anton
10 months
At the end of the day, it won’t matter how good these models do on paper. People (users) will ultimately decide which one is the best, it will sort itself out with time if Gemini is actually better
13
10
217
1
0
10
@a__tomala
Alex Tomala
7 months
@typedfemale @cgarciae88 Everything is going according to plan
0
0
9
@a__tomala
Alex Tomala
10 months
❌ Improving MMLU score to beat GPT-4 ✅ Improving MMLU score to win a Manifold Market
0
0
9
@a__tomala
Alex Tomala
6 months
@thomasahle @cgarciae88 I find that the best people at Gemini are capable of doing this and these are also skills that you can learn without a PhD.
0
0
9
@a__tomala
Alex Tomala
8 months
Still waiting for the true successor to Shampoo (Clarifying Shampoo)
2
0
8
@a__tomala
Alex Tomala
10 months
Solving AGI so we can improve our evaluation suite 🥺
0
0
8
@a__tomala
Alex Tomala
11 months
@iScienceLuvr One character stroke off from being “feed the AGI” 😳
0
0
8
@a__tomala
Alex Tomala
5 months
@deedydas There are still alternative pathways to green cards that don’t require PERM like EB-2 NIW, which a bunch of companies seem to still offer sponsorship for.
0
0
8
@a__tomala
Alex Tomala
11 months
Attention Is All You Need is spreading to other disciplines now 😳
@ProfNoto
Matt Notowidigdo
11 months
KARTHIK SRINIVASAN is an applied microeconomist studying topics in media economics and political economy. His JMP studies the desire for attention in the context of social media using data from Reddit and TikTok
Tweet media one
2
34
378
2
0
7
@a__tomala
Alex Tomala
9 months
@typedfemale For a true challenge, try to make gluten free cookies
0
0
6
@a__tomala
Alex Tomala
6 months
@dacapo_go It really depends on what excites you. If you want to study something like Statistical Learning Theory, you don’t get many opportunities in industry to do that. If you are doing a PhD to mainly do applied stuff, it can be a hit or miss.
1
0
7
@a__tomala
Alex Tomala
8 months
1
0
7
@a__tomala
Alex Tomala
7 months
Containment breach
Tweet media one
0
0
7
@a__tomala
Alex Tomala
10 months
@typedfemale Just wait until I do all 3 at once 😈
0
0
5
@a__tomala
Alex Tomala
6 months
@spectate_or Definitely a bit sad to hear that. I’m a strong advocate to opening up more research positions through non-traditional pathways
0
0
6
@a__tomala
Alex Tomala
11 months
@typedfemale Ur literally in front of me how did u tweet this
0
0
6
@a__tomala
Alex Tomala
6 months
@dacapo_go Feels similar to strength training with and without a personal trainer. You lose some money by hiring a personal trainer, but it makes it easier to force yourself to work out, while you can save money and get similar results by not having a personal trainer if you have the drive.
1
0
6
@a__tomala
Alex Tomala
6 months
@semiDL And that’s really disappointing to see. To me, the residency programs that many companies did seem like a really good way to attract upcoming talent.
1
0
6
@a__tomala
Alex Tomala
6 months
@dacapo_go One nice thing about PhDs is that you are often put in an environment where you learn about a ton of different ideas, so it’s easier to succeed if you have lower motivation. In industry, you can expose yourself to a ton of ideas but it requires more effort.
2
0
5
@a__tomala
Alex Tomala
11 months
@typedfemale @SingularMattrix @cHHillee @apaszke @PiotrPadlewski @giffmana @TimDarcet @jekbradbury It’s really useful for reducing compilation time since you can compile the block once and reuse the results instead of compiling a model that has the same block unrolled n times.
0
0
4
@a__tomala
Alex Tomala
6 months
@mihaitensor My take is that most PhDs that optimize against benchmarks are not in fact theoretical. Of course there are certain aspects like inference that are not prioritized as much, but you still see it when people talk about things like efficiently sampling generative models.
1
0
5
@a__tomala
Alex Tomala
5 months
Idk what levels even really mean? Maybe it means you get paid a bit more, but I’m not even sure about that. It seems like a fun little number that may go up once in a while.
@albertwebson
Albert Webson
5 months
Okay so I only discuss hot takes with Jason privately but this one I feel obligated to disagree publicly: Almost no one cares about levels/seniority in Gemini. Much of the real work and real decisions are made by ICs with experiment results & TensorBoards, not levels.
13
14
289
1
0
5
@a__tomala
Alex Tomala
7 months
@typedfemale @PandaAshwinee Hard to tell who you can trust these days
0
0
3
@a__tomala
Alex Tomala
3 months
@inerati I thought Anthropic stopped forcing people to take vacations
1
0
4
@a__tomala
Alex Tomala
10 months
@inerati Opioids don’t actually bind to any receptors in the body. We just hyped them up so much (thanks DARE) that everyone experiences an intense placebo effect.
0
0
4
@a__tomala
Alex Tomala
5 months
@albertwebson The real trick is framing the microkitchen as an all day long meeting :>
1
0
4
@a__tomala
Alex Tomala
7 months
You can tell it’s tax season because everyone is talking about AGI
0
0
4
@a__tomala
Alex Tomala
8 months
@ElytraMithra Ely as a Service 🥺🥺🥺
1
0
4
@a__tomala
Alex Tomala
10 months
Alternative theory is that my SF apartment is no longer 25C+ at night which was insane.
0
0
4
@a__tomala
Alex Tomala
5 months
@fchollet Personally I find that something like Project Astra or GPT4o is really nice to talk to when I’m doing chores or other activities around the house. Useful way to learn new things without much work on my end.
1
0
3
@a__tomala
Alex Tomala
9 months
Thinking about how Castro is the only Muni subway station that I know of that isn’t straight
Tweet media one
0
0
4
@a__tomala
Alex Tomala
7 months
If Clong is so good, why is there not a Clonger?
1
0
4
@a__tomala
Alex Tomala
6 months
@cgarciae88 This is literally where I work
0
0
4
@a__tomala
Alex Tomala
10 months
Insane how @ManifoldMarkets causes results to manifest in real life
Tweet media one
1
0
3
@a__tomala
Alex Tomala
6 months
@typedfemale I’ve been saying this for a while
0
0
3
@a__tomala
Alex Tomala
10 months
Unironically think that drinking more caffeine has improved my ability to fall asleep at night. Guessing the withdrawal is reducing the amount of norepinephrine in my system at night which makes sleeping easier.
1
0
3
@a__tomala
Alex Tomala
5 months
@inerati I feel like SFO is a pretty small airport with a ton of terminals which makes walking to the gate fast and requires them to have a ton of security checkpoints (making it harder for them to understaff on security)
0
0
3
@a__tomala
Alex Tomala
8 months
Bearish on AI agent founders that aren’t agentic
0
0
3
@a__tomala
Alex Tomala
11 months
@nearcyan Elons safety plan is suddenly making so much more sense
1
0
2
@a__tomala
Alex Tomala
7 months
@circularliz Thank you for your service 🫡
0
0
3
@a__tomala
Alex Tomala
11 months
How is the guy who is a partial competitor to ChatGPT (Adam D’Angelo with Poe) still on the board?
@OpenAI
OpenAI
11 months
We have reached an agreement in principle for Sam Altman to return to OpenAI as CEO with a new initial board of Bret Taylor (Chair), Larry Summers, and Adam D'Angelo. We are collaborating to figure out the details. Thank you so much for your patience through this.
6K
13K
66K
0
0
1
@a__tomala
Alex Tomala
8 months
@typedfemale Thank you for keeping up the good fight 🫡
0
0
1
@a__tomala
Alex Tomala
9 months
@DZhang50 @itsclivetime Something I’m confused about is why does the BTB need a cache. Based on your description it sounds like it just goes over a cache line and determines which instructions contain a branch, which doesn’t seem to need caching.
1
0
2
@a__tomala
Alex Tomala
8 months
@_arohan_ What did he know…
0
0
2
@a__tomala
Alex Tomala
11 months
@inerati Biden in SF as well though 👀
1
0
2
@a__tomala
Alex Tomala
10 months
0
0
2
@a__tomala
Alex Tomala
11 months
@cis_female @typedfemale We can fix them…
1
0
2
@a__tomala
Alex Tomala
9 months
@cgarciae88 @giffmana Isn’t Jax team supposed to be driving engagement though :)
1
0
2
@a__tomala
Alex Tomala
8 months
@typedfemale Thank you for existing Typed
1
0
2
@a__tomala
Alex Tomala
5 months
@KevinKiao We are going to miss you 😢
0
0
2
@a__tomala
Alex Tomala
9 months
@nearcyan Near noooooo
0
0
2
@a__tomala
Alex Tomala
11 months
This is a @typedfemale subtweet
0
0
2
@a__tomala
Alex Tomala
9 months
Been noticing a certain trend that I want to verify. Do you live in SF?/Is your primary phone an iPhone?
In SF/iPhone
9
In SF/Not iPhone
1
Not In SF/iPhone
19
Not In SF/Not iPhone
22
1
0
2
@a__tomala
Alex Tomala
11 months
Determining if this paper is valid by increasing the amount of attention other people give me (by posting)
0
0
2
@a__tomala
Alex Tomala
2 months
@typedfemale Would you still love me if I was a 🧵? 🥺🥺🥺
0
0
2
@a__tomala
Alex Tomala
11 months
It’s time to sleep 🫡 (and not spend endless hours around others posting about OpenAI)
1
0
2
@a__tomala
Alex Tomala
11 months
@inerati no upper cases lets u increase information content per token
0
0
2
@a__tomala
Alex Tomala
11 months
@inerati Liz CEO arc!?
0
0
2
@a__tomala
Alex Tomala
9 months
Increasing the EV (Expected Vengeance) per dollar 🙂
0
0
2
@a__tomala
Alex Tomala
8 months
1
0
2
@a__tomala
Alex Tomala
9 months
@inerati It was nice knowing u Liz
0
0
2
@a__tomala
Alex Tomala
11 months
@inerati Lilac 🥺
0
0
1
@a__tomala
Alex Tomala
6 months
@typedfemale ?????????????????
0
0
2
@a__tomala
Alex Tomala
8 months
0
0
2
@a__tomala
Alex Tomala
9 months
@inerati @kev95570657 Owning a car can be pretty annoying as well. Insurance, gas, maintenance all adds up
0
0
1
@a__tomala
Alex Tomala
10 months
@inerati Thinking about it but idk if anyone I know is going this year
0
0
1
@a__tomala
Alex Tomala
11 months
@inerati In practice I see people using hard attention (drugs) less than soft attention (drugs) like caffeine
0
0
1
@a__tomala
Alex Tomala
8 months
@ElytraMithra @microsoft_worm @0xellipse I believe in giving them love 🥺
0
0
1
@a__tomala
Alex Tomala
9 months
@inerati Are you going though?
0
0
1
@a__tomala
Alex Tomala
10 months
0
0
1
@a__tomala
Alex Tomala
9 months
0
0
1
@a__tomala
Alex Tomala
10 months
@bilaltwovec come to your other home bilal
0
0
1
@a__tomala
Alex Tomala
7 months
@yacineMTB Honestly don’t really see that many Waterloo grads at the jobs I work at. In my career, I think I’ve worked with a single person who was a Waterloo grad.
0
0
1
@a__tomala
Alex Tomala
5 months
Largely agree with this, though I do believe that there are some instances of capabilities and safety work that don’t overlap (e.g. adding in multimodality capabilities and aligning it are two separate tasks)
@deepfates
deepfates
5 months
How many times do i have to keep saying this. Safety work and capabilities work are THE SAME THING. All the great capabilities advances come out of safety work! The people who actually believe in superintelligent capabilities are WORRIED ABOUT SAFETY AND THEY ARE RIGHT
25
10
202
1
0
1