Nat McAleese Profile Banner
Nat McAleese Profile
Nat McAleese

@__nmca__

7,185
Followers
319
Following
44
Media
383
Statuses

Research @OpenAI . Previously @DeepMind . Views my own.

Joined September 2021
Don't wanna be here? Send us removal request.
Pinned Tweet
@__nmca__
Nat McAleese
5 months
LLMs are underrated.
4
9
91
@__nmca__
Nat McAleese
23 days
OpenAI works miracles, but we do also wrap a lot of things in bash while loops to work around periodic crashes.
56
49
1K
@__nmca__
Nat McAleese
5 months
you guys are going to get like 8x more than you expect.
77
70
746
@__nmca__
Nat McAleese
2 years
I can finally talk about Sparrow! My team used RL to train a 70 billion parameter model to be simultaneously safer and more helpful… (1/5)
Tweet media one
5
41
400
@__nmca__
Nat McAleese
1 year
Made the move from DeepMind + London to OpenAI + SF. Going to miss the Sparrow team a lot, but very excited to see the sparks of AGI in person!
10
11
353
@__nmca__
Nat McAleese
3 months
As AI improves humans will need more and more help to monitor and control it. So my team at OpenAI have trained an AI that helps humans to evaluate AI! (1/5)
16
36
352
@__nmca__
Nat McAleese
18 days
if you haven’t noticed yet, o1 is wild but o1-mini is ludicrous
9
14
321
@__nmca__
Nat McAleese
2 months
Final superalignment paper! We define a multi-agent game; get LLMs to play it and show that it makes their reasoning more “legible” to humans 1/n
Tweet media one
3
34
273
@__nmca__
Nat McAleese
18 days
It's a good model.
Tweet media one
8
17
220
@__nmca__
Nat McAleese
18 days
go on, someone say it can’t reason
13
13
209
@__nmca__
Nat McAleese
2 years
Learn your classification task with 2x less data & better final accuracy via active learning in our new paper: . How does it work? (1/n)
Tweet media one
7
32
206
@__nmca__
Nat McAleese
1 year
Excited that the cat is finally out of the bag! A few key things I’d like to highlight about OpenAI’s new super-alignment team... (1/n)
@OpenAI
OpenAI
1 year
We need new technical breakthroughs to steer and control AI systems much smarter than us. Our new Superalignment team aims to solve this problem within 4 years, and we’re dedicating 20% of the compute we've secured to date towards this problem. Join us!
516
737
4K
12
10
205
@__nmca__
Nat McAleese
3 years
Working on a 280 billion parameter language model has greatly reduced how long I think it will take to build AGI. Very excited we finally released the details of Gopher - awesome work from the team! @drjwrae @geoffreyirving
Tweet media one
3
29
149
@__nmca__
Nat McAleese
3 years
We trained a huge language model to back up its claims with evidence! Super excited to be sharing GopherCite after a great collaboration with @jacobmenick @geoffreyirving @majatrebacz + many other awesome folk @DeepMind .
Tweet media one
4
21
137
@__nmca__
Nat McAleese
5 months
academics: deep learning is hitting a wall the wall:
Tweet media one
8
5
127
@__nmca__
Nat McAleese
6 months
Tweet media one
@aaron_defazio
Aaron Defazio
6 months
Cooking up something special! Can't wait to get a paper out so everyone can try it out. An optimizer with no extra overhead, no additional parameters. Stay tuned!
Tweet media one
14
30
398
2
3
121
@__nmca__
Nat McAleese
2 years
Nonsense QA with no prompting is not an interesting failure of large language models. Any vaguely sensible prompt (like the one from the Gopher paper) greatly reduces it, indicating it will not be a hard problem to completely solve with RL etc.
Tweet media one
4
15
113
@__nmca__
Nat McAleese
17 days
this, but unironically
@satyanutella_
Satya Nutella
18 days
Tweet media one
27
19
897
3
3
96
@__nmca__
Nat McAleese
1 month
Even on high school exams, no language model has ever scored more than 100%. Coincidence? I think not.
8
3
91
@__nmca__
Nat McAleese
10 months
Want to work on applied research for near-term safety of AI? It's pretty obvious you should do it for @lilianweng on OpenAI Safety Systems. They ship.
3
5
90
@__nmca__
Nat McAleese
13 days
seeing o1 do this well on completely fresh international-competition level coding problems was an amazing moment. If you don’t agree that this novel reasoning then your definition of novel reasoning is broken 😅
@alexwei_
Alexander Wei
13 days
Evaluating o1 on the International Olympiad of Informatics was very personally meaningful to me. When I competed nine years ago, I never thought I'd be back—so soon—competing with an AI. To highlight how amazing this model is, we shared on Codeforces its best IOI submissions ⬇️
10
25
342
1
3
90
@__nmca__
Nat McAleese
9 months
@robertskmiles suppose scores are [100, 100, 1, 2, 3], idx 0 and 1 are by far one of the best, e.g. in the closely tied group that is far better than most of the distribution
4
0
81
@__nmca__
Nat McAleese
10 months
The Community Notes algorithm is quite simple, yet it seems to work fairly well. This is a reason to be optimistic about coordination technology.
3
8
76
@__nmca__
Nat McAleese
2 years
The new Bing safety-pilled an NYT journalist. This seems... maybe good on balance? I'm unsure.
Tweet media one
2
1
72
@__nmca__
Nat McAleese
1 year
1) Yes, this is the notkilleveryoneism team. ( @AISafetyMemes ...)
2
3
74
@__nmca__
Nat McAleese
5 months
💔
@janleike
Jan Leike
5 months
I resigned
1K
902
10K
1
4
67
@__nmca__
Nat McAleese
10 days
o1 at work:
Tweet media one
Tweet media two
5
3
71
@__nmca__
Nat McAleese
4 months
the mandate of haveJan
@janleike
Jan Leike
4 months
I'm excited to join @AnthropicAI to continue the superalignment mission! My new team will work on scalable oversight, weak-to-strong generalization, and automated alignment research. If you're interested in joining, my dms are open.
371
525
9K
1
0
62
@__nmca__
Nat McAleese
9 days
people keep saying that the rollout and adoption of AGI is gonna take a lot of thought, software engineering and intelligence. oh boy do I have the technology for you!
3
3
68
@__nmca__
Nat McAleese
2 months
I wish the times were a bit less interesting.
1
3
60
@__nmca__
Nat McAleese
5 months
welcome to the world, gpt-4o
2
0
57
@__nmca__
Nat McAleese
2 years
Minerva and DeepNash are both surprising progress even against my short timelines. Much more so than Dalle2 was, having seen GLIDE - but around that GLIDE level of omg. Imagining 2030 is getting really hard.
2
6
63
@__nmca__
Nat McAleese
17 days
@DaveShapi run evals! AIME 2024 should be easy to replicate.
4
2
60
@__nmca__
Nat McAleese
1 year
4) Yes, superintelligence is a real danger.
2
1
52
@__nmca__
Nat McAleese
6 months
1
0
53
@__nmca__
Nat McAleese
1 year
2) Yes, 20% of all of OpenAI’s compute is a metric shit-ton of GPUs per person.
2
1
46
@__nmca__
Nat McAleese
1 year
1) Ben Hilton’s great summary, “Preventing AI-related catastrophe”
1
2
43
@__nmca__
Nat McAleese
2 years
@percyliang @NPCollapse the area of "scalable oversight" focuses in precisely this, see the work of myself & Geoffrey Irving & Sam Bowman & Ethan Perez & Jeffrey Wu & Jan Leike & many others. Sam's latest paper is excellent:
0
0
42
@__nmca__
Nat McAleese
1 year
3) Yes, we’re hiring for both scientists and engineers right now (although at OAI, they're much the same!)
2
0
40
@__nmca__
Nat McAleese
3 months
We looked specifically at model-written code for almost all our evaluations, and we already see huge potential for GPT-4-class models to assist humans in RLHF labelling (2/5)
1
2
39
@__nmca__
Nat McAleese
2 years
BuT iT's jUsT SuRfAcE LeVEl sTaTiStIcS!
Tweet media one
2
1
39
@__nmca__
Nat McAleese
2 months
the one thing we all agree on is that Noam Shazeer could definitely invert a binary tree on a whiteboard.
0
0
38
@__nmca__
Nat McAleese
2 years
by majority vote, the fathers of deep learning now declare agi safety to be a real issue:
@soundboy
Ian Hogarth
2 years
1/ Notable how three pioneers of deep learning ( recognised in their shared 2018 Turing award) have substantially diverged on how they assess risk from superintelligence:
Tweet media one
36
277
2K
1
1
36
@__nmca__
Nat McAleese
1 year
3) Dan Hendryk’s “An Overview of Catastrophic AI Risks”
2
0
37
@__nmca__
Nat McAleese
1 year
Now is the time for progress on superintelligence alignment; this is why @ilyasut and @janleike are joining forces to lead the new super-effort. Join us!
4
0
36
@__nmca__
Nat McAleese
2 months
we forgot how to count that low
@moyix
Brendan Dolan-Gavitt
2 months
Sorry OpenAI is doing WHAT now?! Fine-tuning gpt-4o-mini is *free* for up to 2M tok/day??
Tweet media one
65
170
3K
2
1
37
@__nmca__
Nat McAleese
1 year
Alternatively, if you have no idea why folks are talking so seriously about risks from rogue AI (but you have a science or engineering background) here’s a super-alignment reading list…
2
2
36
@__nmca__
Nat McAleese
1 year
Believing that cars could change Earth's climate requires imagining hundreds of millions in circulation, an unrealistic scenario benefiting the auto industry's narrative. They're just pushing car hype!
2
3
36
@__nmca__
Nat McAleese
2 years
Overall it was an exciting and huge collaboration, the results of which you can now read: . Huge thanks to everyone involved, but particularly to the rest of the joint-first authors who made the thing work! @mia_glaese @majatrebacz @john_aslanides (5/5)
2
0
34
@__nmca__
Nat McAleese
3 months
I’m tremendously excited for the future of human-machine teams in evaluation and training. If you want to work on this technology one of the best ways to do it is for @mia_glaese who runs human data here at OpenAI. They’re the best in the business! (4/5)
2
4
31
@__nmca__
Nat McAleese
9 months
2024 prediction: more AI progress than in 2023
0
3
27
@__nmca__
Nat McAleese
18 days
this, but unironically
@karpathy
Andrej Karpathy
18 days
o1-mini keeps refusing to try to solve the Riemann Hypothesis on my behalf. Model laziness continues to be a major issue sad ;p
321
502
10K
1
1
28
@__nmca__
Nat McAleese
5 months
academics: language models just memorize! language models:
Tweet media one
1
1
27
@__nmca__
Nat McAleese
4 months
the anthropic principle makes this an especially solid bet
@mpopv
Matt Popovich
5 months
twenty years from now, i would bet ai x-risk will look a lot like y2k does now - nothing cataclysmic happened, so the common view is that it was a fake concern all along - the counterfactual risk was actually real though - it was only averted through a lot of human effort
38
32
444
1
0
25
@__nmca__
Nat McAleese
3 months
Apply to their team here: (5/5)
1
3
23
@__nmca__
Nat McAleese
1 year
Demis, Dario, Sam, Hinton, Bengio, Russell all signed the following statement:
Tweet media one
3
2
21
@__nmca__
Nat McAleese
16 days
@DaveShapi you really need to some evals dude
1
0
23
@__nmca__
Nat McAleese
1 year
manifold on whether superalignment will succeed:
1
1
21
@__nmca__
Nat McAleese
2 years
We started by RL fine-tuning models to be more helpful, but that made the resulting policies much more exploitable when you try and trick them into bad behaviour. We had to jointly train for usefulness and safety to get better at both! (2/5)
Tweet media one
1
2
22
@__nmca__
Nat McAleese
2 years
well I did not expect my day to include furiously checking ...
2
0
22
@__nmca__
Nat McAleese
2 months
The mandate of have(John)
0
0
21
@__nmca__
Nat McAleese
2 years
The description length of all models in the same transformer family is the same at initialisation, regardless of param count; it is this description length (& not after training) that is relevant to optimal compression. (1/n)
@michael_nielsen
Michael Nielsen
2 years
The "scale is all you need" hype around ever-larger language models is a striking inversion of our (usual) preference for making models simpler. Occam's Big Ball of Mud
20
11
137
2
1
21
@__nmca__
Nat McAleese
3 years
So happy to finally talk about Red Teaming! TL;DR even seemingly well behaved dialogue models fall down completely if you search hard enough for adversarial questions...
@GoogleDeepMind
Google DeepMind
3 years
Language models (LMs) can generate harmful text. New research shows that generating test cases ("red teaming") using another LM can help find and fix undesirable behaviour before impacting users. Read more: 1/
Tweet media one
16
89
553
1
2
21
@__nmca__
Nat McAleese
1 year
And, for examples of solutions instead of problems:
1
0
20
@__nmca__
Nat McAleese
2 years
And the harm mitigations we used (rule classifiers and preference models) *don't* solve distributional bias problems - they just remove bad behaviours that you can see in a single sample (lots of detail in paper, 4/5)
Tweet media one
2
1
19
@__nmca__
Nat McAleese
10 months
Collin is better at ML research than at speedcubing.
@sama
Sam Altman
10 months
i'd particularly like to recognize @CollinBurns4 for today's generalization result, who came to openai excited to pursue this vision and helped get the rest of the team excited about it!
178
157
3K
2
0
18
@__nmca__
Nat McAleese
2 years
if it actually is clippy that gets us all, the irony will confirm the simulation hypothesis
1
1
18
@__nmca__
Nat McAleese
2 months
As LLMs become capable of superhuman reasoning we need methods that let us understand why and how they reached their conclusions. Unfortunately the chain of thought that gets the best performance might not be the easiest for humans to understand — the “legibility gap”. 2/n
Tweet media one
1
0
17
@__nmca__
Nat McAleese
2 years
Including the ability to search the internet really helps with factuality (although it's not a panacea)... (3/5)
Tweet media one
1
0
17
@__nmca__
Nat McAleese
2 months
I’m thrilled the team were able to formally define a notion of legibility that behaves sensibly when approximately optimized and that the results generalize to humans. Amazing work by the two lead authors @cynnjjs and @janhkirchner and the rest of the team! 3/n
1
0
16
@__nmca__
Nat McAleese
4 months
> "Our generation too easily takes for granted that we live in peace and freedom. And those who herald the age of AGI in SF too often ignore the elephant in the room: superintelligence is a matter of national security, and the United States must win." (1/2)
1
4
16
@__nmca__
Nat McAleese
1 year
I'm so happy we got this timeline:
Tweet media one
0
1
16
@__nmca__
Nat McAleese
18 days
medium confidence: if you wouldn't qualify for USAMO, then o1 can probably accelerate your research, whatever the domain
1
0
15
@__nmca__
Nat McAleese
4 months
this is a much better joke than xkcd extrapolation
@prerationalist
prerat
4 months
speaking of drawing lines on a chart....... by 2050 the olympic 100m dash will be won by a human running on all fours
Tweet media one
Tweet media two
13
39
869
3
1
16
@__nmca__
Nat McAleese
10 months
agi is gonna be wild
@AAAzzam
Adam Azzam
10 months
fun story: terry tao was on both my and my brother's committee. he solved both our dissertation problems before we were done talking, each of us got "wouldn't it have been easier to...outline of entire proof" 🫠
70
339
6K
0
0
14
@__nmca__
Nat McAleese
8 months
the best python plotting library is very clearly @HKibirige 's plotnine. little known gem.
2
2
15
@__nmca__
Nat McAleese
15 days
moderate confidence: if you wouldn't qualify for USAMO, then o1 can probably accelerate your research, whatever the domain
@emollick
Ethan Mollick
15 days
The first author of this astrophysics paper found that if he gave o1-preview the methods section, it was able to reproduce 10 months of work coding he did as a PhD in 5 prompts (a few caveats in the video) Side note: all of your methods sections are becoming instruction manuals.
Tweet media one
Tweet media two
55
385
3K
0
0
15
@__nmca__
Nat McAleese
8 months
@sama "deep learning hitting a wall"
1
0
13
@__nmca__
Nat McAleese
4 months
Moravec’s Opportunity
1
1
14
@__nmca__
Nat McAleese
1 year
🤷
Tweet media one
Tweet media two
0
0
14
@__nmca__
Nat McAleese
8 months
Yann Lecun proposes disdainful "DeTuring" award for those concerned with AI risk; forgets that indeed Turing invented AI risk.
@NPCollapse
Connor Leahy
8 months
I nominate Alan Turing for the first DeTuring Award.
Tweet media one
61
156
1K
2
0
13
@__nmca__
Nat McAleese
7 months
the canonical eval setting should be one-(shot-without-cot)-cot pass @ 1, everything else has been a mistake.
2
1
12
@__nmca__
Nat McAleese
1 year
@__nmca__
Nat McAleese
1 year
@slatestarcodex has a cute eval for image generation. How does dall-e-3 do? (1/n)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
5
0
0
5
@__nmca__
Nat McAleese
2 years
I feel like the image understanding capabilities of GPT4 are currently underrated. API access or the evals paper are going to blow minds. (based only on assuming paper examples are not cherries; which was true of '3)
0
0
13
@__nmca__
Nat McAleese
10 months
did anyone see this future coming? have read a fair bit of sci-fi but this was absent!
0
0
13
@__nmca__
Nat McAleese
1 year
@slatestarcodex has a cute eval for image generation. How does dall-e-3 do? (1/n)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
5
@__nmca__
Nat McAleese
8 months
positive sum status games are the ultimate social technology
@RichardMCNgo
Richard Ngo
8 months
Catchy quote, but Bernard Arnault just became the richest man on earth; are you sure you want to short fashion? Like it or not, when people gain material abundance, they mostly spend it on status. The real question is whether we can design status games that are positive-sum.
28
15
273
0
0
11
@__nmca__
Nat McAleese
2 years
we laughed at this originally but LLM @ int4 + inference hardware and batteries would fit, either now or in one generation's time. so we have evolved to the point of the internet in a box / hitchhiker's guide. exciting times.
Tweet media one
1
0
12
@__nmca__
Nat McAleese
2 years
A fun collaboration with @IanOsband , @john_aslanides , @geoffreyirving and several more great but non-tweeting authors. Looking forward to more uncertainty estimates in future!
0
0
12
@__nmca__
Nat McAleese
11 months
#devday all tools all the time:
Tweet media one
0
2
12
@__nmca__
Nat McAleese
2 years
How do we learn what will be informative? It helps to separate aleatoric & epistemic uncertainty. Ian argues you can do this with the joint distribution of your labels - and has a key paper on it, introducing EpiNets (3/n)
Tweet media one
2
0
12