Jaime Sevilla @Jsevillamol Twitter profile

Pinned Tweet

Jaime Sevilla

4 months

I have very exciting news to share. @henshall_will from TIME magazine wrote an amazing profile featuring yours truly!! 🤩 The link is in the next tweet.

11

7

125

Last Seen Profiles

@Andre552072855

@RetardedDAO

@AttalahDr

@ZackaryRinehart

@yedsodTK

@joaco_cndef

@battybass

@DailyMood321

@ryeouxn

@alleng30

@F700oo25

@szNel_

@natAndrewson

@assereqbola

@KAKURI152

@Faded_Drmz

@sneakoarchive

@mariposa_0880

@SavannaBritoWX

@kxcpk__

@SwampFTW

@Dress2Party

@kirbysok31

@BrawlySensei

@CesarJBlanco

@stw46

@ceksuhairi

@Z_p14

@your_melody007

@tennensui_cos

@stw46

@stw_pdg

@Jane__K5115

@YaqutCQoumas

@1234Bigdeal

Jaime Sevilla

@Jsevillamol

2 years

Our paper has been featured in The Economist!

12

48

399

Jaime Sevilla

@Jsevillamol

2 years

I very very much encourage people to not publicly associate political positions with postures about AI. We are possibly in the critical juncture where we decide whether this is going to be a problem we all face together or divided. Do not let AI become party coded.

21

35

250

Jaime Sevilla

@Jsevillamol

2 years

Reviewer #1 | Reviewer #2 The paper was rejected

19

14

230

Jaime Sevilla

@Jsevillamol

6 months

The evidence for LLMs being capable of reasoning beyond memorization at this point is overwhelming.

Dwarkesh Patel

@dwarkesh_sp

6 months

. @TrentonBricken explains how we know LLMs are actually generalizing - aka they're not just stochastic parrots: - Training models on code makes them better at reasoning in language. - Models fine tuned on math problems become better at entity detection. - We can just

28

115

726

12

16

159

Jaime Sevilla

@Jsevillamol

11 months

New paper by Google provides evidence that the AI research community cannot generalize beyond simple experimental setups

4

8

153

Jaime Sevilla

@Jsevillamol

1 year

I was not ready to experience @geoffreyhinton AI risk dunks.

1

5

115

Jaime Sevilla

@Jsevillamol

2 years

We are excited to announce our new research organization: Epoch! We are working on investigating AI developments and forecasting the development of Transformative AI. You can learn more in our announcement: Summary below 🧵⬇️

7

19

115

Jaime Sevilla

@Jsevillamol

7 months

My sense talking to researchers working on AI safety related work is that in the last two years there has been an update towards: 1. Shorter timelines 2. Slow takeoff 3. Less worrying about extinction and more about other catastrophic outcomes

13

2

111

Jaime Sevilla

@Jsevillamol

2 years

When I talk to EAs about managing GCRs, and they inevitably try to steer the conversation towards longtermism

4

3

84

Jaime Sevilla

@Jsevillamol

3 years

. @OurWorldInData is now hosting a visualization of the data we collected for our paper on AI and compute!

1

13

81

Jaime Sevilla

@Jsevillamol

11 months

@AndrewYNg @geoffreyhinton I don't necessarily endorse Andrew's perspective here, but kudos on a respectful reply over a sincere disagreement.

1

0

82

Jaime Sevilla

@Jsevillamol

4 months

@ericneyman In gpt-4o defense, I also got this wrong

6

1

81

Jaime Sevilla

@Jsevillamol

3 years

Big personal announcement: I am taking a break from my PhD to work as a contractor for @open_phil , to research trends in Artificial Intelligence. You can read more of what I have been up lately in a post I've written:

'Tis The Season of Change — EA Forum Bots

A lot has happened since I wrote about my PhD in August. Seems like a reflection is in order! …

forum-bots.effectivealtruism.org

2

0

75

Jaime Sevilla

@Jsevillamol

3 months

0

77

Jaime Sevilla

@Jsevillamol

10 months

Vox populi, vox dei

Jaime Sevilla

@Jsevillamol

10 months

At Epoch we have been publicly releasing compute estimates of major models such as GPT4 and Claude2. Do you think we should keep doing this, even in cases where companies keep the compute deliberately secret?

9

3

14

4

1

67

Jaime Sevilla

@Jsevillamol

3 months

Ok, I have changed my mind on moving compute thresholds. The EU AI Office does not have and does not plan for the capacity to update compute thresholds every six months. A dynamically moving threshold is a no-go.

5

7

67

Jaime Sevilla

@Jsevillamol

1 year

🚨Exciting news in the world of Machine Learning! @EpochAIResearch has just launched a dashboard covering key trends and figures in ML.👀

Announcing Epoch AI’s Dashboard of Key Trends and Figures in Machine Learning

We are launching a dashboard that provides key data from our research on machine learning, aiming to serve as a valuable resource for understanding the present and future of the field.

epochai.org

6

14

61

Jaime Sevilla

@Jsevillamol

9 months

I wrote an opinionated list of open research questions in AI forecasting, with some input from @tamaybes . This will be useful if you are considering applying for a job at @EpochAIResearch , or want to build a portfolio to break into the field.

Some open questions of AI forecasting

Some open questions of AI forecasting A quick write-up by Jaime Sevilla. Jan 2024. This document outlines some open questions in AI forecasting. I think these are candidates for an entrepreneurial...

docs.google.com

0

7

63

Jaime Sevilla

@Jsevillamol

9 months

What a year. Epoch has gone from a small research group to a major research institute that governments are relying on. And it still is the best workplace in the world, thanks to my awesome colleagues!

Epoch AI

@EpochAIResearch

9 months

2023 was a great year for Epoch! We just published our annual impact report, listing our achievements in the past year and our plans for the coming year. Here’s a summary 🧵:

2

10

77

2

5

61

Jaime Sevilla

@Jsevillamol

4 months

AlphaGo Master and AlphaGo Zero were such massive outliers in scale. They single-handedly warp trends. Analyses at Epoch need to be very deliberate on whether to include them!

9

0

59

Jaime Sevilla

@Jsevillamol

3 months

A $100,000 training run in early 2019 costs $700, a 140x improvement. @EpochAIResearch 's paper on algorithmic efficiency estimated a 3x/year improvement in efficiency, which would imply an expected 240x improvement over 5 years.

Andrej Karpathy

@karpathy

3 months

@maurosicard This information was never released but I'd expect it was a lot more. In terms of multipliers let's say 3X from data, 2X from hardware utilization, in 2019 this was probably a V100 cluster (~100 fp16 TFLOPS), down from H100 (~1,000), so that's ~10X. Very roughly let's say ~100X

5

14

287

1

10

56

Jaime Sevilla

@Jsevillamol

3 years

Our new paper is out! This is the result of a multiple month collaboration were we painstakingly curated a dataset of >100 milestone ML models. (1/n)

Lennart Heim

@ohlennart

3 years

**ML training compute has been doubling every 6 months since 2010!** Our preprint "Compute Trends Across Three Eras of Machine Learning" is out. 🧵 Thread below ↓ 1/

25

239

857

2

19

53

Jaime Sevilla

@Jsevillamol

5 years

I just published "Confounders made simple", where I introduce some basic concepts of causal inference

3

12

51

Jaime Sevilla

@Jsevillamol

3 months

Underappreciated fact: OpenAI is investing more compute in training than in inference. GPT-4 has ~240B active parameters and was trained on a 25,000 A100 cluster. At 20% utilisation, this cluster serves 260B token/day. In Feb, OpenAI was serving 100B tokens/day.

4

54

Jaime Sevilla

@Jsevillamol

3 months

Anthropic cofounder @samsamoa states they will discontinue non-disparagement agreements and promises not to enforce existing agreements. Is there confirmation from @samsamoa that this is indeed their account and Anthropic's official position?

2

0

51

Jaime Sevilla

@Jsevillamol

1 year

After writing this article, we were invited to contribute to the national emergency plan of Argentina, which will subsequently be the first country in the world with a national emergency plan for nuclear winter. Also, check out a summary here!

Infografias Informe: seguridad alimentaria en Argentina en caso de un Escenario de Reducción...

Por: Equipo de RCG

www.orcg.info

Anders Sandberg

@anderssandberg

1 year

New report from @RiesgosGlobales and @ALLFEDALLIANCE on "Food Security in Argentina in the event of an Abrupt Sunlight Reduction Scenario: A Strategic Proposal". Lots of useful ideas; counterpart reports for other countries would be great.

0

9

25

5

6

52

Jaime Sevilla

@Jsevillamol

7 months

You got the right number in the wrong base @elonmusk The amount of compute used to train AI systems is *doubling* every six months.

3

2

51

Jaime Sevilla

@Jsevillamol

7 months

About two-thirds of performance improvements in language models can be attributed to scaling. The remaining one-third corresponds to innovations in model architecture and training. This has profound implications.

Tamay Besiroglu

@tamaybes

7 months

Language models have come a long way since 2012, when recurrent networks struggled to form coherent sentences. Our new paper finds that the compute needed to achieve a set performance level has been halving every 5 to 14 months on average. (1/10)

8

55

299

8

4

49

Jaime Sevilla

@Jsevillamol

1 year

A recent NYT article showcased @EpochAIResearch 's data to push a China vs US narrative. Let’s set the record straight - the graph they made (reproduced below) is misleading. I explain why below 🧵

1

12

50

Jaime Sevilla

@Jsevillamol

2 years

Short report by @EpochAIResearch ! We argue that we won’t see ML training runs over 1.22 years - longer runs will be outcompeted by runs that start later and use better hardware and algorithms.

3

10

49

Jaime Sevilla

@Jsevillamol

1 year

Presenting a new Epoch double feature! Today we release an interactive model of AI timelines and an opinion piece by researcher @MatthewJBar explaining our approach to modeling the future of AI. 🧵

Direct Approach Interactive Model

We combine the Direct Approach framework with simple models of progress in algorithms, investment, and compute costs to produce a user-adjustable forecast of when TAI will be achieved.

epochai.org

5

14

47

Jaime Sevilla

@Jsevillamol

7 months

Dear economists: you need to up your game. You are not taking seriously what might be the next industrial revolution.

Tamay Besiroglu

@tamaybes

7 months

A recent paper asseses whether AI could cause explosive growth and suggests no. It's good to have other economists seriously engage with the arguments that suggest that AI that substitutes for humans could accelerate growth, right?

7

9

99

4

5

47

Jaime Sevilla

@Jsevillamol

1 year

Do you want to be paid for reading ML papers? At Epoch we are looking for contractors who can help annotate information from notable ML papers to inform our research and visualizations.

Careers

Explore Epoch AI’s career opportunities, apply to open positions, and help shape the future of AI.

epochai.org

2

14

47

Jaime Sevilla

@Jsevillamol

6 months

@EgeErdil2 I think people usually refer to an economic arrangement in which basic goods like food and clothing have a negligible cost compared to everyone's wealth, such that lowering their price would not increase demand.

3

0

45

Jaime Sevilla

@Jsevillamol

10 months

Fantastic work from my colleagues! The bounds that they derive imply that training runs over 1e35 FLOP are likely out of reach for CMOS technology.

Anson Ho

@ansonwhho

10 months

What are the limits to the energy efficiency of CMOS microprocessors? In our new paper, published in the International Conference on Rebooting Computing, we propose a simple model to shed light on this question:

2

9

50

6

3

45

Jaime Sevilla

@Jsevillamol

1 year

My experience with LW people is that they consistently underestimate how seriously other people will take the issue and overestimate how sudden AI developments will be

Siméon

@Simeon_Cps

1 year

@StefanFSchubert I believe that a share of why technical people are very pessimistic is the experience of banging their head against the problem with potential solutions and not succeeding. I also believe that the underlying threat models for why an intelligent thing may be dangerous are more

2

3

44

3

44

Jaime Sevilla

@Jsevillamol

6 months

I am very excited to share with you the first comprehensive database of publicly known models trained with over 1e23 FLOP.

Epoch AI

@EpochAIResearch

6 months

How many large AI models are out there, who developed them, and for what applications? To answer this question, we present a new dataset tracking every AI model we could find trained with over 10^23 FLOP. Highlights in thread 🧵

43

297

2K

1

6

43

Jaime Sevilla

@Jsevillamol

1 year

- It's expensive. Backprop costs twice as much compute as inference, so you would be tripling costs. - You want to choose your target model size in advance depending on the training target, due to scaling laws. - It's an attack vector. Remember Tai?

Siméon

@Simeon_Cps

1 year

Why have continually learning agents not become a big thing yet? It seems like from GPT-4, it wouldn't be hard to build one for OpenAI, and it will massively change the pace of capabilities progress.

16

0

28

2

42

Jaime Sevilla

@Jsevillamol

10 months

Over 2023, @RiesgosGlobales has become a fully-fledged science-policy organization. I am incredibly proud of their work. It includes some major successes. I cover some highlights on thread 🧵

2

7

42

Jaime Sevilla

@Jsevillamol

9 months

One of the most impactful work that can be done in the next couple of months on AI governance is developing frameworks for how to assess risks from AI that governments could readily incorporate into their workflows.

4

5

42

Jaime Sevilla

@Jsevillamol

3 years

Arrived in Bahamas! The place is absolutely amazing, and the people mind-blowing. I am very grateful to the FTX Foundation for organizing the fellowship!

0

41

Jaime Sevilla

@Jsevillamol

7 months

This paper is the first comprehensive analysis of how the efficiency of language models has been improving over time. It's importance cannot be overstated!

Tamay Besiroglu

@tamaybes

7 months

Language models have come a long way since 2012, when recurrent networks struggled to form coherent sentences. Our new paper finds that the compute needed to achieve a set performance level has been halving every 5 to 14 months on average. (1/10)

8

55

299

0

39

Jaime Sevilla

@Jsevillamol

3 months

@ShakeelHashim B200 = 4.5e15 FLOP/s at INT8 100 days ~= 1e7 seconds Typical utilization ~= 30% So 100,000 * 4.5e15 FLOP/s * 1e7 * 30% ~= 1e27 FLOP Which is ~1.5 OOMs bigger than GPT-4

7

2

38

Jaime Sevilla

@Jsevillamol

4 months

Currently busy but I'll be back to write some commentary on our latest article. Stay tuned!

Epoch AI

@EpochAIResearch

4 months

1/ How quickly are state-of-the-art AI models growing? The amount of compute used in AI training is a critical driver of progress in AI. Our analysis of over 300 machine learning systems reveals that the amount of compute used in training is consistently being scaled up at

23

303

2K

4

0

38

Jaime Sevilla

@Jsevillamol

9 months

I do predict it, because as a matter of fact this is something we have (mounting) evidence on. 2024 is not the year when AI hardware scaling will hit a wall -- both algorithms and compute will continue being important facets of AI development.

Eliezer Yudkowsky ⏹️

@ESYudkowsky

9 months

I don't, particularly, predict it, because the future is rarely that predictable -- but if 2024 is the year when AI hardware scaling seems to hit a temporary wall, and further progress past GPT-4 seems to be all about algorithms, this won't surprise me. I can already guess that,

57

18

282

3

38

Jaime Sevilla

@Jsevillamol

11 months

Some key tips if you want to talk about trends in compute: 1. Use logarithmic axes. 2. Do not fit your trends to only outliers. 3. Do not confuse FLOP and FLOPS.

5

2

38

Jaime Sevilla

@Jsevillamol

2 years

I share a big part of Matthew's frustration, though I disagree with the bottom line and I have signed the letter. Why? I explain below 🧵

Matthew Barnett

@MatthewJBar

2 years

I currently think this open letter is quite bad, and possibly net harmful. The proposed policy appears vague and misguided. I want to explain some of my thoughts. 🧵

21

58

411

1

4

37

Jaime Sevilla

@Jsevillamol

7 months

If companies could start publishing information about the training compute going into large models that would be great.

1

5

37

Jaime Sevilla

@Jsevillamol

1 year

I'm mentioned on TIME! Will did a fantastic job of explaining the current situation with AI using @EpochAIResearch data.

Will Henshall

@henshall_will

1 year

Why do experts think AI progress likely to continue? It's just a continuation of trends that have been going on for decades 🧵 (1/6)

8

14

61

2

5

36

Jaime Sevilla

@Jsevillamol

2 years

@ATabarrok Note that currently I would not trust manifold markets much more than a Twitter poll. Metaculus has a track record, so I would put more trust there. This old report gives 0.35% chance of full scale nuclear war.

Russia-Ukraine Conflict: Forecasting Nuclear Risk in 2022

www.metaculus.com

4

0

34

Jaime Sevilla

@Jsevillamol

1 year

Longtermism is broken beyond repair. As a replacement, I propose focusing on projects that would make my abuela proud.

2

4

34

Jaime Sevilla

@Jsevillamol

4 months

Very thoughtful piece on the future of AI. I think the basic picture that we are going to be rushing through many OOMs of compute soon and that will unlock drastic capability increases is basically right.

Leopold Aschenbrenner

@leopoldasch

4 months

Virtually nobody is pricing in what's coming in AI. I wrote an essay series on the AGI strategic picture: from the trendlines in deep learning and counting the OOMs, to the international situation and The Project. SITUATIONAL AWARENESS: The Decade Ahead

276

911

4K

1

2

34

Jaime Sevilla

@Jsevillamol

2 months

@alexandr_wang 25% chance before when? This sentence is vacuous otherwise. Anyway, if its before 2050 Metaculus agrees with you that 25% is in the right ballpark. But for boring baseline reasons rather than any recent events.

Will there be a "World War Three" before 2050?

www.metaculus.com

1

0

34

Jaime Sevilla

@Jsevillamol

9 months

Did you know that there is already a system falling within the purview of the recent AI Executive Order? Learn more about this and biological ML models on @EpochAIResearch 's new report!

nicimaug

@nicimaug

9 months

The recently issued Executive Order requests regulatory oversight of AI models trained on primarily biological sequence data whose training compute exceeds 1e23 operations. Our report examines trends in training compute, data availability and points to potential regulatory gaps🧵

2

21

82

1

5

33

Jaime Sevilla

@Jsevillamol

5 years

@3blue1brown Disclaimer: I am a Bayesian Having said that: 1) maliciously choosing a prior can allow you to infer whatever conclusions you want 2) Bayesian approaches are often computationally intractable

4

1

33

Jaime Sevilla

@Jsevillamol

4 months

1/ This was an exciting article to write! We establish that compute growth is blazingly fast, doubling twice per year. I am particularly proud of how we expanded on previous work. I explain how below 🧵

Epoch AI

@EpochAIResearch

4 months

1/ How quickly are state-of-the-art AI models growing? The amount of compute used in AI training is a critical driver of progress in AI. Our analysis of over 300 machine learning systems reveals that the amount of compute used in training is consistently being scaled up at

23

303

2K

2

4

32

Jaime Sevilla

@Jsevillamol

5 months

@SashaMTL FWIW this seems to me like a case of "you used an outdated model and so you got outdated results". Here is midjourney v6 on the prompt "Mother Teresa fighting against poverty"

3

1

32

Jaime Sevilla

@Jsevillamol

1 year

If you have recently received an email inviting you to the "First Latin American Conference AI Safety" that claims that I am a confirmed participant, please be aware that this is false. I did not confirm attendance nor I endorse the organizing team.

2

1

32

Jaime Sevilla

@Jsevillamol

6 months

My colleague @EgeErdil2 is criminally underrated. One of the smartest people I've had the pleasure to work with.

Dwarkesh Patel

@dwarkesh_sp

6 months

This is such a clever short argument, but with important implications about the AI progress to come. I only recently learned of @EgeErdil2 . And already I have learned a lot from his work.

11

8

214

6

1

30

Jaime Sevilla

@Jsevillamol

3 months

Me explaining compute estimations to the EU AI Office

1

30

Jaime Sevilla

@Jsevillamol

2 years

@william_woof I do! I also wrote a non technical overview here

A Framework to Explain Bayesian Models — AI Alignment Forum

Bayesian Networks are used to represent uncertainty and probabilistic relations between variables. They have an appealing graphical representation an…

www.alignmentforum.org

1

28

Jaime Sevilla

@Jsevillamol

2 years

@MaxCRoser I found this @metaculus question from 2016 assigning a 40% chance of a major naturally occuring pandemic before 2026

Pandemic series: a major naturally-originated pandemic by 2026?

The aggregate of 237 Metaculus community forecasters was 10% on Oct 04, 2024.

www.metaculus.com

2

27

Jaime Sevilla

@Jsevillamol

1 year

Our paper "Power laws in Speedrunning and Machine Learning" is out now! @EgeErdil2 and I develop a model for predicting record improvements in video game speedrunning 🎮 and apply it to predicting Machine Learning benchmarks 🤖. (1/6)

1

5

29

Jaime Sevilla

@Jsevillamol

1 year

Epoch was born out of a project to systematically collect data about ML systems. I am elated to announce that the database keeps growing and becoming more useful by the moment!

Epoch AI

@EpochAIResearch

1 year

🧵Introducing our newly expanded Parameter, Compute and Data Trends database!

2

13

43

0

4

29

Jaime Sevilla

@Jsevillamol

2 months

Sentinel -- the news in advance of them happening

Nuño Sempere

@NunoSempere

2 months

Monkeypox in this week's Sentinel minutes: ~60% Public Health Emergency of International Concern (PHEIC) in the next 12 months, case fatality rate currently 3-5.5% but probably extrapolates to 0.2% if it goes global; probably 1-5x times as worse as seasonal flu if so.

0

5

36

2

6

28

Jaime Sevilla

@Jsevillamol

2 months

After six months of working and teasing results on Twitter, our report on scaling constraints is finally out. One of the most ambitious @EpochAIResearch pieces to date.

Epoch AI

@EpochAIResearch

2 months

1/ Can AI scaling continue through 2030? We examine whether constraints on power, chip manufacturing, training data, or data center latencies might hinder AI growth. Our analysis suggests that AI scaling can likely continue its current trend through 2030.

48

181

807

1

29

Jaime Sevilla

@Jsevillamol

4 years

The paper I wrote with @Jess_Riedel about forecasting timelines for quantum computing is now available in the arXiv! I also wrote a short explainer on Jess' blog if you want an overview of the results

Forecasting timelines of quantum computing

We consider how to forecast progress in the domain of quantum computing. For this purpose we collect a dataset of quantum computer systems to date, scored on their physical qubits and gate error...

arxiv.org

1

7

28

Jaime Sevilla

@Jsevillamol

4 months

I've realized I just crossed the 2,000-follower mark! Any questions from new followers? AMA!

16

0

28

Jaime Sevilla

@Jsevillamol

3 years

I am coordinating a research effort to collate the biggest ever public dataset on parameters, compute and dataset size for landmark AI models. And we are looking for collaborators! (details in thread)

1

10

28

Jaime Sevilla

@Jsevillamol

3 months

Spotted in the wild

1

0

27

Jaime Sevilla

@Jsevillamol

2 years

I am constantly moving countries and changing phone numbers. It is very tiresome that many of my apps are tied to mobile numbers, which I get subsequently locked out of. What is a good solution to this?

5

0

26

Jaime Sevilla

@Jsevillamol

1 year

New Epoch merch just dropped

Samuel Curtis

@samuelmcurtis

1 year

Thinking I'd wear this

3

0

8

2

0

26

Jaime Sevilla

@Jsevillamol

2 years

This is a somewhat misleading picture. AlphaZero and AlphaGoZero are outliers in terms of compute, and with more data the trend appears substantially slower, doubling every ~6 months.

Eric Topol

@EricTopol

2 years

Doubling every 2 years: Moore's Law Doubling every 2 months: Foundation models From deep to dendrocentric learning @Nature by @boahen_k

6

100

309

2

5

26

Jaime Sevilla

@Jsevillamol

4 years

Mini post: Simpson's paradox Statement (1/3)

1

25

Jaime Sevilla

@Jsevillamol

11 months

Compute caps, if imperfectly enforced, can lead to a large compute overhang, plus have a large cost in preventing the development of useful AI. I'd much rather we focused on improving auditing and threat detection, and addressing vulnerabilities as we scale AI systems.

1

3

26

Jaime Sevilla

@Jsevillamol

3 years

I have written about a new forecasting aggregation method suggested by @ericneyman in a recent paper. It is still early to say with confidence, but I am moderately excited about their method. It performs well on @metaculus binary questions too!

Principled extremizing of aggregated forecasts — EA Forum Bots

In short: In light of a recent result by Eric Neyman, I tentatively recommend extremizing aggregated forecasts by a factor d=n(√3n2−3n+1−2)n2−n−1, wh…

forum-bots.effectivealtruism.org

1

4

25

Jaime Sevilla

@Jsevillamol

2 years

I have inaugurated a new AI art exhibition — Spellbound. Today I will reveal the first six exhibits. Every day through November, I will show additional pieces from the collection. See the gallery with the paintings released so far at

Spellbound — Connectome Art

www.connectomeart.com

3

4

25

Jaime Sevilla

@Jsevillamol

6 months

They seem somewhat uncalibrated on how much ai can grow in the coming years. Energy use for training has been going up 3.2x/year for the last few years. That's a 1000x in six more year.

Robinson Meyer

@robinsonmeyer

6 months

NEW SHIFT KEY: We talked to Jonathan Koomey, one of the top researchers on the internet’s energy and environmental impact, about whether the AI boom will break the US electricity system. His verdict: “Everyone needs to calm the heck down.”

6

36

174

3

0

25

Jaime Sevilla

@Jsevillamol

1 year

Busy time at @EpochAIResearch ! Here is an overview of our recent output to make sure you didn't miss anything 🧵

1

4

24

Jaime Sevilla

@Jsevillamol

2 years

. @EpochAIResearch has released an interactive website as a supplement to the recent report from Tom Davidson about AI Takeoff Speeds. We hope you will find it useful!

An Interactive Model of AI Takeoff Speeds

We have developed an interactive website showcasing a new model of AI takeoff speeds.

epochai.org

0

5

25

Jaime Sevilla

@Jsevillamol

4 months

We need to raise the bar for conducting LLM evaluations. See here my colleagues doing just this and reporting uncertainty intervals for GPQA!

Epoch AI

@EpochAIResearch

4 months

1/7 Is Claude 3.5 Sonnet actually better than GPT-4o on GPQA? Benchmark results can be noisy due to randomness in model outputs, so we put Claude 3.5 Sonnet to a more rigorous test. Here's what we found. 🧵

3

11

109

0

3

25

Jaime Sevilla

@Jsevillamol

7 months

The majority of my followers think that inference compute will exceed training compute. Interestingly, my colleague @EgeErdil2 has a compelling argument that they will be roughly similar. Follow @EpochAIResearch to learn about it as soon as it comes out!

Jaime Sevilla

@Jsevillamol

7 months

Do you think that in 2030 there will be more compute allocated to training, or to inference?

2

1

3

1

24

Jaime Sevilla

@Jsevillamol

6 months

TL;DR: the parametric fit of the Chinchilla paper scaling law is likely wrong. Extremely thought-provoking work by my colleagues @EpochAIResearch !

Tamay Besiroglu

@tamaybes

6 months

The Chinchilla scaling paper by Hoffmann et al. has been highly influential in the language modeling community. We tried to replicate a key part of their work and discovered discrepancies. Here's what we found. (1/9)

17

138

921

1

2

24

Jaime Sevilla

@Jsevillamol

5 months

@DavidSKrueger We shouldn't take people's stances from 2016 as overwhelming evidence of what they think now. The field of AI has changed enough that it would be strange if experts hadn't changed their minds on several key issues since.

3

0

23

Jaime Sevilla

@Jsevillamol

1 year

I'd like to see more nuanced discussion of the pros and cons of open source, and this is a good step in that direction.

Elizabeth A. Seger

@ea_seger

1 year

Excited to release this new @GovAI report outlining the risks and benefits of open-sourcing highly capable AI systems and alternative methods for pursuing some open-source goals. (1/10) Summary thread below 🧵

3

24

92

1

23

Jaime Sevilla

@Jsevillamol

10 months

First, Riesgos Globales has advised the Spanish presidency of the EU council on regulation foundation models. It's hard to understand the counterfactual impact, but all our major recommendations were adopted in the EU AI Act.

2

3

23

Jaime Sevilla

@Jsevillamol

2 months

I've read @random_walker 's article three times by now and I just found it though provoking and a good summary of the current epistemic status of AI risk -- uncertain.

AI existential risk probabilities are too unreliable to inform policy

How speculation gets laundered through pseudo-quantification

www.aisnakeoil.com

2

0

23

Jaime Sevilla

@Jsevillamol

6 months

We received a paper review that points out we are missing an important reference to a suspiciously similar previous work - which happens to be the preprint version of our own paper. How are we supposed to address that without breaching blindness? #AcademicTwitter

5

1

23

Jaime Sevilla

@Jsevillamol

2 months

I have been asked whether this overturns our previous result that training runs should not take longer than 14-15 months. The TL;DR is that I still think > 15-month training runs are unlikely.

Epoch AI

@EpochAIResearch

2 months

New data insight: The training time for notable AI models is growing steadily. Since 2010, we've seen a 1.2x increase per year in training duration for notable models (excluding those fine-tuned from base models). This trend has significant implications for AI development. 1/4

3

12

68

2

1

22

Jaime Sevilla

@Jsevillamol

2 years

MIT Tech Review has featured our recent @EpochAIResearch paper!

We could run out of data to train AI language programs

Researchers may have to get creative to make training data stretch further.

www.technologyreview.com

1

6

22

Jaime Sevilla

@Jsevillamol

5 months

Seeing @EpochAIResearch used as an example of excellent research makes me so proud! @ericneyman thanks for the kind words!

1

0

22

Jaime Sevilla

@Jsevillamol

1 year

@eigenrobot @sama can you confirm if the quote is correct or misleading? Was it >$100M including salaries of devs, or just the cost of the compute? And it's the cost of operating the cluster or the cost of buying the hardware? Does this factor in that the cluster can be reused?

0

1

22

Jaime Sevilla

@Jsevillamol

3 years

Our research made it to the first page in Hacker News #notbad

2

0

22

Jaime Sevilla

@Jsevillamol

4 years

@JimDMiller @Austen I mean, the obvious solution here is to also remove geometry and trig😛

4

0

22

Jaime Sevilla

@Jsevillamol

4 years

We won!!! #mediassist #bitxlamarato @NL4XAI

4

3

22

Jaime Sevilla

@Jsevillamol

6 months

Throwback to when @tamaybes ' joke graph on "Maybe slightly conscious" models went viral and news outlets hailed it as a major discovery.

0

1

22

Jaime Sevilla

@Jsevillamol

2 months

A report I wrote is coming out in a couple of weeks that a coauthor aptly describes as "a marathon of insights". Can't wait to see the reaction.

0

21

Jaime Sevilla

@Jsevillamol

1 year

A good reminder that the stakes are really high, and that cost effectiveness matters

Reflective Altruism

@ReflectiveAlt

1 year

Over 2022 and 2023, OpenPhil has pulled $350m in planned funding from GiveWell. This money could save about 70,000 lives today. That's the price of longtermism.

17

5

95

2

0

21