Every time a new LLM comes out, I ask it one question:
What is the smallest integer whose square is between 15 and 30?
So far, no LLM has gotten this right.
@TutorVals
Yeah, I'm definitely not claiming that my question is evidence of LLMs being worse than humans. Humans also get this question wrong, as you point out!
Just wrote a new blog post, about a sport called marble racing! In particular, I found that Jelle's Marble Runs (
@Jellesmarbles
) isn't entirely random -- marbles have a level of "skill"!
It's not that I don't support meritocratic hiring (I do), but this name is clearly riffing on DEI, which is the sort of thing you do if you're trying to own the libs (an unvirtuous motivation).
Today we’ve formalized an important hiring policy at Scale. We hire for MEI: merit, excellence, and intelligence.
This is the email I’ve shared with our
@scale_AI
team.
———————————————————
MERITOCRACY AT SCALE
In the wake of our fundraise, I’ve been getting a lot of questions
On Wednesday, I defended my PhD thesis. For me, this was a major milestone: more significant than, say, college graduation. My thesis represents a culmination of research that I had done in and around grad school -- research that I had poured my heart and soul into.
Dividing two 8-digit numbers (if one is a multiple of the other) is easy! The answer is always between 1 and 9, and you can literally ignore everything besides the first two digits of each number.
Like, what's 40762047/13587349? Obviously 3, since 40 is roughly 3*13.
From athletics to intellectualism to just sheer resilience, I love seeing the peaks of human ability. That said what would be the athletic equivalent of a 6yr old dividing 2 8-digit numbers in their head? My first thought is like a 14 second 100m dash. No idea though
It's tempting to reference xkcd comics to explain things, but you need to remember that most people have only memorized the feldspar comic (and a few others, of course).
My (not very informed) guess is something like: Harris 50%, Newsom 20%, other random people (like Buttigieg and Whitmer) totaling 30% (but no one besides Harris and Newsom above 5%).
I just finished two papers that I'd been working on for a while! I'll preview the other one tomorrow, but today:
"Are You Smarter Than a Random Expert? The Robust Aggregation of Substitutable Signals", with my advisor
@algoclass
(Tim Roughgarden) 🧵/20
Just finished a paper with
@algo_class
!
Proper scoring rules, like log or Brier, incentivize an expert to report their true belief. But what if there are *multiple* experts? In that case they can collude to guarantee themselves a larger reward: (1/8)
I have a new blog post called "Alike minds think great". It's about the bias of overestimating the competence of people who think like you (and other things).
I think Nate Silver is unusually likely to understand EA and have good critiques. I don't anticipate agreeing with all or most of the critiques, but if I had to task someone with thinking carefully about EA and critiquing it, Nate would be near the top of my list.
I'm going to have quite a lot to say about effective altruism in the forthcoming book, much of it critical, but yeah it's extremely frustrating that the DC/NYC press tends to pick the worst possible vibes-based criticisms.
I'm interning at Redwood right now. I think the work we're doing is *really cool*, I really like the team, and am generally having a great time. I think REMIX is gonna be great. Consider applying!
I'm helping Redwood Research run REMIX, a 1 month mechanistic interpretability sprint where 25+ people to reverse engineer circuits in GPT-2 Small. This seems a great way to get experience exploring
@ch402
's transformer circuits work.
Apply by 13th Nov!
Lastly, I'm excited to say that I will be joining the Alignment Research Center full-time! The last chapter of my thesis -- Deductive Circuit Estimation -- is on some of the research I did while at ARC. I'm really looking forward to continuing this research.
Last night at dinner, my housemates and I shared what we appreciated about America. Despite all its flaws, I'm a big America fan, and I love that I'm part of a community that also loves America.
And I *am* proud of my thesis. It's called Algorithmic Bayesian Epistemology, and while I can't share it until next month, I'll say a few words about it.
"Bayesian epistemology" refers to the study of knowledge and uncertainty from a Bayesian standpoint: probabilities, etc.
1 like = 1 unpopular opinion (none of these will be political). Capping this at, uh, 100, in case this goes viral or something, so I can keep my promise.
I did finish quickly (3 1/2 years), but I put a lot of effort into writing and polishing my thesis -- more than I needed to in order to graduate. I was proud of the work that I had done, and wanted to be proud of my thesis.
Last year I wrote a blog post called "Meat is the new guns". I never published it because it wasn't that good. But the thesis was: there will be a culture war over meat in the coming decades, with conservatives being pro-meat and liberals being anti-meat.
I want to thank my advisor
@Tim_Roughgarden
for guiding my through this journey. He's been incredibly supportive, and being advised by him has been a wonderful experience.
In January, I defended my thesis, supervised by
@Tim_Roughgarden
. Here's my thread from shortly after the defense.
Now I have a blog post explaining my research in more depth! (1/5)
On Wednesday, I defended my PhD thesis. For me, this was a major milestone: more significant than, say, college graduation. My thesis represents a culmination of research that I had done in and around grad school -- research that I had poured my heart and soul into.
@ne0liberal
Have you considered having Bryan Caplan (of Open Borders) on to discuss his book The Case Against Education? Haven't read it but I hear he makes a compelling case against having lots and lots of people go to college.
Again, I'm really surprised about how neglected some of the topics of my thesis are. They're super fascinating, and in some cases really important. And there's a bunch of low-hanging fruit, too! Important, tractable, and neglected, as the cool kids say :P
Just like algorithmic game theory is the application of the algorithmic lens to game theory, algorithmic Bayesian epistemology is the application of the algorithmic lens to Bayesian epistemology. And so my thesis is about how to be an *okay* Bayesian under real-world constraints.
Last year I wrote a blog post called "Meat is the new guns". I never published it because it wasn't that good. But the thesis was: there will be a culture war over meat in the coming decades, with conservatives being pro-meat and liberals being anti-meat.
Interesting to see Anthropic joining TechNet, the trade group opposing SB 1047.
That means OpenAI, Anthropic, Google, Meta, Amazon, Apple, IBM, and Andreessen Horowitz all now belong to orgs opposing the bill.
Hardly looking like regulatory capture!
@Scott_Wiener
The GOP platform is really disconcerting. On the other hand, SB 1047 does a great job of prioritizing safety without unnecessarily inhibiting AI innovation, and should be a model for the country. Thank you for your work, Senator.
Once I decided that I wouldn't go into academia (more below), I told some people that I wanted to just get my thesis done and finish quickly so I could have the PhD credential. But at some point I realized that this wasn't actually true. I actually cared more than that.
@alyssamvance
Regression to the mean, maybe? Like, maybe to get hired by the IAS most people need to get lucky (i.e. produce better research than you would on average given your skill level). And then they produce research equal to their skill level, which is worse.
What attempts have been made at money-free prediction markets? Why haven't they become popular?
[Asking because I'm considering working on making one. I have some reasons to think it might be preferable to Metaculus, which I'll write a blog post about soon.]
I super highly recommend Scott Garrabrant's ongoing "geometric rationality" sequence. I've been having lots of related thoughts, and it's wonderful to see a solid theoretical grounding for it.
There are several other topics, including forecast elicitation, Aumann agreement, and deductive estimation. I'll have much more to say about my thesis in the coming months. But what I'll say for now is: my main hope for my thesis is that it'll inspire more work on this stuff.
My crazy theory (which I think is only 5-10% likely) is that Biden is waiting until after the RNC to step aside, so that he takes the hit for Harris. RNC works less well if they don't know who they're running against.
I seem to have done well in the ACX forecasting contest!
I was unusually close to the average of all superforecasters -- indeed, closer to said average than 90% of superforecasters!
I don't focus much on computational constraints (there's been an *enormous* amount of literature on that already). On the other hand, Bayesian epistemology under informational, communication, and strategic constraints is surprisingly neglected relative to its importance.
You'd think that we'd have a really well-developed understanding of how to aggregate forecasts under incomplete information -- after all forecast aggregation is a truly ubiquitous problem! But... we kinda don't. Understanding this problem better is one topic of my thesis.
ARC's research is aimed at figuring out how to build advanced AI as safely as possible. I think this is a really important problem -- the *most* important problem -- and I'm really excited to try to contribute to this effort as best I can.
So it’s no surprise that, after a year of trying some other things, I decided to write my thesis about predicting the future.
I'm really proud of the work I did and wanted to make sure that it was accessible to a (more) lay audience. Hence this post!
But being a perfect Bayesian is often computationally intractable. And that's just one problem: there are many obstacles to completely assimilating all available information. Imagine that Alice tells you that there's a 60% chance that it will rain tomorrow, while Bob says 70%...
I've been looking forward to this!
I haven't had the chance to dig into the details yet, but so far there's been one number that has jumped out as kinda crazy to me: the model gives Trump a 43.5% chance of winning Maine. That seems *way* too high to me.
I thought about this in the context of China's one child policy, and the answer is 50%!
Every family will have exactly one boy. But what is the expected number of girls per family? With 50% chance, they have a girl. Conditioned on that, with 50% chance, they have another girl...
@ForecasterEnten
I think it isn't inconsistent to support a meeting when you like the president and oppose it when you don't. You just have to think that a meeting is worthwhile if and only if the president is competent.
@Scott_Wiener
Thank you for your work on this! I think this bill is great and I appreciate that you're striving to make it even better. The derivatives model clarification is particularly important IMO.
Why does no one in my circles ever argue that the tail risk of nuclear energy isn't worth the benefits? This is exactly the sort of contrarian argument I'd expect to hear. And I'm not sure it's wrong either! Has anyone seen an analysis that carefully deals with tail risk?
For personal reasons it's relevant to me what percent of Londoners will have covid on Thursday, the 16th. I did some math, and the number I got was really high: 10%!
I'll spell out my math in this thread. Let me know if you think I'm doing anything wrong!
Manifold has donated $245,120.79 since we've started (plus a whole bunch of
@manifund
regrants)!
Let's try to hit $300k or more before the rate change on May 15th 😤
Happy to add any charity you care about and want to donate to
Earlier I said that
@slatestarcodex
won a bet with
@GaryMarcus
on image generation. That was incorrect -- the bet was with a commentor named Vitor -- and I apologize to Gary Marcus.
@NateSilver538
But like, when polling issues, a 4-point difference in margin doesn't matter that much. 52-48 an 48-52 both mean "basically half of Americans like this". So even if polls are regularly off by 4 points, we can use them to get a pretty good handle on public opinion!
How should you aggregate their forecasts? If you know exactly how Alice and Bob's information overlaps, then maybe you can compute the perfect aggregate. But in practice, you almost never do. This is an *informational* barrier to being a perfect Bayesian.
There are a few people I follow who very consistently say really interesting, reasonable things. I recommend
@KelseyTuoc
and
@juliagalef
to everyone and
@davidshor
and
@Nate_Cohn
if you're interested in policitcs and elections.
Confidence in probabilities: a thread.
I sometimes say things like "I'm 50% sure of X but I'm not confident about that". What does "not confident" mean; what am I saying that's not expressed in the 50%?
This is importantly misleading, because lesser-known candidates do worse in head-to-head polls.
In 2020 polling, I found that for every 10-point increase in candidate recognition, a candidate gained 2 points in polls vs. Trump.
NEW POST-DEBATE POLL: In a new survey, 45% of likely voters choose Biden and 48% choose Trump in a head-to-head matchup.
However, there is no clear advantage among the alternative candidates who could replace Biden as the Democratic nominee.
@jeb_2020
@HarrisonScheer
@GRMagalha
@ne0liberal
Thanks for pointing this out. For anyone interested, here's the full table. First column is probability of Trump winning the electoral college conditioned on winning the state, second column is same but for Biden.
It refers to the "algorithmic lens" of theoretical CS on other disciplines, a major component of which is seeking *satisfactory* solutions in light of real-world constraints that prevent you from getting a *perfect* solution to your problem.
Writing is hard because an essay is linear, whereas your own understanding of the material is a DAG. You have to to figure out a nice way to order your thoughts and convey them so the reader builds their own DAG.
The
@nytimes
is planning to publish an article containing Scott Alexander's real name, and this is *very* not okay. It serves no positive purpose, but does serve to threaten Scott's job, and maybe even his life and family. (1/3)
There are other constraints as well: maybe you're limited by communication, or by the fact that the information you need it held by experts who behave strategically instead of honestly.
And that's where the word *Algorithmic* in my thesis title comes in...
I love the feeling of unexpectedly using a piece of math from a pretty unrelated field in my research. I just used Shamir secret sharing (cryptography) to prove a result in my work on aggregating forecasts!
Introspecting about how I feel patriotism, I think that the action that the federal government could take to most increase my sense of national pride would be to substantially increase the amount of immigration into the US, let in lots of refugees, lift visa restrictions, etc.
@LinchZhang
One important piece of word choice they made is saying that "the executive branch" issued an executive order, rather than saying "Biden". I'm guessing way more Republicans would have said they disapprove if they had said "Biden".
Watch this space. These questions just opened, so there will probably be visible community predictions soon (maybe tomorrow?). I haven't seen many people willing to make probabilistic predictions about Omicron, so this will be valuable.
Yesterday I gave a quick summary of my paper with
@algo_class
on aggregating forecasts. Today I'll describe my other paper, with Raf Frongillo and Bo Waggoner of CU Boulder. This one's called "Agreement Implies Accuracy for Substitutable Signals". 🧵/16
I'm finally thinking *on my own* about AI alignment and writing down my thoughts (might share at some point, but probably not soon). A couple big uncertainties stand out: questions that I think really matter in terms of figuring out which approaches might work. Here are two:
@devonzuegel
That was the one I reacted most negatively to! (Though I don' think it's the worst one on the list -- it's just a pet peeve.)
It sets up a huge prisoner's dilemma: maybe it's correct for the nation to go to war, but it's in everyone's interest to vote no anyway.
@TheOnion
The first quote (translating via ASCII) is "I'm honored to meet with such hard-working, true Americans." The second one is "as president, I will never stop fighting for you.Despite our differences we all want freedom, democracy, and electricity."