Anthropologist
@MPI_EVA_Leipzig
- telling anyone who will listen that, if we are very careful and try very hard, we might not completely mislead ourselves
The folks at
@ISBA_events
tell me my book Statistical Rethinking has won the 2024 DeGroot Prize for its contributions to "statistical inference, decision theory and statistical applications". This is huge honor especially given the previous winners, who have influenced me so much
I used to teach game theory, both undergrad & phd levels. One game I would do at start is version of Keynesian beauty contest: everyone picks a number 0-100, person closest to 2/3 of average wins. Nash is 0. But anyone choosing 0 loses, bc the class aren't (yet) game theorists. >
Saying "there is one way to arrange zero objects" is the perfect example of sentence that makes sense to a mathematician but not to a person. I like it
Ppl are on twitter for many reasons. I'm here to chew bubblegum & teach scientific inference. And I'm all out of gum. Engage yourself with 20+ hours of free lectures on causal inference and Bayesian data analysis. This isn't an ordinary statistics course.
Students who had previous exposure to game theory would get mad at the class. "You have to choose zero!" They were robbed of victory by the dumb majority. But this is how society works. Gotta have the right model of the distribution of strategies. >
The 2023 edition of my long-running anti-establishment art-science-fusion code-therapy smooth-baritone causal inference & Bayesian data analysis course is complete. 20 lectures, from the basics of causal inference & Bayesian updating to mixed models & Gaussian processes. 1/2
Forgive me, for I am about to Bayes. Lesson: Don't trust intuition, for even simple prior+likelihood scenarios defy it. Four examples below, each producing radically different posteriors. Can you guess what each does? Revealed in next tweet >>
Honestly publishing journal articles is the lowest impact activity I could do right now. I know not everyone finishes every video. But the sustained viewings are good this year. I remain at your service.
It has been 951 days since Bill Gates gifted every stats teacher with this finely distilled tweet. It's so good, because Gates is not dumb. There is nothing dumb about not understanding conditional probability. It's only human.
Want to learn a little game theory? Lectures from my course "Very Little Evolutionary Game Theory". Topics:
1. Evolution of Conflict
2. Evol of Cooperation
3. Evol of Relationships
4. Evol of Families
5. Evol of Societies
What at x2 speed for full effect:
These "how are Sardinians living so long" takes are still popping up. My favorite hypothesis: They lie about (or don't know) how old they are. Few of the 100+ folks have birth cerificates. Same phenomenon in elsewhere: offical birth certs appear -> supercents decline.
The
#1
spot in the entire world where people live to 100 years old:
Sardinia, Italy.
I wanted to find out what Sardinians are doing that other places arenβtβ¦
Here are the 5 key things I found:
(Actions you can steal)
I am giving an internal talk to the Max Planck IT community next month. They asked for something about the role of software in science. Here's what I'm giving them. I will try to record and upload afterwards...sure to be spicy.
Statistical Rethinking 2nd edition page now lists code conversions for:
* raw Stan+tidyverse
* brms+tidyverse
* PyMC3
* Tensorflow Probability
* Julia & Turing
I know other conversions in the works. If I have missed something, please let me know.
Thinking again this morning that so much scientific writing is bad because it is written for hostile reviewers rather than for interested readers. Not sure there is a way out of this trap.
It is yet again a good time to post this great paper from Xiao-Li Meng on how data quality influences effective sample size. You really musk read it. [pdf: ]
Opening up Statistical Rethinking 2022 to external registration. All the materials will be public, so registration only necessary if you want to join weekly online discussion. Starts in January.
Details:
Register:
I spend a lot of time complaining about statistical practice, so recent followers might appreciate that I also spend a lot of time trying to present reasoned solutions. Like 20+h of free lectures on computational Bayes & causal inference. Vibe to trailer.
Stats is broken. Causal inference is broken. Together they are less broken. I will teach a 3h online course in September, aimed at ppl who have tried to understand connections btw causal inference & statistical inference but were discouraged by notation & puffery.
@StatHorizons
72% of corresponding authors (CA) responded positively to a raw data request when the CA was an early career researcher. Same figure for senior researchers was only 11%.
#openscience
#hope
#paywall
10% of your genome is composed of traces of Alu, a mobile element that has jumped around our DNA for millions of years doing nothing important. But one of those jumps may have caused our distant ancestors to lose their tails, about 15 million years ago!
My Stat Rethinking course this winter has filled up. 160 people joining me each week through the magic of Zoom to work the problem sets and address conceptual questions. Everyone else is welcome to the materials, including lecrures & problem set solutions:
As summer gets started, here is a reminder that all 20 lectures of my 2019 applied bayesian stats course are online (with all notes and exercises/solutions):
hey
@rlmcelreath
-
just finished your 2019 rethinking course, all with exercises and owls done :)
it was truly amazing and can't stop recommending it to people!
thank you!
Norwayβs $1.5 trillion sovereign wealth fund lost $92 million because of an excel spreadsheet error - "the most consequential misdated cell in history"?
I teach the Kalman filter as a special implementation of a class of Gaussian Processes (GPs). Much of the modern world runs on these algorithms so a shame they are not more central to training, if that's the case.
The Kalman Filter was once a core topic in EECS curricula. Given it's relevance to ML, RL, Ctrl/Robotics, I'm surprised that most researchers don't know much about it, and many papers just rediscover it. KF seems messy & complicated, but the intuition behind it is invaluable
1/4
Every month, I send someone the "Table 2 Fallacy" paper. Let's stop bike-shedding over p-values and face the fact that most scientists have no idea what a regression coefficient means in the first place. Link:
This debunked finding keeps coming back. Humans are at least as good at this task as the chimpanzees. It's a practice effect. See replies in the thread.
Good television makes for bad theorizing.
One of the hard things about immigrating to Germany is the general lack of positive feedback. I was born here, but I'm still very American, so I give positive feedback & it makes ppl uncomfortable sometimes.
This from the Max Planck Society's guide for foreign scientists:
Open registration begins for the 2023 session of my annual anti-statistics course focusing on causal inference & Bayesian data analsysis & fully coded examples. Lectures will be free online, so no need to register unless you want to join discussions.
You may know that I wrote a code-heavy, jokey, entropy-loving applied Bayes stat book.
But you may not know that altruistic colleagues have translated the examples into:
(1) tidyverse + brms
(2) Python
(3) raw Stan
(4) Julia
Everything linked at top:
Causal salad, causal design, causal inference. I did a 3 hour workshop in Leipzig yesterday on causal inference, aka why your regressions are garbage lolsob. Here's a recording of me covering the same content, I promise it's not boring:
I'm taking my steps to becoming a bayesian (but I'm gonna bitch and complain my whole way about it)
I'm gonna say this is the book that literally converted me. It's beautifully written in a way that makes me think of reading a novel almost.
Likert scores are not integers and they cannot be subdued by pretense. Stop pretending and meet me in the warm 3rd circle of stats hell and learn about ordered categorical models. Lecture:
Looking for a distraction? Because reasons.
How about 20 hours of causal inference and Bayesian statistics? Ranging from the foundations of inference to high-dimensional machine learning? Yeah that's the stuff. First sample is free. Okay it's all free.
Things I do not do often enough:
1. inflate my bicycle tires
2. descale my de'longhi
3. call my mom
4. remind you that I made 20 hours of free bayes stats (really anti-stats) lectures because i love you
Trying to improve my upcoming lecture and I know I will spend 15 minutes looking for a cat pic that more closely matches the tiger's pose. But these are almost perfect.
Okay I give up: A p-value really is the probability the null is true. We lost this game, statisticians. Every one of you gave it your best, and I will always be proud of you. But the scientists cannot be defeated by conventional means. GG
For all the people who followed me recently because I posted some weird posterior distributions, I have made more than 20 hours of free lectures on Bayesian stats and causal inference just for youπ
The 2023 edition of my long-running anti-establishment art-science-fusion code-therapy smooth-baritone causal inference & Bayesian data analysis course is complete. 20 lectures, from the basics of causal inference & Bayesian updating to mixed models & Gaussian processes. 1/2
If I am not answering your email, it's because I'm working on new (free) lectures to begin in January 2022. Fewer examples, but more workflow details and lots of new animation. I'll update this repo as the schedule and materials assemble:
Writing a review and I just wrote: "People used to do theory. Now they just do regressions."
I will delete my snark before submitting. But damn I am tired of vapid empirical papers that argue some neglected thing will transform our understanding and then ***regressions***
OUTAGES
- Biggest IT outage ever according to experts
- Major banks, media, airports and airlines affected by major IT outage
- Rail services disrupted in parts of US and UK
- Payment systems impacted in different parts of the world, including Australia and the UK.
-
Readers of my book will know the globe tossing example in the early chapters that I use to introduce bayesian updating. I have now fully virtualized it.
βWe show that published papers in top journals that fail to replicate are cited more than those that replicate. This difference doesn't change after publication of failure to replicate. 12% of postreplication citations acknowledge the replication failure.β
So suppose I were thinking of leaving academia. Would anyone be interested in hiring me? DM and let's chat. I haven't made up my mind, but just trying to assess the landscape, eventually find someplace where I can make a difference
"The general rule is that people .... do not read." Hilarious, true, sad and reminds me of this bit from 2nd ed of Collins & Pinch where they quote a physicist talking about the fact that physicists don't read:
Scientist, post-doc, and PhD positions open in my department in Leipzig:
Things we value in candidates: Open science, scholarship, mad skills
Things we do not value: Number of pubs, journal impact factor, h-index
Lecture recordings, slides, homework sets and solutions are all listed here. Take it at your own pace, as you like it. First half is a solid course in regression and causal inference. Second half turns it up to 11.
Looking for a mind-growing distraction? How about my 3 part intro to Bayesian causal inference. It's like a condensed version of my book, 10 weeks of causal computation in 3 short blog posts. Take with plenty of water.
I was asked by a journalist how Bayesian stats is relevant to the epidemic. I said some deflationary things. I don't care about 19th century academic debates.
I worry more about narrative that we need to get the models "right". We don't buy insurance bc we know what will happen.
This is the 2 page template my PhD students and I use to draft their project proposals. This came up this morning, as I met with all the PhD students to review a bunch of committee procedures. Students seem to like this template.
Just finished 2nd week of my bayes & causal inference course. Prerecording for internet audience is satisfying. Fewer spontaneous jokes but better content. First 4 lectures as alternative to doomscroll, my gift to you.
Yes I will offer my online science-focused Statistical Rethinking course again starting in January 2024. Registration going up at the end of this month. All the course materials are already online though, so why wait? Update your posterior today
So much Machine Learning snark in my timeline. But I honestly wouldn't be surprised if logistic regression would be a moon shot level of improvement in many industries. In some industries, coin flipping might increase accuracy.
That's my optimistic cynicism for the week.
Science as Amateur Software Development
50min talk, webcam edition
As frustrating as software engineering can be, it is still more professional than normative scientific research. Lots of shade thrown at academia, some hopefully useful suggestions.
Golems, Owls & DAGs: Lecture 1 of Statistical Rethinking 2022. No hard work yet in this lecture. Just setting the stage. Lecture 2 dropping soon with Bayesian updating.
Alphabetical order mismatch and 52 of 78 neighborhoods had wrongly merged data. I spend a lot of time teaching advanced inference methods, but boring research data management remains the most essential skill. And that includes auditing for merge mistakes.
The findings in the published paper, however, are completely untrue, and can only be replicated using an improperly merged dataset. The findings are an artifact of failed data management.
This short comment from Andrew Gelman on designing experiments is pointed and useful. Love the de-emphasis on power analysis but emphasis on simulating and making hard choices. PDF:
Take heed, Statistical Rethinkers - there are many code translations of the book examples and the new lecture examples into your favorite dialect of SnakeScript or CleanSpace or whatever hacker nonsense you like. I try to maintain an updated list:
Oops, I accidentally stayed up until 1am reading Statistical Rethinking by
@rlmcelreath
.
I really love the way in which he couches the use of statistical models in the history of the philosophy of science. π
Want to learn a little game theory? Lectures from my course "Very Little Evolutionary Game Theory". Topics:
1. Evolution of Conflict
2. Evol of Cooperation
3. Evol of Relationships
4. Evol of Families
5. Evol of Societies
What at x2 speed for full effect:
Slides + audio: A gentle 2 hour introduction to Bayesian data analysis & causal inference. This is "gentle" because it ignores computation & focuses instead on motivation, basics of Bayesian updating, simple confounds & colliders.
I complain a lot about academia. But I also try to help. eg here are 20h of patient lectures on scientific inference taught at an algorithmic level. No one should have to learn this stuff the way that I did.
Local registration for the 2024 round of my Statistical Rethinking course has begun. I'll open up registration on Sunday 3 December. Registration link will appear on the course github page:
Since I am talking about p-values today (only day this year I promise), the common claim that p-vals are uniform under the null is not in general true. Even in theory. Here is the dist of p-values under null for logistic regression, two groups:
My editor is turning up the heat & the 2nd edition of Statistical Rethinking will be out next year. Below is 1st page of last chapter, summarizing my general attitude to stats. More here:
Okay, I added vertical axes, as is my duty. Trying to put some version of this on cover of 2nd edition of Statistical Rethinking. It's an example in Chapter 4.
Multiple regression is not an oracle that spits out the total causal effects of each explanatory variable. Adding variables can hurt as much as it can help, whether a study is observational or experimental.
Brand new chapter in 2nd edition of my stats book will be about using real scientific models to build statistical analyses. Need to edit it down now to only 3 distinct examples. Feel like I could do an entire book of examples like these.
Many performers of music cannot read it. Okay. There are other, often more intuitive, ways to learn music.
Scientists perform stat models. Most scientists cannot read them. This is less OK, but there are other ways to learn models.
Short thread in which I strain this comparison
In stats consultations lately, common problem across domains has been that scientists want to start with what they have measured (or can download) instead of what they would ideally want to measure. Gotta back them up and get them to science before stats. Hard.
I hope everyone is having a relaxing holiday season. I am still making my gift to you, a bunch of new lectures for January. This is a lot of work! Teaching continues to be the hardest and most impactful part of my job. Below: Drawing the Bayesian owl.