Christoph Molnar Profile Banner
Christoph Molnar Profile
Christoph Molnar

@ChristophMolnar

31,766
Followers
1,047
Following
751
Media
4,976
Statuses

Author of Interpretable Machine Learning | Newsletter:

Munich
Joined July 2012
Don't wanna be here? Send us removal request.
Pinned Tweet
@ChristophMolnar
Christoph Molnar
7 months
Happy to publish our book project “Supervised Machine Learning for Science” today 🥳 Read it for free here: The book is about the philosophical and pragmatic justification for using supervised machine learning in science.
10
80
381
@ChristophMolnar
Christoph Molnar
2 years
BREAKING: IBM makes game-changing move, invests $42 in ChatGPT Pro to revolutionize IBM Watson capabilities
143
564
9K
@ChristophMolnar
Christoph Molnar
6 months
I have the solution for detecting AI-generated text.
Tweet media one
232
268
5K
@ChristophMolnar
Christoph Molnar
6 years
2 years, 250 pages, 1,219 commits, and 78,480 words: I am very proud to say that today I published the 1st edition of "Interpretable Machine Learning". 🎉🎉🎉 Web: Leanpub:
Tweet media one
71
1K
4K
@ChristophMolnar
Christoph Molnar
4 years
When your machine learning model is confronted with data it has not been trained on.
63
723
4K
@ChristophMolnar
Christoph Molnar
2 years
You can't "train" a model. The model always exists. It existed before you were born and it exists after your death. You can only find the model. "Training" is just your way of looking for the model's location in the infinite hypothesis space and binding its essence to silicon
199
319
4K
@ChristophMolnar
Christoph Molnar
2 years
I write my books with vim. Why tough? While there are many good reasons to use Vim for book writing, one stands out: I don't know how to exit Vim.
81
238
4K
@ChristophMolnar
Christoph Molnar
4 years
Machine Learning concepts as animal GIFs A thread 🧵
53
1K
4K
@ChristophMolnar
Christoph Molnar
2 years
Watching the k-means cluster slowly converge
35
393
3K
@ChristophMolnar
Christoph Molnar
11 months
Ready to nerd out on random forests? Someone wrote an entire Ph.D. thesis on random forests ❤️ 200 pages of gems. Highlights: Why ensembles work, interpretability of forests, and making random forests work on huge datasets.
Tweet media one
15
487
3K
@ChristophMolnar
Christoph Molnar
3 years
A lot of machine learning research has detached itself from solving real problems, and created their own "benchmark-islands". How does this happen? And why are researchers not escaping this pattern? A thread 🧵
Tweet media one
54
719
3K
@ChristophMolnar
Christoph Molnar
1 year
Tweet media one
22
199
2K
@ChristophMolnar
Christoph Molnar
8 months
Evaluating a model on training data is like asking your mom if you look good.
41
289
2K
@ChristophMolnar
Christoph Molnar
11 months
The mean is the best prediction model • Needs no features • Easy to compute • No overfitting • Interpretable • Optimizes L2 • Analytical solution
Tweet media one
118
139
2K
@ChristophMolnar
Christoph Molnar
1 year
Not many people know this, but one milestone in becoming an ML researcher is getting blocked by Lex Fridman on Twitter.
85
101
2K
@ChristophMolnar
Christoph Molnar
3 years
Tweet media one
12
342
2K
@ChristophMolnar
Christoph Molnar
1 year
Neural networks with only linear layers would result in linear models. Because a linear combination of linear functions is again linear. Nothing would be gained by adding more layers.
Tweet media one
@mrdbourke
Daniel Bourke
1 year
ML folks, does anyone have a good resource/explanation for *why* the use of non-linear functions (e.g. ReLU) in neural networks and why they work so well? My simplified version is: Combine enough straight (linear) and non-straight (non-linear) lines and you can draw a
Tweet media one
103
58
593
33
183
2K
@ChristophMolnar
Christoph Molnar
10 months
"Machine Learning is just statistics." I thought as I entered my first Kaggle competition. Turned out to be a harsh reality check: ML and statistical modeling require different mindsets. This haunted me for many years, so I wrote it all down in my book Modeling Mindsets.
Tweet media one
12
227
2K
@ChristophMolnar
Christoph Molnar
3 years
Where are you on your machine learning journey?
Tweet media one
17
190
2K
@ChristophMolnar
Christoph Molnar
8 months
I just found a great introduction to embedding. The book is comprehensive yet short. Historical encoding tools, neural nets, and production - all covered. Fantastic job by @vboykis . Thanks for making it free to read! Looking forward to diving in.
Tweet media one
Tweet media two
5
276
1K
@ChristophMolnar
Christoph Molnar
6 months
Will this now happen with every new LLM update?
Tweet media one
19
193
1K
@ChristophMolnar
Christoph Molnar
2 years
Some dude at a BBQ once asked me what I do. I said research in ML / AI Dude went off into an angry rant about surveillance, killer robots, and self-driving cars. No chance to tell him my work is most useful for data stored in Excel for some boring process in an insurance corp.
29
54
1K
@ChristophMolnar
Christoph Molnar
1 month
"Machine Learning is just statistics." I thought as I entered my first Kaggle machine learning competition. Turned out to be a harsh reality check: ML and statistical modeling require different mindsets. This haunted me for many years, so I wrote the book Modeling Mindsets.
Tweet media one
10
161
1K
@ChristophMolnar
Christoph Molnar
4 years
What do you call a machine learning model that perfectly predicts the training data, but does not work for unseen data? A database
40
131
1K
@ChristophMolnar
Christoph Molnar
1 year
How it started: f(X) = X How it's going: • is f AGI? • f might be conscious • f will conquer the world • You can use premium f for $20 a month • f is only 14 weeks old and here is what you can do. • Here are 10 products of f(X)=X you haven't heard of
30
173
1K
@ChristophMolnar
Christoph Molnar
2 years
I "grew up" as a statistician. When I later learned about machine learning, I found it a mind-blowing new perspective on data modeling. The best data modelers don't ideologically follow one mindset but have many at their disposal. But which modeling mindsets are there? 🧵
15
200
1K
@ChristophMolnar
Christoph Molnar
2 years
Progress in generative AI Past: Cherry-pick good results Now: Cherry-pick bad results
15
75
1K
@ChristophMolnar
Christoph Molnar
2 years
Novice data scientist: Data is data. Expert data scientist: Each data column has a difficult past. A past full of life-altering events (distribution shift), loss (missing data), misunderstandings (wrong encodings), and regrets (no documentation of how data was collected).
11
187
1K
@ChristophMolnar
Christoph Molnar
2 months
My book Interpretable Machine Learning has been career-defining. Even though the book is available on the web for free, many have supported me by buying the ebook or paperback, which ultimately helped me become a full-time writer. I'm grateful for all your support 🙏
Tweet media one
13
163
1K
@ChristophMolnar
Christoph Molnar
7 years
The first version of my online book on Interpretable Machine Learning is out! I am very excited to release it. It's a guide for making machine learning models explainable. #interpretableML #iml #ExplainableAI #xai #MachineLearning #DataScience
Tweet media one
16
463
1K
@ChristophMolnar
Christoph Molnar
2 years
Machine learning sucks at uncertainty quantification. But there is a solution that almost sounds too good to be true: conformal prediction • works for any black box model • requires few lines of code • is fast • comes with statistical guarantees A thread 🧵
20
140
976
@ChristophMolnar
Christoph Molnar
3 years
LASSO is the Marie Kondo of linear regression models
Tweet media one
6
169
916
@ChristophMolnar
Christoph Molnar
1 year
This is used as an example of a Gaussian distribution. But it's not a pure Gaussian. We must increase the nerd level: It's a mixture distribution of a discretized and truncated Gaussian with 0 and 135 (?) cut-offs and a Dirac delta function at 100.
@memecrashes
A meme page to check every time MatLab crashes
1 year
gauss 🤝🏽 gymbros
Tweet media one
127
5K
63K
16
98
896
@ChristophMolnar
Christoph Molnar
2 years
Studying statistics was one of the best decisions I made. After finishing school, I had no idea what industry I wanted to work in. I liked math. So I chose statistics. It's like having a janitor's key ring that gives me access to any industry. I never looked back.
22
93
889
@ChristophMolnar
Christoph Molnar
3 years
No amount of deep learning can fix bad data. Modeling is fun, good data is hard work. Improving data can mean - going through hand-written records - building a labeling interface yourself - find out why data is missing - investigating the "trail" of the data - weeks of curation
13
139
864
@ChristophMolnar
Christoph Molnar
3 years
compute resources
Tweet media one
7
92
874
@ChristophMolnar
Christoph Molnar
4 years
We have a new paper on arxiv 🎉🎉 Interpretable Machine Learning - A Brief History, State-of-the-Art and Challenges. Best to read in a comfortable chair with a cup of coffee/tea. It's an extended abstract to a keynote I gave at the ECML XKDD workshop.
18
217
842
@ChristophMolnar
Christoph Molnar
11 months
I'm calling for a more minimalistic approach to machine learning: Regression? Predict the mean Robust regression? Median Classification? Always predict majority class Time series? Same as yesterday: Y(t) = Y(t-1) Text generation? Lorem ipsum
27
71
838
@ChristophMolnar
Christoph Molnar
2 years
Bayesians: "Join us! Updating prior believes is exactly how humans learn." Causal inference: "Join us! Humans think in causal relationships." Machine learning: "Join us! Models are black boxes and so are humans." 1/2
6
119
822
@ChristophMolnar
Christoph Molnar
3 years
Reinventing the wheel
Tweet media one
7
117
814
@ChristophMolnar
Christoph Molnar
2 years
Most ML interpretation methods have a common enemy: Correlated features. They ruin interpretation both on a technical and a philosophical level. Why correlation is problematic, how to patch it, and why we have no cure. A thread 🧵
17
171
809
@ChristophMolnar
Christoph Molnar
1 year
Machine learning interpretability from first principles: • A model is just a mathematical function • The function can be broken down into simpler parts • Interpretation methods address the behavior of these parts Let's dive in.
Tweet media one
3
126
796
@ChristophMolnar
Christoph Molnar
1 year
It took me a long time to understand Bayesian statistics. So many angles from which to approach it: the Bayes' Theorem, probability as a degree of belief, Bayesian updating, priors and posteriors, ... But my favorite angle is the following first principle :
8
116
779
@ChristophMolnar
Christoph Molnar
2 years
Regression models usually just output a point prediction. That's a problem. Because the prediction could be spot on or it could be a wild guess. To distinguish these two scenarios, we need uncertainty quantification. A solution: Conformal Prediction 1/n
17
118
769
@ChristophMolnar
Christoph Molnar
3 years
Statisticians and Data Scientists
Tweet media one
7
60
751
@ChristophMolnar
Christoph Molnar
1 year
Bayesian modeling from first principle and memes. Let's go.
15
124
750
@ChristophMolnar
Christoph Molnar
2 years
SHAP, LIME, PFI, ... you can interpret ML models with many different methods. It's all fun and games until two methods disagree. What if LIME says X1 has a positive contribution, SHAP says negative? A thread about the disagreement problem, and how to approach it:
19
139
745
@ChristophMolnar
Christoph Molnar
4 years
@MMsina @marielli In Germany we call this "Waschmaschinenfremdbenutzungsabsicherung" and you can buy it at the Baumarkt. No, just kidding. What's wrong with the neighbour?
7
8
728
@ChristophMolnar
Christoph Molnar
2 years
Simply apply SHAP and other methods to interpret your machine learning model, paste those charts into a report and you're done? Unfortunately, if you do, you're likely to make a mistake. Avoid these pitfalls when interpreting your machine learning model.👇
12
167
744
@ChristophMolnar
Christoph Molnar
11 months
Been doing ML for some years now. Here's my ranking: Supervised ⭐️⭐️ ⭐️ ⭐️ ⭐️ Self-supervised ⭐️ ⭐️ ⭐️ ⭐️ Reinforcement Learning ⭐️ Unsupervised 🚮
52
45
721
@ChristophMolnar
Christoph Molnar
2 years
Unpopular (?) opinion: Scientific papers are designed to be defensive which comes at the expense of honesty and readability. I blame the peer review system.
33
42
724
@ChristophMolnar
Christoph Molnar
3 years
Finally, the second edition of Interpretable Machine Learning is here! 🥳🥳🥳 So far only as e-book. Paperback should be available shortly (and in color!). Leanpub: Amazon:
11
119
687
@ChristophMolnar
Christoph Molnar
2 years
Data science cannot be fully automated. No amount of parameter tuning, benchmarking, automation, model comparison, automated feature engineering, etc., can automatically figure out what data column location_123_old contains and whether it should be a feature or not.
26
80
645
@ChristophMolnar
Christoph Molnar
2 years
Supervised learning "only" gives you a prediction function. But with the right tools, you'll get a lot more: • Uncertainty quantification • Causality • Interpretability • Analysis of variance • ... And the best news: tools in this thread work for any black box model 👇
8
102
618
@ChristophMolnar
Christoph Molnar
2 years
It's overwhelming to keep up with research on interpretable machine learning. I say that as the author of the Interpretable Machine Learning book. 😅 I use these 3 questions to quickly understand new interpretation methods:
11
99
618
@ChristophMolnar
Christoph Molnar
2 years
Super excited to release my new book Modeling Mindsets tomorrow 🥳 This is my first book since I decided to become a full-time writer. And it's the book I wish I had read a few years ago to save myself a lot of time. 1/n
Tweet media one
10
80
615
@ChristophMolnar
Christoph Molnar
6 months
I'm hoping developers take the AI dev thing seriously. Self-driving cars were introduced in the mid-2010s. Look at all the fools who got their driver's license nonetheless and now have a worthless document. Or think of all the jobless radiologists trying to make ends meet.
31
68
615
@ChristophMolnar
Christoph Molnar
2 years
I found it unexpectedly difficult to get into causal inference. (still a beginner, I guess) Here are a few insights that helped me in understanding causal inference. 🧵
16
64
574
@ChristophMolnar
Christoph Molnar
4 years
Logistic Regression sold as AI
2
48
586
@ChristophMolnar
Christoph Molnar
3 years
If you were to learn only 1 method for explaining machine learning models, it should be Shapley values (SHAP) - Model-agnostic: Use with any model - Theoretic foundation: Game theory - Good software ecosystem - Local and global explanations More here:
16
83
581
@ChristophMolnar
Christoph Molnar
2 years
One of the best arguments for supervised learning was made by a statistician. Statistical Modeling: The Two Cultures Every modeler should read it. The paper is written by Leo Breiman, the inventor of Random Forests.
11
106
580
@ChristophMolnar
Christoph Molnar
3 years
Early on, I had made "Interpretable Machine Learning" (web + e-book) available for 0$. I never regretted the decision, and I happily share my knowledge. And yet so many of you opted to pay for the e-book, encouraging me on my path of free knowledge. I am grateful 🙏
7
30
561
@ChristophMolnar
Christoph Molnar
3 years
How COVID ML papers are created
Tweet media one
9
101
558
@ChristophMolnar
Christoph Molnar
2 years
The modeling mindset of statisticians in one tweet 1) Measure random variables, e.g. water temperature 2) Assume distribution, e.g. Normal distribution 3) Goal: Find optimal distribution parameters, e.g. mean 4) Solution: Maximize the likelihood Parameters = Insight about world
Tweet media one
10
80
548
@ChristophMolnar
Christoph Molnar
3 years
Statistics, Machine Learning, Causal Inference, ... People coming from these different modeling cultures often have misunderstandings, and fight over whether logistic regression is "AI" or statistics. Why the big differences? 1/n 🧵
8
133
549
@ChristophMolnar
Christoph Molnar
2 years
This tweet is 20% food for thought 80% shitposting Thank you for your attention
17
6
542
@ChristophMolnar
Christoph Molnar
3 years
Missing data in a random forest.
@AKBrews
The Spooky Lawyer | QENNY
3 years
What is this?
Tweet media one
4K
17K
160K
10
58
522
@ChristophMolnar
Christoph Molnar
5 years
What are some limitations of interpretable machine learning methods? This summer, our students worked on this question. We compiled the results in a free online book, which we release today. 🎉🎉🎉 Find out more:
Tweet media one
12
168
510
@ChristophMolnar
Christoph Molnar
7 months
To become a statistician, you must study the divine numbers: 30 represents sufficiency 0.049 is a sign of good fortune 1.96 protects you from uncertainty 0.051 says that your career is in danger
8
48
502
@ChristophMolnar
Christoph Molnar
6 months
I don't get why people still use SQL???? • Rejects most prompts. Requires specific prompting. • 100% overfitting to training data • No generalization to new data • Non-stochastic parrots
26
40
502
@ChristophMolnar
Christoph Molnar
2 years
Supervised machine learning models are deployed everywhere. It's an open secret that all models have a huge problem: Performative prediction - when predictions change future outcomes. How to spot and handle this problem: A thread 🧵
19
85
492
@ChristophMolnar
Christoph Molnar
3 years
When machine learning researchers reinvent methods from statistics.
8
73
485
@ChristophMolnar
Christoph Molnar
2 months
I see what you did there. Love the title.
Tweet media one
18
33
495
@ChristophMolnar
Christoph Molnar
1 year
Statistical modeling from first principle. But with memes and GIFs. Let's start.
Tweet media one
4
118
481
@ChristophMolnar
Christoph Molnar
3 years
We assume linearity
Tweet media one
6
75
471
@ChristophMolnar
Christoph Molnar
7 months
3 days left. Looking forward to the release of Supervised Machine Learning for Science.
Tweet media one
9
50
471
@ChristophMolnar
Christoph Molnar
2 years
Most machine learning applications fail. Google Flu, Amazon hiring, ML in healthcare, ... Why? A wide range of reasons. But most are due to the same flaw: not thinking through the problem as a statistician would. Adopt a statistician's mindset. Become a better machine learner
17
67
459
@ChristophMolnar
Christoph Molnar
2 years
tabular -> import xgboost text -> from transformers import pipeline image -> import torch, torchvision chat -> import langchain thanks for coming to my TED talk
10
35
461
@ChristophMolnar
Christoph Molnar
2 years
Modeling Mindsets is published 🥳 From Bayesian inference to unsupervised learning: Modeling Mindsets provides an intuitive introduction to different modeling approaches and allows you to choose the right one for your problem. Get your copy at:
21
86
459
@ChristophMolnar
Christoph Molnar
2 years
Is linear regression machine learning or stats? is like asking Is an apple a fruit or is it something to eat?
35
30
452
@ChristophMolnar
Christoph Molnar
4 years
Some examples how interpretable machine learning (IML) is used for scientific discovery, when the underlying model is a black box model. A few examples. Thread 1/n
8
110
441
@ChristophMolnar
Christoph Molnar
2 years
I once met a professor who was OBSESSED with h-index and citations. - Advised me to stop my book, turn it into a paper to get cited - Output not published at icml/ neurips? Worthless - Gave a talk - it was only about flashing status symbols Absolutely corrupted by THE GAME.
22
23
437
@ChristophMolnar
Christoph Molnar
3 years
The more I dive into machine learning targeting COVID, the more I believe that the research impact is not zero, but negative. Negative, bc. many prediction models are low quality, unusable. Waste of time + resources. Adding noise to an already crazy publishing environment.
17
63
429
@ChristophMolnar
Christoph Molnar
3 years
Welcome to academia
Tweet media one
4
27
431
@ChristophMolnar
Christoph Molnar
2 years
Deep learning crowd getting novel models.
5
53
422
@ChristophMolnar
Christoph Molnar
1 year
My new book, Interpreting Machine Learning Models With SHAP is now also available as a paperback (in color)! 🎉 SHAP is the Swiss army knife of ML interpretability and this book is a comprehensive guide to the theory and application of SHAP.
Tweet media one
5
41
420
@ChristophMolnar
Christoph Molnar
1 year
Looking for a career in AI? I have you covered.
Tweet media one
8
41
403
@ChristophMolnar
Christoph Molnar
10 months
Follow me for more tips on how to sound smart
Tweet media one
8
19
406
@ChristophMolnar
Christoph Molnar
7 months
7 years ago, I wrote the first chapters of Interpretable Machine Learning. I didn't know back then, but this was the start of my writing career. In 4 days, I'll publish a new book project. Online. Free access. In many ways, it weaves together many threads of my career.
18
27
396
@ChristophMolnar
Christoph Molnar
7 months
Bias-variance tradeoff, generalization error, no free lunch theorem -> These concepts come from statistical learning theory which views supervised machine learning through a statistics lens. It's heavy on math and theory. Bookmark the best entry point:
1
57
394
@ChristophMolnar
Christoph Molnar
3 years
Interpretable Machine Learning has >2000 citations 🥳 This reminds me of the professor who laughed at me when I told him that people were citing the book (8 citations at the time) and told me, "You should turn the book into review paper so you can get citations." 🤷‍♂️
8
21
396
@ChristophMolnar
Christoph Molnar
2 years
I have started to create cheat sheets for model interpretation. So far: • Logistic regression: • SHAP plots for tabular data: They are free (pay what you want) Do you have any topic wishes for the next cheat sheets?
16
60
390
@ChristophMolnar
Christoph Molnar
4 years
A lot of interpretation methods for machine learning are connected to other fields Game Theory -> Shapley Values Philosophy -> Counterfactuals Sensitivity Analysis -> model-agnostic methods Statistical Modeling -> linear model, GAMs Rule-Based ML -> Decision trees What else?
16
56
388
@ChristophMolnar
Christoph Molnar
6 months
@gunesevitan AI-generated text was detected.
1
2
379
@ChristophMolnar
Christoph Molnar
6 months
@wolftreeMtg which is the truth and proof that this solution is premium
0
1
377
@ChristophMolnar
Christoph Molnar
10 months
Two types of Bayesians 1) Beer pub Bayesian: People who want to be seen as intelligent by using words like "Bayesian", "prior", "posterior", "updating" to refer to the fact that people have opinions and sometimes change their minds. 2) Actual Bayesian who analyzes data
22
30
365
@ChristophMolnar
Christoph Molnar
2 years
Are researchers putting too much burden on this smol boy?
Tweet media one
2
38
365
@ChristophMolnar
Christoph Molnar
4 years
"You should turn your book into a paper so that it gets cited." "You have to publish at ICML/ECML to have any impact." It can be difficult to decide when to follow conventional advice and when not. I am happy I did the unconventional with Interpretable Machine Learning.
Tweet media one
12
37
364
@ChristophMolnar
Christoph Molnar
2 years
@nntaleb Bayes' theorem: P(H | D) = P(D | H) P(H) / P(D) Conspiracy' theorem: P(H | D) = P(H) where H: Hypothesis D: Data P(H): prior belief P(D): evidence P(D | H): likelihood P(H | D): posteriori belief
9
77
354
@ChristophMolnar
Christoph Molnar
7 years
Most common arguments against interpretable ML: - Humans can't explain their actions either - Performance > Interpretability - Slows down ML adoption - Linear model also not interpretable - Might create illusion of understanding the model Thread:
10
155
353