eugeneyan Profile Banner
eugeneyan Profile
eugeneyan

@eugeneyalt

652
Followers
72
Following
195
Media
1,408
Statuses

Care a lot, try hard, have fun. @eugeneyan 's inner Id.

Joined March 2020
Don't wanna be here? Send us removal request.
Pinned Tweet
@eugeneyalt
eugeneyan
11 months
Note to self: • Work hard • Keep learning • Cherish loved ones • Find people who inspire you • Be kind & egoless • Eat healthy, exercise, sleep well • Read & write • Practice gratitude & meditate • Be present • Enjoy food & nature • Don’t sweat the small stuff • Smile
1
0
24
@eugeneyalt
eugeneyan
12 days
applies to most technical roles and prompted me to write this:
Tweet media one
@_xjdr
xjdr
12 days
AI is full of people that think what is difficult is valuable, lol
19
19
420
10
76
571
@eugeneyalt
eugeneyan
3 months
luck surface area > fly to SF to speak at conf > talk to 20+ pple > attend evals meetup > talk to 10+ pple > chief-of-staff hears me talk, texts friend to uber down asap > friend turns out to be research dir > we had highest alpha chat (1hr) on my entire trip > 3% hit rate
6
1
117
@eugeneyalt
eugeneyan
2 months
tbh it’s hard to articulate what i do to look around corners, connect the dots, force multiply, deliver through others. it’s especially challenging to explain how i do it or mentor others to do it too
@rakyll
Jaana Dogan ヤナ ドガン
2 months
As a senior individual contributor, part of how you get things done is always going to be black magic. In companies where there is an established senior IC population, this is a pretty well understood reality. In others, management will try to micromanage you or count some
11
53
644
4
5
60
@eugeneyalt
eugeneyan
3 months
look mum, i'm an "experienced LLM hacker"
Tweet media one
4
4
51
@eugeneyalt
eugeneyan
3 months
look who i bumped into the airport
Tweet media one
3
0
46
@eugeneyalt
eugeneyan
4 months
tech folks hired during peacetime ain’t ready for wartime
3
5
46
@eugeneyalt
eugeneyan
3 months
behind the scenes on how it started
Tweet media one
@eugeneyan
Eugene Yan
3 months
All three parts of "What We Learned from a Year of Building with LLMs" are now live on O'Reilly! • Tactics: • Ops: • Strategy: Read all 42 lessons here: A sample of what each
19
170
771
1
3
45
@eugeneyalt
eugeneyan
22 days
a lot of ml engineers and scientists have the misconception that “doing science” is about inventing or applying the sota, lots of math/code, etc. no, that’s not it. doing science is about valid measurements and experiments, understanding why it works or doesn’t, creativity, etc
0
7
46
@eugeneyalt
eugeneyan
4 months
look what Hamel did
Tweet media one
@HamelHusain
Hamel Husain
4 months
Love this essay from @eugeneyan This is especially acute for tools and infra around AI
Tweet media one
30
73
585
3
4
29
@eugeneyalt
eugeneyan
25 days
wow my paper on detecting hallucinations got accepted for oral presentation at Amazon ML Conf 🎉
3
0
28
@eugeneyalt
eugeneyan
3 months
them: "why don't you publish in a journal or conference? it'll help with your reputation" me: "do i really need to? feels like more people can benefit by sharing openly?" • • •
@_xjdr
xjdr
3 months
man, i want to be like Noam when i grow up. This guy is out here doing native int8 training, local attention and radix tree kv_caching and then just posts about it casually on a blog. some of y'all would have milked 8 arxiv papers out of what was contained in that post.
11
15
437
1
5
22
@eugeneyalt
eugeneyan
3 months
Tweet media one
1
0
22
@eugeneyalt
eugeneyan
1 month
drafting my 188th writeup (on llm-based evals) now and it's still incredibly hard to put thoughts on paper and organize it for my intended reader
Tweet media one
@eugeneyalt
eugeneyan
1 month
writing is HARD, and i've not been as productive this year as i'd like to be—only 8 write-ups & 2 talks😔 some of these took lots of effort: synthetic data, task-specific evals, what we learned, prompting, ml interviews. maybe I should write shorter pieces in larger quantity
Tweet media one
2
0
12
0
1
22
@eugeneyalt
eugeneyan
2 months
@sh_reya wait seriously? that's 30m input tokens?
Tweet media one
1
0
21
@eugeneyalt
eugeneyan
21 days
epic sf day trip catching up with old and making new friends. and amazing picnics, noodles, and jap dinner. 🤤
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
3
21
@eugeneyalt
eugeneyan
28 days
at a bbq someone asked why I "worked" every weekend. i said that for as long as i remember i always felt the need to catch up, from my psych degree to investment negotiations to data analyst to cs masters to ml eng to now llms. do it for a decade and it becomes habit
4
0
19
@eugeneyalt
eugeneyan
3 months
the more you put out into the ether the more comes back at you applies to both good and bad vibes few truly understand and believe this
1
3
19
@eugeneyalt
eugeneyan
3 months
before this weekend i knew evals were important; after the weekend i realized it was a bottleneck for almost all other teams and it’s probably the most pressing issue for building reliable and usable ai product.
2
0
19
@eugeneyalt
eugeneyan
4 months
Top qns people get from L6 Applied Scientists (me): • What's the customer problem we're solving? • Do we have data/prototype that confirms this works? • Could we write it down so it's clear for everyone? • Have we considered latency, throughput, security, monitoring, cost?
1
2
19
@eugeneyalt
eugeneyan
1 month
working on my o1 and it's such a slog
7
0
18
@eugeneyalt
eugeneyan
4 months
@ekdnam hungry to learn & ship, ready to 996 for a deadline every now & then, pushing like runway's running out every day, when leader says jump they ask how high instead of bargaining to squat or jump later, willing to do whatever it takes ethically to serve users, and actually deliver
1
2
15
@eugeneyalt
eugeneyan
2 months
the inverse applies too: anyone who doesn’t have their own evals is just playing around
@yacineMTB
kache
2 months
@_xjdr anyone serious has their own private test bench that they hand wrote themselves
1
0
16
1
1
17
@eugeneyalt
eugeneyan
5 months
Just finished my draft of "A Builder's Guide to Evals for LLM-based Applications" If you have nothing better to do on this beautiful sunday and want to provide feedback on 5,000 words about evals for classification, summarization, and translation, copyright, and toxicity, dm me.
2
0
16
@eugeneyalt
eugeneyan
19 days
heartbreaking to see how poor leadership & management of a team / program leads to its obsolescence or hollowing out over years. it's not noticeable when you look at it month on month or quarter on quarter, but suddenly after a few years, its collapse is clear.
1
1
16
@eugeneyalt
eugeneyan
2 months
the last mile matters more than the first starting well but fumbling the last mile = poor outcomes and lost trust starting poor yet finishing well = solid outcomes and turning things around
1
3
14
@eugeneyalt
eugeneyan
3 months
dinner with some of my fav folks in SF
Tweet media one
4
0
15
@eugeneyalt
eugeneyan
3 months
overall, gained alpha from 6 - 8 folks i spoke with, so ~20% hit rate, with one outlier (3%)
0
0
14
@eugeneyalt
eugeneyan
2 months
@HamelHusain buying $AMD based on this Hamel
3
0
14
@eugeneyalt
eugeneyan
3 months
level 1: sign up for credits level 2: sign up for learning level 3: sign up for community level 4: sign up to forge friendships
@TheZachMueller
Zach Mueller
3 months
*Signs up for a course* *Ends up making great friends out of @eugeneyan , @charles_irl , etc* I’d say it’s a win win all-around
0
0
18
2
2
13
@eugeneyalt
eugeneyan
3 months
eval notes • a few practitioners have methods similar to mine (simplify task, non-llm model-based evals, etc) and found it to perform well too • two experts: while they agree llm-eval not reliable enough now, confident can get there in 12 months and have invested heavily in it
2
0
13
@eugeneyalt
eugeneyan
12 days
the quality of your writing/coding tells me the quality of your thinking. if it's bloated, disorganized, or has redundant parts or large gaps, it suggests the same of your thinking. and if it's because there were no edits or rewrites, then it suggests lack of effort.
2
2
14
@eugeneyalt
eugeneyan
26 days
if you adopt a beginner's mind and accept you don't know everything and learn how others have succeeded; if you let go of ego and look beyond how things currently are, even if you contributed to it; you'll see a lot of bullshit that we should remove and lots of room to improve
1
1
13
@eugeneyalt
eugeneyan
3 months
senpai karpathy noticed ❤️
Tweet media one
@charles_irl
Charles 🎉 Frye
3 months
In this post, @eugeneyan , @BEBischof , @HamelHusain , @jxnlco , @sh_reya & I share our tactical tips for working with LLMs, from structured outputs to caching Stay tuned for two more posts covering the operational (hiring, product) & strategic (durability, competition) perspectives
Tweet media one
2
23
208
0
0
14
@eugeneyalt
eugeneyan
4 months
@HamelHusain wait a minute, it’s jAson and Hyaml 🤯
2
1
11
@eugeneyalt
eugeneyan
1 month
writing is HARD, and i've not been as productive this year as i'd like to be—only 8 write-ups & 2 talks😔 some of these took lots of effort: synthetic data, task-specific evals, what we learned, prompting, ml interviews. maybe I should write shorter pieces in larger quantity
Tweet media one
2
0
12
@eugeneyalt
eugeneyan
1 month
some non-ml folks ask me for a list papers to get into ml, and when I point them to applied-ml, they ask if i can suggest a handful to focus on. i don't think they understand: the list keeps growing; you need to keep reading and learning. I read 3 - 5 a week and barely keep up.
1
2
12
@eugeneyalt
eugeneyan
3 months
possibly useful resources • evals as system design pattern: • evals & hallucination for summarization: • practical evals for common tasks: • building a hallucination classifier:
1
0
11
@eugeneyalt
eugeneyan
4 months
company event; my table had 9 ladies + me brought cheesecake to table; she ate most of it♥️ she kept playing mobile games💕 she played scramble with friends. i started dueling her. after 2 weeks, got my chance: i formed RAMEN & asked if she liked it. first date. 12 yrs ago.
@coastalmama_
Mrs. G 🕊️
4 months
Please rt with the story of how you met your s/o. I want to know. Even if it’s “not interesting” 🫶🏼
Tweet media one
648
446
4K
1
0
12
@eugeneyalt
eugeneyan
16 days
wow when did i get a knowledge panel on google search
Tweet media one
1
0
12
@eugeneyalt
eugeneyan
2 months
it's tough competing with someone who's having fun
1
0
12
@eugeneyalt
eugeneyan
2 months
meta = zerg openi = terran anthropic = protoss
1
0
12
@eugeneyalt
eugeneyan
4 months
embarrassed to say i barely read books now. mostly been reading papers, writing notes, and sharing what i learn online (to reinforce my learning)
@rakyll
Jaana Dogan ヤナ ドガン
4 months
Industry wide AI burnout is real. People don't talk enough about it because they are too burned out to talk about it.
11
32
303
2
0
12
@eugeneyalt
eugeneyan
2 months
taking part in an ai hackathon, and wow, compared to this time last year, we've come a loooong way.
3
0
12
@eugeneyalt
eugeneyan
3 months
embrace the beginner's mind
Tweet media one
@AndrewArruda
andrew arruda 🎸
3 months
"your ability to keep doing interesting things is your willingness to be embarrassed again and go back to step one and start as a beginner and get your ass kicked and look stupid doing things." - @finkd
24
369
3K
1
2
10
@eugeneyalt
eugeneyan
2 months
skyline trail at mount rainier today 🥾🏔️
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
0
11
@eugeneyalt
eugeneyan
2 months
@HamelHusain saw that tweet; should have ignored and moved on. but i didn't, and now it's half hr past my usual gym time lol
2
0
11
@eugeneyalt
eugeneyan
4 months
this video made my day 🤣 punchline is pure alpha
Tweet media one
@granawkins
Grant♟️
4 months
sota RAG in 2024
64
233
1K
1
0
11
@eugeneyalt
eugeneyan
1 month
💯 working with LLMs over prolonged periods can make building / debugging data pipelines, conventional ml, recsys, and even doing on call, fun and something to look forward to lol
@_xjdr
xjdr
1 month
had a whole day of just normal ass API programming and slinging around protos like the old days. Turns out when you spend most of your time working with and programming for AI, regular old "deterministic programming" can be fun and kind of easy by comparison
5
1
112
1
0
11
@eugeneyalt
eugeneyan
2 months
> Further, it feels like base models are getting good enough that improvements to finetuning methodology don't matter much. never understood finetuning llamas and mistrals to improve on general benchmarks like mmlu or gsmk. feels a bit like kaggle?
@Euclaise_
Jade
2 months
I'm getting bored. Nous is amazing, but it's getting harder to find areas of overlap between my interests and theirs. Further, it feels like base models are getting good enough that improvements to finetuning methodology don't matter much. I'm not sure what to do
29
5
207
2
0
11
@eugeneyalt
eugeneyan
5 months
read this book years ago and figured out this is what I'll do
Tweet media one
@NirantK
Nirant
5 months
Reminder that the best thing you can do for your career is get so good that the best people find you and want to work with you Everything else: offer shopping, leetcode grindmax is less efficient
7
38
422
1
1
11
@eugeneyalt
eugeneyan
3 months
help i’d like to report a murder
Tweet media one
0
0
11
@eugeneyalt
eugeneyan
4 months
@jxnlco what are the requirements and why wasn’t this PM consulted? have you spoken to these 9 teams on how to reuse / integrate their systems? can you share the results of the user study research? did you get the approval of these VPs? if you want rate limit increase i need CEO approval
0
0
11
@eugeneyalt
eugeneyan
4 months
recently, i asked a mentor for advice, expecting them to tell me “North” to my surprise, they said the opposite, “South”, which was the harder, more uncomfortable path i acted on it; in hindsight it was right lesson: find mentors who shake you out of local optimal/comfort zone
2
2
10
@eugeneyalt
eugeneyan
4 months
working with people who have different values is tiring. in the end there'll be people who get taken advantage of.
1
0
10
@eugeneyalt
eugeneyan
1 month
i may have rediscovered the joy of lego
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@eugeneyalt
eugeneyan
1 month
My Lego Star Wars X-Wing is here! Which would you do over the weekend? • Build the X-Wing • Work through backlog of papers on Model-as-Judge
4
0
1
0
1
10
@eugeneyalt
eugeneyan
5 months
@jxnlco > how do you encourage scientists or engineers to do the analysis/evals/prototype before implementation • make evals part of the product spec • make a baseline mandatory • share how simple always does better
2
0
6
@eugeneyalt
eugeneyan
3 months
@sh_reya Shreya, you gonna put me out of a job
0
0
9
@eugeneyalt
eugeneyan
2 months
if you say “let me know how i can contribute” while in a hackathon team, ngmi ☠️
2
0
10
@eugeneyalt
eugeneyan
6 months
@HanchungLee @lucy3_li 💯 prototype and iterate with LLM APIs till fit, then (maybe) scale with finetuning and self-hosting. the reverse often leads down a rabbit hole.
2
1
4
@eugeneyalt
eugeneyan
3 months
now i gotta rerun all my evals before tbe big gathering next week so i can seek advice more effectively lol
@AnthropicAI
Anthropic
3 months
Introducing Claude 3.5 Sonnet—our most intelligent model yet. This is the first release in our 3.5 model family. Sonnet now outperforms competitor models on key evaluations, at twice the speed of Claude 3 Opus and one-fifth the cost. Try it for free:
Tweet media one
448
2K
7K
0
0
10
@eugeneyalt
eugeneyan
5 months
• psych degree • internship: opened a patisserie, popular with 🌈 • govt: negotiated FTAs • ibm: only analyst without tech degree • e-com startup: no. 1 in southeast asia; acquired by alibaba • healthtech: predict & prevent chronic diseases • now: selling 📚 at a bookstore
@jxnlco
jason liu
5 months
Maybe I’ll write about my unconventional time at Waterloo. - physics lab - NYU epidemiology - action iq doing enterprise sales and forward deployed - stitchfix multimodal ai and search - facebook public safety and risk All by the time I was 22. - went to hackathons every
12
35
697
0
0
10
@eugeneyalt
eugeneyan
2 months
lol maybe we overdid it for evals
Tweet media one
2
0
9
@eugeneyalt
eugeneyan
3 months
@eugeneyalt
eugeneyan
3 months
just do your best and have fun; disregard critics and cheerleaders alike.
2
1
9
1
0
9
@eugeneyalt
eugeneyan
6 months
Call me thickheaded but I don't understand why a deeply researched, 40-min writeup on synthetic data gets less engagement than a 5-min hastily written post on unit testing 🫠
Tweet media one
5
0
8
@eugeneyalt
eugeneyan
1 month
every now and then i pause and reflect on my work and wow it’s crazy hard. the level of ambiguity, and how it extends beyond science and engineering into business and product and organization is 🫣 i guess best not to think too much about it lest i get paralyzed with fear 🤣🫠
1
0
8
@eugeneyalt
eugeneyan
3 months
just do your best and have fun; disregard critics and cheerleaders alike.
2
1
9
@eugeneyalt
eugeneyan
2 months
crap is this style of videos gonna do to ai engineers what it did to PMs?
@cto_junior
TDM (e/λ)
2 months
We're definitely in a bubble
Tweet media one
41
65
3K
2
0
9
@eugeneyalt
eugeneyan
3 months
what an exhilarating first half of the year. learned a lot, built a lot, shared a lot. gained several new friends, a few which will likely be 20-year friendships. kinda exhausted for now and can’t wait for my recharge next week 🪫
0
0
9
@eugeneyalt
eugeneyan
6 months
just built my own stitching together opus, vapi, and a vonage virtual number. took under an hour. it's a golden age for tinkerers and builders.
@swyx
smol swyx
6 months
I've now had multiple >20min phone calls with AI therapists and it feels completely natural. Every AI Engineer should be building their own therapist rn, and voice is the right medium. forget typing. go on a long walk and talk thru your day, your childhood, your dreams,
Tweet media one
65
49
771
3
2
8
@eugeneyalt
eugeneyan
3 months
@jxnlco the answer is in the map: "Toyota"
1
0
9
@eugeneyalt
eugeneyan
2 months
the difference between academic papers and industry write-ups is that the former’s finish line is reporting (loose) metrics like “54.9 spearman p” on static, overfitted data while the latter’s starting line is actually shipping systems for customer-facing experiences at scale
1
0
9
@eugeneyalt
eugeneyan
4 months
garlic butter with a side of shahi paneer
@gordic_aleksa
Aleksa Gordić 🍿🤖
4 months
ML interview question: what's your favorite nan?
12
0
13
3
0
8
@eugeneyalt
eugeneyan
3 months
i was hoping to learn from others how they solved evals. now i realized that everyone’s struggling too and i’m probably ahead of the curve. crap it’s gonna be tough.
@eugeneyalt
eugeneyan
3 months
before this weekend i knew evals were important; after the weekend i realized it was a bottleneck for almost all other teams and it’s probably the most pressing issue for building reliable and usable ai product.
2
0
19
0
0
9
@eugeneyalt
eugeneyan
3 months
And this is my favorite sentence in the writeup 🤣 Read it here:
Tweet media one
@charles_irl
Charles 🎉 Frye
3 months
My personal favorite in this section: "the rumors of RAG's demise are greatly exaggerated."
Tweet media one
4
9
100
1
1
9
@eugeneyalt
eugeneyan
3 months
looking at my nvda friends who tried to hire me and i declined 😅
@benln
Ben Lang
3 months
If you joined Nvidia 5 years ago as a mid-level product manager with an annual $70K stock grant over 4 years, just that initial grant would be worth ~$10.6M today.
133
1K
16K
2
0
8
@eugeneyalt
eugeneyan
16 days
for the first weekend in years, i don't have a topic i'm dying to write about, or an idea i really want to build so, at risk of coming across as arrogant, what's something you want me to write on or share? (alternatively, i'll revisit senpai k's class or build a labeling tool)
7
0
8
@eugeneyalt
eugeneyan
4 months
like how 10 harkonnen troopers don’t stand a chance against a single Freman
1
0
8
@eugeneyalt
eugeneyan
3 months
there are people who know how to talk about the thing, and there are people who know the thing—be the latter.
1
1
9
@eugeneyalt
eugeneyan
25 days
lgtm
Tweet media one
2
0
8
@eugeneyalt
eugeneyan
2 months
okay a third of y’all burnt out
@eugeneyalt
eugeneyan
2 months
i know several folks in my ai practitioner circle who are slightly burned out from learning, building, shipping ai products. quick poll: are you feeling burned out for AI?
1
0
0
2
0
8
@eugeneyalt
eugeneyan
3 months
there are people who're mostly talk and there are people who actually deliver—be the latter
2
0
8
@eugeneyalt
eugeneyan
4 months
llms raise the ceiling of expectations more than the floor of capability
3
0
8
@eugeneyalt
eugeneyan
4 months
💯 tech is the easy part; herding humans is much, much harder.
@shreyas
Shreyas Doshi
4 months
It is easy to make fun of PMs & engineers in big companies who are just launching features, moving pixels around, and are focused on “outputs over outcomes”, but you must understand that some environments are so messed up that a launch (really _any_ launch) is by itself a huge
27
96
716
0
0
8
@eugeneyalt
eugeneyan
3 months
interesting comparison between alt and main
Tweet media one
Tweet media two
4
0
8
@eugeneyalt
eugeneyan
2 months
you can’t change the people around you but you can change the people you’re around
1
0
8
@eugeneyalt
eugeneyan
4 months
interesting to learn where in the embedding space i’m in more influencers than i’d like though 🤣
Tweet media one
0
0
8
@eugeneyalt
eugeneyan
2 months
if agi was imminent, would K senpai be building a startup that teaches people how to build towards agi?
1
1
8
@eugeneyalt
eugeneyan
3 months
the size of a person is revealed by the size of what they consider problems
0
0
8
@eugeneyalt
eugeneyan
2 months
care for those who care a lot for love and camaraderie work with people who work hard to do your best work surround yourself with folks who have fun to enjoy life quarantine yourself from haters, karens, grifters, status seekers, freeloaders, drama mamas, negative nancies, etc
0
0
8
@eugeneyalt
eugeneyan
5 months
Me: <to my wife> "Wow look at this feedback" Wife: "Yea, I'm surprised people actually listen to three old men rambling on for hours with no agenda" 🫠
@eugeneyan
Eugene Yan
5 months
Did not expect this OH experiment to resonate so much. Thanks to everyone who's provided feedback, either via DMs or during the stream ♥️
Tweet media one
2
2
17
1
0
8
@eugeneyalt
eugeneyan
2 months
home sweet home. best sleep i’ve had in the week though still lots of sleep debt to catch up
Tweet media one
1
0
8
@eugeneyalt
eugeneyan
4 months
chatbots are such a pain to build esp if they have to respect the existing knowledge and act on it beyond just returning text
3
0
8
@eugeneyalt
eugeneyan
6 months
@HanchungLee @lucy3_li finetuning cost is usually a small fraction of inference cost, and inference of small classifiers is orders of magnitude cheaper and lower latency than medium sized LLMs
2
0
3
@eugeneyalt
eugeneyan
6 months
n years of experience may not mean the same thing for different people. for some it’s a solid foundation and virtuous cycle that accelerates their learning and future work. for others it’s outdated knowledge, bad habits, and hubris that holds them back and must be unlearned.
1
0
8
@eugeneyalt
eugeneyan
4 months
@jxnlco the strongest are the kindest
2
1
8
@eugeneyalt
eugeneyan
3 months
i just read the most amazing doc by an L7 that succeeded in saying a lot without saying anything. full of jargon and obviously written in a vacuum, divorced from data, analysis, or prototyping note to self: don’t write like that
1
0
8
@eugeneyalt
eugeneyan
4 months
Folks ask me why I don't just read a summary. > "Reading and writing notes on papers is how I learn."
@chrisalbon
Chris Albon
4 months
I don't want an AI to create notes on the new AI papers comings out. Reading and writing notes on papers is how I learn. I want an AI to find all the most important papers coming out and automatically add them to Zotero.
9
8
75
1
0
8