V Profile Banner
V Profile
V

@_VatsaDev_

432
Followers
236
Following
96
Media
1,808
Statuses

CS Undergrad, head @lucky86877 🎋🎍 prev intern @moondreamai @nousresearch @so_claros

The residual stream
Joined April 2021
Don't wanna be here? Send us removal request.
Pinned Tweet
My worked helped in this!!! never even thought about non-text data until a few months ago, and never did I think it would be a part of a model this good!
@vikhyatk
vik
15 days
Genuinely shocked at how much better moondream has gotten at understanding text... it could barely read anything back in March.
Tweet media one
9
7
134
1
0
15
@_VatsaDev_
V
2 months
@mcneilly_alex Unspoken MIT rizz moment confirmed
0
0
82
@_VatsaDev_
V
2 months
My dad absolutely KO'ed my programming as usual: python DSA -> nah boi he used pascal three.js -> nah he made a torus in c++ use nvim -> oh yeah I remember using that when I debugged errors in C arch btw -> oh you mean formatting linux at 3am and rawdogging the servers at 6 ...
4
0
68
@_VatsaDev_
V
3 months
ML grind day 3 - presented some RNA research work at prof group meet - kept going through nnfs - on jjk ch 190 Need to get into more projects
Tweet media one
1
1
53
@_VatsaDev_
V
2 months
Looking for a Internship, contact me if you have a spot. open to general cs, ml or robotics roles. I've worked on projects at nous research and optimized product search for I also have research lab experience in bio AI
4
6
51
@_VatsaDev_
V
3 months
growth by getting all the highbie chads
Tweet media one
1
0
30
If asked why bullish on @cohere :
Tweet media one
3
0
24
@vikhyatk You've literally announced m87's existence please give the company a linkedin profile so my best internship isnt a default building 😭😭😭
1
0
22
@_VatsaDev_
V
3 months
I've Finally got the blog post up! take a look at some experiments on layers and layer width at thanks to @cto_junior @notyamisukehiro @twofifteenam for reviewing it over and making it much better than I could have alone.
2
3
20
Just turned 18 yesterday
7
0
17
Everybody else: Moondream is wayyy to good for its size how tf did they do this meanwhile the channels:
Tweet media one
3
0
16
@_VatsaDev_
V
2 months
Finally out of Vacay mode, locked in again: skill maxxing day 12: - learning Windows x86 Assembly and Ocaml: (legit asm is a footgun)
Tweet media one
3
0
16
Google deepmind screenAI best case (left) v. my stuff at moondreams worst case (right)
Tweet media one
Tweet media two
0
0
15
me looking at myself compared to boon, damn i'm dying in the lowbie trenches out here:
Tweet media one
@iamyourboon
boon
11 days
anyway, curious what the payout is, finally eligible
Tweet media one
3
0
14
2
0
15
speaking of this @iamyourboon you should call up google deepmind and see what they do
CMU+quant really hits different huh
1
0
10
1
0
15
I've hit 300, ama for my followers, my chronic followers: @Kshitijjkapoor @iliekcomputers @angkul07 @snowclipsed @wateriscoding
5
0
15
@_VatsaDev_
V
7 months
Hmm @Euclaise_ @4evaBehindSOTA @Teknium1 @NousResearch Lilith looks promising at scale, maybe useable for future training runs?
Tweet media one
1
2
14
@_VatsaDev_
V
2 months
Jane street! Ocaml! -> nah mf we used fortran, that stuff was fun bit manipulation -> that stuff great in cobol I'm starting an AI research lab -> haha nice me and the bois used to research compilers at 2am too state school CS :( -> (NIT undergrad->IIT masters) I cannot winnn
1
0
14
Bro wants to work at moondream maybe??? 🥹🥹🥹
@snowclipsed
snow
1 month
went through the entire moondream repo today, finally got a plan down on how to go all about it in pic: struct for weights
Tweet media one
8
2
69
1
0
13
Mid level in three months, the mans a programming beast
@iamyourboon
boon
14 days
first, i am a new grad (i'm 3 months out of college), i want to maximise learning and becoming a subject matter expert in something i enjoy, swe is more versatile, engaging, and something i am more personally passionate about
3
0
102
1
0
13
I have questions. 1 yr and L5? 320k tc is a downgrade after 1yr nahhh im so dead 😭
@iamyourboon
boon
20 days
one downside is i actually downgrade my pay and leave a life of trading behind 🫡 (which is better for wlb, worse for being unable to retire in ten years)
4
0
124
1
0
12
@_xjdr That has got to be stupidest take I've ever seen, guarantee this guy Thinks OAI=GPU google=TPU, and nothing else. If I train a transformer with the same settings on a t4 or an h100, its going to be 99% the same
1
0
13
@_VatsaDev_
V
2 months
Jeremy howard follows my lab wtf got to ship faster now
2
0
13
@_VatsaDev_
V
2 months
Skillmaxxing day 15: - mostly a planner day, reflections on the AIMO - converting pythonic OOP to Ocaml is kind of weird - is the most satisfying file name ever new nvim theme fits though, monokai to chadracula is nice
Tweet media one
1
0
13
is there like less competition for these or something? Is this a target-uni only thing/special stuff they notice you by? @iamyourboon @anpaure I need to know 🥹
@iamyourboon
boon
1 month
also ppl be looking down on citadel fall/spring internships, it’s just 85% waterloo kids
0
0
22
2
0
12
@_VatsaDev_
V
3 months
ML grind day 2 - 40 problems solved at project euler - making my first neuron all over again - started reading jjk, im 100 chapters in
Tweet media one
1
0
13
@_VatsaDev_
V
3 months
omg thank for the god raichu follow @twofifteenam
Tweet media one
2
0
12
@_VatsaDev_
V
3 months
Finally have a follower/following ratio > 1, AMA my 3 viewers?
3
0
11
Bullish on Noam arch coming to Gemini @_xjdr thoughts?
@cto_junior
TDM (e/λ)
1 month
Suddenly bullish on Gemini Most likely they'll hire Ilya as well once Sam takes away the Figure AI bots keeping tabs on him
Tweet media one
2
0
47
1
0
11
nah my employer got noted its over, his reps ruined get the allegations out there now
@vikhyatk
vik
1 month
gotta admit he has a point
Tweet media one
261
745
20K
1
0
11
CMU+quant really hits different huh
@iamyourboon
boon
20 days
good morning, twitch called, got the job, do I be a twitch engineer??
77
1
1K
1
0
10
@_VatsaDev_
V
3 months
ML grind day 3 - more in NNFS, stuck building simple nn's for now, must accelerate - Did some computational biology research work - JJK ch 155!
Tweet media one
1
0
10
22yrs at Amazon ... damn bro probably looks at the juniors like "I can smell the burnout in you"
@GrantSlatton
Grant Slatton
18 days
LinkedIn resume of the most talented engineer I ever worked with at AWS Confirms all my priors
Tweet media one
65
508
18K
2
1
9
@_VatsaDev_
V
2 months
I'm safe, I'm not unpaid 🥹🥹🥹🥹🥹
@vikhyatk
vik
2 months
@adamcohenhillel unpaid interns are the backbone of the economy and I will not be taking further questions at this time
1
0
14
0
0
9
@_VatsaDev_
V
2 months
Introducing @lucky86877 , or Lucky Bamboo research labs. The names from lots of bamboo metaphors, like NNs being grown and a model picking responses is like one path of many in the bamboo maze. We are looking at model evals, lora, model interfaces, and maybe some mech interp ...
3
0
9
The duality of man
Tweet media one
0
0
9
@_VatsaDev_
V
2 months
Skillmaxxing day 11: - mostly on 1 ML experiment, blog post is now up: - more fiddling with nvim, the lower ram usage is nice.
Tweet media one
1
0
9
@_VatsaDev_
V
3 months
@anpaure economists liked this tweet
0
0
9
guy in my lab running randomized hyperparameter search and its just kept improving for 3 wks now (improving at least 1k times in a row) is he the hakari of neural nets???
3
0
8
Legendary shitposting out here @cto_junior
Tweet media one
1
0
8
@alexthegrreat Goddamm I have never seen such a blanket "sucks will never work" and then anyone who sounds like they know what they're doing straight up agree with you
1
0
8
@_VatsaDev_
V
3 months
@iamgingertrash @egrefen @ssi @nearcyan NFGD has some great ones? Figma/Stripe/11labs/wnb/suno/perplexity/pika?
0
0
8
@_VatsaDev_
V
3 months
@LatentPepe @teortaxesTex @AkshGarg03 @siddrrsh @OpenBMB bruh this wrong on so many levels. how does one flirt with llama-3v?
0
0
8
@_VatsaDev_
V
2 months
The guy asking the question died when he realized stable people on sf are real
@twofifteenam
Telt 🍕
2 months
0
0
8
0
0
8
@_VatsaDev_
V
2 months
I was looking for pokemon red "source code" as an 8 yr old and found a github repo full of asm and c. 8 yr old me ran away, only to use python 2 days later I do ml now ig
@github
GitHub
2 months
How did you get into coding?
927
131
2K
0
0
8
MIT students realizing they have free riches waiting for them
@mcneilly_alex
mcneilly ☃️
1 month
holy moly
Tweet media one
18
4
685
0
1
7
@_VatsaDev_
V
3 months
@yacineMTB I've been bullish on them after the command R releases, been really strong
0
0
7
@_VatsaDev_
V
3 months
skillmaxxing Day 5 - more NNFS+nvim, testing claude 3.5 - made a dataset of challenging math problems, around like 5k QA pairs of AMC->AIME->IMO level stuff, way too much webscraping - Up to date with JJK now, I see why all the stans hype this everything cooks peak
Tweet media one
0
0
7
We mogging bois
@vikhyatk
vik
1 month
Tweet media one
1
0
5
0
0
7
@_VatsaDev_
V
2 months
Average 1AM+claude+neovim
Tweet media one
0
0
7
@_VatsaDev_
V
7 months
These are Lilith runs at bs(batch_size) 48, versus an AdamW run at bs 180, the Lilith step takes 70 ms vs adam taking 300 ms per step. 4x less memory for batches while 5x faster.
Tweet media one
0
0
6
@_VatsaDev_
V
2 months
The influence grows
Tweet media one
0
0
7
HRT seems to grab every non jane street
@mcneilly_alex
mcneilly ☃️
13 days
in other news...
Tweet media one
8
5
131
1
0
7
It'd be hilarious if @vikhyatk wrote a "what the interns have wrought" for m87, buts its just me pretending to be worth 3 jane street interns
@yminsky
Yaron (Ron) Minsky
1 year
Once again, it's time for Jane Street's "what the interns have wrought" post.
4
18
149
0
0
7
@_VatsaDev_
V
3 months
This is a real question guys. @dejavucoder @cto_junior @justalexoki @untitled01ipynb you're the best shitpoasters I know please help me.
@_VatsaDev_
V
3 months
@LatentPepe @teortaxesTex @AkshGarg03 @siddrrsh @OpenBMB bruh this wrong on so many levels. how does one flirt with llama-3v?
0
0
8
4
0
7
@_VatsaDev_
V
8 months
@burkov No 16k and 32k are just as possible, just more compute hungry, which means people like Google/OAI do it. Now when you see 128k or 200K, thats prob. rope/alibi, etc.
0
0
7
@_VatsaDev_
V
2 months
personally I'm really worried over the cycle that is: -> dpo the models making you look as good as possible -> it actually sucks on you irl -> consumer frustration -> make the model realistic -> "why does the model make ugly now ?!" ->
@cto_junior
TDM (e/λ)
2 months
Shopping at Zara online is going to be so much fun in a few years
Tweet media one
Tweet media two
20
80
1K
1
0
7
chaddream not moondream
@vikhyatk
vik
1 month
i have trained the same model 300 times so far this year
19
1
145
0
0
6
ML bros and Algo bros need to learn from minecraft bros @snowclipsed @iamyourboon probably get this
@yungnickyoung
YUNG
1 month
first day implementing some of the pieces for the Better Mineshafts redesign. Got a basic version of the surface entrance + main shaft working. Integration with surroundings in the main shaft is already looking wayyy better imo. Thoughts?
36
94
1K
2
0
7
@giffmana creating the real startup reef here, from doubtsuu-VLM to @moondreamai
0
1
7
@_VatsaDev_
V
3 months
@lelouchdaily @wateriscoding @gi0nyx @angkul07 I'm finally starting my own daily progress log too. Trying to get cracked by end of the summer and college Day 1 - Started reading through/working on NNFS ch 1 - Started on the Euler project, want to finish by the end of summer
4
0
6
imagine the waterloo intern at citadel on his way to invent the worlds fastest rejection setup, parses resume in 1 nanosec, rejcts you the other
@anpaure
anpaure
5 days
job application rejection world record
Tweet media one
74
150
4K
2
1
6
@_VatsaDev_
V
6 months
Got transformers, nothing bigger than 20m parameters, doing math
0
2
6
@_VatsaDev_
V
3 months
Skillmaxxing Day 8: - the blog post is out! - Almost done with the paper for the neurips highschool track - AIMO looks like i get a bronze medal ig? (23) - Still working on the sass idea, describe it when its more fleshed out
0
0
6
If I have to measure up with this guy im: freshman, noname uni, 2x ML startup internships, 1x ML+product+search internship, offer from stanford startup I'm not "him", but I'm getting there
@anpaure
anpaure
1 month
he is why you don't have a job
Tweet media one
65
106
6K
0
0
6
@_xjdr look id just like to know what 100 mil ctx effectiveness even is here, theres no evals for anything that big, and I cant even think of what 100m ctx len data even looks like.
3
0
6
@_VatsaDev_
V
2 months
great things happening out here
@vikhyatk
vik
2 months
> train the same model 3 times with slightly different data mixes > average the checkpoints > better performance than any individual checkpoint what is this black magic?
97
28
1K
0
0
6
@_VatsaDev_
V
3 months
googling @_sholtodouglas just brought up the other dude this was unexpected:
Tweet media one
Tweet media two
1
0
6
@_VatsaDev_
V
2 months
Everyday I yearn to learn Jax and fill out the TPU RC g-form and use the so-called legendary stability in these TPUs
@_xjdr
xjdr
2 months
i had to venture back in to GPU cluster training again these past few weeks. I figured "Lets do what the people do, pytorch, triton, FSDP, Ray" ... on the rare occasion that all my accelerators were available for training, it worked ok, i guess? HOW DO Y'ALL LIVE LIKE THIS?!
5
0
83
0
0
6
@agihippo google engineers otw to shard their kid across 1024 TPU slices and pool him back together again (child speedruner)
0
2
6
@_VatsaDev_
V
3 months
@quantymacro damn @quantymacro if a quants TC cant afford this...
1
0
6
@mcneilly_alex another mit student caught by the quant bait
0
0
6
@_VatsaDev_
V
3 months
I'M NOT WORKING ENOUGH
@snowclipsed
snow
3 months
I'M NOT WORKING ENOUGH
1
0
11
0
0
6
@_VatsaDev_
V
2 months
skillmaxxing day 14: - More Ocaml, check out: - Reflecting and regearing around AIMO, hope to improve on our score from here on with our own thing.
0
0
6
@_VatsaDev_
V
2 months
Malloc fails here
@nahrzf
nahr (yapping)
2 months
building ur mom in c from scratch
34
89
1K
0
0
6
bros pfp change was so big I thought a new anon happened
@iamyourboon
boon
5 days
new killua pfp vibe
Tweet media one
31
1
263
0
0
6
Literally this but my dms are full of images with all 8 transformer paper authors and the rest have no rizz while aidan gomez has rizz
@1vnzh
Ivan Zhang
22 days
If Aidan Gomez has a million fans, then I am one of them. If Aidan Gomez has ten fans, then I am one of them. If Aidan Gomez has only one fan then that is me. If Aidan Gomez has no fans, then that means I am no longer on earth. If the world is against Aidan Gomez, then I am
6
4
103
1
0
6
wait damn hes actually former citadel @graffioh @Aryvyo @arpitingle this is peak alpha knowledge, hack your nearest google datacenter today, intercept that request, get that job people
@binalkp91
binal
5 days
@_VatsaDev_ Colocate your rejection server at Google’s data center to reduce latency
1
0
3
2
0
6
@_VatsaDev_
V
2 months
@dejavucoder Bro finishes projects instead of feature creep wtf
1
0
6
literally me rn
@jrysana
John
1 month
crazy how people will publish a research paper and its just blatantly false
15
5
103
2
0
6
This is what Non-MIT kids write on their cover letters to get the OAs
@nearcyan
near
1 month
when i worked at jane street i earned the nickname jane for my proficiency in predicting the street
4
2
187
0
0
6
@_VatsaDev_
V
3 months
@yacineMTB Yess yacine join the fellow desis man
0
0
6
@_VatsaDev_
V
2 months
My top website visitors are split between America and China (both countries need my ml alpha) but there's 31 ugandans? what are you doing?
0
0
5
@_VatsaDev_
V
2 months
Ocaml is kind of quirky, but nice gist with Intro code to like learn the very basics of Ocaml, getting use to syntax and stuff Also Ocaml needs to become the machine learning language, its literally got the mandate of heaven with the .ml extension
1
0
5
@_VatsaDev_
V
2 months
The avg @kaiokendev1 or @qtnx_ tweet is either peak esoteric experiment content or good theorizing. then half the replies are just "quantum blockchain tokenizer incoming" Its saddening
0
0
4
@_VatsaDev_
V
3 months
What if claude 3.5 is just claude 3 with all the logic/reasoning/math neuron activations boosted?
0
0
5
@_VatsaDev_
V
3 months
@mcneilly_alex Dude most of them are crypto shills. You can tell when the pfp is also just boredape/pepe/pixelboy with the .eth
0
0
5
This is what creator of llama file thought lmao
@vikhyatk
vik
1 month
gotta admit he has a point
Tweet media one
261
745
20K
0
0
5
@_VatsaDev_
V
3 months
Thread thoughts: Pytorch visualizer - like penzai? whats it monitor? Pytorch to Manim - ooh thats epic if it works, though you might need paper for eqns Kino Compass - yess if only this existed the the main time was called the grandline so I could just have one piece tpot ...
@snowclipsed
snow
3 months
some ideas i am working on this summer
Tweet media one
37
16
927
3
0
5
There is no way the rich dude who can summon most waterloo with a tweet can complain about unfinished project ideas, 1 day spent articulating and posting them, and it's done
@natfriedman
Nat Friedman
19 days
I have too many project ideas
160
41
1K
0
0
5
@_VatsaDev_
V
2 months
Thundermoon release is out!!!!
@vikhyatk
vik
2 months
Released a new version of moondream today, with significant improvements in OCR and document understanding! You can actually run this model locally, unlike the other one that dropped today. 😅
Tweet media one
24
46
453
0
0
5
@_VatsaDev_
V
2 months
The world if pytorch had q1/q2/q4 datatypes
@vikhyatk
vik
2 months
the world if ML libraries preserved backward compatibility over minor version updates
Tweet media one
1
1
12
0
0
5
@_VatsaDev_
V
5 months
All right its official I need to build and run the pipelines, weekend lock in.
Tweet media one
1
0
5
there's two wolves that you chose from, @yacineMTB with 2070s mem constrained, or @_xjdr where you start doing the scaling and end at "our run will fail every 8 minutes"
@yacineMTB
kache
18 days
@_xjdr once a shill always a shill (they do just work)
1
0
3
0
0
4
@_VatsaDev_
V
3 months
Late day 4 post, didn't get to do much - cont. on NNFS - started using neovim, my ram is thanking me after running vs bloat (I don't even know why its slow I have a new laptop and 5 extensions) - working on Mech Interp, have my own lab when the paper happens - jjk ch 200
Tweet media one
0
0
4
@_VatsaDev_
V
3 months
@snowclipsed @teortaxesTex Ilya and the cult was the magic? Sama is money man - eliminated Murati management - gone GDB wonders - well they happened after the model was made, so also gone
0
0
5