Diego Profile Banner
Diego Profile
Diego

@diegoasua

439
Followers
870
Following
222
Media
3,137
Statuses

💻 make neural nets go brrr

SF
Joined January 2019
Don't wanna be here? Send us removal request.
@diegoasua
Diego
2 months
ChatGPT can't count "r"s @karpathy
Tweet media one
Tweet media two
20
6
174
@diegoasua
Diego
3 months
@AravSrinivas AFAIK Tesla was using convolutional nets (in particular RegNets) at least in 2021. I am not familiar with FPNs but I think they are based on convolutional math operations too. These are slides from @karpathy I highly doubt they switched to something else
Tweet media one
11
5
114
@diegoasua
Diego
5 months
@karpathy To me this just goes to show how incredibly efficient PyTorch is even though it’s a general framework
1
0
50
@diegoasua
Diego
4 months
@soumithchintala @SkyLi0n This is a very McKinsey move
0
2
47
@diegoasua
Diego
2 months
@Teknium1 Forget about 405B if this is real 8B smokes everything else big time
5
0
42
@diegoasua
Diego
3 months
@NickADobos @10x_er No, he said science needs to be published. Publishing has nothing to do with shipping products or features. Eng and science are not the same dude
1
0
39
@diegoasua
Diego
4 months
@Wlo_oWl @felix_red_panda @Leoagua1 this. for starters when you analyze that data you do it at 1kHz and high and low-filtered. That alone would be a huge compression (~50x) in almost real time. or kaman filters sending it unprocessed is plainly stupid source: I used to do this for a living
1
0
34
@diegoasua
Diego
3 years
In the past weeks @facebookai has announced and open-sourced a number of models trained via self-supervision in a number of different tasks. And they are just AWESOME. e.g. - -
0
7
29
@diegoasua
Diego
6 months
Nvidia's new 950 GB/s is cool. Apple gives you 800GB/s with the M2 Ultra chip. Why is memory bandwidth so important in machine learning? 🧵
1
4
28
@diegoasua
Diego
3 months
@NickADobos @10x_er scientists don't ship products. Engineers do. Yann is a scientist.
1
1
26
@diegoasua
Diego
4 months
@levelsio why are Europeans so triggered by this tweet just face the truth and leave ego aside
4
0
23
@diegoasua
Diego
4 years
@3blue1brown @numberphile 's Provides a very neat visualization of the size of Graham's number
0
2
22
@diegoasua
Diego
4 months
@ikristoph @jakubtomsu_ i don't know you tell me
1
0
18
@diegoasua
Diego
5 months
@ekzhang1 two day projects are the best Linus wrote Git in C in a weekend
1
0
16
@diegoasua
Diego
4 years
I think @napari_imaging is going to be a real game changer
1
3
16
@diegoasua
Diego
1 year
@maggie_1i As someone who has done this before IMO limited audience + academic budget = no moneys
2
0
14
@diegoasua
Diego
10 months
@0interestrates Text is still there if u look into HTML
Tweet media one
1
0
14
@diegoasua
Diego
2 months
@karpathy @maurosicard I think V100s don’t even have bfloat16 support!
1
0
12
@diegoasua
Diego
4 months
@ikristoph @jakubtomsu_ could do early return pattern or error monad
0
0
11
@diegoasua
Diego
27 days
@alxfazio @GoogleDeepMind @Android Just to be clear this is not an end to end voice system. It’s a traditional VAD-STT-LLM-TTS system with an added layer for interruption.
3
0
11
@diegoasua
Diego
4 months
@Nick_Davidov @MartinShkreli i don’t know anyone who thinks they are easy
2
0
10
@diegoasua
Diego
3 months
@samschmitz Same with solar panels. This is the 2nd largest Chinese solar panel manufacturer and 2nd larger in the world.
1
2
11
@diegoasua
Diego
7 months
@Suhail What income bracket is this
1
0
10
@diegoasua
Diego
1 year
@AviSchiffmann Same experience, from 0 to hero in a week GPT-4 is decent on Swift
1
1
10
@diegoasua
Diego
1 year
0
0
9
@diegoasua
Diego
6 months
@tsarnick Bureaucrats, unfortunately
1
0
9
@diegoasua
Diego
4 years
Interesting talk from @CellTypist @worldwideneuro . Combine biophysics and statistics to boost you models
Tweet media one
0
3
9
@diegoasua
Diego
4 years
I am so much in love with this 😍
@napari_imaging
napari🔬
4 years
We're also very excited about a new companion package `magicgui` that @TalleyJLambert has led the development of, which lets people to create their own dock widgets without writing any GUI code
2
5
19
0
4
8
@diegoasua
Diego
6 months
@realSharonZhou @LaminiAI @AMD yay twice as much memory bandwidth than H100 at 5.3 TB/s with 30% more flops and so cheap? bring it on!
0
0
8
@diegoasua
Diego
11 months
@inflectionAI You guys rocking it. Improvements in transcription + new call mode even if locked + internet browsing. Neat.
0
0
1
@diegoasua
Diego
5 years
Recently I had a really stimulating discussion around the Claustrum on my MSc Viva. I appreciate the interest the examiners put into understanding my work and not only the limitations and failures of the project! This is how science should always work #ThankYou
1
0
6
@diegoasua
Diego
27 days
@alxfazio @GoogleDeepMind @Android It does not understand the emotional content of your voice.
1
1
7
@diegoasua
Diego
4 years
Life update: I moved to NY
0
0
7
@diegoasua
Diego
4 years
@dlevenstein May I answer with a question? In which species? Sleep has different functions in different species, slide from a talk by @vanswinderenlab
Tweet media one
1
2
7
@diegoasua
Diego
5 months
@amdradeon Great to hear this @__tinygrad__
0
0
7
@diegoasua
Diego
1 year
@gdb Mmm wdym Greg?
0
0
0
@diegoasua
Diego
4 years
Started using for taking notes. It is just beautiful! @excalidraw #opensource
0
1
7
@diegoasua
Diego
29 days
@getjonwithit I mean this is just rephrasing diagonalization with a more modern "Wolfram-sounding like" framework. You used self-referential statement that lead to a contradiction within the limits of what a formal system can prove.
0
0
7
@diegoasua
Diego
1 year
@wcathcart @FT Hey Will why did FT write this? TBF Meta should consider suing them
0
0
1
@diegoasua
Diego
4 years
@phant0msp1k3 @AcademicChatter They were in science books since before memes were popular
Tweet media one
0
0
6
@diegoasua
Diego
4 months
@andrew_n_carr Soooo Mistral true multimodal by Christmas?
1
0
6
@diegoasua
Diego
3 months
@Karmedge @__tinygrad__ California avg: $0.3/kWh Germany avg: $0.43/kWh Double those numbers
2
0
6
@diegoasua
Diego
2 months
@karpathy I'm willing to bet it's the tokenizer
2
0
6
@diegoasua
Diego
1 year
@AiBreakfast Uhm web pages are text
0
0
0
@diegoasua
Diego
5 years
Starting #cosyne2020
Tweet media one
0
0
6
@diegoasua
Diego
4 months
@sterlingcrispin @karpathy @natfriedman technically at a high level neurons operate in asynchronous time-domain as analog devices so comparing flops is not as straightforward as adding binary (clocked) synapses
0
1
6
@diegoasua
Diego
3 months
@garybasin @jtaylor_tweets You posted a result based on grokking they don’t do that on SOTA LLMs
0
0
6
@diegoasua
Diego
5 years
⚠️ job offer @TU_Muenchen (Munich, Germany)⚠️ Full-time technician in systems neuro lab working with zebrafish. Fish facility work, lab management, support for experiments (e. g. clonning, genotyping, behavioural assays). See + info below. Please RT!
0
8
6
@diegoasua
Diego
1 year
@DeveloperHarris Your bill is going to be epic
1
0
6
@diegoasua
Diego
1 month
@_xjdr Hurts. But hey it works
0
0
5
@diegoasua
Diego
4 years
Thanks so much @MarcelStimberg @neuralreckoning for an amazing tutorial on @briansimulator ! Only 1 Q was left unresolved: Is the name an allegory to Monty Python?
1
0
5
@diegoasua
Diego
6 months
You can overcome this with a super awesome scheduler. Once you overcome memory bottlenecks you are purely bound by compute. A successful example of high utilization while partially memory bounded is @LaminiAI that gets close to 50% utilization in AMD MI300X GPUs
1
0
4
@diegoasua
Diego
2 months
@Yuchenj_UW @karpathy One interesting comparison would be hellaswag score vs compute ~ training time if training setup is approx the same. For each model
2
0
5
@diegoasua
Diego
1 month
It's wild to think that the new Llama 3.1 8B scores higher than the OG ChatGPT (ChatGPT-3.5) which was a 175B network. Over the course of 2 years we have compressed skill and knowledge by a factor of > 20x Bananas.
1
0
4
@diegoasua
Diego
3 months
@swyx @elevenlabsio @cartesia_ai @krandiash @_albertgu Cartesia vs ElevenLabs > 3x cheaper 180 ms vs 400 ms
Tweet media one
Tweet media two
0
0
4
@diegoasua
Diego
5 months
@karpathy And it’s very impressive!
0
0
5
@diegoasua
Diego
4 months
@__tinygrad__ these guys ship
0
0
5
@diegoasua
Diego
3 months
@garybasin you extrapolate that from GPT-2 small a 117M language model? That’s not a large language model. So your premise is that a model 10,000x smaller than SOTA LLMs will behave equally.
Tweet media one
2
0
5
@diegoasua
Diego
1 year
0
0
2
@diegoasua
Diego
1 year
@natfriedman @aidan_mclau And it has a phone number!
0
0
3
@diegoasua
Diego
6 months
The M2 Ultra with unified memory is also interesting but even though it has crazy memory bandwidth, the integrated GPU is very much lacking compared with any high end AMD or Nvidia card. If only Apple made HPC GPUs, they could become a powerful player in this space
2
0
4
@diegoasua
Diego
1 year
@madhavsinghal_ 20x cheaper. And prompts are way more steerable.
0
0
5
@diegoasua
Diego
5 years
Thank you Windows
Tweet media one
0
0
4
@diegoasua
Diego
6 months
Tweet media one
1
0
4
@diegoasua
Diego
2 months
@karpathy Sonnet also can't do it
Tweet media one
Tweet media two
1
0
4
@diegoasua
Diego
4 years
Just watched @RetoPaul recorded talk about sweet, single-objective, aberration-free remote-focusing oblique lightsheet microscopy #ImagingONEWORLD . Very charming!
0
1
4
@diegoasua
Diego
3 months
@karpathy @natfriedman Apple’s many years investment in super dope low energy inference chips in every Apple device will pay out big time
0
0
4
@diegoasua
Diego
6 months
@tomchapin A mixture of a mixture of experts
0
0
4
@diegoasua
Diego
5 months
@andersonbcdefg I have discovered an awesome solution to this which this tweet is too narrow to contain. The proof is left as an exercise to the reader
0
0
4
@diegoasua
Diego
4 months
0
0
4
@diegoasua
Diego
9 months
0
0
1
@diegoasua
Diego
6 months
Modern 10x ML engineer stack CUDA MLX JAX Julia, Zig
0
0
4
@diegoasua
Diego
5 months
@Teknium1 zuck: {...} we've also got more releases coming soon to bring multimodality and bigger context windows {...} be patient padawan
0
0
4
@diegoasua
Diego
2 months
Compare this to American tech: - Google: $305B - Microsoft: $200B - Apple: $383B - Meta: $134B - Nvidia: $80B - Tesla: $96B - Qualcomm: $36B - Broadcom: $36B - Airbnb: $10B - AMD: $23B - Netflix: $33B - Micron: $16B
1
0
4
@diegoasua
Diego
2 months
@julien_c so they haven't built it yet. demos @ WWDC were fake?
0
0
4
@diegoasua
Diego
4 months
@Teknium1 @ylecun so V-JEPA looks like this You have a pair of encoders and one of them is masked (BERT style but images). it's still self supervised but the architecture is completely different than GPTs. V-JEPA learns masked chunks of video in a latent space
Tweet media one
3
0
4
@diegoasua
Diego
5 years
Spotted at Baier lab in MPI neuro — I am a great fan of whoever made this
Tweet media one
0
0
4
@diegoasua
Diego
5 years
This thread is music to my ears. @eLife is starting a big change here, looking forward to more of this
@mbeisen
Michael Eisen
5 years
Since there's been a lot of discussion today about @eLife prompted by @TanentzapfLab , I thought it would be a good time to discuss several initiatives we're taking to reshape peer review.
32
414
703
0
1
4
@diegoasua
Diego
4 years
Today someone asked me if I wear glasses. I sometimes do!
Tweet media one
1
0
4
@diegoasua
Diego
4 years
@cziscience is awarding to the right projects, like @napari_imaging @DeepLabCut @zarr_dev . Good job everyone!
@DeepLabCut
DeepLabCut 🦄
4 years
🥂MAJOR news!! We received funding from @cziscience #EOSS to support #DeepLabCut in 2021!!! We can't even put into words how happy & grateful we are. Want to see where we are going in 2021? #teamDLC @EPFL @CIS_EPFL @Brainmind_EPFL
Tweet media one
3
18
140
0
2
4
@diegoasua
Diego
5 months
@mollycantillon source code looks nothing like I expected
0
0
4
@diegoasua
Diego
2 months
Hope y'all enjoyed your s'mores and fireworks. Happy 4th of July
0
0
4
@diegoasua
Diego
7 months
@amasad (Claimed). We’ll see if it’s actually the case. Google has been claiming a lot that did not materialize
0
0
3
@diegoasua
Diego
3 months
@BrandonLive @AravSrinivas @karpathy So when Tesla says E2E neural nets what they mean is they replaced the decision component with neural nets. There is no longer fixed logic, it’s all trained. This diagram is about the segmentation network that produces the segmented 3D point cloud and fed into the decision module
2
0
3
@diegoasua
Diego
4 years
There should be at least one of these copies in every academic lab that does some programming in Python. PIs start pre-ordering
@aquicarattino
Aquiles Carattino
4 years
Getting ready!! Final checks done, last compilation done. Final tweaks to the cover, and then sending it to the printer!
Tweet media one
Tweet media two
0
0
15
0
1
4
@diegoasua
Diego
4 months
@natfriedman
Nat Friedman
4 months
"People hire a janitor service to clean their office. They don't hire a generic labor service, even though it's basically the same thing." – advice for AI startups.
35
86
1K
0
0
4
@diegoasua
Diego
11 months
@alexandr_wang I would say this moment in time is the closest we have been to WW3 since the start of the millennia
0
0
0
@diegoasua
Diego
4 years
Academic metabolism
Tweet media one
0
0
3
@diegoasua
Diego
2 months
@braddwyer @vikhyatk Or memory. H100: 80GB HBM3 @ 3.9 TB/s vs RTX4080: 24GB GDDR6X @ 1 TB/s Or interconnects. NVLink @ 900 GB/s vs PCIe @ 32 GB/s H100s are not meant to be used alone but in clusters. The win is when you need 1,000 of these not when you need 1.
1
0
3
@diegoasua
Diego
5 months
1
0
2
@diegoasua
Diego
2 years
@karpathy @a_meta4 End-to-end GPU-powered IDE with copilot in the browser sounds fire
0
0
3