Austin Huang @austinvhuang Twitter profile

Pinned Tweet

Austin Huang

15 days

Announcing: The initial release of my 1st project since joining the amazing team here at @answerdotai gpu.cpp Portable C++ GPU compute using WebGPU Links + info + a few demos below 👇

22

183

1K

Last Seen Profiles

@NancyHornby1

@EthanDa27244503

@DTSavage_2

@wiedo

@noah_prater7

@LucyNsfw01

@ritsuka_given

@sabrown_32

@APHRODlTA

@oula_silver

@griffinalan2

@ProducerEddie

@jandakembangstw

@Itsallgoodman3

@ShinyGreninjaGX

@aDadDev

@Ren_Autumn1018

@jjkracha

@Saudileagueft

@matias_viinni

@Ecole_ProMedia

@RCentralFem

@Rjo20

@KAlZHS

@bokeplokalmalam

@fvckpunkhaw

@128_jda

@aaudisi

@zayde_sam

@ibubohay2

@iszki2

@kyumyeonlight

@Mattnerazzurro

@mick_goodwood

@AlexisMinchella

@InternetfoiErro

Austin Huang

@austinvhuang

6 months

I'm happy to share the release of gemma.cpp - a lightweight, standalone C++ inference engine for Google's Gemma models: Have to say, it’s one of the best project experiences of my career.

23

199

1K

Austin Huang

@austinvhuang

7 months

I’ve think people are bad at understanding just how *small* LLMs are. LLMs will fit on a thumb drive you can buy for cheap on Amazon and stick in your pocket, and that’s before quantization. It’s compression so strong that we’re traveling back in time to when you could get the

James Campbell

@jam3scampbell

7 months

People are really bad at understanding just how big LLM's actually are. I think this is partly why they belittle them as 'just' next-word predictors

98

439

3K

20

65

620

Austin Huang

@austinvhuang

2 years

9 out of 10 ML researchers are working on RLHF in 2023.

19

50

563

Austin Huang

@austinvhuang

9 months

There’s not enough creative exploration of the KV cache.. researchers seem to ignore it as just an engineering optimization. The KV cache is the global state of the transformer model. Saving, copying, reusing kv cache state is like playing with snapshots of your brain.

Sasha Rush

@srush_nlp

9 months

@haozhangml The use case I am interested in is that I want to generate from 1000 different short suffixes that all use the same long prefix. (I can do this in transformers by setting the KV cache.)

5

1

24

4

22

195

Austin Huang

@austinvhuang

2 years

A personal update - I recently joined Google Brain. I feel fortunate to join such a wonderful and collaborative research community during such an interesting time. Excited for what's next in this journey of accelerating AI progress. The coming years are going to be wild.

9

3

182

Austin Huang

@austinvhuang

2 months

Async* life update - I have joined Answer .AI , the "new old kind of R&D lab" founded by @jeremyhoward and @ericries . I feel lucky to embark on this journey, with this group of people. The old R&D labs are something I've had a relationship to my entire life... there's more to

8

173

Austin Huang

@austinvhuang

3 years

machine learning needs its own demoscene with constraints analogous to 64K and 4K compos: "4 MB of parameters max" "trained using only simd parallelism" "trained on 1 GPU in 15 minutes" ...

Tim Dettmers

@Tim_Dettmers

3 years

I am excited to share my latest work: 8-bit optimizers – a replacement for regular optimizers. Faster 🚀, 75% less memory 🪶, same performance📈, no hyperparam tuning needed 🔢. 🧵/n Paper: Library: Video:

18

283

1K

3

27

168

Austin Huang

@austinvhuang

7 months

Relatedly, most people have no idea just how powerful our own personal computers are. I’d argue that discovering just what regular computers are capable of is one of the most impactful open research questions right now.

4

6

164

Austin Huang

@austinvhuang

1 year

If you've ever said "it's *just* matrix multiplication!", then you should probably read this article. Few people appreciate the insane amount of sophistication that goes into a matrix multiply.

How to Optimize a CUDA Matmul Kernel for cuBLAS-like Performance: a Worklog

In this post, I’ll iteratively optimize an implementation of matrix multiplication written in CUDA.My goal is not to build a cuBLAS replacement, but to deepl...

siboehm.com

1

23

157

Austin Huang

@austinvhuang

3 years

10 years. Respect to anyone who embarks on the lonely journey of deep work to create something like this.

Ryan Challinor @[email protected]

@awwbees

4 years

hello! for the past decade I've been building bespoke, a free modular synth environment with python livecoding support for mac/windows/linux you can find the code and get builds at and if you scroll through my feed, you'll find a bunch of videos of it

39

364

2K

2

9

141

Austin Huang

@austinvhuang

3 years

My @PyTorch Developer Day talk on Real-world Research to Production is here - How ML projects are changing - Building models when labeled data is not available - End-to-end considerations, neural network+user experience codesign #FidelityAssociate #PTD2

REAL-WORLD RESEARCH TO PRODUCTION AT FIDELITY | AUSTIN HUANG

Learn first-hand about real-world approaches to taking machine learning from research to production at Fidelity. In this talk, Austin Huang (Vice President, ...

www.youtube.com

1

22

138

Austin Huang

@austinvhuang

30 days

For a while I've had some vague sense that tinygrad's frontend was pytorch-like and internals were an optimizing compiler that spits out kernel code, but I didn't have much of an idea about the implementation. This writeup looks good and might even be a beginner-friendly entry

1

11

100

Austin Huang

@austinvhuang

4 years

@julien_c @huggingface @marksaroufim This was hilarious and has a lot of truth to it. @ykilcher

2

11

85

Austin Huang

@austinvhuang

3 years

I wonder if applied ML practitioners appreciate that "training a model" is going to become a drastically different process within the next 2-4 years.

7

2

79

Austin Huang

@austinvhuang

6 months

is on HN 🤖🙈

Hacker News 20

@betterhn20

6 months

Gemma.cpp: lightweight, standalone C++ inference engine for Gemma models ()

0

1

6

5

8

79

Austin Huang

@austinvhuang

5 years

Finally, there's a paper to cite for @PyTorch ! From the neurips 2019 pre-proceedings - h/t @junjihashimoto3

1

25

76

Austin Huang

@austinvhuang

1 year

Amazing that this works with a matmul implementation like this. omp can be unreasonably effective sometimes.

Andrej Karpathy

@karpathy

1 year

Yay, llama2.c can now load and inference the Meta released models! :) E.g. here inferencing the smallest 7B model at ~3 tokens/s on 96 OMP threads on a cloud Linux box. Still just CPU, fp32, one single .c file of 500 lines: expecting ~300 tok/s tomorrow :)

62

343

3K

3

5

74

Austin Huang

@austinvhuang

2 years

Insanely great to watch new generations of artists come into their own. I have fond memories of seeing Brandon do the goofy-yet-inspired weekly YT grind on the freddiew channel. @StressLevelZero is probably at the forefront of VR user interaction now.

Brandon J Laatsch

@BrandonJLa

2 years

The teaser for our 4th project, BONELAB.

217

417

2K

1

2

67

Austin Huang

@austinvhuang

3 years

"Any prior knowledge can be encoded as a data augmentation or data generation scheme." Yes! This is the right way to incorporate prior knowledge into machine learning models. The key factor is that data is inherently composable.

Pachyderm

@pachyderminc

3 years

Reinforcement Learning for Industrial AI with Pieter Abbeel - #476 by @twimlai

0

4

18

5

8

53

Austin Huang

@austinvhuang

15 days

Non-AI general purpose GPU compute can be fun too. Here's shadertui - a terminal-based shadertoy clone that live loads WebGPU compute shaders. shadertui is only ~ 150 lines of code and compiles in a second.

2

54

Austin Huang

@austinvhuang

15 days

gpu.cpp helps open up this portable + hackable + ease-of-use design space for GPU programming. Here's a "hello world" program implementing a GELU activation. We use WebGPU as a portable native GPU API (no browser or web needed). Edit/compile/run cycles are 1-2 seconds.

2

7

51

Austin Huang

@austinvhuang

15 days

GPUs are the most empowering technology in the world today. Currently programming GPUs is usually - Low-level platform-specific stacks - CUDA, ROCm - High-level, portable frameworks (PyTorch/ Jax) + ML compilers. There are good, practical reasons for this combo. Nonetheless..

1

51

Austin Huang

@austinvhuang

7 months

@typedfemale Nix: featuring 100% reproducibility on demand yet somehow requires constant maintenance.

2

1

50

Austin Huang

@austinvhuang

5 years

Students looking for a summer of code project in Haskell, consider contributing to @hasktorch -

Add hasktorch to ideas for 2020 by austinvhuang · Pull Request #119 · haskell-org/summer-of-haskell

Hello, I've added a markdown description for potential GSoC 2020 Hasktorch ideas. Does this look alright? Let me know if any changes are needed. Thanks!

github.com

2

10

48

Austin Huang

@austinvhuang

3 years

We had the privilege of presenting our Sim2Real Docs project at @NeurIPSConf 2021 DCAI. TLDR: Python library for synthetic images of documents in natural scenes using @Blender . It's open source: It's @nmaddikunta21 's first paper:

Andrew Ng

@AndrewYNg

3 years

Thanks to everyone that participated in the Data-centric AI NeurIPS workshop! I was surprised and delighted at the sheer amount of innovation in the field. I also share some of my reflections from the workshop in The Batch.

5

22

192

3

8

49

Austin Huang

@austinvhuang

3 years

@abhi1thakur A while back I took a stab at unrolling the call stack into one big figure (based on Sasha Rush's Annotated Transformer post).

2

3

46

Austin Huang

@austinvhuang

15 days

I've always wished we could just do this with low-level GPU code though: #include "gpu.h" // do gpu stuff No custom vendor tooling, massive build system, or fiddling w/ details like descriptor set layouts. Just a C++ compiler + instant edit/compile/run.

2

47

Austin Huang

@austinvhuang

6 years

@fchollet The issue is "knowing the math" in the wrong sense - CS education fails by emphasizing "algorithms as sequences of operations" instead of computational models as mathematical representations. The math of PCA as a generative latent model elucidates what it does and why it works.

0

5

45

Austin Huang

@austinvhuang

15 days

Many thanks to @jeremyphoward Sarah Pan and fellow @answerdotai colleagues for supporting this project and kicking the tires. Also thanks to early contributors like @junjihashimoto , Trevor and Micheal at the gpu.cpp discord. **Join us** at the gpu.cpp channel in the @fastdotai

2

5

45

Austin Huang

@austinvhuang

4 years

Nice example of why one should never mistake implementation effort for a moat.

Sabrina J. Mielke

@sjmielke

4 years

Another old bookmark from 2014 where some guy writes a fast and simple syntactic parser: ...wait ... @honnibal ? Omg, is this the spaCy origin story 😱

1

13

89

1

6

44

Austin Huang

@austinvhuang

3 months

Remember base rates - you can easily have 99% accuracy on a balanced test set and then be wrong 99% of the time when you deploy to the real world. OpenAI’s own AI text classifier was in this regime before they shut it down.

Peter Yang

@petergyang

3 months

AI detectors feel like total scams - sad that students have to deal with this

762

3K

35K

1

6

42

Austin Huang

@austinvhuang

4 months

@KLdivergence I read it as: - the prediction target of the neural network is the latent variable of interest. - the neural network itself is the estimator function.

5

1

41

Austin Huang

@austinvhuang

5 years

@jmhessel @murraygabriel Point taken ... but as a parent I'm definitely observing some form of sequential pre-training + latent dimension expansion with the baby.

1

0

39

Austin Huang

@austinvhuang

4 years

Finally gave HLS+ @code a spin after holding out with a syntax-highlighting neovim workflow. Haskell Language Server is fantastic. Developer ergonomics have really improved in this dimension.

3

6

40

Austin Huang

@austinvhuang

4 years

@RadekPaszkowski @bookofshaders 🤎 @code as a GLSL playground running locally- Shader hacking really does put all other programming feedback loops to shame. There's nothing quite like it for a flow junkie.

0

9

38

Austin Huang

@austinvhuang

2 years

Let's find out just how far text-as-vision can go. #SimTex - I coded this up over the holiday as a data generator + perception challenge for language models. Have a look👇

Riley Goodside

@goodside

2 years

There's a thread going around claiming ChatGPT / GPT‑3 can recognize subjects (e.g. "a bird", "a person") in images rasterized as ASCII art. I'm skeptical. E.g., GPT‑3 identifies this ASCII-art MNIST digit as "8" 57% of the time and "4" doesn't appear in its top-5 choices:

22

10

166

4

1

37

Austin Huang

@austinvhuang

15 days

We want to broaden the availability of GPU compute and just drop-in custom GPU algorithms inside applications, simulations, runtimes, etc and get broad portability + ease of use. Here's a little physics sim - ensemble of double pendulums doing their thing ~ 100 LoC + compiles

1

2

37

Austin Huang

@austinvhuang

2 years

@typedfemale Aphex Twin was the original productivity grifter for this one.

1

36

Austin Huang

@austinvhuang

3 years

@iamtrask The DPR paper opened my eyes to the central importance of neural retrieval. It was clear there's so many possibilities if you embed retrieval mechanisms w/ data as part of the model. I think we'll see more variations on fusing retrieval and inference in the next few years.

0

37

Austin Huang

@austinvhuang

6 months

What's next? There’s a lot of low-hanging fruit - we welcome external collaborators . I'm most excited to enable new research on co-design between models + inference engines. Stay tuned. “Now that things are so simple, there's so much to do.” - M. Feldman

GitHub - google/gemma.cpp: lightweight, standalone C++ inference engine for Google's Gemma models.

lightweight, standalone C++ inference engine for Google's Gemma models. - google/gemma.cpp

github.com

0

2

36

Austin Huang

@austinvhuang

15 days

There are tradeoffs for portability but early experiments are promising. w/ @junjihashimoto a naive-ish baseline matmul is around 2.5 TFLOPS on my M1 max laptop + and there's plenty of room for further optimization. We'll be following the path paved by llm.c and plan to port

2

1

36

Austin Huang

@austinvhuang

4 months

@tobyshooters Nice, sort of maps to: Abductive - artist Inductive - scientist Deductive - engineer You can live a more expansive life journey if you don’t pigeonhole yourself as an artist, scientist, or engineer.

4

2

34

Austin Huang

@austinvhuang

2 months

Such a good talk, provides much-needed clarity on conceptualizing encoder vs decoder models.

Hyung Won Chung

@hwchung27

2 months

I gave a lecture at @Stanford CS 25. Lecture video: AI is moving so fast that it's hard to keep up. Instead of spending all our energy catching up with the latest development, we should study the change itself. First step is to identify and understand

25

204

1K

0

4

34

Austin Huang

@austinvhuang

3 years

@thesephist Agreed- lets move away from high maintenance tools for thought. Instead of hand curated knowledge graphs scattered over markdown docs, #OpenMemex focuses on enabling automation with a SQLite event stream + neural network integrations. It’s open source -

GitHub - austinvhuang/openmemex: Open source, local-first knowledge platform.

Open source, local-first knowledge platform. Contribute to austinvhuang/openmemex development by creating an account on GitHub.

github.com

4

1

33

Austin Huang

@austinvhuang

4 years

Weekend hack - proof-of-concept Haskell binding to @huggingface fast tokenizers. Anyone interested in seeing this expanded? Does @huggingface accept PRs for contributed language bindings, if this were further developed @ClementDelangue @srush_nlp ?

2

4

32

Austin Huang

@austinvhuang

6 months

gemma.cpp is a minimalist implementation of Gemma 2B and 7B models: focusing on simplicity and directness rather than full generality, it takes inspiration from ggml, llama.c, and other "integrated" model implementations.

Gemma: Introducing new state-of-the-art open models

Gemma is a family of lightweight, state-of-the art open models built from the same research and technology used to create the Gemini models.

blog.google

1

3

30

Austin Huang

@austinvhuang

3 years

@cmuratori Reminds me of @Jonathan_Blow 's comments on the ethics of wasting people's time at scale Should have the engine team run everybody's code reviews. For that matter, non-gamedev industries would benefit from @mike_acton parachuting in as a drill sergeant.

Jonathan Blow on Success and Ethics in Software Development

Here's a short clip of game designer and programmer Jonathan Blow, discussing his ideas of success - and quickly turning towards ethics and moral standards i...

www.youtube.com

0

1

29

Austin Huang

@austinvhuang

5 months

Efron and Morris’s non-technical intro to Stein’s paradox is a delight to read. Possibly even worldview-altering if you haven’t encountered the topic before.

Peyman Milanfar

@docmilanfar

5 months

One of the lesser known ways to compare estimators is "admissibility". An estimator θ* = g(θ,y) of θ from data y is called *in*admissible if g is uniformly dominated by another estimator g(θ,y) for all values of g(θ,y), say in the MSE sense. 1/6

8

32

291

0

6

29

Austin Huang

@austinvhuang

6 months

The goal of the project is to have a small experimental inference engine for experimentation and research. The codebase has minimal dependencies and is portable pure C++ (taking advantage of for portable SIMD).

GitHub - google/highway: Performance-portable, length-agnostic SIMD with runtime dispatch

Performance-portable, length-agnostic SIMD with runtime dispatch - google/highway

github.com

1

30

Austin Huang

@austinvhuang

2 years

@0interestrates “This is because Amazon will penalize you…”? After fine tuning a language model to exhibit a quasi ego it somehow doesn’t feel right to put threats in the prompt.

0

28

Austin Huang

@austinvhuang

3 years

Last book on the summer reading list arrived a month late but it’s finally here. H/t @marksaroufim

1

0

28

Austin Huang

@austinvhuang

5 years

Concepts, ranges, coroutines, and modules in C++20. Pretty big changes coming, perhaps comparable in scope to C++11, which kicked off the whole "modern" C++ thing.

0

7

28

Austin Huang

@austinvhuang

5 years

Really enjoyed the recent @ylecun lecture series discussing neural networks and physics at the @harvardphysics dept Loeb lectures. Looks like videos have been posted now -

0

14

28

Austin Huang

@austinvhuang

9 months

@ericjang11 Went from light hearted research speculation to pure memes way too fast.

0

28

Austin Huang

@austinvhuang

4 years

pretty-simple 4.0 release - a simple Haskell pretty printing library. For me pretty-simple "just works" and I often find myself wishing ghci defaulted to pretty printing for the repl.

From the haskell community on Reddit

Explore this post and more from the haskell community

www.reddit.com

2

9

28

Austin Huang

@austinvhuang

6 years

@er_crema @rlmcelreath Bob Carpenter's tutorial on the beta binomial

3

5

28

Austin Huang

@austinvhuang

3 years

@jeremyphoward Did you ever see @gabeeegoooh 's SEIR(+) demo? It's beautifully done:

1

2

27

Austin Huang

@austinvhuang

6 years

Nice writeup by @tarantulae with an under-the-hood look at @PyTorch 1.0 internals + a js integration example

0

13

27

Austin Huang

@austinvhuang

3 years

@jeremyphoward @github @OpenAI This makes me wonder what would happen if there was a "conditioned" generation mode. Instead of "average all of github", you'd change a setting to bias the the model towards writing code in the style of @jeremyphoward 's repos, for example.

3

2

27

Austin Huang

@austinvhuang

4 years

@twiecki @tiangolo makes excellent use of types for automation in FastAPI: For me types in python might not be a huge productivity boost, but they do help reduce the cognitive overhead of returning to a piece of code after a context switch.

1

0

27

Austin Huang

@austinvhuang

9 months

@Suhail Time to publish that paper on 64k bit integer quantization.

1

0

25

Austin Huang

@austinvhuang

5 years

@larsrosenquist I sympathize, but being locked into a complicated bespoke configuration language would’ve been worse. At least a simple format serves as an attractive compilation target for DSLs to explore and compete on. @dhall_lang + @nixos_org offer hope for a better future on this front.

0

25

Austin Huang

@austinvhuang

3 years

@wooldridgemike I was at a party having a conversation with an MIT CS professor around 2009. Hearing that I was working with modeling, simulations, and data, the professor just said point blank, "I don't see how this is computer science."

1

0

24

Austin Huang

@austinvhuang

1 year

Seeing LLMs / generative ai being casually deployed local-first to browsers thanks to two mostly unsung heroes: @ApacheTVM being ahead of the curve w/ WebGPU and WASM targets @googlechrome shipping WebGPU in Chrome 113 this month

1

6

25

Austin Huang

@austinvhuang

5 years

Look forward to meeting everyone at the @PyTorch conference #PTDC19 . Come chat with @apaszke and me about differentiable functional programming with @hasktorch :)

hasktorch-pytorchdev2019-poster

docs.google.com

3

7

24

Austin Huang

@austinvhuang

4 years

The strategy is not "more refined", it's wrong. ~ 80% of the population is so many orders of magnitude larger than capacity that even shutting down now, in the *best* case, will barely be enough to not completely overwhelm the healthcare system. (1/2)

Professor Ian Donald

@iandonald_psych

4 years

1. The govt strategy on #Coronavirus is more refined than those used in other countries and potentially very effective. But it is also riskier and based on a number of assumptions. They need to be correct, and the measures they introduce need to work when they are supposed to.

3K

18K

41K

1

3

23

Austin Huang

@austinvhuang

4 months

When there's an MoE launch, MegaBlocks is there behind the scenes. Amazing project @Tgale96

MegaBlocks: Efficient Sparse Training with Mixture-of-Experts

We present MegaBlocks, a system for efficient Mixture-of-Experts (MoE) training on GPUs. Our system is motivated by the limitations of current frameworks, which restrict the dynamic routing in MoE...

arxiv.org

Jonathan Frankle

@jefrankle

4 months

And we stood on the shoulders of giants in the community: * @TGale96 , creator of MegaBlocks * The @PyTorch team and FSDP * @nvidia and TensorRT-LLM * The vLLM project * @AiEleuther and their evaluation tools * @dsmilkov + @nsthorat of @lilac_ai * Our amazing friends at @allen_ai

1

6

66

1

4

23

Austin Huang

@austinvhuang

6 months

Beyond the interactive terminal ui for playing with the model, with near-instant model loading we can use gemma as a local-first command line LLM tool.

1

2

23

Austin Huang

@austinvhuang

5 years

Congrats @apaszke & #S4TF on releasing "Tensors Fitting Perfectly" - static analyzers are an exciting middle path b/w dependently typed and untyped tensor dimensions. @srush_nlp - this assert approach may be interesting given prior conversations :)

GitHub - google-research/swift-tfp: Find shape errors before you run your code!

Find shape errors before you run your code! Contribute to google-research/swift-tfp development by creating an account on GitHub.

github.com

Swift for Tensorflow newsletter

@s4tfnews

5 years

ITI: Tensors Fitting Perfectly library by @apaszke has been open sourced. @DynamicWebPaige and @bsaeta visited @swiftbysundell podcast to talk about #MachineLearning and #S4TF . @gsoc '19 students presented projects they hacked on during the summer at the last Swift Design meeting.

0

8

22

2

5

23

Austin Huang

@austinvhuang

3 years

Looking forward to participating in the @ml_collective research jam this Wednesday. I'll be presenting a lightning talk on @hasktorch . Additional details should be up soon. Thanks to @savvyRL for organizing!

ML Collective

@ml_collective

3 years

Working on an ML research project but don’t have labmates to show? Show a plot or three in our Research Jam in two weeks! Also open to folks that want to bounce an idea off others or who just want to hang out and talk shop. Details:

1

30

96

0

5

23

Austin Huang

@austinvhuang

5 years

@GabrielG439 I'm a believer in these sand mandala rituals of development 😀 Throwing away code is not a waste if the internal state of the developer has been mutated by the process.

0

2

21

Austin Huang

@austinvhuang

11 months

Researchers often dismiss NN inference runtimes as merely deployment infrastructure. We should embrace runtime implementations as a form of research and discovery. This helps us understand models as systems, discover new methods, and enable new capabilities.

Andrej Karpathy

@karpathy

11 months

Speculative execution for LLMs is an excellent inference-time optimization. It hinges on the following unintuitive observation: forwarding an LLM on a single input token takes about as much time as forwarding an LLM on K input tokens in a batch (for larger K than you might

111

604

4K

2

22

Austin Huang

@austinvhuang

3 years

@a_cowley There's things to learn from UX/gamedev folks about engagement loops. How long does is it take for a beginner to get to their first success (e.g. a working useful app)? Haskellers are usually orders of magnitude off as to what constitutes "acceptable"/"good" along this axis.

2

0

22

Austin Huang

@austinvhuang

5 years

@Thom_Wolf An unexpected, forward-thinking idea in PyTorch's list of design principles is that "worse is better". From

0

1

21

Austin Huang

@austinvhuang

5 years

Nice writeup for those coming from python or other languages. People love to write about advanced topics in Haskell but this basic stuff is where a lot of the core value proposition is.

Oisín Kidney

@oisdk

5 years

New Post: What is good about #Haskell ?

5

33

80

1

6

20

Austin Huang

@austinvhuang

2 years

@BlackHC @rasbt @randal_olson If this were applied to a class where 4% of students were cheating, then 9 in 10 students flagged as "Likely AI-generated" would be false positives? Total flagged as AI: .96*.09 + .04*.26 Human-written but wrongly flagged as AI: .96*.09 .96*.09 / ((.96*.09) + (.04*.26)) = .89

1

3

21

Austin Huang

@austinvhuang

2 years

How has dwarf fortress not been turned into an RL environment?

1

20

Austin Huang

@austinvhuang

6 months

The core implementation is ~ 2K LOC,w/ ~ 4K LOC supporting code. It’s meant to be both hackable and also embeddable as a library w/ cmake. Prototype your apps with local LLM inference as a C++ function call. Add runtime support for your own research with a few lines of code.

1

20

Austin Huang

@austinvhuang

1 year

@yoavgo Fwiw I don’t get it either. Everyone is talking about boilerplate and abstractions and agents but: 1) LLMs have 1 endpoint 2) afaict it adds boilerplate + cognitive overhead vs just … doing the thing. 3) It’s perfectly doable to implement agents and API chaining as programs.

0

20

Austin Huang

@austinvhuang

3 months

@srush_nlp IMO being an incoherent discipline is a virtue. Once a field is coherent enough that there's a sharp delineation between in/out of scope, that adds an (imo unnecessary) constraint to search directions for progress.

1

0

18

Austin Huang

@austinvhuang

2 months

I've only been at Answer .AI a short while, but I can already say there are some amazing things coming. I'll be releasing something soon too. Stay tuned.

1

0

20

Austin Huang

@austinvhuang

6 months

Will post a real thread later today, but for now: 👋

GitHub - google/gemma.cpp: lightweight, standalone C++ inference engine for Google's Gemma models.

lightweight, standalone C++ inference engine for Google's Gemma models. - google/gemma.cpp

github.com

0

4

20

Austin Huang

@austinvhuang

6 years

@jeremyphoward @apaszke @BrabecJan91 @gregcons covers a beginner-safe modern-ish C++ subset (though not available as a book and the talk is aimed at convincing teachers) +1 @BrabecJan91 Effective [Modern] C++ comes closest (except for gamedev .. in which case @apaszke 's joke answer applies)

CppCon 2015: Kate Gregory “Stop Teaching C"

http://www.Cppcon.org—Presentation Slides, PDFs, Source Code and other presenter materials are available at: https://github.com/cppcon/cppcon2015—To this day...

www.youtube.com

1

5

19

Austin Huang

@austinvhuang

4 months

@davidad Audiomulch had matrix routing as modules embedded within a topological graph. Worked incredibly well for re-routing patches as part of a live performance. Still one of my favorite HCIs 25 years on.

0

1

19

Austin Huang

@austinvhuang

1 year

@dustinvtran Language is also much more forgiving than (1-e)^n. I get where the diffusion process view comes from but I don't think it provides the correct intuition.

0

18

Austin Huang

@austinvhuang

6 months

Jan Wassenberg (author of ) and I started gemma.cpp as a small project just a few months ago. We were lucky to find amazing collaborators from around Google - @PhilCulliton , @dancherp , Paul Chang, and of course, the GDM Gemma team.

1

19

Austin Huang

@austinvhuang

2 months

Much appreciation to all my colleagues at @GoogleDeepMind . My time there was a life-changing experience and I was especially proud to be a part of the Gemma and open sourcing effort. Despite the Google criticism that's common nowadays (sometimes warranted), there are

1

0

19

Austin Huang

@austinvhuang

3 years

A large part of my own team's successes solving everyday applied ML problems can be boiled down to starting with "what program can I write to produce the desired model behavior?" instead of "I need labeled training data".

Max Jaderberg

@maxjaderberg

3 years

Very excited to release our new work: Open-Ended Learning Leads to Generally Capable Agents. tldr; algorithm that dynamically shapes task distributions to train agents on huge task space, resulting in surprisingly general behaviour Thread: (1/n)

10

216

874

2

3

19

Austin Huang

@austinvhuang

5 years

Hire this guy. Also, congrats @apaszke !

Adam Paszke

@apaszke

5 years

My graduation is approaching rapidly, and that means that I’ll be able to work full-time soon. If you’re looking for someone excited about building next generation tools and infra for scientific computing (and ML) then let's talk! Note: remote and Europe only 🚀

19

316

0

1

19

Austin Huang

@austinvhuang

2 months

Next generation AI in Japan extends beyond particular products, it’s a geopolitical imperative. I’m really happy to see Sakana AI and Shane x GDM Japan leading things there. It’s a bullish sign for a country’s institutional capacity that it’s able to discern technical

Shane Gu

@shaneguML

2 months

Whether you agree or not with $1.1B valuation of Sakana AI based on their outputs, I argue it was easy to raise $155M. Japan is a fascinating country with untapped market and talent opportunities. VCs/AI engineers interested in investing/working in Japan, feel free to DM me. My

6

13

172

0

5

19

Austin Huang

@austinvhuang

4 years

I learn a lot from design notes of frameworks. These are great -

👩‍💻 Paige Bailey

@DynamicWebPaige

4 years

😁 Couldn't agree more! "Unlike the stateful pseudorandom number generators (PRNGs) that users of NumPy and SciPy may be accustomed to, JAX random functions all require an explicit PRNG state to be passed as a first argument." Learn about it here 👉

4

7

53

1

19

Austin Huang

@austinvhuang

3 years

Both can be simultaneously true: 1. Your model makes a mistake only 1 out of a trillion times in your test set. 2. The probability of your model exhibiting catastrophic failures shortly after prod deployment is nearly 1.

0

2

18

Austin Huang

@austinvhuang

3 years

Here's the final version, for now. High res svg/png/pdf:

Austin Huang

@austinvhuang

3 years

First attempt at a visual which flattens the call hierarchy of the classic "Annotated Transformer" by @srush_nlp . Suggestions welcome.

1

0

9

2

18

Austin Huang

@austinvhuang

1 year

@yoavgo @johnschulman2 It seems having an auxiliary model is useful to learn the knowledge boundary of the main model and the reward model can do that (among other things) w/ RL. Still not obvious that this "I-don't-know" problem couldn't be somehow addressed using an auxillary model + SL though.

2

1

18

Austin Huang

@austinvhuang

9 months

@jeremyphoward Summary of the 90s for people who didn't get to experience the magic: Use a computer? Must be an antisocial nerd. Produce electronic music? Sounds like video game music to everyone because real music has guitars. Friends is the most popular TV show.

3

0

17

Austin Huang

@austinvhuang

3 years

@jlongster Types are intended to be broken as you evolve a program. @Jonathan_Blow calls this "one of the most powerful programming techniques" In Haskell I don't spend time to plan types - I just start YOLO coding and use this technique to refactor as needed.

Jonathan Blow on Refactoring

If you have questions, you can come to one of Jon's streams:https://www.twitch.tv/j_blow

www.youtube.com

1

0

18

Austin Huang

@austinvhuang

4 years

My @MuniHac talk introducing @hasktorch . Thanks to @kosmikus and the organizers for the invitation, really enjoyed participating.

MuniHac 2020: Austin Huang - Hasktorch: Differentiable Functional...

Title: Hasktorch: Differentiable Functional Programming in HaskellSpeaker: Austin HuangOptimization over function composition is the unifying feature of mach...

www.youtube.com

2

1

17

Austin Huang

@austinvhuang

2 years

@lexi_lambda I'm working on this backed by sqlite. Not there yet with direct manipulation but i am working towards a very different client. It is sad that there are many ideas that are technically possible yet don't cross the threshold of financially sustainability.

1

17