Shreyas Kapur @shreyaskapur Twitter profile

Pinned Tweet

Shreyas Kapur

3 months

My first PhD paper!🎉We learn *diffusion* models for code generation that learn to directly *edit* syntax trees of programs. The result is a system that can incrementally write code, see the execution output, and debug it. 🧵1/n

120

627

6K

Last Seen Profiles

@mirshroom

@MrsBell9

@nikosia_

@gojoosaorru

@sherr_koulal

@pollitato

@EliasWenerFBK

@XantheFuller

@RandoGallery

@chinen__booo

@hoshitsuseyo

@mfybry99228690

@EinheitVAL

@cellcellist34

@il2525p

@kike_sola

@mytbrisgrowing

@markrobinsonNC

@_Amamiyayen_

@kissritani

@kape__1999

@muturkiyat

@penikma89063156

@LCSProtect

@Emanuel68587011

@elyannaalas

@kikakumennchi

@kodiveerankilma

@GordonCohagan

@PWAYNJ

@piotrbinkowski

@cryptobynight

@LucknamPark

@mark21425304

@coope38010

@positiv07652695

Shreyas Kapur

@shreyaskapur

3 months

We develop an analogous version of “noise” for syntax trees inspired by the computer security literature on fuzzing🎲. And we teach our model to reverse this noise⏪. 2/n

4

11

266

Shreyas Kapur

@shreyaskapur

3 months

We managed to get part of our project running in the browser, Website🌎: Paper📄: Code🖥️: Thanks for my wonderful collaborator @jenner_erik , and advisor Stuart Russell! n/n 🧵

6

13

252

Shreyas Kapur

@shreyaskapur

3 months

I had a lot of fun working on this. I didn't believe that a chess playing neural net could learn to do look-ahead just in its weights, so I was definitely the non-believer in this project.

Erik Jenner

@jenner_erik

3 months

♟️Do chess-playing neural nets rely purely on simple heuristics? Or do they implement algorithms involving *look-ahead* in a single forward pass? We find clear evidence of 2-turn look-ahead in a chess-playing network, using techniques from mechanistic interpretability! 🧵

16

137

895

3

15

222

Shreyas Kapur

@shreyaskapur

3 months

Our implementation works on a given context-free grammar. Here is an example of our model diffusing a smaller “SVG”-like language. 4/n

5

6

204

Shreyas Kapur

@shreyaskapur

3 months

A model that *edits* code makes it really easy to combine it with a search algorithm🔎. 3/n

1

5

189

Shreyas Kapur

@shreyaskapur

3 months

This is what that language looks like, 5/n

1

136

Shreyas Kapur

@shreyaskapur

3 months

We show how our approach outperforms previous methods, including rejection sampling a Vision-Language Transformer that is specifically trained on these tasks (CSGNet in this figure). 6/n

1

3

127

Shreyas Kapur

@shreyaskapur

3 months

These languages are small, and we only show this approach on a fairly narrow inverse-graphics task. In the future, we hope to show that this approach may potentially work more generally with languages with loops and variables. 8/n

1

126

Shreyas Kapur

@shreyaskapur

3 months

Of course, our architecture is also a Vision-Language Transformer that is trained to edit code via tree diffusion. 7/n

2

122

Shreyas Kapur

@shreyaskapur

3 months

@sdtoyer 😂I'm glad you asked Sam! We've been working on a modern, functional, and performant library for graphics and diagrams in Python called iceberg,

GitHub - revalo/iceberg: A compositional diagramming and animation library as an eDSL in Python

A compositional diagramming and animation library as an eDSL in Python - revalo/iceberg

github.com

2

60

Shreyas Kapur

@shreyaskapur

3 months

@anwesh_bh Yes absolutely that's a problem. The tree diffusion approach we propose allows us to collect a very rich dataset of "edits" to train a model. here you can click on "add noise"

2

0

27

Shreyas Kapur

@shreyaskapur

3 months

@gazorp5 this is a great point! I'll add explicit examples of failure modes.

1

0

26

Shreyas Kapur

@shreyaskapur

3 months

@realmrfakename Yes! It can start from a randomly initialized program and edit its way to a target image, guided by search.

1

0

14

Shreyas Kapur

@shreyaskapur

3 months

@chiaralalalah You should really check out @aaron_lou 's excellent work on language modeling,

Aaron Lou

@aaron_lou

6 months

Announcing Score Entropy Discrete Diffusion (SEDD) w/ @chenlin_meng @StefanoErmon . SEDD challenges the autoregressive language paradigm, beating GPT-2 on perplexity and quality! Arxiv: Code: Blog: 🧵1/n

20

133

677

1

7

Shreyas Kapur

@shreyaskapur

7 months

🐱🐱🐱✨✨✨

Chung Min Kim

@ChungMinKim

7 months

🪴Should the leaves of a plant be considered separate or part of the whole? Answer: it really depends! Points can, and should, belong to multiple groups. With GARField, points can belong to multiple groups, with physical scale 📏 as an extra dimension.

7

28

200

0

5

Shreyas Kapur

@shreyaskapur

8 years

Shots fired.

Andrew Ng

@AndrewYNg

8 years

DeepMind is investing heavily in learning to play video games. Congrats! Hope this transfers someday to non-simulated worlds.

4

121

244

0

3

Shreyas Kapur

@shreyaskapur

11 months

@dxwu_ More parameters is the new norm in DL, it's so cool to see theory that pins this down!!

0

2

Shreyas Kapur

@shreyaskapur

3 months

@EmilevanKrieken In our current mutation scheme, the expression can get longer or shorter at roughly the same probability, so not sure about the limiting distribution. Anecdotally we noticed that if we noise the program some number of times, the programs resemble just random programs.

0

1

Shreyas Kapur

@shreyaskapur

9 years

It's still impressive to see Facebook use a pure CNN approach against MCTS, if only they combined the two :P

0

2

Shreyas Kapur

@shreyaskapur

3 months

@InglfurAri As mentioned, the DSLs used are small. The x, y, w, h values snap to a pretty coarse grid. We also limit the max number of objects that can be placed.

1

0

1

Shreyas Kapur

@shreyaskapur

9 years

@hackpert Considering the studies done to compare convnets with brains, I think convnets do emulate a very small but crucial part of a brain

0

1

Shreyas Kapur

@shreyaskapur

3 months

@EmilevanKrieken I think it has a lot of synergies with GFlowNets (which we mention in the paper) and one of our baseline methods (REPL Flow) is a mix between Ellis et. al. reimagined as a GFlowNet.

1

0

1

Shreyas Kapur

@shreyaskapur

9 years

CS231n is like Sherlock, the wait for the next episode .. er .. lecture is killing me. @cs231n

0

1

Shreyas Kapur

@shreyaskapur

9 years