Phillip Isola @phillip_isola Twitter profile

Last Seen Profiles

@proyek_kuli

@ANUcass

@asry531

@B0Jbe

@GOGITO08388206

@KlickDenis10845

@EButterwolf

@Felipe_P76

@HollyGraceful

@_ta08

@zaiming1256

@sac4nagem

@SLakeathletics

@magicellphone

@Alexvittery1

@sirteabear

@OkShazir

@NIsik6760

@the__drummist

@the_pooja1

@kirero

@ps

@BinorRaja

@msdpure

@bokeplokalmalam

@info_uposatha

@FrenzMarsh47643

@Zyon_Hailey

@SmegaForever

@ServantsHeralds

@mnsvngsLms

@uncutcol

@jcmcarla1111

@rvkirk

@ourANU

Phillip Isola

@phillip_isola

6 years

#BigGAN is so much fun. I stumbled upon a (circular) direction in latent space that makes party parrots, as well as other party animals:

32

692

3K

Phillip Isola

@phillip_isola

6 months

Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4

40

408

2K

Phillip Isola

@phillip_isola

5 months

New paper: The Platonic Representation Hypothesis In which we posit that _different_ foundation models are converging to the _same_ representation of reality. paper: website: code: 1/8

GitHub - minyoungg/platonic-rep

Contribute to minyoungg/platonic-rep development by creating an account on GitHub.

github.com

34

257

1K

Phillip Isola

@phillip_isola

2 years

Language-conditional models can act a bit like decision transformers, in that you can prompt them with a desired level of "reward". E.g., want prettier #dalle creations? "Just ask" by adding "[very]^n beautiful": n=0: "A beautiful painting of a mountain next to a waterfall."

35

215

1K

Phillip Isola

@phillip_isola

1 year

A simple, fun example to refute the common story that ML can interpolate but not extrapolate: Black dots are training points. Green curve is true data generating function. Blue curve is best fit. Notice how it correctly predicts far outside the training distribution! 1/3

67

138

1K

Phillip Isola

@phillip_isola

1 year

We’re releasing a new image similarity metric and dataset! --> DreamSim: a metric which outperforms LPIPS, CLIP, and DINO on similarity and retrieval tasks --> NIGHTS: a dataset of synthetic images with human similarity ratings paper+code+data: 1/n

12

197

925

Phillip Isola

@phillip_isola

2 years

An interesting thing about ChatGPT is you can script in it a bit like you would in a programming language. You can define functions, compose them, etc. Except all in natural language! This means you can write out common tasks and attach them to command names. For example: 1/n

12

120

809

Phillip Isola

@phillip_isola

7 years

This figure from group norm () is super useful for anyone trying to keep track of how all these things relate:

2

232

706

Phillip Isola

@phillip_isola

2 years

Back in 2018 at OpenAI, a few of us wrote a story with gpt as an AI "co-author". We didn't have an AI illustrator back then, but now we sort of do, so I tried plugging the text into #dalle . Here is the result! “The Bees”, a short story by humans & AIs:

17

82

610

Phillip Isola

@phillip_isola

1 year

This looks like one of those results that marks a phase transition in science: for years people have anticipated that synthetic data would eventually outperform / boost real, but an imagenet scale result has been elusive. Finally models are good enough that it works!

AK

@_akhaliq

1 year

Synthetic Data from Diffusion Models Improves ImageNet Classification abs:

18

200

950

13

108

611

Phillip Isola

@phillip_isola

2 years

n=22: "A very very very very very very very very very very very very very very very very very very very very very very beautiful painting of a mountain next to a waterfall."

10

55

550

Phillip Isola

@phillip_isola

10 months

This paper is so cool: It shows several kinds of illusions that I had never seen before (e.g., color inversion illusion). It's exciting to see more and more cases like this, where AI opens up new kinds of art, rather than only imitating old forms.

10

140

561

Phillip Isola

@phillip_isola

2 years

I'm updating the "hacker's guide to deep learning" lecture for a course I'm teaching this semester () -- what are your favorite ~2022 DL tips and tricks should I definitely include?

16

77

543

Phillip Isola

@phillip_isola

2 years

Wondering what to do in the era of GPT? One answer: do science! There is still so much to understand about _why_ these models work (or don’t). Here’s my group’s lastest* on the science of deep learning, newly accepted in TMLR:

The Low-Rank Simplicity Bias in Deep Networks

Modern deep neural networks are highly over-parameterized compared to the data on which they are trained, yet they often generalize remarkably well. A flurry of recent work has asked: why do deep...

arxiv.org

6

77

538

Phillip Isola

@phillip_isola

4 years

Sharing two new preprints on the science and theory of contrastive learning: pdf: code: w/ @YonglongT , Chen Sun, @poolio , @dilipkay , Cordelia Schmid pdf: code: w/ @TongzhouWang 1/n

GitHub - ssnl/align_uniform: Open source code for paper "Understanding Contrastive Representation...

Open source code for paper "Understanding Contrastive Representation Learning through Alignment and Uniformity on the Hypersphere" ICML 2020 - ssnl/align_uniform

github.com

4

109

434

Phillip Isola

@phillip_isola

2 years

I give a lot of talks but often only a few people see them. I’m going to try a new experiment where I write blog versions of some of the talks I give. Here’s the first one: A short, general-audience intro to "Generative Models of Images" -->

5

52

389

Phillip Isola

@phillip_isola

1 year

This only worked because we had a good hypothesis space (few params, contains true fn). Point is: if you have the right hypothesis space, you can extrapolate, correctly, far outside the training distribution! 3/3

23

10

368

Phillip Isola

@phillip_isola

4 years

GANs get a lot of press for making photorealistic images, but to me the more impressive and useful feat is that they discover this organized "latent space" that underlies the visual world. This video is a really nice intro to that concept and why it is so cool:

4

65

372

Phillip Isola

@phillip_isola

3 months

I recently gave a talk on the platonic representation hypothesis at the Simons Institute, which is now online here: pdf of the slides for those interested, feel free to reuse:

platonic_rep_simons2024.pdf

Shared with Dropbox

www.dropbox.com

4

54

343

Phillip Isola

@phillip_isola

3 years

More new work at ICCV: Training a GAN to explain a classifier Idea: visualize how an image would have to change in order for the predicted class to change. Can reveal attributes and biases underlying deep net decisions. w/ big team @GoogleAI —>

3

63

342

Phillip Isola

@phillip_isola

4 years

Surprising and fun result: Unpaired image translation without a deep net, just a _linear_ transformation: (and no GAN too!)

3

62

310

Phillip Isola

@phillip_isola

2 years

Now trying out parameterized natural language commands in ChatGPT. a bit like defining a function F(X;n) where n is a parameter. Here is the prompt: 1/n

10

27

277

Phillip Isola

@phillip_isola

1 year

Should you train your vision system on real images or synthetic? In the era of stable diffusion, the answer seems to be: synthetic! One stable diffusion sample can be worth more than one real image. paper link:

StableRep: Synthetic Images from Text-to-Image Models Make Strong...

We investigate the potential of learning visual representations using synthetic images generated by text-to-image models. This is a natural question in the light of the excellent performance of...

arxiv.org

Dilip Krishnan

@dilipkay

1 year

New paper!! We show that pre-training language-image models *solely* on synthetic images from Stable Diffusion can outperform training on real images!! Work done with @YonglongT (Google), Huiwen Chang (Google), @phillip_isola (MIT) and Lijie Fan (MIT)!!

12

115

587

8

46

249

Phillip Isola

@phillip_isola

2 years

n=6: "A very very very very very very beautiful painting of a mountain next to a waterfall."

1

14

217

Phillip Isola

@phillip_isola

4 years

Beautiful summary of different learning problems from a new paper by @__ishaan and David Lopez-Paz (rest of paper also looks interesting)

0

34

217

Phillip Isola

@phillip_isola

1 year

How is that possible? Because we aren't fitting over all possible mappings x-->y, we are only considering fits of the form y = a*sin(x^2) + b*x. This "hypothesis space" extrapolates in the way shown above. Since the true fn is in this space, the extrapolation is ~correct. 2/3

21

8

205

Phillip Isola

@phillip_isola

11 months

New paper at #CoRL2023 ! "Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation" How should robots represent the world around them? This paper's answer: as a field of foundation model features localized in 3D space. paper+code: 1/n

1

30

202

Phillip Isola

@phillip_isola

6 years

What's cool about this, to me, is that BigGAN learned something that looks like 3D rotation, but it did so just by modeling 2D images:

9

29

196

Phillip Isola

@phillip_isola

3 months

Very interesting work on a question I've been thinking a lot about: when can training a system on X' ~ G outperform training directly on X (where G is a gen model of X). They find that retrieving task-relevant images from X outperforms sampling task-relevant images from G 1/n

Scott Geng

@scottgeng00

3 months

Will training on AI-generated synthetic data lead to the next frontier of vision models?🤔 Our new paper suggests NO—for now. Synthetic data doesn't magically enable generalization beyond the generator's original training set. 📜: Details below🧵(1/n)

17

95

475

6

21

198

Phillip Isola

@phillip_isola

7 months

Distributed training using parallel LoRAs, infrequently synced. My fav part is the analogy to git, where lots of coders can work together on a project, coordinated by simple operators like pull, commit, merge. Potential implications toward community training of big models.

Pulkit Agrawal

@pulkitology

7 months

Presenting a method for training models from SCRATCH using LoRA: 💡20x reduction in communication 💡3x savings in memory - Find out more: - Code available to try out - Scaling to larger models ongoing - led by Jacob Huh!

6

57

391

6

23

188

Phillip Isola

@phillip_isola

3 years

Jumping on the CLIP+VQGAN bandwagon: "What is the answer to the ultimate question of life, the universe, and everything?" (using seed=42 of course)

4

24

186

Phillip Isola

@phillip_isola

2 years

At ICLR this week, we are presenting our work on learning contrastive representations from generative models: w/ @Ali_Design_1 @xavierpuigf @YonglongT 1/n

Generative Models as a Data Source for Multiview Representation Learning

Generative models are now capable of producing highly realistic images that look nearly indistinguishable from the data on which they are trained. This raises the question: if we have good enough...

arxiv.org

4

25

184

Phillip Isola

@phillip_isola

5 years

Slides of the talk I gave today at the #iccv19 synthesis workshop, on "Generative Models as Data Visualization": (covers and )

On the "steerability" of generative adversarial networks

An open secret in contemporary machine learning is that many models work beautifully on standard benchmarks but fail to generalize outside the lab. This has been attributed to biased training...

arxiv.org

3

45

170

Phillip Isola

@phillip_isola

2 years

Great to see GANs becoming competitive on text-to-image. GANs are still my favorite kind of generative model :)

AK

@_akhaliq

2 years

StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis significantly improves over previous GANs and outperforms distilled diffusion models in terms of sample quality and speed abs: project page:

21

184

902

3

18

164

Phillip Isola

@phillip_isola

1 year

@karpathy Neat! I think this is the same trick as studied in this great old paper?

3

13

165

Phillip Isola

@phillip_isola

5 years

If you are interested in GANs, and around Boston on 5/31, please come check out this workshop we are organizing: . Talks on theory and practice, arts and applications. Also accepting poster submissions here:

GANocracy Poster Submission

Deadline to submit abstracts to present a workshop-related poster is Friday, May 24, 5 PM (EST). It is not required to upload/submit your actual poster PDF file.

docs.google.com

MIT Quest for Intelligence

@MIT_Quest

5 years

SAVE THE DATE! We’re co-hosting a GANs workshop @MIT on Friday 5/31 with @MITIBMLab . Tutorials, talks and posters on one of the hottest topics in AI.

0

11

22

6

38

165

Phillip Isola

@phillip_isola

2 years

n=1: "A very beautiful painting of a mountain next to a waterfall."

1

9

140

Phillip Isola

@phillip_isola

1 year

This post prompted some interesting reactions :) so let me quickly respond to a few: 1. 'but you gave it the answer' -- yes, partially, and that's the point, the hypothesis space gives it the _form_ of the answer

Phillip Isola

@phillip_isola

1 year

A simple, fun example to refute the common story that ML can interpolate but not extrapolate: Black dots are training points. Green curve is true data generating function. Blue curve is best fit. Notice how it correctly predicts far outside the training distribution! 1/3

67

138

1K

17

8

135

Phillip Isola

@phillip_isola

5 months

Why are reps converging? We suggest a few possibilities, including: As we train on more tasks, there are fewer reps that can satisfy all, leading to an Anna Karenina scenario (): all strong models are alike, each weak model is weak in its own way. 3/8

2

9

135

Phillip Isola

@phillip_isola

4 months

On my way to CVPR: Antonio, Bill, and I will be at the MIT Press booth on Thursday, 4-4:30pm. We will be happy to sign books if you want to bring yours! We will also raffle away a few copies.

Phillip Isola

@phillip_isola

6 months

Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4

40

408

2K

1

13

135

Phillip Isola

@phillip_isola

2 years

I like watching image models optimize because the optimization path reveals interesting visual connections. Here's an image being optimized toward “ocean”. First it makes clownfish, then seems to realize “oh those fish could be kayaks instead!” So clever and opportunistic.

0

11

134

Phillip Isola

@phillip_isola

4 years

It's hard to be surprised these days, but these results surprised me; it really _seems_ to have captured the compositional nature of our world; amazing work!

Ilya Sutskever

@ilyasut

4 years

Synthetic capybaras in different styles

12

91

667

0

7

131

Phillip Isola

@phillip_isola

9 months

One lens on synthetic data: Often you have a bunch of mappings X-->Y, Y-->Z, etc and you want other mappings implied by these. A simple approach is to use the given mappings to sample training data for the implied mappings. 1/3

3

9

130

Phillip Isola

@phillip_isola

3 years

This looks like a fascinating report: It gives a name to a major paradigm shift that has occurred over the last few years in AI: "foundation models" are the big pretrained nets on top of which almost everything else will be made. I love the term.

On the Opportunities and Risks of Foundation Models

AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these...

arxiv.org

3

11

129

Phillip Isola

@phillip_isola

4 years

Even wonder if all the fancy new contrastive objectives would be useful for regular old *supervised* learning? Turns out they can be!

Dilip Krishnan

@dilipkay

4 years

New paper on *Supervised Contrastive Learning*: A new loss function to train supervised deep networks, based on contrastive learning! Our new loss performs significantly better than cross-entropy across a range of architectures and data augmentations.

11

84

363

5

16

125

Phillip Isola

@phillip_isola

1 year

code to replicate this:

ML can extrapolate

ML can extrapolate. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

5

6

123

Phillip Isola

@phillip_isola

6 months

@sirbayes @sindero We will release a free online version pretty soon! We agreed to release in stages. We definitely want there to be broad access.

7

3

118

Phillip Isola

@phillip_isola

6 months

One of the most fun parts for me has been making visualizations. To give a sample, here are a few, showing 1) embeddings layer by layer in an MLP, 2) weight sharing in a CNN, 3) a diffusion model, 4) an image captioning system 2/4

4

9

117

Phillip Isola

@phillip_isola

2 years

How much does #dalle know about 3D? Let's see by asking it to render stereo pairs. "An anaglyph photo of a cute lego elephant."

5

13

117

Phillip Isola

@phillip_isola

9 months

More of our work on learning vision from synthetic data from generative models. This time both the images and the text are synthetic!

Lijie Fan

@lijie_fan

9 months

🚀 Is the future of vision models Synthetic? Introducing SynCLR: our new pipeline leveraging LLMs & Text-to-image models to train vision models with only synthetic data! 🔥 Outperforming SOTAs like DinoV2 & CLIP on real images! SynCLR excels in fine-grained classification &

3

42

197

1

12

114

Phillip Isola

@phillip_isola

5 months

We survey evidence from the literature, then provide several *new* results including: As LLMs get bigger and better, they learn representations that are more and more similar to those learned by vision models. And vice versa: strong visual reps are similar to LLM reps. 2/8

3

9

113

Phillip Isola

@phillip_isola

6 years

A few more, this time between species. Dog-wolf, goldfinch-bunting, red panda-panda:

3

23

112

Phillip Isola

@phillip_isola

5 years

(1/2) New work! Contrastive Multiview Coding Paper+Code: Different views of the world capture different info, but important factors are shared. Learning to capture the shared info —> SOTA reps. Saturday @ ICML self-sup workshop w/ @YonglongT + @dilipkay

GitHub - HobbitLong/CMC: [arXiv 2019] "Contrastive Multiview Coding", also contains implementations...

[arXiv 2019] "Contrastive Multiview Coding", also contains implementations for MoCo and InstDis - HobbitLong/CMC

github.com

2

31

113

Phillip Isola

@phillip_isola

2 years

Tomorrow at ECCV, we are presenting “Totems: Physical Objects for Verifying Visual Integrity” Remember totems from Inception? We tried to make something *a bit* like that in reality. website: paper: 1/n

3

18

110

Phillip Isola

@phillip_isola

4 years

Human imagination is compositional: e.g., you can picture the Notre Dame, on a grassy field, surrounded by oaks, ... Turns out GANs can too, in their latent space! We study to what extent GANs can compose parts, and provide some fun tools for doing so in the paper+code below:

Lucy Chai

@lucyrchai

4 years

Excited to share our ICLR 2021 paper on image composition in GAN latent space! joint with @jswulff @phillip_isola paper+code+colab: it's interactive and super fun to play with :)

10

123

570

1

18

104

Phillip Isola

@phillip_isola

4 years

Nice to see more theory on this. Paraphrasing: the only way to correctly colorize pikachu yellow is to first implicitly recognize that you are looking at a picture of pikachu!

Jason Lee

@jasondeanlee

4 years

Predicting What You Already Know Helps: Provable Self-Supervised Learning We analyze how predicting parts of the input from other parts (missing patch, missing word, etc.) helps to learn a representation that linearly separates the downstream task. 1/2

2

105

521

2

11

99

Phillip Isola

@phillip_isola

6 years

Code, if anyone wants to make more:

Slerp through the BigGAN latent space

Slerp through the BigGAN latent space. GitHub Gist: instantly share code, notes, and snippets.

gist.github.com

4

12

97

Phillip Isola

@phillip_isola

3 years

How can we learn good visual reps from *environments*, rather than datasets? Requires exploring env to collect data to train rep We study this as adv game (curiosity) between an explorer and a contrastive learner At ICCV! w/ @du_yilun @gan_chuang -->

0

8

93

Phillip Isola

@phillip_isola

2 years

"Google search" for generative models: with all the gazillions of models being trained now, I think tools like this will become more and more essential -- very exciting work!

Jun-Yan Zhu

@junyanz89

2 years

Introducing Modelverse (), an online model sharing and search platform, with the mission to help everyone share, discover, and study deep generative models more easily. Please share your models on Modelverse today. [1/4]

12

169

895

0

7

94

Phillip Isola

@phillip_isola

2 years

great! now to finish translating the bard into the twitter-verse, let's turn it into emojis: "*emojify* *shorten* *shorten* *shorten* *shorten* *shorten* *shorten* *simplify* [R&J prologue]" 5/n

1

93

Phillip Isola

@phillip_isola

4 years

Different generative models (not just CNNs) tend to all make similar mistakes, especially at the patch level. This means you can train a fake detector on one kind of fake and it generalizes decently well to detecting fakes from held out models too! We analyze this ability here:

Lucy Chai

@lucyrchai

4 years

Just released our new project on using small patches to classify and visualize where artifacts occur in synthetic facial images, joint with @davidbau , Ser-Nam Lim, @phillip_isola code+paper available at:

1

12

62

1

13

92

Phillip Isola

@phillip_isola

2 years

Well-trained generative models are great but I also love the visual creativity of poorly optimized models. Sometimes the results are more interesting when the model is not quite doing what you told it to do:

5

92

Phillip Isola

@phillip_isola

6 years

Nonparametric image synthesis has gone out of fashion but this paper shows that it can still do amazing things when combined with deep nets to clean things up: Amazing how much results have improved in just the last 1.5 years:

Semi-parametric Image Synthesis

Semi-parametric Image SynthesisXiaojuan Qi, Qifeng Chen, Jiaya Jia, and Vladlen KoltunCVPR 2018Paper: http://arxiv.org/abs/1804.10992Code: http://github.com/...

www.youtube.com

1

38

85

Phillip Isola

@phillip_isola

5 years

Work we did on visualizing memorability, using GANs!

MIT Quest for Intelligence

@MIT_Quest

5 years

What makes some images stick in the mind while others fade? Ask a GAN. via @MIT #icccv2019 @phillip_isola @AudeOliva @alexjandonian @L_Goetschalckx

0

5

34

1

9

81

Phillip Isola

@phillip_isola

3 years

My favorite thing here is we are not just supervising our way to this result by imitating artist examples. Rather good sketches emerge, in part, as a consequence of what may be the _objective_ of line drawings: to communicate geometry and meaning. Demo:

Informativedrawings - a Hugging Face Space by carolineec

huggingface.co

Fredo Durand

@fredodurand

3 years

Our new work (with fun demo!) on making better line drawing by making them informative, as assessed by a neural network's ability to infer depth and semantics. With Caroline Chan and @phillip_isola

7

37

200

3

10

81

Phillip Isola

@phillip_isola

5 years

Giving tutorial talk today on image-to-image translation. 9am at CVRP deep content creation tutorial: Slides here:

1

5

78

Phillip Isola

@phillip_isola

2 years

Imagine having a personalized library of functions like this written in natural language. I wonder how far you can take it. Can you parameterize the functions ("shorten by X%"), can you define loops ("repeat X times"), ... I guess people will find! (or probably already have!)

5

1

78

Phillip Isola

@phillip_isola

5 months

What are we converging to? We hypothesize that there is an endpoint to all this convergence: a representation of the joint distribution over the underlying events that cause our observations. 4/8

3

4

74

Phillip Isola

@phillip_isola

7 months

Are game engines world simulators? Given a mesh+texture, a game can render a beautiful depiction of tree bark. But, typically, it doesn't model how the bark came to be in the first place, how the mesh+texture were created. Gen models, in a sense, do. 1/3

5

7

72

Phillip Isola

@phillip_isola

2 years

Revisiting this idea with GPT-4! Prompt: "When I type “eli[N,M] X”, please explain X like I’m age N, using M references to movies. Respond with one sentence." Now maybe it can help us all understand how GPT-4 works... 🧵

Phillip Isola

@phillip_isola

2 years

Now trying out parameterized natural language commands in ChatGPT. a bit like defining a function F(X;n) where n is a parameter. Here is the prompt: 1/n

10

27

277

5

7

72

Phillip Isola

@phillip_isola

4 years

Super cool results on generative representation learning; thought-provoking about generative versus contrastive. Generative tries to model all info, which may make it less efficient, but perhaps sufficient in the end:

Ilya Sutskever

@ilyasut

4 years

Transformers trained to predict pixels generate plausible completions of images and learn excellent unsupervised image representations! To compensate for their lack of 2d prior knowledge, they are more expensive to train.

3

43

298

0

5

69

Phillip Isola

@phillip_isola

2 years

can't end on a negative one, so let's try to max this out: *peppify[infinity]* It's Saturday and I have to do laundry.

4

2

68

Phillip Isola

@phillip_isola

3 years

@hardmaru It's being attempted!

Cetacean Translation Initiative: a roadmap to deciphering the...

The past decade has witnessed a groundbreaking rise of machine learning for human language analysis, with current methods capable of automatically accurately recovering various aspects of syntax...

arxiv.org

2

4

67

Phillip Isola

@phillip_isola

6 months

The field changed a lot while writing this. Here is our graph of progress. (I joined in the middle and my first contribution was to slow things down...) But it's been fun to try to connect the old and the new. So many concepts reappear in each era, a spiral of progress. 3/4

2

3

67

Phillip Isola

@phillip_isola

5 months

In particular, in an specific idealized world, we show that contrastive learners converge to a representation whose kernel is the pointwise mutual information function over the underlying events. On a simple color domain, this empirically holds. 5/8

2

3

64

Phillip Isola

@phillip_isola

1 year

@RRKRahul96 perhaps but I think similar principles help explain the extrapolation properties of modern neural nets. CNNs and transformers extrapolate in part due to the extreme constraints they place on the hypothesis space.

7

5

57

Phillip Isola

@phillip_isola

5 months

However there are many immediate objections to consider: * what about information that is _unique_ to one modality? * what about special-purpose systems that do not require general world knowledge? * are we measuring rep similarity in the right way? There’s lots more to do! 7/8

3

1

58

Phillip Isola

@phillip_isola

4 years

This is very intriguing, suggestive that of the underlying sameness of so many problems. Reminds me of the Feynman quote: "Nature uses only the longest threads to weave her patterns, so each small piece of her fabric reveals the organization of the entire tapestry."

Igor Mordatch

@IMordatch

4 years

What are the limits to the generalization of large pretrained transformer models? We find minimal fine-tuning (~0.1% of params) performs as well as training from scratch on a completely new modality! with @_kevinlu , @adityagrover_ , @pabbeel paper: 1/8

4

71

353

1

8

59

Phillip Isola

@phillip_isola

2 years

I think we are in for a very interesting future of creative expression. To me, these tools do change things. Something is lost and something is gained. I really enjoyed making this, but also feel the pain that certain parts of this creative process are no longer uniquely human.

3

57

Phillip Isola

@phillip_isola

7 years

Really cool new high-res + editable version of #pix2pix from nvidia and my colleague @junyanz89 :

2

12

55

Phillip Isola

@phillip_isola

3 years

New work to appear at NeurIPS: How can we get agents to communicate meaningfully with each other? Simple idea: just have them broadcast compressed reps of their obs Makes decentralized coordination much easier! --> w/ @ToruO_O J Huh C Stauffer S Lim

2

6

52

Phillip Isola

@phillip_isola

5 months

with @minyoung_huh , @thisismyhat , @TongzhouWang This will be a position paper at ICML 2024. 8/8

6

1

52

Phillip Isola

@phillip_isola

6 months

Thanks to everyone who helped with this book! Please send us errors and corrections if you find them. More will eventually be made available online. 4/4

4

2

53

Phillip Isola

@phillip_isola

4 years

Fun test to see if you are living in a dream (adapted from Solaris): - Pick a problem you can't mentally solve, but can check e.g., prime factorization - Solve with a computer - Mentally check - If checks out, then you are not in a dream! (at least not one produced by your mind)

5

4

49

Phillip Isola

@phillip_isola

3 years

This is beautiful, and more so when you think about all the technologies that interact to make this possible:

Rivers Have Wings

@RiversHaveWings

3 years

I scaled up CLIPDraw () a bit... "a beautiful epic wondrous fantasy painting of [the ocean / lightning / wind / a deep valley]":

10

88

428

0

6

50

Phillip Isola

@phillip_isola

3 years

We are presenting a few papers at NeurIPS this week. I’ll be at the posters and would be great to see folks there! Details follow: [1/n]

1

7

47

Phillip Isola

@phillip_isola

1 year

Re StableRep (), getting questions like: "isn't this because SD is trained on bigger data (LAION+CLIP)?" Yes, but I think it's more than that. In (Tab. 2) we equated training data and still found big boost from synthetic over real:

1

7

47

Phillip Isola

@phillip_isola

5 months

The last sections of our paper explore implications and counterarguments. If there is indeed a platonic representation, then we should keep working to find it. We can marshal all kinds of data and architectures to this cause, rather than proceeding in disciplinary silos. 6/8

1

45

Phillip Isola

@phillip_isola

4 months

Super cool new paper/framework from @jxbz Tim Large et al. (in which I had a small role). Enables tuning lr on small model, then using same value on big model. Hopefully helps eliminate wasteful lr sweeps on big models.

Jeremy Bernstein

@jxbz

4 months

New paper and pip package: modula: "Scalable Optimization in the Modular Norm" 📦 📝 We re-wrote the @pytorch module tree so that training automatically scales across width and depth.

8

38

177

0

8

47

Phillip Isola

@phillip_isola

3 years

These are beautiful. It's interesting how the objects seem have their own unique style, a bit distinct from other CLIP/NeRF styles. I'd like to play a video game rendered in this style.

AK

@_akhaliq

3 years

Zero-Shot Text-Guided Object Generation with Dream Fields abs: project page: combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions

5

62

316

0

6

46

Phillip Isola

@phillip_isola

2 years

"An anaglyph photo of a penguin flying through the jungle." #dalle 1/2

2

43

Phillip Isola

@phillip_isola

6 years

Blog post (+updated paper and code) on our latest work, where we learn a loss function to train RL agents! w/ @rein_houthooft @richardchen100 Bradly Stadie, @fjwolski Jonathan Ho @pabbeel @jackclarkSF @OpenAI

OpenAI

@OpenAI

6 years

Releasing Evolved Policy Gradients, an experimental metalearning technique to let agents rapidly learn to solve novel tasks:

15

222

612

0

16

44

Phillip Isola

@phillip_isola

6 years

@goodfellow_ian Sure! #PartySpiders

1

42

Phillip Isola

@phillip_isola

4 years

For US folks, if you haven't yet and can, please don't forget to vote!

0

42

Phillip Isola

@phillip_isola

2 years

This is cool because we usually think of GPT-3 as being an ungrounded language model but actually it has a bit of visual grounding built in:

Stas Kulesh

@stas_kulesh

2 years

I’ve asked my little AI tool (powered by @OpenAI ’s GPT-3) to generates some palettes. The prompt was: “Generate a list of 7 hex codes for color palette based on the description: DESCRIPTION” This is what I got back.

16

46

414

3

1

40

Phillip Isola

@phillip_isola

2 years

Giving 2 talks tomorrow at CVPR: 9:15am at MultiEarth : I'll talk about generative data for multiview learning 10am at Social Intelligence : I'll try to port some lessons from CV to multiagent AI Feel free to attend + say hi!

1

3

39

Phillip Isola

@phillip_isola

3 years

Super fun paper. I like thinking of GANs as producing a new kind of data, better and easier to work with than normal data, and this paper demonstrates that beautifully. With GAN data, finding correspondences becomes easy.

AK

@_akhaliq

3 years

GAN-Supervised Dense Visual Alignment abs: project page: github:

2

38

209

0

2

39

Phillip Isola

@phillip_isola

5 years

Surprising results! The class embedding space of BigGAN is more expressive than I would have thought. Favorite result: BigGAN trained on ImageNet can generate fairly plausible Places photos.

Anh Nguyen (Totti)

@anh_ng8

5 years

BigGAN samples are famously photo-realistic but limited in diversity for some classes. Slightly modifying only the class embeddings (network unchanged) can reduce the diversity gap by ~50%! Work with Long Mai and led by fantastic @MkQili !! Paper & video:

2

64

234

0

4

39

Phillip Isola

@phillip_isola

2 years

Why is this interesting? Because, imo, gen models are the future of datasets. They are just like datasets but better. They serve the same purpose as datasets — they store data — but they can also do more — they are continuous, differentiable, and have latent controls. 3/n

4

6

39

Phillip Isola

@phillip_isola

6 years

The simplicity of this model and result is beautiful: just train a big, task-agnostic generative model of text and you get a representation that, with a bit of finetuning, gives state of the art results on numerous problems in language.

Alec Radford

@AlecRad

6 years

What I've been working on for the past year! Inspired by CoVE, ELMo, and ULMFiT we show that a single transformer language model can be finetuned to a wide variety of NLP tasks and performs very well with little tuning/tweaking.

34

457

1K

0

7

36