Phillip Isola Profile
Phillip Isola

@phillip_isola

15,238
Followers
157
Following
106
Media
605
Statuses

Associate Professor in EECS at MIT, trying to understand intelligence.

Joined December 2016
Don't wanna be here? Send us removal request.
@phillip_isola
Phillip Isola
6 years
#BigGAN is so much fun. I stumbled upon a (circular) direction in latent space that makes party parrots, as well as other party animals:
32
692
3K
@phillip_isola
Phillip Isola
6 months
Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4
Tweet media one
40
408
2K
@phillip_isola
Phillip Isola
5 months
New paper: The Platonic Representation Hypothesis In which we posit that _different_ foundation models are converging to the _same_ representation of reality. paper: website: code: 1/8
34
257
1K
@phillip_isola
Phillip Isola
2 years
Language-conditional models can act a bit like decision transformers, in that you can prompt them with a desired level of "reward". E.g., want prettier #dalle creations? "Just ask" by adding "[very]^n beautiful": n=0: "A beautiful painting of a mountain next to a waterfall."
Tweet media one
35
215
1K
@phillip_isola
Phillip Isola
1 year
A simple, fun example to refute the common story that ML can interpolate but not extrapolate: Black dots are training points. Green curve is true data generating function. Blue curve is best fit. Notice how it correctly predicts far outside the training distribution! 1/3
Tweet media one
67
138
1K
@phillip_isola
Phillip Isola
1 year
We’re releasing a new image similarity metric and dataset! --> DreamSim: a metric which outperforms LPIPS, CLIP, and DINO on similarity and retrieval tasks --> NIGHTS: a dataset of synthetic images with human similarity ratings paper+code+data: 1/n
Tweet media one
12
197
925
@phillip_isola
Phillip Isola
2 years
An interesting thing about ChatGPT is you can script in it a bit like you would in a programming language. You can define functions, compose them, etc. Except all in natural language! This means you can write out common tasks and attach them to command names. For example: 1/n
Tweet media one
12
120
809
@phillip_isola
Phillip Isola
7 years
This figure from group norm () is super useful for anyone trying to keep track of how all these things relate:
Tweet media one
2
232
706
@phillip_isola
Phillip Isola
2 years
Back in 2018 at OpenAI, a few of us wrote a story with gpt as an AI "co-author". We didn't have an AI illustrator back then, but now we sort of do, so I tried plugging the text into #dalle . Here is the result! β€œThe Bees”, a short story by humans & AIs:
Tweet media one
17
82
610
@phillip_isola
Phillip Isola
1 year
This looks like one of those results that marks a phase transition in science: for years people have anticipated that synthetic data would eventually outperform / boost real, but an imagenet scale result has been elusive. Finally models are good enough that it works!
@_akhaliq
AK
1 year
Synthetic Data from Diffusion Models Improves ImageNet Classification abs:
Tweet media one
18
200
950
13
108
611
@phillip_isola
Phillip Isola
2 years
n=22: "A very very very very very very very very very very very very very very very very very very very very very very beautiful painting of a mountain next to a waterfall."
Tweet media one
10
55
550
@phillip_isola
Phillip Isola
10 months
This paper is so cool: It shows several kinds of illusions that I had never seen before (e.g., color inversion illusion). It's exciting to see more and more cases like this, where AI opens up new kinds of art, rather than only imitating old forms.
10
140
561
@phillip_isola
Phillip Isola
2 years
I'm updating the "hacker's guide to deep learning" lecture for a course I'm teaching this semester () -- what are your favorite ~2022 DL tips and tricks should I definitely include?
16
77
543
@phillip_isola
Phillip Isola
2 years
Wondering what to do in the era of GPT? One answer: do science! There is still so much to understand about _why_ these models work (or don’t). Here’s my group’s lastest* on the science of deep learning, newly accepted in TMLR:
6
77
538
@phillip_isola
Phillip Isola
2 years
I give a lot of talks but often only a few people see them. I’m going to try a new experiment where I write blog versions of some of the talks I give. Here’s the first one: A short, general-audience intro to "Generative Models of Images" -->
Tweet media one
5
52
389
@phillip_isola
Phillip Isola
1 year
This only worked because we had a good hypothesis space (few params, contains true fn). Point is: if you have the right hypothesis space, you can extrapolate, correctly, far outside the training distribution! 3/3
23
10
368
@phillip_isola
Phillip Isola
4 years
GANs get a lot of press for making photorealistic images, but to me the more impressive and useful feat is that they discover this organized "latent space" that underlies the visual world. This video is a really nice intro to that concept and why it is so cool:
4
65
372
@phillip_isola
Phillip Isola
3 months
I recently gave a talk on the platonic representation hypothesis at the Simons Institute, which is now online here: pdf of the slides for those interested, feel free to reuse:
4
54
343
@phillip_isola
Phillip Isola
3 years
More new work at ICCV: Training a GAN to explain a classifier Idea: visualize how an image would have to change in order for the predicted class to change. Can reveal attributes and biases underlying deep net decisions. w/ big team @GoogleAI β€”>
3
63
342
@phillip_isola
Phillip Isola
4 years
Surprising and fun result: Unpaired image translation without a deep net, just a _linear_ transformation: (and no GAN too!)
3
62
310
@phillip_isola
Phillip Isola
2 years
Now trying out parameterized natural language commands in ChatGPT. a bit like defining a function F(X;n) where n is a parameter. Here is the prompt: 1/n
Tweet media one
10
27
277
@phillip_isola
Phillip Isola
1 year
Should you train your vision system on real images or synthetic? In the era of stable diffusion, the answer seems to be: synthetic! One stable diffusion sample can be worth more than one real image. paper link:
@dilipkay
Dilip Krishnan
1 year
New paper!! We show that pre-training language-image models *solely* on synthetic images from Stable Diffusion can outperform training on real images!! Work done with @YonglongT (Google), Huiwen Chang (Google), @phillip_isola (MIT) and Lijie Fan (MIT)!!
Tweet media one
12
115
587
8
46
249
@phillip_isola
Phillip Isola
2 years
n=6: "A very very very very very very beautiful painting of a mountain next to a waterfall."
Tweet media one
1
14
217
@phillip_isola
Phillip Isola
4 years
Beautiful summary of different learning problems from a new paper by @__ishaan and David Lopez-Paz (rest of paper also looks interesting)
Tweet media one
0
34
217
@phillip_isola
Phillip Isola
1 year
How is that possible? Because we aren't fitting over all possible mappings x-->y, we are only considering fits of the form y = a*sin(x^2) + b*x. This "hypothesis space" extrapolates in the way shown above. Since the true fn is in this space, the extrapolation is ~correct. 2/3
21
8
205
@phillip_isola
Phillip Isola
11 months
New paper at #CoRL2023 ! "Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation" How should robots represent the world around them? This paper's answer: as a field of foundation model features localized in 3D space. paper+code: 1/n
1
30
202
@phillip_isola
Phillip Isola
6 years
What's cool about this, to me, is that BigGAN learned something that looks like 3D rotation, but it did so just by modeling 2D images:
9
29
196
@phillip_isola
Phillip Isola
3 months
Very interesting work on a question I've been thinking a lot about: when can training a system on X' ~ G outperform training directly on X (where G is a gen model of X). They find that retrieving task-relevant images from X outperforms sampling task-relevant images from G 1/n
@scottgeng00
Scott Geng
3 months
Will training on AI-generated synthetic data lead to the next frontier of vision models?πŸ€” Our new paper suggests NOβ€”for now. Synthetic data doesn't magically enable generalization beyond the generator's original training set. πŸ“œ: Details below🧡(1/n)
17
95
475
6
21
198
@phillip_isola
Phillip Isola
7 months
Distributed training using parallel LoRAs, infrequently synced. My fav part is the analogy to git, where lots of coders can work together on a project, coordinated by simple operators like pull, commit, merge. Potential implications toward community training of big models.
Tweet media one
@pulkitology
Pulkit Agrawal
7 months
Presenting a method for training models from SCRATCH using LoRA: πŸ’‘20x reduction in communication πŸ’‘3x savings in memory - Find out more: - Code available to try out - Scaling to larger models ongoing - led by Jacob Huh!
Tweet media one
6
57
391
6
23
188
@phillip_isola
Phillip Isola
3 years
Jumping on the CLIP+VQGAN bandwagon: "What is the answer to the ultimate question of life, the universe, and everything?" (using seed=42 of course)
4
24
186
@phillip_isola
Phillip Isola
2 years
Great to see GANs becoming competitive on text-to-image. GANs are still my favorite kind of generative model :)
@_akhaliq
AK
2 years
StyleGAN-T: Unlocking the Power of GANs for Fast Large-Scale Text-to-Image Synthesis significantly improves over previous GANs and outperforms distilled diffusion models in terms of sample quality and speed abs: project page:
21
184
902
3
18
164
@phillip_isola
Phillip Isola
1 year
@karpathy Neat! I think this is the same trick as studied in this great old paper?
3
13
165
@phillip_isola
Phillip Isola
5 years
If you are interested in GANs, and around Boston on 5/31, please come check out this workshop we are organizing: . Talks on theory and practice, arts and applications. Also accepting poster submissions here:
@MIT_Quest
MIT Quest for Intelligence
5 years
SAVE THE DATE! We’re co-hosting a GANs workshop @MIT on Friday 5/31 with @MITIBMLab . Tutorials, talks and posters on one of the hottest topics in AI.
0
11
22
6
38
165
@phillip_isola
Phillip Isola
2 years
n=1: "A very beautiful painting of a mountain next to a waterfall."
Tweet media one
1
9
140
@phillip_isola
Phillip Isola
1 year
This post prompted some interesting reactions :) so let me quickly respond to a few: 1. 'but you gave it the answer' -- yes, partially, and that's the point, the hypothesis space gives it the _form_ of the answer
@phillip_isola
Phillip Isola
1 year
A simple, fun example to refute the common story that ML can interpolate but not extrapolate: Black dots are training points. Green curve is true data generating function. Blue curve is best fit. Notice how it correctly predicts far outside the training distribution! 1/3
Tweet media one
67
138
1K
17
8
135
@phillip_isola
Phillip Isola
5 months
Why are reps converging? We suggest a few possibilities, including: As we train on more tasks, there are fewer reps that can satisfy all, leading to an Anna Karenina scenario (): all strong models are alike, each weak model is weak in its own way. 3/8
Tweet media one
2
9
135
@phillip_isola
Phillip Isola
4 months
On my way to CVPR: Antonio, Bill, and I will be at the MIT Press booth on Thursday, 4-4:30pm. We will be happy to sign books if you want to bring yours! We will also raffle away a few copies.
@phillip_isola
Phillip Isola
6 months
Our computer vision textbook is released! Foundations of Computer Vision with Antonio Torralba and Bill Freeman It’s been in the works for >10 years. Covers everything from linear filters and camera optics to diffusion models and radiance fields. 1/4
Tweet media one
40
408
2K
1
13
135
@phillip_isola
Phillip Isola
2 years
I like watching image models optimize because the optimization path reveals interesting visual connections. Here's an image being optimized toward β€œocean”. First it makes clownfish, then seems to realize β€œoh those fish could be kayaks instead!” So clever and opportunistic.
Tweet media one
0
11
134
@phillip_isola
Phillip Isola
4 years
It's hard to be surprised these days, but these results surprised me; it really _seems_ to have captured the compositional nature of our world; amazing work!
@ilyasut
Ilya Sutskever
4 years
Synthetic capybaras in different styles
Tweet media one
12
91
667
0
7
131
@phillip_isola
Phillip Isola
9 months
One lens on synthetic data: Often you have a bunch of mappings X-->Y, Y-->Z, etc and you want other mappings implied by these. A simple approach is to use the given mappings to sample training data for the implied mappings. 1/3
3
9
130
@phillip_isola
Phillip Isola
3 years
This looks like a fascinating report: It gives a name to a major paradigm shift that has occurred over the last few years in AI: "foundation models" are the big pretrained nets on top of which almost everything else will be made. I love the term.
3
11
129
@phillip_isola
Phillip Isola
4 years
Even wonder if all the fancy new contrastive objectives would be useful for regular old *supervised* learning? Turns out they can be!
@dilipkay
Dilip Krishnan
4 years
New paper on *Supervised Contrastive Learning*: A new loss function to train supervised deep networks, based on contrastive learning! Our new loss performs significantly better than cross-entropy across a range of architectures and data augmentations.
Tweet media one
11
84
363
5
16
125
@phillip_isola
Phillip Isola
6 months
@sirbayes @sindero We will release a free online version pretty soon! We agreed to release in stages. We definitely want there to be broad access.
7
3
118
@phillip_isola
Phillip Isola
6 months
One of the most fun parts for me has been making visualizations. To give a sample, here are a few, showing 1) embeddings layer by layer in an MLP, 2) weight sharing in a CNN, 3) a diffusion model, 4) an image captioning system 2/4
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
9
117
@phillip_isola
Phillip Isola
2 years
How much does #dalle know about 3D? Let's see by asking it to render stereo pairs. "An anaglyph photo of a cute lego elephant."
5
13
117
@phillip_isola
Phillip Isola
9 months
More of our work on learning vision from synthetic data from generative models. This time both the images and the text are synthetic!
@lijie_fan
Lijie Fan
9 months
πŸš€ Is the future of vision models Synthetic? Introducing SynCLR: our new pipeline leveraging LLMs & Text-to-image models to train vision models with only synthetic data! πŸ”₯ Outperforming SOTAs like DinoV2 & CLIP on real images! SynCLR excels in fine-grained classification &
Tweet media one
3
42
197
1
12
114
@phillip_isola
Phillip Isola
5 months
We survey evidence from the literature, then provide several *new* results including: As LLMs get bigger and better, they learn representations that are more and more similar to those learned by vision models. And vice versa: strong visual reps are similar to LLM reps. 2/8
Tweet media one
3
9
113
@phillip_isola
Phillip Isola
6 years
A few more, this time between species. Dog-wolf, goldfinch-bunting, red panda-panda:
3
23
112
@phillip_isola
Phillip Isola
5 years
(1/2) New work! Contrastive Multiview Coding Paper+Code: Different views of the world capture different info, but important factors are shared. Learning to capture the shared info β€”> SOTA reps. Saturday @ ICML self-sup workshop w/ @YonglongT + @dilipkay
2
31
113
@phillip_isola
Phillip Isola
2 years
Tomorrow at ECCV, we are presenting β€œTotems: Physical Objects for Verifying Visual Integrity” Remember totems from Inception? We tried to make something *a bit* like that in reality. website: paper: 1/n
Tweet media one
3
18
110
@phillip_isola
Phillip Isola
4 years
Human imagination is compositional: e.g., you can picture the Notre Dame, on a grassy field, surrounded by oaks, ... Turns out GANs can too, in their latent space! We study to what extent GANs can compose parts, and provide some fun tools for doing so in the paper+code below:
@lucyrchai
Lucy Chai
4 years
Excited to share our ICLR 2021 paper on image composition in GAN latent space! joint with @jswulff @phillip_isola paper+code+colab: it's interactive and super fun to play with :)
10
123
570
1
18
104
@phillip_isola
Phillip Isola
4 years
Nice to see more theory on this. Paraphrasing: the only way to correctly colorize pikachu yellow is to first implicitly recognize that you are looking at a picture of pikachu!
@jasondeanlee
Jason Lee
4 years
Predicting What You Already Know Helps: Provable Self-Supervised Learning We analyze how predicting parts of the input from other parts (missing patch, missing word, etc.) helps to learn a representation that linearly separates the downstream task. 1/2
Tweet media one
2
105
521
2
11
99
@phillip_isola
Phillip Isola
3 years
How can we learn good visual reps from *environments*, rather than datasets? Requires exploring env to collect data to train rep We study this as adv game (curiosity) between an explorer and a contrastive learner At ICCV! w/ @du_yilun @gan_chuang -->
Tweet media one
0
8
93
@phillip_isola
Phillip Isola
2 years
"Google search" for generative models: with all the gazillions of models being trained now, I think tools like this will become more and more essential -- very exciting work!
@junyanz89
Jun-Yan Zhu
2 years
Introducing Modelverse (), an online model sharing and search platform, with the mission to help everyone share, discover, and study deep generative models more easily. Please share your models on Modelverse today. [1/4]
12
169
895
0
7
94
@phillip_isola
Phillip Isola
2 years
great! now to finish translating the bard into the twitter-verse, let's turn it into emojis: "*emojify* *shorten* *shorten* *shorten* *shorten* *shorten* *shorten* *simplify* [R&J prologue]" 5/n
Tweet media one
1
1
93
@phillip_isola
Phillip Isola
4 years
Different generative models (not just CNNs) tend to all make similar mistakes, especially at the patch level. This means you can train a fake detector on one kind of fake and it generalizes decently well to detecting fakes from held out models too! We analyze this ability here:
@lucyrchai
Lucy Chai
4 years
Just released our new project on using small patches to classify and visualize where artifacts occur in synthetic facial images, joint with @davidbau , Ser-Nam Lim, @phillip_isola code+paper available at:
1
12
62
1
13
92
@phillip_isola
Phillip Isola
2 years
Well-trained generative models are great but I also love the visual creativity of poorly optimized models. Sometimes the results are more interesting when the model is not quite doing what you told it to do:
Tweet media one
5
5
92
@phillip_isola
Phillip Isola
6 years
Nonparametric image synthesis has gone out of fashion but this paper shows that it can still do amazing things when combined with deep nets to clean things up: Amazing how much results have improved in just the last 1.5 years:
1
38
85
@phillip_isola
Phillip Isola
5 years
Work we did on visualizing memorability, using GANs!
@MIT_Quest
MIT Quest for Intelligence
5 years
What makes some images stick in the mind while others fade? Ask a GAN. via @MIT #icccv2019 @phillip_isola @AudeOliva @alexjandonian @L_Goetschalckx
0
5
34
1
9
81
@phillip_isola
Phillip Isola
3 years
My favorite thing here is we are not just supervising our way to this result by imitating artist examples. Rather good sketches emerge, in part, as a consequence of what may be the _objective_ of line drawings: to communicate geometry and meaning. Demo:
@fredodurand
Fredo Durand
3 years
Our new work (with fun demo!) on making better line drawing by making them informative, as assessed by a neural network's ability to infer depth and semantics. With Caroline Chan and @phillip_isola
Tweet media one
7
37
200
3
10
81
@phillip_isola
Phillip Isola
5 years
Giving tutorial talk today on image-to-image translation. 9am at CVRP deep content creation tutorial: Slides here:
1
5
78
@phillip_isola
Phillip Isola
2 years
Imagine having a personalized library of functions like this written in natural language. I wonder how far you can take it. Can you parameterize the functions ("shorten by X%"), can you define loops ("repeat X times"), ... I guess people will find! (or probably already have!)
5
1
78
@phillip_isola
Phillip Isola
5 months
What are we converging to? We hypothesize that there is an endpoint to all this convergence: a representation of the joint distribution over the underlying events that cause our observations. 4/8
3
4
74
@phillip_isola
Phillip Isola
7 months
Are game engines world simulators? Given a mesh+texture, a game can render a beautiful depiction of tree bark. But, typically, it doesn't model how the bark came to be in the first place, how the mesh+texture were created. Gen models, in a sense, do. 1/3
5
7
72
@phillip_isola
Phillip Isola
2 years
Revisiting this idea with GPT-4! Prompt: "When I type β€œeli[N,M] X”, please explain X like I’m age N, using M references to movies. Respond with one sentence." Now maybe it can help us all understand how GPT-4 works... 🧡
@phillip_isola
Phillip Isola
2 years
Now trying out parameterized natural language commands in ChatGPT. a bit like defining a function F(X;n) where n is a parameter. Here is the prompt: 1/n
Tweet media one
10
27
277
5
7
72
@phillip_isola
Phillip Isola
4 years
Super cool results on generative representation learning; thought-provoking about generative versus contrastive. Generative tries to model all info, which may make it less efficient, but perhaps sufficient in the end:
@ilyasut
Ilya Sutskever
4 years
Transformers trained to predict pixels generate plausible completions of images and learn excellent unsupervised image representations! To compensate for their lack of 2d prior knowledge, they are more expensive to train.
3
43
298
0
5
69
@phillip_isola
Phillip Isola
2 years
can't end on a negative one, so let's try to max this out: *peppify[infinity]* It's Saturday and I have to do laundry.
Tweet media one
4
2
68
@phillip_isola
Phillip Isola
6 months
The field changed a lot while writing this. Here is our graph of progress. (I joined in the middle and my first contribution was to slow things down...) But it's been fun to try to connect the old and the new. So many concepts reappear in each era, a spiral of progress. 3/4
Tweet media one
2
3
67
@phillip_isola
Phillip Isola
5 months
In particular, in an specific idealized world, we show that contrastive learners converge to a representation whose kernel is the pointwise mutual information function over the underlying events. On a simple color domain, this empirically holds. 5/8
Tweet media one
2
3
64
@phillip_isola
Phillip Isola
1 year
@RRKRahul96 perhaps but I think similar principles help explain the extrapolation properties of modern neural nets. CNNs and transformers extrapolate in part due to the extreme constraints they place on the hypothesis space.
7
5
57
@phillip_isola
Phillip Isola
5 months
However there are many immediate objections to consider: * what about information that is _unique_ to one modality? * what about special-purpose systems that do not require general world knowledge? * are we measuring rep similarity in the right way? There’s lots more to do! 7/8
3
1
58
@phillip_isola
Phillip Isola
4 years
This is very intriguing, suggestive that of the underlying sameness of so many problems. Reminds me of the Feynman quote: "Nature uses only the longest threads to weave her patterns, so each small piece of her fabric reveals the organization of the entire tapestry."
@IMordatch
Igor Mordatch
4 years
What are the limits to the generalization of large pretrained transformer models? We find minimal fine-tuning (~0.1% of params) performs as well as training from scratch on a completely new modality! with @_kevinlu , @adityagrover_ , @pabbeel paper: 1/8
4
71
353
1
8
59
@phillip_isola
Phillip Isola
2 years
I think we are in for a very interesting future of creative expression. To me, these tools do change things. Something is lost and something is gained. I really enjoyed making this, but also feel the pain that certain parts of this creative process are no longer uniquely human.
3
3
57
@phillip_isola
Phillip Isola
7 years
Really cool new high-res + editable version of #pix2pix from nvidia and my colleague @junyanz89 :
2
12
55
@phillip_isola
Phillip Isola
3 years
New work to appear at NeurIPS: How can we get agents to communicate meaningfully with each other? Simple idea: just have them broadcast compressed reps of their obs Makes decentralized coordination much easier! --> w/ @ToruO_O J Huh C Stauffer S Lim
Tweet media one
2
6
52
@phillip_isola
Phillip Isola
5 months
with @minyoung_huh , @thisismyhat , @TongzhouWang This will be a position paper at ICML 2024. 8/8
6
1
52
@phillip_isola
Phillip Isola
6 months
Thanks to everyone who helped with this book! Please send us errors and corrections if you find them. More will eventually be made available online. 4/4
4
2
53
@phillip_isola
Phillip Isola
4 years
Fun test to see if you are living in a dream (adapted from Solaris): - Pick a problem you can't mentally solve, but can check e.g., prime factorization - Solve with a computer - Mentally check - If checks out, then you are not in a dream! (at least not one produced by your mind)
5
4
49
@phillip_isola
Phillip Isola
3 years
This is beautiful, and more so when you think about all the technologies that interact to make this possible:
@RiversHaveWings
Rivers Have Wings
3 years
I scaled up CLIPDraw () a bit... "a beautiful epic wondrous fantasy painting of [the ocean / lightning / wind / a deep valley]":
Tweet media one
Tweet media two
Tweet media three
Tweet media four
10
88
428
0
6
50
@phillip_isola
Phillip Isola
3 years
We are presenting a few papers at NeurIPS this week. I’ll be at the posters and would be great to see folks there! Details follow: [1/n]
1
7
47
@phillip_isola
Phillip Isola
1 year
Re StableRep (), getting questions like: "isn't this because SD is trained on bigger data (LAION+CLIP)?" Yes, but I think it's more than that. In (Tab. 2) we equated training data and still found big boost from synthetic over real:
Tweet media one
1
7
47
@phillip_isola
Phillip Isola
5 months
The last sections of our paper explore implications and counterarguments. If there is indeed a platonic representation, then we should keep working to find it. We can marshal all kinds of data and architectures to this cause, rather than proceeding in disciplinary silos. 6/8
1
1
45
@phillip_isola
Phillip Isola
4 months
Super cool new paper/framework from @jxbz Tim Large et al. (in which I had a small role). Enables tuning lr on small model, then using same value on big model. Hopefully helps eliminate wasteful lr sweeps on big models.
@jxbz
Jeremy Bernstein
4 months
New paper and pip package: modula: "Scalable Optimization in the Modular Norm" πŸ“¦ πŸ“ We re-wrote the @pytorch module tree so that training automatically scales across width and depth.
Tweet media one
8
38
177
0
8
47
@phillip_isola
Phillip Isola
3 years
These are beautiful. It's interesting how the objects seem have their own unique style, a bit distinct from other CLIP/NeRF styles. I'd like to play a video game rendered in this style.
@_akhaliq
AK
3 years
Zero-Shot Text-Guided Object Generation with Dream Fields abs: project page: combine neural rendering with multi-modal image and text representations to synthesize diverse 3D objects solely from natural language descriptions
5
62
316
0
6
46
@phillip_isola
Phillip Isola
2 years
"An anaglyph photo of a penguin flying through the jungle." #dalle 1/2
Tweet media one
Tweet media two
Tweet media three
2
2
43
@phillip_isola
Phillip Isola
6 years
Blog post (+updated paper and code) on our latest work, where we learn a loss function to train RL agents! w/ @rein_houthooft @richardchen100 Bradly Stadie, @fjwolski Jonathan Ho @pabbeel @jackclarkSF @OpenAI
@OpenAI
OpenAI
6 years
Releasing Evolved Policy Gradients, an experimental metalearning technique to let agents rapidly learn to solve novel tasks:
15
222
612
0
16
44
@phillip_isola
Phillip Isola
6 years
1
1
42
@phillip_isola
Phillip Isola
4 years
For US folks, if you haven't yet and can, please don't forget to vote!
0
0
42
@phillip_isola
Phillip Isola
2 years
This is cool because we usually think of GPT-3 as being an ungrounded language model but actually it has a bit of visual grounding built in:
@stas_kulesh
Stas Kulesh
2 years
I’ve asked my little AI tool (powered by @OpenAI ’s GPT-3) to generates some palettes. The prompt was: β€œGenerate a list of 7 hex codes for color palette based on the description: DESCRIPTION” This is what I got back.
16
46
414
3
1
40
@phillip_isola
Phillip Isola
2 years
Giving 2 talks tomorrow at CVPR: 9:15am at MultiEarth : I'll talk about generative data for multiview learning 10am at Social Intelligence : I'll try to port some lessons from CV to multiagent AI Feel free to attend + say hi!
1
3
39
@phillip_isola
Phillip Isola
3 years
Super fun paper. I like thinking of GANs as producing a new kind of data, better and easier to work with than normal data, and this paper demonstrates that beautifully. With GAN data, finding correspondences becomes easy.
@_akhaliq
AK
3 years
GAN-Supervised Dense Visual Alignment abs: project page: github:
2
38
209
0
2
39
@phillip_isola
Phillip Isola
5 years
Surprising results! The class embedding space of BigGAN is more expressive than I would have thought. Favorite result: BigGAN trained on ImageNet can generate fairly plausible Places photos.
@anh_ng8
Anh Nguyen (Totti)
5 years
BigGAN samples are famously photo-realistic but limited in diversity for some classes. Slightly modifying only the class embeddings (network unchanged) can reduce the diversity gap by ~50%! Work with Long Mai and led by fantastic @MkQili !! Paper & video:
Tweet media one
2
64
234
0
4
39
@phillip_isola
Phillip Isola
2 years
Why is this interesting? Because, imo, gen models are the future of datasets. They are just like datasets but better. They serve the same purpose as datasets β€” they store data β€” but they can also do more β€” they are continuous, differentiable, and have latent controls. 3/n
Tweet media one
4
6
39
@phillip_isola
Phillip Isola
6 years
The simplicity of this model and result is beautiful: just train a big, task-agnostic generative model of text and you get a representation that, with a bit of finetuning, gives state of the art results on numerous problems in language.
@AlecRad
Alec Radford
6 years
What I've been working on for the past year! Inspired by CoVE, ELMo, and ULMFiT we show that a single transformer language model can be finetuned to a wide variety of NLP tasks and performs very well with little tuning/tweaking.
34
457
1K
0
7
36