Jonathan Lorraine Profile Banner
Jonathan Lorraine Profile
Jonathan Lorraine

@jonLorraine9

4,808
Followers
5,343
Following
76
Media
155
Statuses

Research scientist @NVIDIA | PhD in machine learning @UofT . Previously @Google / @MetaAI . Opinions are my own. 🤖 💻 ☕️

Toronto, Ontario
Joined November 2017
Don't wanna be here? Send us removal request.
Pinned Tweet
@jonLorraine9
Jonathan Lorraine
5 months
New #NVIDIA #GTC24 paper 🎊 We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness. ☕ LATTE3D project details: 🧵with many fun gifs
4
30
96
@jonLorraine9
Jonathan Lorraine
2 years
Gradient descent in games (like GANs) rotates around solutions. We solve this with a simple trick: complex momentum damps the oscillations. Come check out our #AISTATS2022 talk and poster: With @davidjesusacu @PaulVicol @DavidDuvenaud
4
74
493
@jonLorraine9
Jonathan Lorraine
1 month
Thrilled to announce I've completed my PhD! 🎉 My advisor, @DavidDuvenaud 's support and guidance made the journey enjoyable. You showed me that research can be fun, and I will always cherish our brainstorming sessions on the whiteboard. Thank you for taking a chance on me and
Tweet media one
25
3
276
@jonLorraine9
Jonathan Lorraine
25 days
⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡ tl;dr: We develop various optimization tools with highlights, including: · Making the momentum coefficient complex for adversarial games like GANs. · Optimizing millions of hyperparameters
6
25
236
@jonLorraine9
Jonathan Lorraine
3 years
When minimizing objectives, randomly initializing + optimizing can fail to find different solutions. We give a branching optimization method for this using Lyapunov exponents..1/11 @AAMAS2022 With @PaulVicol @jparkerholder @TalKachman @Luke_Metz @j_foerst
3
43
218
@jonLorraine9
Jonathan Lorraine
16 days
Excited that I've reached 1000 citations! Incredibly grateful for all the support from my co-authors, mentors, and peers
Tweet media one
4
1
129
@jonLorraine9
Jonathan Lorraine
2 months
New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights We enhance hyperparameter optimization by adding the ability to condition cheap-to-evaluate surrogates for the loss on checkpointed model weights with a graph metanetwork. This allows us to
2
19
117
@jonLorraine9
Jonathan Lorraine
11 months
New #NVIDIA paper: Real-time text-to-3D generation #ICCV2023 3D generation from text requires expensive per-prompt optimization. We train 1 model on many prompts for real-time generalization to unseen prompts, interpolations and more! ATT3D details:
1
27
112
@jonLorraine9
Jonathan Lorraine
2 years
When minimizing, randomly initializing + optimizing can fail to find different solutions. We give a branching optimization method using Lyapunov exponents..1/12 Come watch our oral @AAMAS2022 ! w/ @PaulVicol @jparkerholder @TalKachman @Luke_Metz @j_foerst
4
15
105
@jonLorraine9
Jonathan Lorraine
2 years
Thrilled to be starting my internship at @NVIDIA / @NVIDIAAI as a research scientist with @FidlerSanja ’s group at the Toronto AI Lab () 😃
4
1
87
@jonLorraine9
Jonathan Lorraine
3 years
Excited to begin my internship @google working on AutoML 😃
1
0
71
@jonLorraine9
Jonathan Lorraine
3 years
Thrilled to start my research internship @facebookai where I'll be working with @j_foerst 😃
2
1
68
@jonLorraine9
Jonathan Lorraine
2 years
ICML paper: Does your bilevel optimization behave in unexpected ways? This can be from the inner/outer problem being overparameterized. We examine the surprising implicit bias of common strategies! w/ @PaulVicol @fpedregosa @DavidDuvenaud @RogerGrosse
Tweet media one
@PaulVicol
Paul Vicol
2 years
Our ICML paper: In bilevel optimization, usually the inner or outer problem is overparameterized. We investigate implicit bias of warm- vs cold-start algorithms and hypergradient approximations. With @jonLorraine9 @fpedregosa @DavidDuvenaud @RogerGrosse
Tweet media one
1
16
102
0
5
52
@jonLorraine9
Jonathan Lorraine
3 years
Excited to share our latest paper - Complex Momentum for Optimization in Games! Check out the main thread for more details. This animation shows sweeps over optimizer hypers. In purely adversarial games, any momentum with non-zero phase/arg has optimizer settings that converge.
@DavidDuvenaud
David Duvenaud
3 years
Gradient descent in differentiable games rotates around solutions instead of converging. For instance, in GANs. We solve this with a simple trick: complex momentum damps the oscillations. With @jonLorraine9 @davidjesusacu @PaulVicol
19
190
1K
1
5
31
@jonLorraine9
Jonathan Lorraine
3 years
If you want to tune high-dimensional hyperparameters for pre-training in arbitrary domains, check out this paper [at #NeurIPS2021 ]!
@DavidDuvenaud
David Duvenaud
3 years
Pre-training large models is useful, but adds many hyperparameters. E.g. task weights, or augmentations in SimCLR. We give a scalable, gradient-based way to tune these: w/ @RaghuAniruddh @jonLorraine9 @skornblith @MattBMcDermott
Tweet media one
6
41
228
0
4
26
@jonLorraine9
Jonathan Lorraine
2 years
Our method is a two-line change from standard momentum updates in JAX and PyTorch. It still gives real-valued updates. This is the code for our paper:
Tweet media one
4
0
23
@jonLorraine9
Jonathan Lorraine
3 years
Our work is inspired by and builds on “Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian”..11/11 from @jparkerholder @Luke_Metz @cinjoncin Hengyuan Hu @adamlerer @_aletcher @alex_peys @aldopacchiano @j_foerst
1
2
21
@jonLorraine9
Jonathan Lorraine
3 years
By varying the sampled subspace and game, there is a wide range of phenomena. If you’re working on optimization in differentiable games and looking for interesting, visualizable, 2-D diagnostic problems try out this technique! 8/11
Tweet media one
1
2
20
@jonLorraine9
Jonathan Lorraine
11 months
If you are in Paris, come check out our work on real-time text-to-3D generation at #ICCV2023 on Friday from 10:30 AM-12:30 PM in Room "Foyer Sud" - 035!
@jonLorraine9
Jonathan Lorraine
11 months
New #NVIDIA paper: Real-time text-to-3D generation #ICCV2023 3D generation from text requires expensive per-prompt optimization. We train 1 model on many prompts for real-time generalization to unseen prompts, interpolations and more! ATT3D details:
1
27
112
0
3
19
@jonLorraine9
Jonathan Lorraine
3 years
Ridge rider (RR) finds diverse solutions in single-objective optimization by branching optimization at saddle points. Our method - Generalized RR (GRR) - branches at bifurcation points where small parameter changes lead to different learning dynamics..2/11
1
1
16
@jonLorraine9
Jonathan Lorraine
9 months
#NeurIPS2023 You can just ask LLMs which hyperparameters to use, and it works pretty well! w/ @michaelrzhang , Nishrkit Desai, @juhan_bae , and @jimmybajimmyba You can even directly optimize your model’s code with this.
@michaelrzhang
Michael Zhang
9 months
We've spent a lot of effort tuning hyperparameters for large models.. but what if they could also help us tune hyperparameters? New paper at NeurIPS FMDM workshop w/ Nishkrit Desai @juhan_bae @jonLorraine9 @jimmybajimmyba 🧵
Tweet media one
3
8
55
0
0
15
@jonLorraine9
Jonathan Lorraine
3 years
Why Lyapunov exponents? They measure optimization trajectory separation speed, for perturbations in different directions (shown in blue/green). Trajectories separate fastest at bifurcations, where even small perturbations can cause trajectories to go to different solutions..5/11
Tweet media one
1
1
15
@jonLorraine9
Jonathan Lorraine
3 years
Congrats @DavidDuvenaud !
@SloanFoundation
Sloan Foundation
3 years
Introducing… the winners of this year’s Sloan Research Fellowship! These extraordinary researchers represent some of the most exciting young minds working today—and we are thrilled to support them. Meet the winners here: #SloanFellow
Tweet media one
14
51
341
0
0
15
@jonLorraine9
Jonathan Lorraine
3 years
We scale up our method, finding diverse solutions in the iterated prisoner’s dilemma and evaluating the approximate Lyapunov exponent for GANs..10/11 Accepted @aamas2022 #AAMAS2022 Work helped by @VectorInst @UofTCompSci @MetaAI @GoogleAI @AI_Radboud
1
1
15
@jonLorraine9
Jonathan Lorraine
3 years
@3blue1brown Curious to see what happens if we use our method on the “Newton Fractal” – i.e., optimizing a cubic with Newton’s method. A great problem to test recursively branching😀
0
1
13
@jonLorraine9
Jonathan Lorraine
3 years
You can create toy problems with various bifurcations using your favorite differentiable games – ex., the Iterated Prisoner’s Dilemma (IPD) or GANs – by taking a random subspace from each player’s parameters, which they optimize in. The exponent is largest near bifurcations..7/11
Tweet media one
1
1
13
@jonLorraine9
Jonathan Lorraine
3 years
We include ablations over various design choices. For example, how long of an optimization horizon is necessary to find bifurcations..9/11
1
1
13
@jonLorraine9
Jonathan Lorraine
3 years
Leveraging automatic differentiation, it’s simple to use gradient-based optimization on the exponents to move our starting parameters to regions where bifurcations occur..6/11
Tweet media one
1
1
12
@jonLorraine9
Jonathan Lorraine
3 years
Ridge rider only finds saddle bifurcations. But, how can we find more general bifurcations, like those occurring in differentiable games? With Lyapunov exponent based objectives, leveraging a broad body of work from dynamical systems! 4/11
1
1
11
@jonLorraine9
Jonathan Lorraine
3 years
In single-objective optimization, the parameter updates form a conservative vector field, where saddle points are the only relevant bifurcation. However, in differentiable games the updates can form a non-conservative field, giving rise to new bifurcations - ex., Hopf..3/11
Tweet media one
1
1
11
@jonLorraine9
Jonathan Lorraine
2 years
@aamas2022 @PaulVicol @jparkerholder @TalKachman @Luke_Metz @j_foerst 📽️ Our talk is now available on YouTube! 📢 In-person talk + Q & A at #AAMAS oral sessions 1A6-2 (12-13 EST) and 3B6-2 (21-22 EST)
0
3
10
@jonLorraine9
Jonathan Lorraine
2 years
We’re not making momentum complex-valued just for the sake of it - it just happens to be a simple way to implement a generalized aggregated momentum ( with @james_r_lucas et al.) for the multi-player setting.
Tweet media one
1
0
9
@jonLorraine9
Jonathan Lorraine
2 years
Our complex buffer stores old gradient info, oscillating between adding & subtracting at a specified frequency. Classical momentum adds gradients, negative momentum () alternates every step.
Tweet media one
1
0
8
@jonLorraine9
Jonathan Lorraine
2 years
Notably, complex momentum can approach the acceleration of classical momentum in real eigenspaces, while still converging in purely imaginary eigenspaces
Tweet media one
1
1
9
@jonLorraine9
Jonathan Lorraine
2 years
We also develop a complex-valued Adam, which we use to train BigGAN on CIFAR-10 to better inception scores. The extra overhead is one momentum buffer, and one hyperparameter, which we give a practical initial guess for.
Tweet media one
1
0
7
@jonLorraine9
Jonathan Lorraine
4 months
@dereklim_lzh #NVIDIA #ICLR2024 spotlight paper: Graph Metanetworks We give a framework for processing neural nets with other neural nets, improving expressiveness and performance. My favorite use is generating the weights of implicit neural representations -- ex., in 3D generation
0
1
8
@jonLorraine9
Jonathan Lorraine
2 years
@VectorInst @UofTCompSci @MetaAI @GoogleAI @AI_Radboud Our work is inspired by and builds on “Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian”..12/12 from @jparkerholder @Luke_Metz @cinjoncin Hengyuan Hu @adamlerer @_aletcher @alex_peys @aldopacchiano @j_foerst
Tweet media one
1
0
8
@jonLorraine9
Jonathan Lorraine
2 years
An amazing guide! A great distillation of all the writing improvements you helped me with
@j_foerst
Jakob Foerster
2 years
I drafted a quick "How to" guide for writing ML papers. I hope this will be useful (if a little late!) for #NeurIPS2022 . Happy paper writing and best of luck!!
24
274
1K
0
1
8
@jonLorraine9
Jonathan Lorraine
5 months
Or, we could use LATTE3D to initialize text-to-4D methods like Align Your Gaussians: ( @HuanLing6 , @seungkim0123 , @karsten_kreis ). Here, we initialize with “DSLR photo of Rottweiler” and animate with “A dog running fast.”
1
3
7
@jonLorraine9
Jonathan Lorraine
2 years
Why Lyapunov exponents? They measure optimization trajectory separation speed, for perturbations in different directions (shown in blue/green). Trajectories separate fastest at bifurcations, where even small perturbations can cause trajectories to go to different solutions..5/12
Tweet media one
1
0
7
@jonLorraine9
Jonathan Lorraine
2 years
@3blue1brown - curious to see what happens if we use our method on the “Newton Fractal” – i.e., optimizing a cubic with Newton’s method. A good problem to test recursively branching :)
0
1
6
@jonLorraine9
Jonathan Lorraine
2 years
Open problems: Can we improve on our complex Adam? Which eigenstructures are useful for analysis of nested optimization, like GANs or meta-learning? Are there better forms of recurrently linked momentum?
2
1
6
@jonLorraine9
Jonathan Lorraine
2 years
Definitely some significant differences in even the simplest ways to featurize the tasks! I wonder if other teams find similar types of shifts 🤔
Tweet media one
@adamboazbecker
Adam Becker Habibi
2 years
The AutoML team at Google releasing the distribution of tasks that their users actually bring them ("prod"), as opposed to the tasks we typically find in the wild ("dev"). This is great: we need much more of this! Thank you for doing this @jonLorraine9
Tweet media one
1
1
4
0
1
6
@jonLorraine9
Jonathan Lorraine
2 years
By varying the sampled subspace and game, there is a wide range of phenomena. If you’re working on optimization in differentiable games and looking for interesting, visualizable, 2-D diagnostic problems try out this technique! 8/12
Tweet media one
1
0
6
@jonLorraine9
Jonathan Lorraine
2 years
Ridge rider (RR) finds diverse solutions in single-objective optimization by branching optimization at saddle points. Our method - Generalized RR (GRR) - branches at bifurcation points where small parameter changes lead to different learning dynamics..2/12
1
0
5
@jonLorraine9
Jonathan Lorraine
2 years
In single-objective optimization, the parameter updates form a conservative vector field, where saddle points are the only relevant bifurcation. However, in differentiable games the updates can form a non-conservative field, giving rise to new bifurcations - ex., Hopf..3/12
Tweet media one
1
0
5
@jonLorraine9
Jonathan Lorraine
11 months
Our work is inspired by and builds on DreamFusion (, @poolio , @ajayjain , @jon_barron , @BenMildenHall ), Score-Jacobian Chaining (, @__whc__ , @RaymondYeh ), Magic3D (), and more!
1
0
4
@jonLorraine9
Jonathan Lorraine
2 years
Leveraging automatic differentiation, it’s simple to use gradient-based optimization on the exponents to move our starting parameters to regions where bifurcations occur..6/12
Tweet media one
1
0
5
@jonLorraine9
Jonathan Lorraine
5 months
Our work builds on MVDream (, @jianglong_ye , @jerrykingpku , @kejie_li , @YangZuoshi ), Magic3D (), ATT3D (), DreamFusion (, @poolio , @ajayjain , @jon_barron , @BenMildenHall ), and more!
1
0
5
@jonLorraine9
Jonathan Lorraine
2 years
Ridge rider only finds saddle bifurcations. But, how can we find more general bifurcations, like those occurring in differentiable games? With Lyapunov exponent-based objectives, leveraging a broad body of work from dynamical systems! 4/12
1
0
5
@jonLorraine9
Jonathan Lorraine
3 years
@mvladymyrov Very cool idea! Is it possible/easy to differentiate through the support set to the hyper transformer?
1
0
5
@jonLorraine9
Jonathan Lorraine
5 months
I’m most excited about using (vision) LLMs to build 3D worlds given access to fast text-to-3D generative tooling!
0
0
5
@jonLorraine9
Jonathan Lorraine
2 years
We scale up our method, finding diverse solutions in the iterated prisoner’s dilemma and evaluating the approximate Lyapunov exponent for GANs..10/12
Tweet media one
1
0
5
@jonLorraine9
Jonathan Lorraine
2 years
We include ablations over various design choices. For example, how long of an optimization horizon is necessary to find bifurcations..9/12
1
1
4
@jonLorraine9
Jonathan Lorraine
5 months
Work done with @kevincxie , @TianshiCao , @JunGao33210520 , @james_r_lucas , Antonio Torralba @FidlerSanja , Xiaohui Zeng At the @NVIDIA Toronto AI Lab: Supported indirectly by @MIT_CSAIL , @VectorInst , @UofTCompSci / @UofTArtSci / @UofT #UofT
Tweet media one
1
0
4
@jonLorraine9
Jonathan Lorraine
2 months
I’m most excited to see hyperparameter optimization methods of the future train surrogates on large amounts of existing optimization metadata – now including checkpointed model weights – to create methods that generalize to optimizing a diverse set of problems efficiently.
0
0
4
@jonLorraine9
Jonathan Lorraine
2 years
📢Talk + Q & A at #AAMAS oral sessions 1A6-2 (12-13 EST) and 3B6-2 (21-22 EST)..11/12 #AAMAS2022 Work helped by @VectorInst @UofTCompSci @MetaAI @GoogleAI @AI_Radboud
1
0
4
@jonLorraine9
Jonathan Lorraine
2 years
You can create toy problems with various bifurcations using your favorite differentiable games – ex., the Iterated Prisoner’s Dilemma (IPD) or GANs – by taking a random subspace from each player’s parameters, which they optimize in. The exponent is largest near bifurcations..7/12
Tweet media one
1
0
4
@jonLorraine9
Jonathan Lorraine
3 years
@DavidDuvenaud We can tune the phase or argument of our momentum coefficient to perform well on a wide range of games, ranging from minimization (i.e., real eigenvalues) to games with mixtures of real and adversarial (i.e. complex) eigenvalues.
Tweet media one
1
0
3
@jonLorraine9
Jonathan Lorraine
3 years
@bouzoukipunks @DavidDuvenaud @davidjesusacu @PaulVicol Great point! This plot is only for simultaneous updates. We compared simultaneous and alternating update versions, where negative momentum works well!
Tweet media one
1
0
3
@jonLorraine9
Jonathan Lorraine
5 months
LATTE3D has 2 stages: 1) Volumetric rendering trains texture & geometry, using 3D-aware SDS gradient & mask comparison for robustness. 2) Surface-based rendering enhances texture quality. Both stages optimize across prompts for quick generation.
Tweet media one
1
1
3
@jonLorraine9
Jonathan Lorraine
3 years
@bouzoukipunks @DavidDuvenaud @davidjesusacu @PaulVicol Alternating updates are bottlenecked by computing the first player's gradient before the second. If we can parallelize or share computation of both players' gradients - as is common in deep learning setups - we may want simultaneous updates, else alternating is likely better.
Tweet media one
1
0
3
@jonLorraine9
Jonathan Lorraine
11 months
I’m excited to extend our semantic prompt interpolation abilities. I foresee arbitrarily long 3D animations (4D) output by interpolating through a dense prompt chain. Think of extending the seasonal trees with huge prompt chains from LLMs. We could grow plants from seeds or more
0
0
2
@jonLorraine9
Jonathan Lorraine
3 years
@DavidDuvenaud We perform competitively with standard first-order optimizers. We tuned the extrapolation parameters for extragradient (EG) and optimistic gradient (OG), so every method shown generalizes gradient descent-ascent by adding a single parameter.
Tweet media one
0
0
2
@jonLorraine9
Jonathan Lorraine
11 months
Our method augments the text-to-3D pipeline to re-use the text-to-image model's text-embedding to condition our 3D representation. We use a NeRF whose spatial parameters we output via a mapping hypernetwork inputting text-embeddings.
Tweet media one
1
0
2
@jonLorraine9
Jonathan Lorraine
11 months
For more details, check out the 3-minute video explaining our method. We also have a 30-second rundown and a 10-minute talk on our project website.
1
0
2
@jonLorraine9
Jonathan Lorraine
3 years
Alternating updates are bottlenecked by computing the first player's gradient before the second. If we can parallelize or share computation of both players' gradients - as is common in deep learning setups - we may want simultaneous updates, else alternating is likely better.
Tweet media one
1
0
2
@jonLorraine9
Jonathan Lorraine
11 months
Now, we can interpolate between prompts! Watch it make 3D animations transforming a 🐸 to a 🐻, interpolating 👗designs, simulating a 🚗’s wear and tear, or aging a 🐲 in real time.
1
0
2
@jonLorraine9
Jonathan Lorraine
5 months
We quantitatively assess our method with the relative user-preference rate to LATTE3D at varying compute horizons. A rate of <50 indicates users prefer LATTE3D on average. We want methods in the top left.
Tweet media one
1
0
2
@jonLorraine9
Jonathan Lorraine
11 months
We can improve the quality of prompt interpolations by training on interpolants. This helps avoid simply dissolving between the prompts. We look at training on interpolations in the text-embedding, the loss, and more.
Tweet media one
1
0
1
@jonLorraine9
Jonathan Lorraine
5 months
Or, we can make creative stylizations of different shapes.
1
0
1
@jonLorraine9
Jonathan Lorraine
5 months
Using our optional point-cloud input, we can easily create higher-quality variants of user-provided reference shapes.
1
0
1
@jonLorraine9
Jonathan Lorraine
2 years
@Foivos_Diak @davidjesusacu @PaulVicol @DavidDuvenaud Thanks! Code will be made available with the AISTATS proceedings
0
0
1
@jonLorraine9
Jonathan Lorraine
5 months
Users can now quickly design entire scenes with one of our models by rapidly iterating on the design of an individual object or the collection of objects they use.
1
1
1
@jonLorraine9
Jonathan Lorraine
3 years
@veemon13 @DavidDuvenaud @davidjesusacu @PaulVicol Check out - potential benefits: (1) Different selections of complex momentum work best for different mixtures of eigenspaces. In some mixtures, negative momentum is best, while in some having an imaginary part helps.
Tweet media one
@jonLorraine9
Jonathan Lorraine
3 years
We can tune the phase or argument of our momentum coefficient to perform well on a wide range of games, ranging from minimization (i.e., real eigenvalues) to games with mixtures of real and adversarial (i.e. complex) eigenvalues.
Tweet media one
1
0
0
0
0
1
@jonLorraine9
Jonathan Lorraine
5 months
We support an optional, fast test-time optimization when a user desires a further quality boost on any prompt.
1
0
1
@jonLorraine9
Jonathan Lorraine
11 months
We generalize to unseen compositional prompts with the "a pig {activity} {theme}" template. Diagonal red boxes are held-out tests, with activities/themes as rows/columns. As per-prompt optimization lacks generalization without optimization, we show initialization.
1
0
1
@jonLorraine9
Jonathan Lorraine
11 months
We can fuse our method's initial guess for unseen prompts with finetuning methods, like the second stage of Magic3D. This allows us to easily increase the resolution from 64x64 to 512x512.
1
0
1
@jonLorraine9
Jonathan Lorraine
5 months
We include a narrated 30s summary video here, and, additionally on our project webpage, a video demonstrating our model's usage and a 3-minute video overview explaining our method.
1
0
1
@jonLorraine9
Jonathan Lorraine
3 years
We perform competitively with standard first-order optimizers. We tuned the extrapolation parameters for extragradient (EG) and optimistic gradient (OG), so each method here generalizes gradient descent-ascent (GDA) by adding a single parameter.
Tweet media one
0
0
1
@jonLorraine9
Jonathan Lorraine
3 years
@veemon13 @DavidDuvenaud @davidjesusacu @PaulVicol (3) Even w/ alternating updates - where negative momentum works well - we may improve convergence with complex momentum.
Tweet media one
0
0
1
@jonLorraine9
Jonathan Lorraine
11 months
Our method scales to training on thousands of prompts. All objects shown below are for unseen prompts and are generated in real time.
1
0
1
@jonLorraine9
Jonathan Lorraine
3 years
@mgrankin @DavidDuvenaud @davidjesusacu @PaulVicol Our results with CIFAR-10 took >1500 GPU hours on an NVIDIA T4. Repeating similar experiments on ImageNet would have been infeasible for us, but is an interesting direction!
1
0
1
@jonLorraine9
Jonathan Lorraine
3 years
@auastro @DavidDuvenaud Yup - Hamiltonian and potential correspond to adversarial and cooperative. A minimax (or 2 player zero sum) objective like simple GANs can be either Hamiltonian, potential, or some combination. The terminology can be optimized too!
0
0
1
@jonLorraine9
Jonathan Lorraine
3 years
@DavidDuvenaud Also, the distribution of eigenvalues changes quite a bit during training.
Tweet media one
0
0
1
@jonLorraine9
Jonathan Lorraine
5 months
We can now train a model on ~100k prompts from ChatGPT, which is able to generalize to creating objects from arbitrary, unseen prompts at interactive rates!
1
0
1
@jonLorraine9
Jonathan Lorraine
11 months
Limitation 2: Underfitting Some prompts are expected to fail - ex., complex materials or not being object-centric (like landscapes). But, many fail for unsure reasons (in all TT3D pipelines), so it's hard to make large non-compositional prompt sets where all successfully train.
1
0
1
@jonLorraine9
Jonathan Lorraine
3 years
@1austrartsua1 @DavidDuvenaud @davidjesusacu @PaulVicol Good point! We don't mean to dismiss EG due to 2 gradients - this is just a practical limitation. EG is a surprisingly robust choice (even in minimization!) if we tune the extrap param too.
Tweet media one
0
0
1
@jonLorraine9
Jonathan Lorraine
5 months
We compare our method to various SOTA text-to-3D methods, showing a favorable quality vs. compute tradeoff.
1
0
1
@jonLorraine9
Jonathan Lorraine
3 years
We contrast simultaneous and alternating update versions of our algorithm. In each case for this fixed learning rate, the best momentum was complex, but alternating updates may be more robust to misspecification.
Tweet media one
1
0
1
@jonLorraine9
Jonathan Lorraine
6 years
@JiaQingYap @DavidDuvenaud It depends on the hypernet architecture is chosen! For a hypernet with no hidden units, it scales the cost of a training iteration on the hypernet by about the number of hyperparameters.
0
0
1
@jonLorraine9
Jonathan Lorraine
5 months
Our model can even interpolate between a user-provided shape and text prompt at inference time. This is a new axis for user-controllability! 🎨
1
0
1
@jonLorraine9
Jonathan Lorraine
11 months
We train a single model on sets of text prompts for reduced training cost. For example, all the prompts from DreamFusion. We find components are re-used across prompts, helping explain the reduced cost and allowing generalization.
Tweet media one
1
0
1
@jonLorraine9
Jonathan Lorraine
11 months
We can chain together multiple text prompts to create longer animations. For example, a tree going through the seasons, from summer, to spring, to fall, to winter.
1
0
1