Jonathan Lorraine @jonLorraine9 Twitter profile | Pikagi

Pikagi

Jonathan Lorraine

@jonLorraine9

4,808

Followers

5,343

Following

76

Media

155

Statuses

Research scientist @NVIDIA | PhD in machine learning @UofT . Previously @Google / @MetaAI . Opinions are my own. 🤖 💻 ☕️

Toronto, Ontario

https://t.co/R4MvdptMKA

Joined November 2017

Don't wanna be here? Send us removal request.

Pinned Tweet

@jonLorraine9

Jonathan Lorraine

5 months

New #NVIDIA #GTC24 paper 🎊 We generate high-quality 3D assets in only 400ms from text by combining (a) amortized optimization for speed, (b) surface rendering for quality, and (c) 3D data for robustness. ☕ LATTE3D project details: 🧵with many fun gifs

4

30

96

Last Seen Profiles

@YoSA_UK

@Mk_M94

@tenndokazane02

@bokeplokalmalam

@hz0844

@EchoCharlieCR

@uppwdofficial

@stwmaniax

@UnluTurkOyuncu

@Cooperation_RW

@Josh_Huerta

@AAhmd39586

@mlacbsaakya

@LeilaAmmon

@AydaEvanof67143

@bwlzlmtzbllh1

@LDLC_Help

@CoryAsbury

@turu_Voli

@uedamarie

@niloutableau

@RealRyanVox

@IFleshWork

@ClementineeeEN

@lvingdkwan

@conynhaela

@elvisfranco

@Izumiszn

@propstore_com

@konahazudeha

@rouquan123

@KitlyBusiness

@LagBXddcCkQjxYI

@NESN

@ramyradwan41681

@turbanlisever69

@jonLorraine9

Jonathan Lorraine

2 years

Gradient descent in games (like GANs) rotates around solutions. We solve this with a simple trick: complex momentum damps the oscillations. Come check out our #AISTATS2022 talk and poster: With @davidjesusacu @PaulVicol @DavidDuvenaud

4

74

493

@jonLorraine9

Jonathan Lorraine

1 month

Thrilled to announce I've completed my PhD! 🎉 My advisor, @DavidDuvenaud 's support and guidance made the journey enjoyable. You showed me that research can be fun, and I will always cherish our brainstorming sessions on the whiteboard. Thank you for taking a chance on me and

Tweet media one

25

3

276

@jonLorraine9

Jonathan Lorraine

25 days

⚡ My PhD thesis, “Scalable Nested Optimization for Deep Learning,” is now on arXiv! ⚡ tl;dr: We develop various optimization tools with highlights, including: · Making the momentum coefficient complex for adversarial games like GANs. · Optimizing millions of hyperparameters

6

25

236

@jonLorraine9

Jonathan Lorraine

3 years

When minimizing objectives, randomly initializing + optimizing can fail to find different solutions. We give a branching optimization method for this using Lyapunov exponents..1/11 @AAMAS2022 With @PaulVicol @jparkerholder @TalKachman @Luke_Metz @j_foerst

3

43

218

@jonLorraine9

Jonathan Lorraine

16 days

Excited that I've reached 1000 citations! Incredibly grateful for all the support from my co-authors, mentors, and peers

Tweet media one

4

1

129

@jonLorraine9

Jonathan Lorraine

2 months

New #NVIDIA paper: Improving Hyperparameter Optimization with Checkpointed Model Weights We enhance hyperparameter optimization by adding the ability to condition cheap-to-evaluate surrogates for the loss on checkpointed model weights with a graph metanetwork. This allows us to

2

19

117

@jonLorraine9

Jonathan Lorraine

11 months

New #NVIDIA paper: Real-time text-to-3D generation #ICCV2023 3D generation from text requires expensive per-prompt optimization. We train 1 model on many prompts for real-time generalization to unseen prompts, interpolations and more! ATT3D details:

1

27

112

@jonLorraine9

Jonathan Lorraine

2 years

When minimizing, randomly initializing + optimizing can fail to find different solutions. We give a branching optimization method using Lyapunov exponents..1/12 Come watch our oral @AAMAS2022 ! w/ @PaulVicol @jparkerholder @TalKachman @Luke_Metz @j_foerst

4

15

105

@jonLorraine9

Jonathan Lorraine

2 years

Thrilled to be starting my internship at @NVIDIA / @NVIDIAAI as a research scientist with @FidlerSanja ’s group at the Toronto AI Lab () 😃

4

1

87

@jonLorraine9

Jonathan Lorraine

3 years

Excited to begin my internship @google working on AutoML 😃

1

0

71

@jonLorraine9

Jonathan Lorraine

3 years

Thrilled to start my research internship @facebookai where I'll be working with @j_foerst 😃

2

1

68

@jonLorraine9

Jonathan Lorraine

2 years

ICML paper: Does your bilevel optimization behave in unexpected ways? This can be from the inner/outer problem being overparameterized. We examine the surprising implicit bias of common strategies! w/ @PaulVicol @fpedregosa @DavidDuvenaud @RogerGrosse

Tweet media one

@PaulVicol

Paul Vicol

2 years

Our ICML paper: In bilevel optimization, usually the inner or outer problem is overparameterized. We investigate implicit bias of warm- vs cold-start algorithms and hypergradient approximations. With @jonLorraine9 @fpedregosa @DavidDuvenaud @RogerGrosse

Tweet media one

1

16

102

0

5

52

@jonLorraine9

Jonathan Lorraine

3 years

Excited to share our latest paper - Complex Momentum for Optimization in Games! Check out the main thread for more details. This animation shows sweeps over optimizer hypers. In purely adversarial games, any momentum with non-zero phase/arg has optimizer settings that converge.

@DavidDuvenaud

David Duvenaud

3 years

Gradient descent in differentiable games rotates around solutions instead of converging. For instance, in GANs. We solve this with a simple trick: complex momentum damps the oscillations. With @jonLorraine9 @davidjesusacu @PaulVicol

19

190

1K

1

5

31

@jonLorraine9

Jonathan Lorraine

3 years

If you want to tune high-dimensional hyperparameters for pre-training in arbitrary domains, check out this paper [at #NeurIPS2021 ]!

@DavidDuvenaud

David Duvenaud

3 years

Pre-training large models is useful, but adds many hyperparameters. E.g. task weights, or augmentations in SimCLR. We give a scalable, gradient-based way to tune these: w/ @RaghuAniruddh @jonLorraine9 @skornblith @MattBMcDermott

Tweet media one

6

41

228

0

4

26

@jonLorraine9

Jonathan Lorraine

2 years

Our method is a two-line change from standard momentum updates in JAX and PyTorch. It still gives real-valued updates. This is the code for our paper:

Tweet media one

4

0

23

@jonLorraine9

Jonathan Lorraine

3 years

Our work is inspired by and builds on “Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian”..11/11 from @jparkerholder @Luke_Metz @cinjoncin Hengyuan Hu @adamlerer @_aletcher @alex_peys @aldopacchiano @j_foerst

1

2

21

@jonLorraine9

Jonathan Lorraine

3 years

By varying the sampled subspace and game, there is a wide range of phenomena. If you’re working on optimization in differentiable games and looking for interesting, visualizable, 2-D diagnostic problems try out this technique! 8/11

Tweet media one

1

2

20

@jonLorraine9

Jonathan Lorraine

11 months

If you are in Paris, come check out our work on real-time text-to-3D generation at #ICCV2023 on Friday from 10:30 AM-12:30 PM in Room "Foyer Sud" - 035!

@jonLorraine9

Jonathan Lorraine

11 months

New #NVIDIA paper: Real-time text-to-3D generation #ICCV2023 3D generation from text requires expensive per-prompt optimization. We train 1 model on many prompts for real-time generalization to unseen prompts, interpolations and more! ATT3D details:

1

27

112

0

3

19

@jonLorraine9

Jonathan Lorraine

3 years

Ridge rider (RR) finds diverse solutions in single-objective optimization by branching optimization at saddle points. Our method - Generalized RR (GRR) - branches at bifurcation points where small parameter changes lead to different learning dynamics..2/11

1

1

16

@jonLorraine9

Jonathan Lorraine

3 years

We can branch optimization at bifurcations to find diverse solutions in games like the iterated prisoner’s dilemma #ICML2021 BOFM Workshop paper w/ @jparkerholder @PaulVicol @aldopacchiano @Luke_Metz @TalKachman @j_foerst @icmlconf

Tweet card media

Using Bifurcations for Diversity in Games

Beyond first-order methods (BOFM) workshop, ICML 2021 talk for our paper "Using Bifurcations for Diversity in Differentiable Games".BOFM Workshop: https://si...

www.youtube.com

0

6

16

@jonLorraine9

Jonathan Lorraine

9 months

#NeurIPS2023 You can just ask LLMs which hyperparameters to use, and it works pretty well! w/ @michaelrzhang , Nishrkit Desai, @juhan_bae , and @jimmybajimmyba You can even directly optimize your model’s code with this.

Tweet card media

Using Large Language Models for Hyperparameter Optimization

This paper studies using foundational large language models (LLMs) to make decisions during hyperparameter optimization (HPO). Empirical evaluations demonstrate that in settings with constrained...

@michaelrzhang

Michael Zhang

9 months

We've spent a lot of effort tuning hyperparameters for large models.. but what if they could also help us tune hyperparameters? New paper at NeurIPS FMDM workshop w/ Nishkrit Desai @juhan_bae @jonLorraine9 @jimmybajimmyba 🧵

Tweet media one

3

8

55

0

0

15

@jonLorraine9

Jonathan Lorraine

3 years

Why Lyapunov exponents? They measure optimization trajectory separation speed, for perturbations in different directions (shown in blue/green). Trajectories separate fastest at bifurcations, where even small perturbations can cause trajectories to go to different solutions..5/11

Tweet media one

1

1

15

@jonLorraine9

Jonathan Lorraine

3 years

Congrats @DavidDuvenaud !

@SloanFoundation

Sloan Foundation

@SloanFoundation

3 years

Introducing… the winners of this year’s Sloan Research Fellowship! These extraordinary researchers represent some of the most exciting young minds working today—and we are thrilled to support them. Meet the winners here: #SloanFellow

Tweet media one

14

51

341

0

0

15

@jonLorraine9

Jonathan Lorraine

3 years

We scale up our method, finding diverse solutions in the iterated prisoner’s dilemma and evaluating the approximate Lyapunov exponent for GANs..10/11 Accepted @aamas2022 #AAMAS2022 Work helped by @VectorInst @UofTCompSci @MetaAI @GoogleAI @AI_Radboud

1

1

15

@jonLorraine9

Jonathan Lorraine

2 months

Honored that @TwoMinutePapers just featured our recent #nvidia project, LATTE3D, in one of their videos! 📺 Watch the video: 📘 Read our paper:

Tweet card media

NVIDIA’s New AI: 5,000x Faster Virtual Worlds!

❤️ Check out Lambda here and sign up for their GPU Cloud: https://lambdalabs.com/papers📝 The paper "LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthe...

www.youtube.com

0

2

14

@jonLorraine9

Jonathan Lorraine

3 years

@3blue1brown Curious to see what happens if we use our method on the “Newton Fractal” – i.e., optimizing a cubic with Newton’s method. A great problem to test recursively branching😀

Tweet card media

From Newton’s method to Newton’s fractal (which Newton knew nothing...

Who knew root-finding could be so complicated?Next part: https://youtu.be/LqbZpur38nwSpecial thanks to the following supporters: https://3b1b.co/lessons/newt...

www.youtube.com

0

1

13

@jonLorraine9

Jonathan Lorraine

3 years

You can create toy problems with various bifurcations using your favorite differentiable games – ex., the Iterated Prisoner’s Dilemma (IPD) or GANs – by taking a random subspace from each player’s parameters, which they optimize in. The exponent is largest near bifurcations..7/11

Tweet media one

1

1

13

@jonLorraine9

Jonathan Lorraine

3 years

We include ablations over various design choices. For example, how long of an optimization horizon is necessary to find bifurcations..9/11

1

1

13

@jonLorraine9

Jonathan Lorraine

3 years

Leveraging automatic differentiation, it’s simple to use gradient-based optimization on the exponents to move our starting parameters to regions where bifurcations occur..6/11

Tweet media one

1

1

12

@jonLorraine9

Jonathan Lorraine

3 years

Ridge rider only finds saddle bifurcations. But, how can we find more general bifurcations, like those occurring in differentiable games? With Lyapunov exponent based objectives, leveraging a broad body of work from dynamical systems! 4/11

1

1

11

@jonLorraine9

Jonathan Lorraine

3 years

In single-objective optimization, the parameter updates form a conservative vector field, where saddle points are the only relevant bifurcation. However, in differentiable games the updates can form a non-conservative field, giving rise to new bifurcations - ex., Hopf..3/11

Tweet media one

1

1

11

@jonLorraine9

Jonathan Lorraine

2 years

@aamas2022 @PaulVicol @jparkerholder @TalKachman @Luke_Metz @j_foerst 📽️ Our talk is now available on YouTube! 📢 In-person talk + Q & A at #AAMAS oral sessions 1A6-2 (12-13 EST) and 3B6-2 (21-22 EST)

0

3

10

@jonLorraine9

Jonathan Lorraine

2 years

We’re not making momentum complex-valued just for the sake of it - it just happens to be a simple way to implement a generalized aggregated momentum ( with @james_r_lucas et al.) for the multi-player setting.

Tweet media one

1

0

9

@jonLorraine9

Jonathan Lorraine

2 years

Our complex buffer stores old gradient info, oscillating between adding & subtracting at a specified frequency. Classical momentum adds gradients, negative momentum () alternates every step.

Tweet media one

1

0

8

@jonLorraine9

Jonathan Lorraine

2 years

Notably, complex momentum can approach the acceleration of classical momentum in real eigenspaces, while still converging in purely imaginary eigenspaces

Tweet media one

1

1

9

@jonLorraine9

Jonathan Lorraine

2 months

Our work builds closely on DyHPO ( ) with Martin Witsuba, Arlind Kadra, and @josifgrabocka , as well as Permutation-invariant Graph Metanetworks () with @dereklim_lzh , @HaggaiMaron , @MarcTLaw , myself and @james_r_lucas .

Tweet card media

Supervising the Multi-Fidelity Race of Hyperparameter Configurations

Multi-fidelity (gray-box) hyperparameter optimization techniques (HPO) have recently emerged as a promising direction for tuning Deep Learning methods. However, existing methods suffer from a...

1

3

9

@jonLorraine9

Jonathan Lorraine

2 years

We also develop a complex-valued Adam, which we use to train BigGAN on CIFAR-10 to better inception scores. The extra overhead is one momentum buffer, and one hyperparameter, which we give a practical initial guess for.

Tweet media one

1

0

7

@jonLorraine9

Jonathan Lorraine

4 months

@dereklim_lzh #NVIDIA #ICLR2024 spotlight paper: Graph Metanetworks We give a framework for processing neural nets with other neural nets, improving expressiveness and performance. My favorite use is generating the weights of implicit neural representations -- ex., in 3D generation

0

1

8

@jonLorraine9

Jonathan Lorraine

2 years

@VectorInst @UofTCompSci @MetaAI @GoogleAI @AI_Radboud Our work is inspired by and builds on “Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian”..12/12 from @jparkerholder @Luke_Metz @cinjoncin Hengyuan Hu @adamlerer @_aletcher @alex_peys @aldopacchiano @j_foerst

Tweet media one

1

0

8

@jonLorraine9

Jonathan Lorraine

2 years

An amazing guide! A great distillation of all the writing improvements you helped me with

@j_foerst

Jakob Foerster

2 years

I drafted a quick "How to" guide for writing ML papers. I hope this will be useful (if a little late!) for #NeurIPS2022 . Happy paper writing and best of luck!!

24

274

1K

0

1

8

@jonLorraine9

Jonathan Lorraine

5 months

Or, we could use LATTE3D to initialize text-to-4D methods like Align Your Gaussians: ( @HuanLing6 , @seungkim0123 , @karsten_kreis ). Here, we initialize with “DSLR photo of Rottweiler” and animate with “A dog running fast.”

1

3

7

@jonLorraine9

Jonathan Lorraine

2 years

Why Lyapunov exponents? They measure optimization trajectory separation speed, for perturbations in different directions (shown in blue/green). Trajectories separate fastest at bifurcations, where even small perturbations can cause trajectories to go to different solutions..5/12

Tweet media one

1

0

7

@jonLorraine9

Jonathan Lorraine

2 years

@3blue1brown - curious to see what happens if we use our method on the “Newton Fractal” – i.e., optimizing a cubic with Newton’s method. A good problem to test recursively branching :)

Tweet card media

From Newton’s method to Newton’s fractal (which Newton knew nothing...

Who knew root-finding could be so complicated?Next part: https://youtu.be/LqbZpur38nwSpecial thanks to the following supporters: https://3b1b.co/lessons/newt...

www.youtube.com

0

1

6

@jonLorraine9

Jonathan Lorraine

2 years

Open problems: Can we improve on our complex Adam? Which eigenstructures are useful for analysis of nested optimization, like GANs or meta-learning? Are there better forms of recurrently linked momentum?

2

1

6

@jonLorraine9

Jonathan Lorraine

2 years

Definitely some significant differences in even the simplest ways to featurize the tasks! I wonder if other teams find similar types of shifts 🤔

Tweet media one

@adamboazbecker

Adam Becker Habibi

@adamboazbecker

2 years

The AutoML team at Google releasing the distribution of tasks that their users actually bring them ("prod"), as opposed to the tasks we typically find in the wild ("dev"). This is great: we need much more of this! Thank you for doing this @jonLorraine9

Tweet media one

1

1

4

0

1

6

@jonLorraine9

Jonathan Lorraine

2 years

By varying the sampled subspace and game, there is a wide range of phenomena. If you’re working on optimization in differentiable games and looking for interesting, visualizable, 2-D diagnostic problems try out this technique! 8/12

Tweet media one

1

0

6

@jonLorraine9

Jonathan Lorraine

2 years

Ridge rider (RR) finds diverse solutions in single-objective optimization by branching optimization at saddle points. Our method - Generalized RR (GRR) - branches at bifurcation points where small parameter changes lead to different learning dynamics..2/12

1

0

5

@jonLorraine9

Jonathan Lorraine

2 years

In single-objective optimization, the parameter updates form a conservative vector field, where saddle points are the only relevant bifurcation. However, in differentiable games the updates can form a non-conservative field, giving rise to new bifurcations - ex., Hopf..3/12

Tweet media one

1

0

5

@jonLorraine9

Jonathan Lorraine

11 months

Our work is inspired by and builds on DreamFusion (, @poolio , @ajayjain , @jon_barron , @BenMildenHall ), Score-Jacobian Chaining (, @__whc__ , @RaymondYeh ), Magic3D (), and more!

1

0

4

@jonLorraine9

Jonathan Lorraine

2 years

Leveraging automatic differentiation, it’s simple to use gradient-based optimization on the exponents to move our starting parameters to regions where bifurcations occur..6/12

Tweet media one

1

0

5

@jonLorraine9

Jonathan Lorraine

5 months

Our work builds on MVDream (, @jianglong_ye , @jerrykingpku , @kejie_li , @YangZuoshi ), Magic3D (), ATT3D (), DreamFusion (, @poolio , @ajayjain , @jon_barron , @BenMildenHall ), and more!

1

0

5

@jonLorraine9

Jonathan Lorraine

2 years

Ridge rider only finds saddle bifurcations. But, how can we find more general bifurcations, like those occurring in differentiable games? With Lyapunov exponent-based objectives, leveraging a broad body of work from dynamical systems! 4/12

1

0

5

@jonLorraine9

Jonathan Lorraine

3 years

@mvladymyrov Very cool idea! Is it possible/easy to differentiate through the support set to the hyper transformer?

1

0

5

@jonLorraine9

Jonathan Lorraine

5 months

I’m most excited about using (vision) LLMs to build 3D worlds given access to fast text-to-3D generative tooling!

0

0

5

@jonLorraine9

Jonathan Lorraine

2 years

We scale up our method, finding diverse solutions in the iterated prisoner’s dilemma and evaluating the approximate Lyapunov exponent for GANs..10/12

Tweet media one

1

0

5

@jonLorraine9

Jonathan Lorraine

2 years

We include ablations over various design choices. For example, how long of an optimization horizon is necessary to find bifurcations..9/12

1

1

4

@jonLorraine9

Jonathan Lorraine

5 months

Work done with @kevincxie , @TianshiCao , @JunGao33210520 , @james_r_lucas , Antonio Torralba @FidlerSanja , Xiaohui Zeng At the @NVIDIA Toronto AI Lab: Supported indirectly by @MIT_CSAIL , @VectorInst , @UofTCompSci / @UofTArtSci / @UofT #UofT

Tweet media one

1

0

4

@jonLorraine9

Jonathan Lorraine

2 months

I’m most excited to see hyperparameter optimization methods of the future train surrogates on large amounts of existing optimization metadata – now including checkpointed model weights – to create methods that generalize to optimizing a diverse set of problems efficiently.

0

0

4

@jonLorraine9

Jonathan Lorraine

2 years

📢Talk + Q & A at #AAMAS oral sessions 1A6-2 (12-13 EST) and 3B6-2 (21-22 EST)..11/12 #AAMAS2022 Work helped by @VectorInst @UofTCompSci @MetaAI @GoogleAI @AI_Radboud

1

0

4

@jonLorraine9

Jonathan Lorraine

2 years

You can create toy problems with various bifurcations using your favorite differentiable games – ex., the Iterated Prisoner’s Dilemma (IPD) or GANs – by taking a random subspace from each player’s parameters, which they optimize in. The exponent is largest near bifurcations..7/12

Tweet media one

1

0

4

@jonLorraine9

Jonathan Lorraine

3 years

@DavidDuvenaud We can tune the phase or argument of our momentum coefficient to perform well on a wide range of games, ranging from minimization (i.e., real eigenvalues) to games with mixtures of real and adversarial (i.e. complex) eigenvalues.

Tweet media one

1

0

3

@jonLorraine9

Jonathan Lorraine

3 years

@bouzoukipunks @DavidDuvenaud @davidjesusacu @PaulVicol Great point! This plot is only for simultaneous updates. We compared simultaneous and alternating update versions, where negative momentum works well!

Tweet media one

1

0

3

@jonLorraine9

Jonathan Lorraine

5 months

LATTE3D has 2 stages: 1) Volumetric rendering trains texture & geometry, using 3D-aware SDS gradient & mask comparison for robustness. 2) Surface-based rendering enhances texture quality. Both stages optimize across prompts for quick generation.

Tweet media one

1

1

3

@jonLorraine9

Jonathan Lorraine

3 years

@bouzoukipunks @DavidDuvenaud @davidjesusacu @PaulVicol Alternating updates are bottlenecked by computing the first player's gradient before the second. If we can parallelize or share computation of both players' gradients - as is common in deep learning setups - we may want simultaneous updates, else alternating is likely better.

Tweet media one

1

0

3

@jonLorraine9

Jonathan Lorraine

3 years

@DylanRobertCope @DavidDuvenaud @davidjesusacu @PaulVicol Check out section 2.2/2.3 if you want to jump right to this in the paper. Check out and which we looked at building on

Tweet card media

Aggregated Momentum: Stability Through Passive Damping

Momentum is a simple and widely used trick which allows gradient-based optimizers to pick up speed along low curvature directions. Its performance depends crucially on a damping coefficient...

1

0

2

@jonLorraine9

Jonathan Lorraine

11 months

I’m excited to extend our semantic prompt interpolation abilities. I foresee arbitrarily long 3D animations (4D) output by interpolating through a dense prompt chain. Think of extending the seasonal trees with huge prompt chains from LLMs. We could grow plants from seeds or more

0

0

2

@jonLorraine9

Jonathan Lorraine

3 years

@DavidDuvenaud We perform competitively with standard first-order optimizers. We tuned the extrapolation parameters for extragradient (EG) and optimistic gradient (OG), so every method shown generalizes gradient descent-ascent by adding a single parameter.

Tweet media one

0

0

2

@jonLorraine9

Jonathan Lorraine

11 months

Our method augments the text-to-3D pipeline to re-use the text-to-image model's text-embedding to condition our 3D representation. We use a NeRF whose spatial parameters we output via a mapping hypernetwork inputting text-embeddings.

Tweet media one

1

0

2

@jonLorraine9

Jonathan Lorraine

11 months

For more details, check out the 3-minute video explaining our method. We also have a 30-second rundown and a 10-minute talk on our project website.

Tweet card media

Amortized Text-to-3D Overview

Project Webpage: https://research.nvidia.com/labs/toronto-ai/ATT3D/arXiv: https://arxiv.org/abs/2306.0734930s summary video: https://youtu.be/Xq9QOtzd1fkFull...

www.youtube.com

1

0

2

@jonLorraine9

Jonathan Lorraine

11 months

Work done with @kevincxie , Xiaohui Zeng, @chenhsuan_lin , @yongyuanxi , @nmwsharp , @TsungYiLinCV , @liu_mingyu , @FidlerSanja , @james_r_lucas At the @NVIDIA Toronto AI Lab:

Tweet card media

NVIDIA Toronto AI Lab

NVIDIA Toronto AI lab

research.nvidia.com

1

0

2

@jonLorraine9

Jonathan Lorraine

3 years

Alternating updates are bottlenecked by computing the first player's gradient before the second. If we can parallelize or share computation of both players' gradients - as is common in deep learning setups - we may want simultaneous updates, else alternating is likely better.

Tweet media one

1

0

2

@jonLorraine9

Jonathan Lorraine

11 months

Now, we can interpolate between prompts! Watch it make 3D animations transforming a 🐸 to a 🐻, interpolating 👗designs, simulating a 🚗’s wear and tear, or aging a 🐲 in real time.

1

0

2

@jonLorraine9

Jonathan Lorraine

5 months

We quantitatively assess our method with the relative user-preference rate to LATTE3D at varying compute horizons. A rate of <50 indicates users prefer LATTE3D on average. We want methods in the top left.

Tweet media one

1

0

2

@jonLorraine9

Jonathan Lorraine

11 months

We can improve the quality of prompt interpolations by training on interpolants. This helps avoid simply dissolving between the prompts. We look at training on interpolations in the text-embedding, the loss, and more.

Tweet media one

1

0

1

@jonLorraine9

Jonathan Lorraine

5 months

Or, we can make creative stylizations of different shapes.

1

0

1

@jonLorraine9

Jonathan Lorraine

5 months

Using our optional point-cloud input, we can easily create higher-quality variants of user-provided reference shapes.

1

0

1

@jonLorraine9

Jonathan Lorraine

2 years

@Foivos_Diak @davidjesusacu @PaulVicol @DavidDuvenaud Thanks! Code will be made available with the AISTATS proceedings

0

0

1

@jonLorraine9

Jonathan Lorraine

5 months

Users can now quickly design entire scenes with one of our models by rapidly iterating on the design of an individual object or the collection of objects they use.

1

1

1

@jonLorraine9

Jonathan Lorraine

3 years

@veemon13 @DavidDuvenaud @davidjesusacu @PaulVicol Check out - potential benefits: (1) Different selections of complex momentum work best for different mixtures of eigenspaces. In some mixtures, negative momentum is best, while in some having an imaginary part helps.

Tweet media one

@jonLorraine9

Jonathan Lorraine

3 years

We can tune the phase or argument of our momentum coefficient to perform well on a wide range of games, ranging from minimization (i.e., real eigenvalues) to games with mixtures of real and adversarial (i.e. complex) eigenvalues.

Tweet media one

1

0

0

0

0

1

@jonLorraine9

Jonathan Lorraine

5 months

We support an optional, fast test-time optimization when a user desires a further quality boost on any prompt.

1

0

1

@jonLorraine9

Jonathan Lorraine

11 months

We generalize to unseen compositional prompts with the "a pig {activity} {theme}" template. Diagonal red boxes are held-out tests, with activities/themes as rows/columns. As per-prompt optimization lacks generalization without optimization, we show initialization.

1

0

1

@jonLorraine9

Jonathan Lorraine

11 months

We can fuse our method's initial guess for unseen prompts with finetuning methods, like the second stage of Magic3D. This allows us to easily increase the resolution from 64x64 to 512x512.

1

0

1

@jonLorraine9

Jonathan Lorraine

5 months

We include a narrated 30s summary video here, and, additionally on our project webpage, a video demonstrating our model's usage and a 3-minute video overview explaining our method.

1

0

1

@jonLorraine9

Jonathan Lorraine

3 years

We perform competitively with standard first-order optimizers. We tuned the extrapolation parameters for extragradient (EG) and optimistic gradient (OG), so each method here generalizes gradient descent-ascent (GDA) by adding a single parameter.

Tweet media one

0

0

1

@jonLorraine9

Jonathan Lorraine

3 years

@veemon13 @DavidDuvenaud @davidjesusacu @PaulVicol (3) Even w/ alternating updates - where negative momentum works well - we may improve convergence with complex momentum.

Tweet media one

0

0

1

@jonLorraine9

Jonathan Lorraine

11 months

Our method scales to training on thousands of prompts. All objects shown below are for unseen prompts and are generated in real time.

1

0

1

@jonLorraine9

Jonathan Lorraine

3 years

@mgrankin @DavidDuvenaud @davidjesusacu @PaulVicol Our results with CIFAR-10 took >1500 GPU hours on an NVIDIA T4. Repeating similar experiments on ImageNet would have been infeasible for us, but is an interesting direction!

1

0

1

@jonLorraine9

Jonathan Lorraine

3 years

@auastro @DavidDuvenaud Yup - Hamiltonian and potential correspond to adversarial and cooperative. A minimax (or 2 player zero sum) objective like simple GANs can be either Hamiltonian, potential, or some combination. The terminology can be optimized too!

0

0

1

@jonLorraine9

Jonathan Lorraine

3 years

@DavidDuvenaud Also, the distribution of eigenvalues changes quite a bit during training.

Tweet media one

0

0

1

@jonLorraine9

Jonathan Lorraine

5 months

We can now train a model on ~100k prompts from ChatGPT, which is able to generalize to creating objects from arbitrary, unseen prompts at interactive rates!

1

0

1

@jonLorraine9

Jonathan Lorraine

11 months

Limitation 2: Underfitting Some prompts are expected to fail - ex., complex materials or not being object-centric (like landscapes). But, many fail for unsure reasons (in all TT3D pipelines), so it's hard to make large non-compositional prompt sets where all successfully train.

1

0

1

@jonLorraine9

Jonathan Lorraine

3 years

@1austrartsua1 @DavidDuvenaud @davidjesusacu @PaulVicol Good point! We don't mean to dismiss EG due to 2 gradients - this is just a practical limitation. EG is a surprisingly robust choice (even in minimization!) if we tune the extrap param too.

Tweet media one

0

0

1

@jonLorraine9

Jonathan Lorraine

5 months

We compare our method to various SOTA text-to-3D methods, showing a favorable quality vs. compute tradeoff.

1

0

1

@jonLorraine9

Jonathan Lorraine

3 years

We contrast simultaneous and alternating update versions of our algorithm. In each case for this fixed learning rate, the best momentum was complex, but alternating updates may be more robust to misspecification.

Tweet media one

1

0

1

@jonLorraine9

Jonathan Lorraine

6 years

@JiaQingYap @DavidDuvenaud It depends on the hypernet architecture is chosen! For a hypernet with no hidden units, it scales the cost of a training iteration on the hypernet by about the number of hyperparameters.

0

0

1

@jonLorraine9

Jonathan Lorraine

5 months

Our model can even interpolate between a user-provided shape and text prompt at inference time. This is a new axis for user-controllability! 🎨

1

0

1

@jonLorraine9

Jonathan Lorraine

11 months

We train a single model on sets of text prompts for reduced training cost. For example, all the prompts from DreamFusion. We find components are re-used across prompts, helping explain the reduced cost and allowing generalization.

Tweet media one

1

0

1

@jonLorraine9

Jonathan Lorraine

11 months

We can chain together multiple text prompts to create longer animations. For example, a tree going through the seasons, from summer, to spring, to fall, to winter.

1

0

1