Yilun Xu Profile Banner
Yilun Xu Profile
Yilun Xu

@xuyilun2

1,560
Followers
292
Following
18
Media
75
Statuses

Research Scientist @NVIDIA , working on fundamental Gen AI. Prev: PhD @MIT_CSAIL , BS @PKU1898 . views are my own

Joined September 2018
Don't wanna be here? Send us removal request.
@xuyilun2
Yilun Xu
29 days
Introducing Discrete-Continuous Latent Variable Diffusion Models (DisCo-Diff 🕺), which augment continuous diffusion models with learnable global discrete latents. DisCo-Diff greatly simplify learning diffusion models and strengthens their sampling trajectories (1/9)
Tweet media one
2
63
392
@xuyilun2
Yilun Xu
2 years
Get ready to upgrade your diffusion models😻! Our #iclr2023 paper reduces variance in denoising score-matching, improving image quality, stability, and training speed. Experience the best image generation with current SOTA FID of 1.90 on CIFAR-10
Tweet media one
6
67
381
@xuyilun2
Yilun Xu
3 months
Officially passed my PhD thesis defense today! I'm deeply grateful to my collaborators and friends for their support throughout this journey. Huge thanks to my amazing thesis committee: Tommi Jaakkola (advisor), @karsten_kreis , and @phillip_isola ! 🎓✨
Tweet media one
Tweet media two
18
13
239
@xuyilun2
Yilun Xu
2 years
Excited to share our #NeurIPS2022 paper "Poisson Flow Generative Models". The PFGM ODE achieves - SOTA performance in (continuous) normalizing flow family - Faster sampling speed than SDEs in diffusion models Paper: Code: … 1/n
Tweet media one
4
33
219
@xuyilun2
Yilun Xu
1 year
In diffusion models, samplers are primarily ODE-centric, overlooking slower stochastic methods. However, we show that stochastic sampler can outperform previous samplers on Stable Diffusion, if we use stochasticity correctly! check out Restart Sampling:
2
34
174
@xuyilun2
Yilun Xu
1 year
I'll be in #ICML to present #PFGMpp (Thurs 1:30-3pm, Exhibit Hall 1 #545 ), and discuss with an awesome panel about new frontier of generative model at #SPIGM workshop (). Happy to chat about diffusion model, PFGM, or new physics-inspired generative models!
Tweet media one
@xuyilun2
Yilun Xu
1 year
Excited to share PFGM++ #ICML2023 : a physics-inspired generative model unifying diffusion models & PFGM! By embedding N-dim data in N+D-dim space, we achieve: ✨ Flexible D for robustness & rigidity ✨ Median Ds outperform SOTA diffusion models (D-> inf)
Tweet media one
3
15
77
1
18
92
@xuyilun2
Yilun Xu
1 year
Excited to share PFGM++ #ICML2023 : a physics-inspired generative model unifying diffusion models & PFGM! By embedding N-dim data in N+D-dim space, we achieve: ✨ Flexible D for robustness & rigidity ✨ Median Ds outperform SOTA diffusion models (D-> inf)
Tweet media one
3
15
77
@xuyilun2
Yilun Xu
11 months
Restart is accepted by #NeurIPS23 , and gets deployed in the popular webui . Let's combine the best of SDE (better quality) and ODE (faster sampling) samplers!
@xuyilun2
Yilun Xu
1 year
In diffusion models, samplers are primarily ODE-centric, overlooking slower stochastic methods. However, we show that stochastic sampler can outperform previous samplers on Stable Diffusion, if we use stochasticity correctly! check out Restart Sampling:
2
34
174
1
11
71
@xuyilun2
Yilun Xu
8 months
Presenting "Restart Sampling for Improving Generative Processes" @ #NeurlPS2023 today! Poster #808 5pm-7pm Come by to chat about the fast sampling for diffusion models!
Tweet media one
@xuyilun2
Yilun Xu
11 months
Restart is accepted by #NeurIPS23 , and gets deployed in the popular webui . Let's combine the best of SDE (better quality) and ODE (faster sampling) samplers!
1
11
71
2
11
64
@xuyilun2
Yilun Xu
9 months
non-IID sampling can promote diversity / mitigate memorization in diffusion models!
@GabriCorso
Gabriele Corso
9 months
New paper!🤗 Do all your samples from Stable Diffusion or Dall-E look very similar to each other? It turns out IID sampling is to blame! We study this problem and propose Particle Guidance, a technique to obtain diverse samples that can be readily applied to your diffusion model!
Tweet media one
4
86
439
0
6
53
@xuyilun2
Yilun Xu
2 years
Check out our ICLR spotlight paper on constructing the **orthogonal classifiers** that enable/outperform baselines in three tasks : - Controlled style transfer - Domain adaptation with label shifts - Fairness Paper: Code:
Tweet media one
Tweet media two
1
2
40
@xuyilun2
Yilun Xu
1 year
Excited to share our latest paper which establishes a duality between generative models and physical processes 😃.
@ZimingLiu11
Ziming Liu
1 year
Generative models have been inspired by physics, but Eureka-type “inspirations” are mysterious. Is there a systematic way to convert physical processes to generative models? The answer is yes! This will largely augment design space of generative models.
2
54
231
1
2
33
@xuyilun2
Yilun Xu
3 months
@kohjingyu This may not necessarily be a binary problem (diffusion versus auto-regressive). It is indeed possible to integrate the strengths of the two into a single model through a novel training pipeline. Stay tuned for our new model designed to achieve this 🙂
4
1
32
@xuyilun2
Yilun Xu
11 months
Quanta magazine just released an article featuring our recent series of works on generative AI 😬
@QuantaMagazine
Quanta Magazine
11 months
Researchers are exploring whether “physics-inspired generative models” might offer more transparent and effective forms of artificial intelligence. Steve Nadis reports:
1
41
166
0
0
17
@xuyilun2
Yilun Xu
29 days
Diffusion models transform the simple Gaussian into the complex and multimodal data distribution through an ODE. The ODE mapping necessarily needs to be highly complex, with strong curvature (see the middle figure). (2/9)
Tweet media one
1
3
16
@xuyilun2
Yilun Xu
2 years
@JosephJacks_ @nearcyan Thanks! Our recent experiments shows that we can further achieve 100x - 200x speedup with no loss on image quality, with some improvements on sampling methods. Stay tuned 😉
0
1
16
@xuyilun2
Yilun Xu
2 years
Happy to see the great application of V-information in the NLP domain😋
@swabhz
Swabha Swayamdipta
2 years
🎉🎉Super thrilled that our paper on Understanding Dataset Difficulty with V-usable information received an outstanding paper award at #ICML2022 !! 🥳Looking forward to the broader applications of this framework. It was a total delight working with my @allen_ai intern, @ethayarajh
16
29
349
1
0
15
@xuyilun2
Yilun Xu
29 days
An additional autoregressive model post-hoc the distribution of the discrete latent. The discrete latent captures global statistics in Euclidean space, such as layouts, shapes, and color variations. These statistics are complementary to semantics, such as class labels. (5/9)
Tweet media one
1
0
11
@xuyilun2
Yilun Xu
2 years
Very detailed and educational blog of PFGM!
@r_o_connor
Ryan O'Connor
2 years
Stable Diffusion runs on physics-inspired Deep Learning. Researchers from MIT (first authors @ZimingLiu11 and @xuyilun2 ) have recently unveiled a new physics-inspired model that runs even faster! This introduction has everything you need to know 👇
2
34
162
0
1
12
@xuyilun2
Yilun Xu
29 days
Empirically, DisCo-Diff consistently improves model performance on several image synthesis tasks and molecular docking. It achieves the new state-of-the-art on ImageNet-64/ImageNet-128 with an ODE sampler. (6/9)
Tweet media one
1
1
9
@xuyilun2
Yilun Xu
29 days
To this end, we augment diffusion model with learnable discrete latent, inferred with an encoder, and train diffusion model and encoder end-to-end. The encoder is encouraged to encode global discrete structure into the latent and help the denoiser to reconstruct the data (4/9).
1
1
9
@xuyilun2
Yilun Xu
29 days
We also test DisCo-Diff on molecular docking, building upon the DiffDock framework. We see that also in this domain discrete latents provide improvements, with the success rate on the full dataset increasing from 32.9% to 35.4% and from 13.9% to 18.5%. (7/9)
Tweet media one
1
0
9
@xuyilun2
Yilun Xu
1 year
In practice, Restart beats SDE and ODE samplers in speed and quality on CIFAR-10 and ImageNet-64. Additionally, Restart achieves a better balance of text-image alignment, visual quality, and diversity on large-scale text-to-image Stable Diffusion! Code:
1
0
9
@xuyilun2
Yilun Xu
29 days
We believe DisCo-Diff could be further extended text-to-image/video generation, where we would expect discrete latent variables to offer complementary benefits to the text conditioning, similar to how discrete latents boost performance in our class-conditional experiments. (8/9)
6
0
7
@xuyilun2
Yilun Xu
2 years
Inspired by the electric field in physics, we interpret the data points as electrical charges on the z = 0 hyperplane in a space augmented with an additional dimension z The electric field lines transform data distribution into a uniform distribution on the large hemisphere. 3/n
Tweet media one
1
3
7
@xuyilun2
Yilun Xu
2 months
@jon_barron That’s how we construct the prior distribution in PFGM (projecting uniform distribution on sphere to a hyperplane). To generate Gaussian, one could simply kick balls towards a cylinder in a uniform angle in an infinite-dimensional space as shown in PFGM++.
0
0
6
@xuyilun2
Yilun Xu
2 years
@alexjc Thanks for your interests! Our PFGM is different from Diffusion models: Diffusion models arise from thermodynamics, but PFGM is inspired by electrostatics. PFGM outperforms diffusion models in sample quality and sampling speed :), using similar architecture.
0
1
6
@xuyilun2
Yilun Xu
29 days
Conversely, using the known global discrete structure (e.g., index of modes) of data as input for diffusion models reduces the curvature of the ODE path (see right figure above). A key challenge remains: how to infer this discrete structure directly from the data? (3/9)
1
0
6
@xuyilun2
Yilun Xu
1 year
ODE samplers are fast but plateau in performance while SDE samplers deliver better samples at the cost of increased sampling time. We attribute this difference to sampling errors: ODE involve smaller discretization errors while stochasticity in SDE contracts accumulated errors.
Tweet media one
1
0
6
@xuyilun2
Yilun Xu
2 years
We also design a backward ODE for sampling. The backward ODE transforms samples from uniform hemisphere to the data distribution. 5/n
Tweet media one
1
1
5
@xuyilun2
Yilun Xu
2 years
We learn the high dimensional version of the electric field, termed Poisson field, by neural networks. Specifically, we use a large batch to calculate the normalized field. 4/n
Tweet media one
1
1
5
@xuyilun2
Yilun Xu
2 years
Our method achieves SOTA results on CIFAR10 dataset in flow family, faster sampling speed than SDEs in score-based and diffusion models, and more robustness. It can also scale to higher resolution dataset, e.g. LSUN bedroom 256x256. n/n
Tweet media one
0
1
4
@xuyilun2
Yilun Xu
1 year
1
0
4
@xuyilun2
Yilun Xu
1 year
Joint work w/ @Goodeat258 , Xiang Cheng, @YonglongT , @ZimingLiu11 and Tommi Jaakkola
0
0
4
@xuyilun2
Yilun Xu
1 year
Based on these findings, we propose a novel sampling algorithm called Restart in order to better balance discretization errors and contraction. The sampling method alternates between adding substantial noise in additional forward steps and strictly following a backward ODE.
1
0
4
@xuyilun2
Yilun Xu
10 months
@s_mandt I think other forward processes like PFGM++ and EDM already get FID smaller than the number in this paper on CIFAR-10?
1
0
1
@xuyilun2
Yilun Xu
11 months
In our updated version, we will show that the Restart sampling can also produce better samples in low-NFE regime (~20 NFE) on benchmarks and SD.
1
0
3
@xuyilun2
Yilun Xu
2 years
Just set up my twitter account. Sorry for the huge delay 😃 @baaadas @shengjia_zhao @StefanoErmon
@DavidDuvenaud
David Duvenaud
4 years
I really like this new paper on "Usable Information under Computational Constraints". It generalizes Shannon information to consider the ease of making predictions using a particular representation. by @baaadas , @shengjia_zhao , @StefanoErmon et al.
4
31
250
0
0
2
@xuyilun2
Yilun Xu
2 years
joint work with @ZimingLiu11 , @tegmark and Tommi Jaakkola 2/n
1
0
2
@xuyilun2
Yilun Xu
3 months
@GuangHeLee1 Thanks Guang-He 哥
0
0
2
@xuyilun2
Yilun Xu
2 years
@_rk_singhal Feel free to add it into the list :)
0
0
2
@xuyilun2
Yilun Xu
3 months
@DibyajyotiAch04 Thank you! I will share the link soon!
0
0
1
@xuyilun2
Yilun Xu
1 year
@janekm Thanks for the information! We plan to PR to diffuser repo, and will also take a look at automatic1111!
0
0
1
@xuyilun2
Yilun Xu
3 months
@DrYangSong Thank you Yang!
0
0
1
@xuyilun2
Yilun Xu
3 months
@chenlin_meng Thank you Chenlin!
0
0
1
@xuyilun2
Yilun Xu
3 months
@SahajGarg6 Thanks Sahaj!
0
0
1
@xuyilun2
Yilun Xu
11 months
@timudk Thanks Tim, we will look into it :)
0
0
1
@xuyilun2
Yilun Xu
2 years
@menghua_wu 😍cute dogs
0
0
1
@xuyilun2
Yilun Xu
2 years
awesome collaborators @hehaodele , Tianxiao Shen and Tommi Jaakkola
0
0
1
@xuyilun2
Yilun Xu
1 year
@YonglongT Congrats Cambridge A Long!
1
0
1
@xuyilun2
Yilun Xu
1 year
@madebyollin Thanks for pointing this out, unfortunately that's a limitation on current method and one has to construct separate reference batches for different conditions. But please try STF if you can maintain these batches! (naive DSM is STF with batch size=1)
0
0
1
@xuyilun2
Yilun Xu
3 months
@adityagrover_ Thank you, Aditya!
0
0
1