Jiatao Gu Profile Banner
Jiatao Gu Profile
Jiatao Gu

@thoma_gu

3,627
Followers
1,851
Following
52
Media
405
Statuses

Machine Learning Researcher at @Apple ML Research (MLR) based in NYC | ex-FAIRer | PhD from HKU | Research on Generative AI for multimodalities. また日本語もできます。

New York, USA
Joined October 2012
Don't wanna be here? Send us removal request.
Pinned Tweet
@thoma_gu
Jiatao Gu
4 months
🚀Excited to introduce KaleidoDiffusion -- a new method that improves conditional diffusion model generation by incorporating autoregressive latent priors! This allows us generate much more diverse outputs even with high CFG just like a kaleidoscope🔭! (1/n)
Tweet media one
@_akhaliq
AK
4 months
Kaleido Diffusion Improving Conditional Diffusion Models with Autoregressive Latent Modeling Diffusion models have emerged as a powerful tool for generating high-quality images from textual descriptions. Despite their successes, these models often exhibit limited diversity in
Tweet media one
2
70
325
3
32
168
@thoma_gu
Jiatao Gu
11 months
📢 Introducing our latest research @Apple MLR for generating high-quality images & videos with a multi-resolution diffusion model -- Matryoshka Diffusion Models or MDM🪆, directly in pixel space (~1024px) without any VAEs or cascaded models. Code will be released soon! !(1/n)
Tweet media one
@_akhaliq
AK
11 months
Matryoshka Diffusion Models paper page: Diffusion models are the de facto approach for generating high-quality images and videos, but learning high-dimensional models remains a formidable task due to computational and optimization challenges. Existing
Tweet media one
13
90
521
17
156
803
@thoma_gu
Jiatao Gu
3 years
I am super excited that the code of our recent ICLR2022 paper, "StyleNeRF: A Style-based 3D-Aware Generator for High-resolution Image Synthesis", has been released! Please check Paper: Code: Project page:
3
107
651
@thoma_gu
Jiatao Gu
2 years
Life Update: After four wonderful years at FAIR Labs, I've decided to move on to join Apple MLR led by Samy Bengio. I will continue working on representation learning and generative models for text, vision and multimodality. Feel free to reach out if you want to work together!
8
6
399
@thoma_gu
Jiatao Gu
4 years
Happy New Year!!  I am super excited to share our new pre-print “Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade”, joint work with @XiangKong4 . Please check out (1/2)
Tweet media one
3
37
228
@thoma_gu
Jiatao Gu
5 years
Super excited to announce my first NeurIPS paper was accepted! We propose Levenshtein Transformer that learns to insert and delete words iteratively for sequence generation and refinement tasks! Thanks for my reliable coauthors @ChanghanWang @JakeZzzzzzz !
Tweet media one
4
47
222
@thoma_gu
Jiatao Gu
1 year
🪘🪘New pre-print!! I’m delighted to share our latest work @Apple MLR “BOOT👢: Data-free Distillation of Denoising Diffusion Models with Bootstrapping.” We explore a novel method that can distill your favorite diffusion models into ONE STEP without using training data!🔆 (1/6)
@_akhaliq
AK
1 year
BOOT: Data-free Distillation of Denoising Diffusion Models with Bootstrapping paper page: Diffusion models have demonstrated excellent potential for generating diverse images. However, their performance often suffers from slow generation due to iterative
6
42
163
1
47
185
@thoma_gu
Jiatao Gu
5 years
[1/7] Super excited to present our recent work -- mBART. We demonstrate multilingual denoising pre-training produces significant gains across a variety of machine translation tasks! Joint work with @YinhanL @NamanGoyal21 @xl_nlp @edunov @gh_marjan @ml_perception @LukeZettlemoyer
@AIatMeta
AI at Meta
5 years
We're releasing mBART, a new seq2seq multilingual pretraining system for machine translation across 25 languages. It gives significant improvements for document-level translation and low-resource languages. Read our paper to learn more:
Tweet media one
13
356
977
2
43
178
@thoma_gu
Jiatao Gu
9 months
Thrilled to announce that our "Matryoshka Diffusion Models" paper got accepted at #ICLR2024 ! Huge thanks to the amazing Apple MLR colleagues @zhaisf @YizheZhangNLP @jsusskin Navdeep Jaitly for their efforts. See you in Vienna! 🚀 #MachineLearning
@thoma_gu
Jiatao Gu
11 months
📢 Introducing our latest research @Apple MLR for generating high-quality images & videos with a multi-resolution diffusion model -- Matryoshka Diffusion Models or MDM🪆, directly in pixel space (~1024px) without any VAEs or cascaded models. Code will be released soon! !(1/n)
Tweet media one
17
156
803
5
16
148
@thoma_gu
Jiatao Gu
10 months
I'll attend #NeurIPS in person next week, presenting our recent works: PLANNER Tue morning, #1921 Diffusion without Attention Fri all day, workshop on DM I'm excited to see you soon and chat about multimodal & diffusion models!
3
11
121
@thoma_gu
Jiatao Gu
2 years
I'm happy to share our latest work, “f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation.” This is joint work with my Apple colleagues, @zhaisf @icaruszyz @itsbautistam @jsusskin (1/6)
Tweet media one
@_akhaliq
AK
2 years
f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation abs: project page: propose f-DM, an end-to-end non-cascaded diffusion model that allows progressive signal transformations along diffusion
Tweet media one
0
9
65
2
19
112
@thoma_gu
Jiatao Gu
1 year
Excited to share that our work, "NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion", was accepted at #ICML2023 Huge thanks to amazing coauthors @alextrevith @kaien_lin @jsusskin @LingjieLiu1 and Ravi Ramamoorthi. See you in Honolulu!
@_akhaliq
AK
2 years
NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion abs: project page:
3
38
258
2
17
102
@thoma_gu
Jiatao Gu
10 months
It is huge fun working with Sasha! Please check our recent work on exploring better architectures for diffusion models where we replace all the attention with linear RNNs, which achieves much better efficiency and No patchify is needed. Thanks @NathanYan2012 @srush_nlp @Apple
@srush_nlp
Sasha Rush
10 months
As with LMs, modern Diffusion models rely heavily on Attention. This improves quality but requires patching to scale. Working with Apple, we designed a model without attention that matches top imagenet accuracy and removes this resolution bottleneck.
Tweet media one
10
112
667
1
9
90
@thoma_gu
Jiatao Gu
1 year
Excited to be in person for #ICML2023 in Hawai'i🌴🌊! I'll be presenting two posters (Nerfdiff, σREPARAM) on Tuesday and giving a contributed talk (BOOT) at the on Friday. Please ping me if you want to chat about diffusion models, transformers, and 3D!!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
14
76
@thoma_gu
Jiatao Gu
1 year
A bit over one year at Apple... the commit pattern clearly shows when a deadline is😂
Tweet media one
4
1
69
@thoma_gu
Jiatao Gu
4 years
Super excited to announce that our recent work "Cross-lingual Retrieval for Iterative Self-Supervised Training (CRISS)" has been accepted as *spotlight* presentation at NeurIPS2020!! Congrats to all my amazing colleagues! @mr_cheu @xl_nlp and Yuqing Tang at @facebookai
@mr_cheu
Chau Tran
4 years
Introducing our new work "Cross-lingual Retrieval for Iterative Self-Supervised Training" () Joint work with Yuqing Tang, @xl_nlp , @thoma_gu ( @facebookai ) 0/4
Tweet media one
1
3
15
2
11
72
@thoma_gu
Jiatao Gu
1 year
Just arrived in Vancouver for #CVPR2023 from June 18-22. I'm thrilled about the first-time CVPR experience and eager to engage in chats on generative models, 3D and MLR! Please visit our poster on 3D-aware diffusion model at the 3DMV workshop () on June 19!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
3
67
@thoma_gu
Jiatao Gu
1 year
I will be attending #ICCV2023 in person in Paris and presenting our poster on "Single-stage diffusion (SSD)-NeRF" on Wednesday 4th, 10:30 AM-12:30 PM! Looking forward to meeting people and talking about diffusion models and 3D!
@thoma_gu
Jiatao Gu
1 year
Please check our recent #ICCV2023 paper on SSD-NeRF () !! We proposed a unified view of 3D generation and reconstruction by learning a "single-stage" 3D diffusion model directly from 2D images.
0
5
38
0
3
63
@thoma_gu
Jiatao Gu
11 months
Please check out our paper for more details! 📃 Paper: Huge thanks to incredible collaborators @Apple MLR @zhaisf @YizheZhangNLP @jsusskin Navdeep Jaitly for their amazing contributions! 📷 A special thanks to @_akhaliq for reposting our work! (6/n)
Tweet media one
1
8
60
@thoma_gu
Jiatao Gu
11 months
Additional video results (7/n, n=7)
1
6
58
@thoma_gu
Jiatao Gu
7 years
Our submission to ICLR "Non-Autoregressive Neural Machine Translation" got accepted!! Congrats @jekbradbury @RichardSocher @CaimingXiong See you in Vancouver!!
1
5
56
@thoma_gu
Jiatao Gu
4 years
(1/3) Super excited to present our recent work: Neural Sparse Voxel Fields (NSVF): a hybrid neural scene representation for fast and high-quality free-viewpoint rendering. Joint work with @LingjieLiu1 (MPI), Zaw Lin (NUS), Tat-Seng Chua (NUS) and Christian Theobalt (MPI).
Tweet media one
2
9
53
@thoma_gu
Jiatao Gu
4 years
Super excited to announce that our recent work "Neural Sparse Voxel Fields (NSVF)"() has been accepted as *spotlight* presentation at NeurIPS2020!! Also the code and data have been released! Please checkout .
@thoma_gu
Jiatao Gu
4 years
(1/3) Super excited to present our recent work: Neural Sparse Voxel Fields (NSVF): a hybrid neural scene representation for fast and high-quality free-viewpoint rendering. Joint work with @LingjieLiu1 (MPI), Zaw Lin (NUS), Tat-Seng Chua (NUS) and Christian Theobalt (MPI).
Tweet media one
2
9
53
0
7
48
@thoma_gu
Jiatao Gu
11 months
MDM is a single generative model that handles various high-resolution targets: Images 🖼️ Text-to-Images 📜➡️🖼️ Text-to-Videos 📜➡️🎥 Distinct from existing works, MDM doesn't need a pre-trained VAE (e.g., SD) or training multiple upscaling modules (e.g., IMAGEN)(2/n)
Tweet media one
Tweet media two
Tweet media three
2
7
45
@thoma_gu
Jiatao Gu
1 year
Sharing our recent #NeurIPS2023 paper on latent diffusion for text generation. PLANNER is a diffusion model in the latent space, connected with an autoregressive language decoder, which can generate more diverse and coherent texts.
@YizheZhangNLP
Yizhe Zhang
1 year
🎉 Thrilled to announce that our Planner paper has been accepted at #NeurIPS2023 ! 📚 If you're searching for a latent text diffusion approach that creates diverse and coherent text, check out our research! 😄 Code will be released soon! #TextGeneration #Diffusion #NLG
0
2
15
1
3
43
@thoma_gu
Jiatao Gu
2 years
Our paper "data2vec" has been accepted as long presentation at ICML2022! Please check it out!!
@MichaelAuli
Michael Auli
2 years
data2vec is a long talk at this year's @icmlconf . Congratulations to @ZloiAlexei @mhnt1580 @QiantongX @arunbabu1234 @thoma_gu ! Updated paper: with new ablations showing that contextualized target representations work very well.
4
8
49
0
2
44
@thoma_gu
Jiatao Gu
11 months
How? We propose a diffusion process that denoises inputs at multiple resolutions jointly and uses a NestedUNet architecture. Just like a "Matryoshka doll", our nested UNet embeds lower resolutions UNets inside the higher ones.🪆 We can do the same for both images & videos. (3/n)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
4
40
@thoma_gu
Jiatao Gu
5 years
Please check out our recent work with @seayong08 @kchonyc and Victor! We found that vanilla zero-shot NMT usually fails due to spurious correlations in the data, and we proposed simple approaches to fix it! Accepted by ACL2019. Thanks for your attention!
@arxiv_cscl
arXiv CS-CL
5 years
Improved Zero-shot Neural Machine Translation via Ignoring Spurious Correlations
0
0
1
2
7
40
@thoma_gu
Jiatao Gu
2 years
Attending #ECCV2022 Oct 23-27 at Tel Aviv, first-time in-person conference in the recent three years!! Happy to chat about research happening at @Apple MLR. We are also looking for interns who are interested in generative models for text, images, videos, 3D, and multimodal!
1
0
40
@thoma_gu
Jiatao Gu
11 months
With these improvements, MDM can train a solo pixel-space model at impressive resolutions (e.g., 1024x1024). To achieve these results, we only need a compact dataset like CC12M and a few days of training with just 3-4 nodes of 8 A100 GPUs. 🔥🚀 (5/n)
Tweet media one
Tweet media two
Tweet media three
1
3
39
@thoma_gu
Jiatao Gu
1 year
Please check our recent #ICCV2023 paper on SSD-NeRF () !! We proposed a unified view of 3D generation and reconstruction by learning a "single-stage" 3D diffusion model directly from 2D images.
@haosu_twitr
Hao Su
1 year
Our paper 'Single-Stage Diffusion NeRF' will be presented at #ICCV2023 . We merge 3D diffusion with NeRF into a holistic model, providing priors for both 3D generation and reconstruction (from an arbitrary number of views). Check it out here: #NeRF #AI
0
24
115
0
5
38
@thoma_gu
Jiatao Gu
1 year
On my way to Kigali #ICLR2023 in-person for presenting our poster on diffusion model with signal transformations! It will be a long flight arriving in April 30. Looking forward to seeing friends and chatting more about generative models, 3D and opportunities @Apple MLR!
@thoma_gu
Jiatao Gu
2 years
I'm happy to share our latest work, “f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation.” This is joint work with my Apple colleagues, @zhaisf @icaruszyz @itsbautistam @jsusskin (1/6)
Tweet media one
2
19
112
1
8
39
@thoma_gu
Jiatao Gu
9 months
Please check out this incredible introduction video about our recent effort on "diffusion models without attention"!! Thanks so much, @srush_nlp , for making this! Also, thanks, @NathanYan2012 , for your hard work. We should always stay curious and think outside the box!
@srush_nlp
Sasha Rush
9 months
New Video: RNNs for Diffusion? Short technical overview of "Diffusion Models without Attention" a recent paper on long-range models for image generation.
5
41
237
0
1
36
@thoma_gu
Jiatao Gu
2 years
For more details and results, please check the following: Paper: Project page: Thanks! (6/6)
0
5
31
@thoma_gu
Jiatao Gu
7 years
Our submission "Universal Neural Machine Translation for Extremely Low Resource Languages" has been accepted as the full paper oral presentation by NAACL-HLT 2018!! Please check out at This is a joint work with Hany and Jacob, congrats!
1
7
30
@thoma_gu
Jiatao Gu
9 months
Happy New Year!! Looking forward to 2024!
0
0
28
@thoma_gu
Jiatao Gu
2 years
@xutan_tx and I will virtually give a tutorial about "Non-autoregressive Sequence Generation" this Sunday, May 22 at 14:30-18:00 Irish Standard Time. #ACL2022NLP #NLProc Please come and check it out More details
1
6
25
@thoma_gu
Jiatao Gu
11 months
Besides, MDM isn't just innovative in its structure; We also propose a progressive training schedule that smoothly transitions from lower to higher resolutions, optimizing high-res generation with noticeable improvement.💡 (4/n)
Tweet media one
Tweet media two
1
2
25
@thoma_gu
Jiatao Gu
10 months
Please check our recent paper on DiffuSSM @NathanYan2012
@nousr_
no usr
10 months
i was curious if that new "mamba" layer could be used for image generation (tldr: ya prob) i hacked together a quick test following ideas from the diffusion transformer (DiT) and made "DiM". after an hour or so on my 4090 it seems that it's learning the oxford flowers dataset.
Tweet media one
20
40
334
1
5
25
@thoma_gu
Jiatao Gu
5 years
Thanks for checking out our new (in-progress) results! Joint work with @ChanghanWang @JakeZzzzzzz , we hope to have a simple but efficient way of unifying sequence generation and refinement, by learning both insertion and deletion operations!
@evolvingstuff
evolvingstuff
5 years
This is really cool - a transformer network that uses insertions and deletions as its primary operations. Roughly same performance, but up to 5x more efficient! Levenshtein Transformer
Tweet media one
1
51
175
1
6
22
@thoma_gu
Jiatao Gu
3 years
In this work, we propose StyleNeRF, a 3D-aware generative model for photo-realistic high-resolution image synthesis with high multi-view consistency, which can be trained on unstructured 2D images. Joint work with @LingjieLiu1 (MPII) Peng Wang (HKU) and Christian Theobalt (MPII)
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
4
23
@thoma_gu
Jiatao Gu
2 years
We ( @melbayad @MichaelAuli @EXGRV ) have also worked on a very similar approach to adaptively change the depth of decoding for MT date back to ICLR2020
@mathemagic1an
Jay Hack
2 years
Current LLMs expend the same amount of computation on each token they generate. But some predictions are much harder than others! With CALM, the authors redirect computational resources to "hard" inferences for better perf (~50% speedup) Here's how 👇
12
118
1K
0
2
21
@thoma_gu
Jiatao Gu
4 years
In this work, we combine approaches from 4 aspects including data, model, loss function and learning, and finally close the performance gap between fully non-autoregressive machine translation and Transformers while maintaining over 16x speed-up at inference time. (2/2)
Tweet media one
0
1
19
@thoma_gu
Jiatao Gu
5 years
Hi! We are at #NeurIPS2019 this week, and come to visit our poster for Levenshtein Transformer() today at 5pm! @ChanghanWang @JakeZzzzzzz
1
3
19
@thoma_gu
Jiatao Gu
6 years
Finally had my Ph.D. oral presentation! Looking forward to the next journey!
Tweet media one
0
0
18
@thoma_gu
Jiatao Gu
6 years
Check out our updated and extended version with clearer formulations and more experimental results!
@arxiv_cscl
arXiv CS-CL
6 years
Insertion-based Decoding with automatically Inferred Generation Order
0
1
5
0
3
19
@thoma_gu
Jiatao Gu
6 years
Thanks for checking out our new (in-progress) results! By doing insertion-based decoding, we can essentially generate a sequence in an arbitrary order!🤔 We can also make it learn to generate in a good order adaptively.😦🤭
0
1
17
@thoma_gu
Jiatao Gu
2 years
Thank you very much for coming to our tutorial! The slides for today's talk have been released on our webpage at Thanks again!
@thoma_gu
Jiatao Gu
2 years
@xutan_tx and I will virtually give a tutorial about "Non-autoregressive Sequence Generation" this Sunday, May 22 at 14:30-18:00 Irish Standard Time. #ACL2022NLP #NLProc Please come and check it out More details
1
6
25
1
2
18
@thoma_gu
Jiatao Gu
5 years
@kchonyc @ChanghanWang @stanfordnlp It turns out that 1130/2001 test pairs are exactly the same from the training set..
0
2
15
@thoma_gu
Jiatao Gu
3 years
0
2
13
@thoma_gu
Jiatao Gu
2 years
@dustinvtran What are "inverse CDF-like" tricks? Completely no details or reference not even sure what is the reasoning here why Yann is wrong. The following 2 tweets are also nonsense
0
0
13
@thoma_gu
Jiatao Gu
1 year
Please check our recent work on latent diffusion for text generation led by my amazing colleague @YizheZhangNLP at Apple MLR!
@_akhaliq
AK
1 year
PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Model paper page: propose PLANNER, a model that combines latent semantic diffusion with autoregressive generation, to generate fluent text while exercising global control over
Tweet media one
1
17
109
0
0
12
@thoma_gu
Jiatao Gu
6 years
Thanks for checking out our new (in-progress) results! By doing insertion-based decoding, we can essentially generate a sequence in an arbitrary order!🤔 We can also make it learn to generate in a good order adaptively.
@OfirPress
Ofir Press
6 years
New translation model (by @thoma_gu et al.) that does not generate the target sequence from left to right. Cool!
Tweet media one
0
17
64
0
3
12
@thoma_gu
Jiatao Gu
3 years
Check out our recent work on "unified" self-supervised learning for vision, speech and nlp!
@MichaelAuli
Michael Auli
3 years
New work! Humans appear to learn similarly for different modalities and so should machines! data2vec uses the same self-supervised algorithm to train models for vision, speech, and nlp. Paper: Blog: Code:
10
115
443
0
1
11
@thoma_gu
Jiatao Gu
7 years
Check our new paper on Fully-Parallel Text Generation for Neural Machine Translation!!
0
2
11
@thoma_gu
Jiatao Gu
4 years
(3/3) With the sparse voxel structure, our method is over 10 times faster than the state-of-the-art (NeRF) at inference time while achieving higher quality results. Check out more at: paper: video:
Tweet media one
Tweet media two
Tweet media three
1
1
11
@thoma_gu
Jiatao Gu
2 years
Feel like a slippery slope argument. If there is a tool which is LLM can help us improve scientific writing, especially for non-English speakers, why should we ban it? What is the difference between LLM and a dictionary? It is the author's responsibility to check the fact.
@Michael_J_Black
Michael Black
2 years
With LLMs for science out there ( #Galactica ) we need new ethics rules for scientific publication. Existing rules regarding plagiarism, fraud, and authorship need to be rethought for LLMs to safeguard public trust in science. Long thread about trust, peer review, & LLMs. (1/23)
30
130
507
1
1
10
@thoma_gu
Jiatao Gu
2 years
@emiel_hoogeboom @JonathanHeek @TimSalimans In our recent ICLR paper, we also proposed a very similar noise schedule adjustment for high resolution and varying resolution diffusion. Hope you maybe interested
@thoma_gu
Jiatao Gu
2 years
I'm happy to share our latest work, “f-DM: A Multi-stage Diffusion Model via Progressive Signal Transformation.” This is joint work with my Apple colleagues, @zhaisf @icaruszyz @itsbautistam @jsusskin (1/6)
Tweet media one
2
19
112
1
0
9
@thoma_gu
Jiatao Gu
11 months
@baaadas hmm I kind of disagree… even for the RS title, the training of during a PhD is very useful and helpful from many aspects… for example you will have more freedom to choose topics, to make mistakes and gain the problem solving skills without worrying being fired?
0
0
9
@thoma_gu
Jiatao Gu
4 years
“以地事秦,犹抱薪救火,薪不尽,火不灭”
0
0
9
@thoma_gu
Jiatao Gu
4 years
Please come to check our presentation and Q&A!
@LingjieLiu1
Lingjie Liu
4 years
If you're interested in neural scene representations and neural rendering, feel free to join us in 40mins on the Q&A live session of our NeurIPS Spotlight paper: Neural Sparse Voxel Fields: Q&A session at Dec 8th, 2020 @ 17:30 CET (8:30 AM PST)
0
2
14
0
0
9
@thoma_gu
Jiatao Gu
3 years
Please check the pre-recorded video for our ACL finding paper on fully non-autoregressive machine translation!
@XiangKong4
Xiang Kong
3 years
Fully NAT significantly reduces the inference latency with quality drop compared to AT. Can we close the performance gap while maintaining the latency advantage? Please checkout our ACL-finding paper: Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade.
1
4
10
0
2
8
@thoma_gu
Jiatao Gu
2 years
There is one more thing... Together with the tutorial slides, we have finally open-sourced the code of our last year's ACL2021 paper on fully non-autoregressive translation (NAT). Joint work with @XiangKong4 Paper: Code:
Tweet media one
1
1
8
@thoma_gu
Jiatao Gu
1 year
Please check out our paper for more details! 📃 Paper Link: Huge thanks to my incredible collaborators @zhaisf @YizheZhangNLP @LingjieLiu1 @jsusskin for their amazing contributions! 👏 A special thanks to @_akhaliq for tweeting about our research! (6/6)
1
0
7
@thoma_gu
Jiatao Gu
4 years
(2/3) NSVF defines a set of voxel-bounded implicit fields organized in sparse voxels. We progressively learn the underlying voxel structures with a diffentiable ray-marching operation from only a set of posed RGB images.
Tweet media one
Tweet media two
1
1
7
@thoma_gu
Jiatao Gu
11 months
@jbhuang0604 Can you give an example for the right side?
5
0
6
@thoma_gu
Jiatao Gu
1 year
Time to consider arxiv ban like *CL community?
@wjscheirer
Walter Scheirer
1 year
Motion 3: Repeal of the CVPR Social Media Ban Yes: 626 No: 418
2
19
86
1
0
7
@thoma_gu
Jiatao Gu
2 years
@kchonyc Thank you for your very kind efforts! I eventually finished mine…
1
0
6
@thoma_gu
Jiatao Gu
2 years
f-DM can produce high-quality samples on standard image generation benchmarks. Furthermore, we can readily manipulate the learned latent space and perform conditional generation tasks (e.g., super-resolution) without additional training. (5/6)
Tweet media one
Tweet media two
1
0
6
@thoma_gu
Jiatao Gu
1 year
@kwea123 Learning NeRF may have different reasons. Don’t teach people what you should do.
1
0
5
@thoma_gu
Jiatao Gu
10 months
@violet_zct Let’s catch up
0
0
5
@thoma_gu
Jiatao Gu
10 months
PLANNER: we propose a novel latent diffusion model for text generation, joint work @YizheZhangNLP @zhaisf @jsusskin Navdeep Jaitly Diffusion Models without Attention: we combine diffusion models with SSM, joint work @NathanYan2012 @srush_nlp
0
0
5
@thoma_gu
Jiatao Gu
2 years
We propose a generalized family of DMs, which is end-to-end non-cascaded, and allows progressive signal transformations along diffusion, including downsampling, blurring, and VAEs. An interpolation-based formulation is used to smoothly bridge consecutive transformations. (3/6)
Tweet media one
Tweet media two
1
0
5
@thoma_gu
Jiatao Gu
7 years
It's finally out!! Looks so cool!! Our non-autoregressive NMT also used distillation to make it possible! Hope to get more ideas from it.
@OriolVinyalsML
Oriol Vinyals
7 years
Almost 3000x speedup with Parallel WaveNet (Google Speech production model). Brought to you by @avdnoord et al. Drastic! More details & link to paper:
0
114
366
1
0
5
@thoma_gu
Jiatao Gu
11 months
@PreetumNakkiran I feel very upset about people posting meme like this around… I don’t think research opinions should have this correlation with IQ scores or showing off anything
1
0
5
@thoma_gu
Jiatao Gu
4 years
Also our poster session at Gather Town A3-Spot A2 12:00 – 14:00 EST
@thoma_gu
Jiatao Gu
4 years
Please come to check our presentation and Q&A!
0
0
9
0
1
5
@thoma_gu
Jiatao Gu
11 months
@baaadas How do you know if you don’t want to be a professor
1
0
5
@thoma_gu
Jiatao Gu
2 years
@OfirPress @jeremy_r_cole @adihaviv @ori__ram @peter_izsak @omerlevy_ Isn't it very natural that for decoder-only model? The masked attention of autoregressive nature must learn positions
1
0
4
@thoma_gu
Jiatao Gu
2 years
To tackle the modeling challenges, we also identify the importance of adjusting the noise levels whenever the signal is sub-sampled. A resolution-agnostic SNR is proposed as a practical guide. (4/6)
Tweet media one
1
0
4
@thoma_gu
Jiatao Gu
4 years
0
0
4
@thoma_gu
Jiatao Gu
1 year
Finally arrived in hotel… can get some rest for tomorrow’s conference…
0
0
4
@thoma_gu
Jiatao Gu
9 months
@alextrevith Congrats to your great work!
1
0
3
@thoma_gu
Jiatao Gu
2 years
Unbelievable we can still see the ending!!
@berserk_project
ベルセルク公式
2 years
『ベルセルク』が6月24日発売のヤングアニマル13号から連載再開いたします。連載再開に際し、ヤングアニマル編集部並びに森恒二先生からのメッセージを掲載いたします。引き続き『ベルセルク』をご愛読いただけるよう何卒よろしくお願い申し上げます。 #BERSERK #ベルセルク
Tweet media one
Tweet media two
1K
82K
170K
0
0
4
@thoma_gu
Jiatao Gu
11 months
@unixpickle @Apple Unfortunately, the naive version should still be slower than LDM, as it has to go through the high-res images anyway... But we can easily combine methods like our previous work () to progressively grow the resolution during inference and reduce the gap.
1
1
4
@thoma_gu
Jiatao Gu
5 years
@kchonyc totally agreed!! And maybe it is still not too late to catch up again?🤪 We also have more knowledge to work on this better!
1
0
4
@thoma_gu
Jiatao Gu
4 years
Wow... where the dream began...
@kchonyc
Kyunghyun Cho
4 years
0
0
1
0
0
4
@thoma_gu
Jiatao Gu
2 years
@OfirPress @jeremy_r_cole @adihaviv @ori__ram @peter_izsak @omerlevy_ Just like you also don't need "positional embeddings" for LSTM-based LM or even simpler NNLM?
1
0
4
@thoma_gu
Jiatao Gu
7 years
Countdowns to top CV/NLP/ML/Robotics/AI conference deadlines #machinelearning
0
0
3
@thoma_gu
Jiatao Gu
3 years
😭
@berserk_project
ベルセルク公式
3 years
【三浦建太郎先生 ご逝去の報】 『ベルセルク』の作者である三浦建太郎先生が、2021年5月6日、急性大動脈解離のため、ご逝去されました。三浦先生の画業に最大の敬意と感謝を表しますとともに、心よりご冥福をお祈りいたします。 2021年5月20日 株式会社白泉社 ヤングアニマル編集部
Tweet media one
0
203K
331K
0
0
3
@thoma_gu
Jiatao Gu
5 years
Tweet media one
0
1
3
@thoma_gu
Jiatao Gu
2 years
Despite the empirical success, diffusion models (DMs) are restricted to denoising in the ambient space. On the other hand, common generative models like VAEs employ a coarse-to-fine generation process. In this work, we are interested in combining the best of the two worlds. (2/6)
Tweet media one
Tweet media two
1
0
3
@thoma_gu
Jiatao Gu
7 years
0
0
1
@thoma_gu
Jiatao Gu
4 years
@zngu Not sure what you wanted to say. Whenever people are reporting the speed-up, we should always state the baseline model we are comparing with. In your words, any neural system might be potentially slower than SMT then.
1
0
3
@thoma_gu
Jiatao Gu
5 years
@AlexRoseJo @ChanghanWang @JakeZzzzzzz Hi Alexander , the code is in internal fairseq and we will release soon
0
0
3
@thoma_gu
Jiatao Gu
2 years
@YiTayML @MIT_CSAIL @Saboo_Shubham_ They just came from the same time and BART did have a simpler formulation as an encoder-decoder model. I personally feel very annoying by the way you talk about things. Thanks
3
0
3
@thoma_gu
Jiatao Gu
4 years
@zngu @odashi_t @raphaelshu @kchonyc @jlibovicky @jasonleeinf Also, in my view, non-autoregressive approaches may or may not be useful in the end, as it has both potentials and limitations. I think it is still a developing area. I am not sure we should limit ourselves by asking all papers to compare with the highly optimized system so far.
1
0
3
@thoma_gu
Jiatao Gu
2 years
It will introduce some classical methods of non-autoregressive generation for machine translation and its recent applications on various tasks, including GEC, ASR, TTS, and image generation!
0
0
3
@thoma_gu
Jiatao Gu
1 year
@tengyuma @HongLiu9903 @zhiyuanli_ @dlwh @percyliang @StanfordAILab @stanfordnlp @StanfordCRFM @Stanford Will this be useful for other domains like speed-up the convergence of diffusion models?
0
0
2
@thoma_gu
Jiatao Gu
2 years
@itsbautistam Congrats!
1
0
3