Daniel Geng @dangengdg Twitter profile | Pikagi

Pikagi

Daniel Geng

@dangengdg

717

Followers

678

Following

26

Media

92

Statuses

PhD student at @UmichCSE . Currently student researcher @GoogleDeepMind . Interested in computer vision and generative models. Previously @MetaAI and @berkeley_ai

https://t.co/CNNksohsBG

Joined August 2016

Don't wanna be here? Send us removal request.

Pinned Tweet

@dangengdg

Daniel Geng

11 months

Can you make a jigsaw puzzle with two different solutions? Or an image that changes appearance when flipped? We can do that, and a lot more, by using diffusion models to generate optical illusions! Continue reading for more illusions and method details 🧵

16

119

609

Last Seen Profiles

@MZ45269094

@marwan_l92

@PeruInTheUSA

@gymh235223

@Hulk108823

@ghawi2005

@davyandrewsdavy

@SBhoneys

@bomblacx972

@SPlazaque

@babes_slut

@JRsupermercats

@rillxi0_

@prettydevil4001

@arzak

@nepobabis

@yama_akbar

@sirawn_evans

@Degyn__Sefer

@Kham7775

@kurumi_anison

@foooz201911

@vishwatarak17

@fula4k

@DetrichDetric

@Abigail6466

@NEPatsFan72

@JamesOdunze

@MlCI6RuZ96gMnbK

@staffdenoticias

@E_Xplorer_OSA

@Dukeskiii_btx

@PawanKalyan

@DanielaCatrileo

@ymn_mky

@KandiPagan

@dangengdg

Daniel Geng

6 months

What do you see in these images? These are called hybrid images, originally proposed by Aude Oliva et al. They change appearance depending on size or viewing distance, and are just one kind of perceptual illusion that our method, Factorized Diffusion, can make.

10

105

456

@dangengdg

Daniel Geng

4 months

I'm at CVPR presenting "Visual Anagrams" on - Tuesday: 10am, Poster #429 - Friday: Oral6B @ 1pm, Poster #118 (pm) Let me know if you want to chat! Also, we manufactured a bunch of these "jigsaws with two solutions." If you want one, just hunt me down in the conference hall :)

8

23

166

@dangengdg

Daniel Geng

9 months

Can we use motion to prompt diffusion models? Our #ICLR2024 paper does just that. We propose Motion Guidance, a technique that allows users to edit an image by specifying “where things should move.”

2

15

76

@dangengdg

Daniel Geng

11 months

See our website, paper, and code for more details (and more illusions)! Website: arXiv: Code: Colab notebook: Big thanks to my collaborators @invernopark and @andrewhowens !

2

7

39

@dangengdg

Daniel Geng

5 months

This is an image of Corgis, but when played as a spectrogram sounds like dogs barking! Really thankful I got the chance to work on this super fun project with first author @CzyangChen . Check out his thread for many more examples, and to see how they're made!

@CzyangChen

Ziyang Chen

5 months

These spectrograms look like images, but can also be played as a sound! We call these images that sound. How do we make them? Look and listen below to find out, and to see more examples!

1

41

169

0

3

38

@dangengdg

Daniel Geng

5 months

I'll be presenting this work at #ICLR2024 on Wednesday, 10:45am in Hall B, #81 . Stop by if you're interested or reach out if you just want to chat!

@dangengdg

Daniel Geng

9 months

Can we use motion to prompt diffusion models? Our #ICLR2024 paper does just that. We propose Motion Guidance, a technique that allows users to edit an image by specifying “where things should move.”

2

15

76

0

4

36

@dangengdg

Daniel Geng

11 months

Using our method, we can create images that change appearance when flipped or rearranged…

1

1

25

@dangengdg

Daniel Geng

6 months

We can also make these images that change when viewed in grayscale. Since the human eye can't see color under dim lighting, there is actually a physical mechanism for this illusion: these images change appearance when taken from a bright room to a dim one!

1

4

29

@dangengdg

Daniel Geng

11 months

We can even make illusions with three different subjects!

2

5

25

@dangengdg

Daniel Geng

6 months

Many many more results are available at our website! code: [code link is not a typo btw!] arxiv: website:

1

1

24

@dangengdg

Daniel Geng

11 months

Our method is zero-shot and conceptually simple: just take a diffusion model, and denoise multiple views/transformations of an image…

Tweet media one

2

0

24

@dangengdg

Daniel Geng

4 months

@andrewhowens and @invernopark already explained the CVPR t-shirt, but here's some of the _designs_ we considered. Thank you so so much to @ctocevents , @elluba , and @wjscheirer for asking us to do this!

Tweet media one

Tweet media two

Tweet media three

Tweet media four

1

2

25

@dangengdg

Daniel Geng

11 months

Most orthogonal transformations on an image are pretty meaningless, but luckily permutations are a subset of these transformations. This is where the idea of a “visual anagram” comes from—images that change appearance under arbitrary permutations of its pixels!

1

2

22

@dangengdg

Daniel Geng

11 months

…or rotated, or skewed.

1

1

19

@dangengdg

Daniel Geng

11 months

But there’s a catch! We found that not every view would work. The view needs to satisfy two conditions. The first is linearity, which ensures the transformed image is the correct mix of noise and signal:

Tweet media one

1

0

18

@dangengdg

Daniel Geng

11 months

And we can make images that change appearance when inverted, and as promised, jigsaw puzzles with two solutions.

1

0

17

@dangengdg

Daniel Geng

6 months

By using a Laplacian pyramid decomposition, we can even manage to make (somewhat decent) "triple" hybrid images

1

3

19

@dangengdg

Daniel Geng

11 months

The second condition we call “statistical consistency.” The transformed noise needs to be iid Gaussian, as that’s the assumption in diffusion. It turns out this is only possible if your transformation is orthogonal.

Tweet media one

1

0

16

@dangengdg

Daniel Geng

6 months

And by extracting low frequencies from a real image, and generating the missing high frequencies with our method, we can make hybrid images from real images. In effect we are solving a (noiseless) inverse problem. Anyways, here's Thomas Edison turning into a lightbulb:

1

0

15

@dangengdg

Daniel Geng

6 months

And we can make these images that change when motion blurred

1

1

16

@dangengdg

Daniel Geng

6 months

Also, big thank you to my collaborators @invernopark and @andrewhowens !

0

0

10

@dangengdg

Daniel Geng

6 months

Our method works by decomposing an image into a sum of components. For example into high and low frequencies, or grayscale and color components. We then use a diffusion model to control each of these components individually, in a zero-shot manner.

Tweet media one

1

0

15

@dangengdg

Daniel Geng

11 months

I had a wonderful time working w/ @ZhaoyingPan and @andrewhowens on our #NeurIPS2023 paper "Self-Supervised Motion Magnification." We propose a simple method for magnifying tiny motions in video, and also show some neat tricks like magnification targeting and test time adaptation

@ZhaoyingPan

Zhaoying Pan

11 months

(1/4) Excited to present our NeurIPS paper: "Self-Supervised Motion Magnification by Backpropagating Through Optical Flow”, a simple, self-supervised method for magnifying subtle motions. arXiv: Website: @dangengdg @andrewhowens

1

1

16

0

2

11

@dangengdg

Daniel Geng

9 months

Our work is inspired by/related to DragGAN ( @XingangP ), DragonDiffusion (Chong Mou), and DragDiffusion ( @YujunPeiyangShi ). The technique from Universal Guided Diffusion ( @arpitbansal297 ) is also quite important for our method to work.

1

0

7

@dangengdg

Daniel Geng

11 months

And secondly to this great CVPR demo, , by @RyanBurgert , @XiangLi54505720 , @abe_leite , @kahnchana , and @ryoo_michael , which makes some really cool and creative illusions using SDS.

1

0

8

@dangengdg

Daniel Geng

9 months

Big thanks to @andrewhowens for advising me on this project. Please check out links for more info and results! website: arXiv: code: visualization code:

0

0

5

@dangengdg

Daniel Geng

11 months

I also want to give a pointer to some awesome related work: First to Matt Tancik, who implemented a method quite similar to ours a while back:

1

0

6

@dangengdg

Daniel Geng

4 months

puzzles can also be found with my advisor @andrewhowens or my coauthor @invernopark !

0

0

6

@dangengdg

Daniel Geng

6 months

Finally, using our method with certain decompositions reduces (roughly!) to prior work on spatial or compositional control of diffusion models. Details are in the paper.

1

0

6

@dangengdg

Daniel Geng

4 months

@HaareBlond Thanks! You may be interested in our recent work, led by @CzyangChen , which does weird, but really really cool things to sound and spectrograms.

1

0

5

@dangengdg

Daniel Geng

1 year

This creative paper led by @alexlioralexli shows that generative models can actually be used as zero-shot classifiers!

@alexlioralexli

Alex Li

@alexlioralexli

1 year

Diffusion models have amazing image creation abilities. But how well does their generative knowledge transfer to discriminative tasks? We present Diffusion Classifier: strong classification results with pretrained conditional diffusion models, *with no additional training*! 1/9

14

79

398

0

0

4

@dangengdg

Daniel Geng

10 months

@CSProfKGD Honored to be on your reading list Kosta! Also glad you printed out the paper, it makes viewing the figures much easier :)

1

0

3

@dangengdg

Daniel Geng

5 months

@giffmana @CzyangChen @andrewhowens Thanks Lucas! :D

0

0

3

@dangengdg

Daniel Geng

9 months

Our method requires no finetuning, works on real images, and enables fine-grained editing of images with pretty complex motion. Here, we visualize the optical flow, and corresponding points between the original image and the “motion edited” image.

1

0

2

@dangengdg

Daniel Geng

4 months

@BBarash Thank you!!!

0

0

3

@dangengdg

Daniel Geng

9 months

We can also extract motion from an existing video, and apply that motion to images. Here we take the spinning of the earth, and use it to rotate various animal faces.

1

0

2

@dangengdg

Daniel Geng

4 months

@danbgoldman @andrewhowens @invernopark Hi Dan, we were thinking of trying to print more. I'll add your name to a list of people who want one and I'll let you know if we figure it out. (Big fan of your work btw!)

1

0

3

@dangengdg

Daniel Geng

9 months

Here’s some more results. Our website has tons more.

1

0

3

@dangengdg

Daniel Geng

11 months

@jon_barron Wow thanks a ton Jon! I'm glad you enjoyed it :D We've got a ton more examples that I'll be putting on the website soon

0

0

2

@dangengdg

Daniel Geng

9 months

Our method also has limitations, such as (a) failures on OOD flow fields (b) potential identity loss (c) and occasional convergence issues. It is also slow to sample from. We hope future work can help alleviate these issues.

Tweet media one

1

0

2

@dangengdg

Daniel Geng

6 months

@_jasonliu_ This is a cool idea! We were thinking that these images could be a form of steganography. Like, you're a spy and a message only appears when you look at the photo in dim lighting. It could also act as really lossy compression, but I think there's probably more practical methods

1

0

3

@dangengdg

Daniel Geng

9 months

We achieve this by doing diffusion guidance through an off-the-shelf optical flow network. Our proposed guidance loss encourages the edited image to have the user specified motion w.r.t. the source image, as estimated by the flow network.

1

0

2

@dangengdg

Daniel Geng

9 months

We also wrote a simple GUI to make these dense motion fields. By just clicking and dragging, a user can segment out an object with SAM and create complex flow fields.

1

0

2

@dangengdg

Daniel Geng

11 months

@phillip_isola Wow thanks! I'm glad you like it :D

0

0

1

@dangengdg

Daniel Geng

10 months

@CSProfKGD I think one of the hardest parts of writing this paper was arranging as many images as possible into the teaser figure :)

0

0

1

@dangengdg

Daniel Geng

10 months

@deviparikh Thank you so much Devi!!

0

0

1

@dangengdg

Daniel Geng

11 months

@eerac This might work, you could try it out! I think you would have to be careful with the noise though... An uninvertible transformation might mess up the iid Gaussian-ness of it

1

0

1

@dangengdg

Daniel Geng

4 months

@ZhaoyingPan @andrewhowens @invernopark Thanks Zhaoying!!!

0

0

1

@dangengdg

Daniel Geng

4 months

@laurayuzheng hahaha, hello! :) I'll dm you

0

0

1

@dangengdg

Daniel Geng

4 months

@DanielZoran_ @CVPR @andrewhowens Oh my goodness I just saw this, thank you so much!! :D

3

0

1

@dangengdg

Daniel Geng

11 months

@CSProfKGD @andrewhowens @ZhaoyingPan Thank you Kosta!

0

0

1

@dangengdg

Daniel Geng

4 months

And to @invernopark 's tweet:

@invernopark

Aaron Inbum Park

4 months

Here are some of the candidates for this year's @CVPR T-shirt design using our newest work, Factorized Diffusion !

2

7

64

0

0

1

@dangengdg

Daniel Geng

6 months

@NagabhushanSN95 It's related! I think it's more that high frequency components of the image go away when you downsample. You could check out the hybrid images paper if you want more details:

0

0

1

@dangengdg

Daniel Geng

5 months

@ryan_hoque @berkeley_ai @Berkeley_EECS congrats Ryan!!!

0

0

1

@dangengdg

Daniel Geng

11 months

@HaareBlond @invernopark @andrewhowens Yeah, like you said, latent diffusion doesn't work *well* (but it does kind of work). Audio is really interesting as well! We sort of lucked out tho, because the views that work with this method correspond to visually interpretable views. idk if the same would hold for audio

0

0

1

@dangengdg

Daniel Geng

6 months

@CSProfKGD ❤️

0

0

1

@dangengdg

Daniel Geng

6 months

@anand_bhattad @andrewhowens Thank you for the kind words! :D

0

0

1