Hello, view synthesis devotees. I invite you to some new work at
@eccvconf
. We gather tourist photos of famous landmarks and learn a new neural 3D representation that can synthesize new views with natural, modifyable lighting. We call it "Crowdsampling the Plenoptic Function".
It turns out that YouTube has tons of videos of people pretending to be statues. This is great for learning about the 3D shape of people! Cool new work from
@zl548
at CVPR19 from his Google internship.
Attention all looking glass lovers: This tweet is a shameless plug for a CVPR 2020 paper that asks a dumb question and finds an interesting answer. Can you tell if an image has been horizontally flipped or not?
We couldn't find a Fundamental matrix visualizer online, so we made one for our vision course. If you are an F-matrix fan, take a look & tell us if you find any problems. And please send pointers to other demos! (Credits: Alek Curless, Sri Chakra Kumar)
Dear typesetting fanatics: I wrote a short Latex style guide with some tips & tricks that I find useful for making short and nice-looking papers. If you are working on ECCV papers or the like, maybe it will be useful to you, too.
For stylization fans,
@KaiZhang9546
's work called ARF: Artistic Radiance Fields is on Tuesday's docket at
@eccvconf
. It achieves nice, view-consistent 2D-to-3D style transfer results by fine-tuning a radiance field so that projections resemble the style of an input source image.
ARF: Artistic Radiance Fields
abs:
project page:
github:
create high-quality artistic 3D content by transferring the style of an exemplar image, such as a painting or sketch, to NeRF and its variants
Do you have the blues because you are getting broken 3D models from COLMAP or other 3D reconstruction pipelines? Ruojin has a nice new paper and codebase that can help! We invite you to check out our work on doppelganger images here:
Check out our
#ICCV203
paper called Doppelgangers. We train a classifier to detect distinct but visually similar image pairs ("doppelgangers") and apply it to SfM disambiguation, enabling COLMAP to create correct 3D models in hard cases.
Project page:
Dear city lovers: here's new work at
@eccvconf
on observing many images of a city over time, and learning to factor lighting effects from scene appearance. This factorization lets us relight new images, even from new cities. Here we learn from NYC and create a full day in Paris.
To all the CVPR-heads out there -- check out
@KaiZhang9546
's work on inverse rendering in this morning's oral session! Relightable 3D meshes from photos, with really beautiful results.
Greetings from View Synthesis Land!
Richard Tucker and I had a fun
@cvpr2020
paper (from
@GoogleAI
) called "Single-View View Synthesis with Multiplane Images". The code (and Colab) is now available. Have fun out there!
web:
Colab:
Got an urge to render the world from Internet photo collections? The source code for
@moustafaMeshry
's CVPR2019 best paper finalist is now available: . Have fun out there!
Learned about Notre Dame Cathedral through computer vision and structure from motion, of all things, many years before I ever got a chance to visit. Very sad day.
Zhengqi’s new work is a very cool approach to single-image animation—these videos are really nifty!
This work turns a still image into a looping video by predicting frequency-space motion. It can also make your image interactive. The demo is really nice!
Excited to share our work on Generative Image Dynamics!
We learn a generative image-space prior for scene dynamics, which can turn a still photo into a seamless looping video or let you interact with objects in the picture. Check out the interactive demo:
This is so cool! Check out Richard Bowen's work today at
#3DV2022
. It considers what possible flow fields could arise if you were to hit a hypothetical "play" button on a still image.
@3DVconf
Hey there—code and data for our Crowdsampling the Plenoptic Function paper from
@eccvconf
is now available for all you tourism-heads out there.
github link:
Hello, view synthesis devotees. I invite you to some new work at
@eccvconf
. We gather tourist photos of famous landmarks and learn a new neural 3D representation that can synthesize new views with natural, modifyable lighting. We call it "Crowdsampling the Plenoptic Function".
Hello to all you light field lovers out there! We have new work with John Flynn and others on high-quality view synthesis from a camera array. We use soft layers to make nice pictures. Presented in Tuesday's afternoon oral session at
@cvpr19
.
I have a real soft spot for epipolar geometry—and so this tweet is a crass advertisement for some work of ours at
@eccvconf
that I think is nice. The idea is to learn local feature descriptors from pairs of images with known camera poses—no ground truth correspondence required.
I think it is pretty neat. This is work from Cornell Tech with
@zl548
,
@XianWenqi
, and
@AbeDavis
. You can find out more at , or watch this wonderful teaser video made by
@AbeDavis
.
This is so cool! Check out
@boyang_deng
's wonderful work on generating Streetscapes -- tours through imaginary street scenes, conditioned on a desired city layout and a text description. I like this wintry result a lot!
Thought about generating realistic 3D urban neighbourhoods from maps, dawn to dusk, rain or shine? Putting heavy snow on the streets of Barcelona? Or making Paris look like NYC? We built a Streetscapes system that does all these. See . (Showreel w/ 🔊 ↓)
Maybe it's just me, but for me, the award for the computer vision project whose webpage has survived for the longest time without breaking is "3D Photography on your Desk" by Jean-Yves Bouguet and Pietro Perona (1998).
In need of many examples of camera trajectories from videos? Check out our new RealEstate10K dataset! . This is the kind of data we used in our recent Stereo Magnification work on view synthesis (with Tinghui Zhou).
I'm really proud of
@zhengqi_li
, who put his heart into the DynIBaR work that got the Best Paper Honorable Mention nod at CVPR. And I'm really sad that he couldn't be there to experience it due to circumstances beyond his control.
Thanks for the nice photo and note,
@jon_barron
!
Hi everyone. I'm helping to organize tomorrow's ECCV 4D Vision Workshop. We have a lineup of great papers and speakers—some real vision enthusiasts—including
@RaquelUrtasun
, Michael Ryoo,
@davsca1
, Drago Anguelov,
@mapo1
,
@xiaolonw
, & Tom Funkhouser.
Zhengqi Li (Cornell PhD student) presents: MegaDepth! Big (100K+), diverse dataset of RGBD images derived from Internet multi-view stereo. Good for training RGB -> depth, generalizable to other datasets (e.g. KITTI).
Web: , arXiv:
This is work with Zhiqiu Lin, Jin Sun, and
@abedavis
. You can check it out at or visit the CVPR Q&A on "Visual Chirality" on Thursday. Or watch this nice teaser video from
@AbeDavis
. Thanks! Now back to your timeline.
Shamelessly plugging this talk tomorrow (Wednesday). My hat is off to the 3DGV organizers for putting together a great series of talks on cool 3D vision-style work!
@3_dgv
Seminar in 2 days!
@Jimantha
Noah Snavely will talk about "The Plenoptic Camera", joined by
@_pratul_
Pratul Srinivasan and Rick Szeliski. Please distribute the news to students/members in your groups.
Youtube:
3/10 11am Pacific
3/10 19:00 UK
The Google Ph.D. Fellowship Program has selected
@QianqianWang5
as one of its 2022 fellows. “I hope that my technology can enable us to create a rich and realistic virtual world,” - Qianqian Wang, computer science Ph.D. student at
@Cornell_tech
Read more:
Hello to all you fashion-heads out there! We invite you to our new ICCV paper on analyzing clothing in millions of photos around the world. We can discover world events and festivals purely from apparel! With U. Mall, K. Matzen, B. Hariharan & K. Bala.
We invite you to check out this nice work on view synthesis for dynamic scenes! Work from
@zl548
during his Adobe internship with
@oliverwang81
and
@simon_niklaus
.
Really great work we did with
@zl548
on practical novel view synthesis in space and time. Take any video and move the camera, or slow down the time, or both!
Website:
With:
@zl548
,
@oliverwang81
,
@Jimantha
Very nice work led by
@QianqianWang5
on generalizing NeRF by incorporating principles from classic image-based rendering. Along with other work like pixelNeRF and GRF, I'm excited by these demonstrations of cross-scene generalization. (And I love this example miniature scene!)
Training NeRFs per-scene is so 2020. Inspired by image based rendering, IBRNet does amortized inference for view synthesis by learning how to look at input images at render time. 15% drop in error, 80% fewer FLOPs than NeRF. Great work
@QianqianWang5
!
In case you missed it at ECCV:
@zl548
has a new dataset called CGIntrinsics. Ludicrously high-quality CG renderings for learning intrinsic images. You can predict state-of-the-art intrinsic images on real photos just by training on CG data!
#ECCV2018
It feels like CVPR20 ended 9 years ago—but I'm only now checking it out. I recommend the great FATE tutorial. On top of the challenges outlined, I imagine there are hurdles even in spearheading such a tutorial—thank you,
@timnitGebru
&
@cephaloponderer
!
I got the chance to read this paper in detail recently, and it is really cool, especially for all you feature matching–heads out there! I love the idea of computing descriptors on the basis of two images at once. Nice work,
@oliviawiles1
, Sebastien Ehrhardt, and Andrew Zisserman!
D2D: Learning to find good correspondences for image matching and manipulation
@oliviawiles1
, Sebastien Ehrhardt, Andrew Zisserman,
@Oxford_VGG
Idea: extract features conditionally on 2nd image.
1/
Glad to share our work “Neural 3D Reconstruction in the Wild” in SIGGRAPH 2022! We show that with a clever sampling strategy, neural-based 3D reconstruction can be better and faster than COLMAP. Check out the project page at: .
Our paper, “NeRF in the Wild”, is out! NeRF-W is a method for reconstructing 3D scenes from internet photography. We apply it to the kinds of photos you might take on vacation: tourists, poor lighting, filters, and all. (1/n)
View synthesis is super cool! How can we push it further to generate the world *far* beyond the edges of an image? We present Infinite Nature, a method that combines image synthesis and 3D to generate long videos of natural scenes from a single image.
Let's turn photos of ancient "revolutionary" (rotationally symmetric) artefacts into 3D and rotate them, or even change the lighting!
Our model learns to de-render a single image of a vase into shape, albedo, material & lighting, from just a single-image collection.
#CVPR2021
Totally agree with Beth: "The violence directed at Asian Americans, especially women, children and elderly, is against the very core values America is built on. This is why I am standing up and speaking up today."
Hi all, please consider nominating yourself to be reviewer for
#CVPR2022
. And please pass the word along, especially to those whose voices are not well represented in the vision community. This is one way to help guide the field.
Check out our CVPR 2023 Award Candidate paper, DynIBaR!
DynIBaR takes monocular videos of dynamic scenes and renders novel views in space and time. It addresses limitations of prior dynamic NeRF methods, rendering much higher quality views.
From our latest project, an homage to the original Photo Tourism visualizations by
@Jimantha
et al. - interpolating between camera pose, focal length, aspect ratio, and scene appearance from different tourist images. More details at
@_pratul_
@jon_barron
I'm a big logo head! I keep seeing ads for Zenni on the train. I'm intrigued by how the stylized Z and N are exact mirror images here, but not in "real life" - one has horizontal lines, the other vertical. Yet no problem interpreting the logo. Cool Gestalt-style logic at work!
There are few things I find more terror-inducing than cold calling people -- but I am finding that making US election-related volunteer calls leads to some pretty nice conversations. Some folks just want to chat right now.
Organizing the first NYC vision workshop was super fun! Shout out to other organizers
@elliottszwu
@Haian_Jin
and especially
@Jimantha
for the generous support!
View synthesis is super cool! How can we push it further to generate the world *far* beyond the edges of an image? We present Infinite Nature, a method that combines image synthesis and 3D to generate long videos of natural scenes from a single image.
This work led by
@Haian_Jin
is really nice. It takes text-to-image models and teases out their capability to light objects in a controllable way, much like Zero123 does for camera viewpoint. I'm really surprised that conditioning on environment maps can work this well!
Check out our recent work “Neural Gaffer: Relighting Any Object via Diffusion” 📷🌈, an end-to-end 2D relighting diffusion model that accurately relights any object in a single image under various lighting conditions.
🧵1/N:
Website:
I'm thinking tonight about our international students, PhD and MS students who over the last few years have faced so much uncertainty about their very presence in this country. This is a great moment for them, and a great moment for US universities and the US economy.
A new follow up to infinite nature is out! This time we show how an infinite nature model can be trained on *single image* collections, without any multi-view or video supervision at training time!
We call it infinite nature 𝘻𝘦𝘳𝘰 since it requires no video 🙂
#ECCV2022
oral
Image synthesis models can be used for visual data mining!
See our new
#ECCV2024
paper: "Diffusion Models as Data Mining Tools."
Project page:
Paper:
1/9
Happy Lunar New Year, and to all you astronomy fanatics a question -- If you and your family lived for generations in a village on the far side of the Moon, would you realize that the Earth existed?
Check out our
#SIGGRAPH2020
paper on Consistent Video Depth Estimation. Our geometrically consistent depth enables cool video effects to a whole new level!
Video:
Paper:
Project page:
Reminder about this CVPR registration support program -- please apply by April 15, 2022 if you'd like to be considered for a registration fee waiver! I hope that this effort can help increase inclusivity of CVPR. Application is here:
#CVPR2022
is committed to supporting students from communities that do not traditionally attend CVPR through waived registration fees, to foster a more inclusive, diverse and equitable conference.
1/2
I think I'm not mistaken that ICCV camera ready papers this year can be 9 pages+references (not 8)? If that is right, that is a first, and a very welcome, cool, and nice change!
For all you photography fanatics out there, a nice blog post about the photo with the longest known sightline captured to date -- 443km, from the Pyrenees to the French Alps.
Hi there. Code for our
@eccvconf
work from
@cornelltech
on learning where people could appear in an image is now online. Our (cool) method learns to predict potential people purely from observing data like Waymo's Open Dataset.
Code below—have a good day!
Where could people walk? Excited to share our
@eccvconf
paper on learning contextual walkability: "Hidden Footprints: Learning Contextual Walkability from 3D Human Trails"
Arxiv:
Website:
With: Jin Sun,
@QianqianWang5
,
@Jimantha
The upshot is that images seem to be full of low- and high-level chirality cues, and deep networks are pretty good at guessing when an image has been flipped. You might care if you're into data augmentation, image forensics, or self-supervision (or if you are a huge mirror-head).
For Lunar New Year/Spring Festival, another Moon-based note to all you Moon lunatics out there. One of my favorite gifs on Wikipedia is this one illustrating the apparent wobble of the Moon over the course of a month, called libration.
Tomorrow is Election Tuesday in the US. I have tons of extra candy. If you see me and show me a "I Voted" sticker, I will try to give you some candy! I have Skittles and Baby Ruths.
I am sorry for buzz marketing, but if you are looking for a TV show for a 3-5 year old, I recommend a math-themed PBS program called Peg + Cat. Our 4 year old loves it (and the songs are catchy).
This workshop was so cool! My live talk had some technical difficulties, so if you want to see a clean version of a talk on how to tell if you are in a mirror universe (and a bunch of other great talks on less obscure topics), please check this out!
Announcing WiGRAPH's Rising Stars in Computer Graphics! Ph.D. students and postdocs of underrepresented genders: apply for a two-year program of mentorship and workshops co-located with SIGGRAPH 2022&2023. Travel support provided.
#WiGRAPHRisingStars
[1/6]
I've personally benefited a ton from
@timnitGebru
and her work. Her earlier work on estimating demographics at scale from Street View has been a big inspiration to me. Her more recent work in ethics is truly foundational, and has helped me think about the world differently.
If you're a big lover of video decomposition, check out
@vickie_ye_
's nice paper on Deformable Sprites in the CVPR afternoon oral session. We are huge fans of layered video representations!
A deep network takes two images, learns to search for 2D matches between them, and then a loss function decides how much it likes the matches based on how much they deviate from the epipolar constraints derived from the camera poses, as in the visualization below.
@CSProfKGD
Thanks, Kosta! Yes, there was an appearance from a 4-year-old who wasn't happy that I wasn't in play mode. (I moved to a different room, but forgot my trackball, so she started controlling my computer remotely.) I'm glad people seemed to be understanding of the heightened chaos.
That page also has a stunning photo of the Earth that looks like CGI but is actually a photo from the Lunar Reconnaissance Orbiter. This photo is new to me but is really amazing!
@taiyasaki
@ICCV_2021
Reviewers rock! Thank you so much for your hard work. Some didn't chime into discussions, but I think because CVPR and life was happening. Also, I initiated discussions late... between CVPR and sick kids/self, I was a bad AC☹️. But many reviewers chimed in anyway. Thank you!
Thanks, Ben! And I should note that the original idea for multiplane images came from John Flynn, working with Graham Fyffe and
@debfx
. That idea was also presaged in John's prior DeepStereo view synthesis method, as well as Soft3D from Penner and Zhang.
Great overview from
@fdellaert
! I'd also like to highlight
@TinghuiZhou
and
@Jimantha
et al. for bringing volume rendering into deep learning for view synthesis with their paper Stereo Magnification in 2018.
Hi all, ICCV21 isn't even over yet, but
#CVPR2022
will be here before we know it, and deadlines for proposing workshops & tutorials are coming up soon. It would be great if the organizers had a diverse set of proposals on a range of topics, including societal impacts of CV.
Introducing Eclipse, a method for recovering lighting and materials even from diffuse objects!
The key idea is that standard "NeRF-like" data has all we need: a photographer moving around a scene to capture it causes "accidental" lighting variations. (1/3)
And for all y'all intrinsic image fanatics out there -- Zhengqi also has a cool new paper on learning intrinsic images supervised with time-lapse data: "Learning Intrinsic Image Decomposition by Watching the World".
Web: , arXiv:
Hope you are all doing well out there. ECCV22 workshop proposals are due tomorrow! it would be wonderful to see a diverse range of workshops at the conference.
#CVPR2022
is now accepting applications for travel grants.
Decisions will be made on a rolling basis, so please apply soon, and no later than 5/13 at 11:59 CST.
Links:
(student)
(advisor)
Source:
I defended my PhD thesis last week 🥳 Thank you to everyone that made this possible, including my advisor
@mapo1
, examiners
@Jimantha
@quantombone
and Daniel Cremers, and the amazing
@cvg_ethz
. As per the tradition, I received a nice commemorative hat 🎓 Now time for vacations 😎
Hello, view synthesis devotees. I invite you to some new work at
@eccvconf
. We gather tourist photos of famous landmarks and learn a new neural 3D representation that can synthesize new views with natural, modifyable lighting. We call it "Crowdsampling the Plenoptic Function".
For all you ECCV rebuttal writers out there... seems like reviewers can see your rebuttals in OpenReview even before the end of the rebuttal period, which may be surprising behavior to CMT-heads like me.
@CSProfKGD
As a reviewer I saw initial rebuttal from authors, and then edited version. So, you can edit, but whatever you have entered is already visible to the reviewers.
Layers! We love layers. On that note, a shameless plug for Shubham Tulsiani's work on *layered scene inference* -- predicting geometry in the form of *layered* depth maps from single images.
#ECCV2018
#layers
I heard some states are rightfully changing up their flag design, so it seemed like a good time to remind folks that my home state of Arizona has the best flag, and Wisconsin has the worst one.
... low-level artifacts due to JPEG compression and Bayer demosaicing, and high-level elements like shirt collars, musical instruments, and even eye gaze and hair part (?)