will grathwohl Profile Banner
will grathwohl Profile
will grathwohl

@wgrathwohl

3,629
Followers
236
Following
82
Media
1,478
Statuses

Lover of raccoons and machine learning. Research Scientist at @GoogleDeepMind in NYC. Ph.D. from @UofTCompSci

New York, New York
Joined June 2010
Don't wanna be here? Send us removal request.
Pinned Tweet
@wgrathwohl
will grathwohl
8 months
Really excited for people to see why my publication output has been diminished over the last year or so. Many thanks to everyone involved blasting out our eardrums to make this happen!
@demishassabis
Demis Hassabis
8 months
Thrilled to share #Lyria , the world's most sophisticated AI music generation system. From just a text prompt Lyria produces compelling music & vocals. Also: building new Music AI tools for artists to amplify creativity in partnership w/YT & music industry
111
535
3K
4
10
134
@wgrathwohl
will grathwohl
3 years
Not a bad view from the new apartment! Excited to start work at @DeepMind this Monday!!
Tweet media one
9
6
696
@wgrathwohl
will grathwohl
9 months
My team at Google Deepmind in NYC is hiring research engineers! If you're interested in helping add new capabilities to large-scale generative models, please apply: Feel free to reach out if you have questions :)
21
124
692
@wgrathwohl
will grathwohl
3 years
Hi everyone! I'm very pleased to announce that I'll be starting work as a research scientist at @DeepMind in their New York office with @rob_fergus , Ishita Dasgupta, Arun Ahuja, and @kaeserchen Just gotta finish dat Ph.D. and we're off! I'm so excited!
17
4
467
@wgrathwohl
will grathwohl
3 years
Hi all! Very pleased to share that my latest paper: "Oops I Took A Gradient: Scalable Sampling for Discrete Distributions" () has been accepted to ICML for a long presentation. Energy-Based Models have seen amazing progress in the last few years...
6
48
329
@wgrathwohl
will grathwohl
2 years
Very happy to see such rapid progress in gradient-based samplers for discrete distributions! Some recent highlights include: Very nice, elegant, and simple ideas all!
3
33
215
@wgrathwohl
will grathwohl
3 years
Finally made it to New York! It’s been real Canada. Thanks for the PhD (and the love of my life). Taking them both with me!
Tweet media one
4
0
215
@wgrathwohl
will grathwohl
3 years
So this happened! Yay
@icmlconf
ICML Conference
3 years
ICML 2021 Outstanding Paper Award Honorable Mentions: 2/4. Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, and Chris Maddison 📜Oops I Took A Gradient: Scalable Sampling for Discrete Distributions (Tuesday 9am US Eastern)
2
10
134
11
6
197
@wgrathwohl
will grathwohl
3 years
excuse me, that's Dr. Dumbass to you
20
0
176
@wgrathwohl
will grathwohl
3 years
Ye
Tweet media one
4
0
168
@wgrathwohl
will grathwohl
1 year
Super proud of this work with the amazing @du_yilun and my great colleagues and friends from Deepmind, Google, MIT, and INRIA @jaschasd @conormdurkan @sedielem @rob_fergus @robinstrudel @ArnaudDoucet1 Josh Tenenbaum. More info to follow soon.
@_akhaliq
AK
1 year
Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC abs: project page:
1
29
165
3
20
121
@wgrathwohl
will grathwohl
2 years
Apologies to my adoring fans for the relative dark period in my research output. Starting a new job + procuring a green card for one’s wife can be somewhat distracting. Good shit in the pipeline. Yeet.
2
1
87
@wgrathwohl
will grathwohl
2 years
Major shout out to one of my brilliant research poppas @DavidDuvenaud for getting the 2022 @SloanFoundation fellowship in computer science. Great choice!!!!!
1
0
83
@wgrathwohl
will grathwohl
1 year
To succeed in science you either need the stupidity to believe you are smarter than those who came before and failed…or the intelligence to know that you are most certainly stupider and maybe they just forgot to try some dumb shit.
2
2
83
@wgrathwohl
will grathwohl
7 months
Very excited to be hosting a student researcher on my team at Google Deepmind next year. I'm looking to work with a PhD student who has a background in generative models and sampling. If you think that describes you, please apply here!
1
7
78
@wgrathwohl
will grathwohl
3 years
Your BERT is Secretly an Energy-Based Model and you should totally treat it like one! I love it! Super-cool work from @kartik_goyal_ , @redpony , and @BergKirkpatrick
1
14
68
@wgrathwohl
will grathwohl
2 years
Heading to London for two weeks to visit the main @DeepMind office! If I got any homiez out there in da UK hit me up and tell me where to get the best Sunday roast!
4
0
68
@wgrathwohl
will grathwohl
3 years
Hi everyone! Very honored to get the Honorable Mention for the Outstanding paper award at @icmlconf . If you have questions about discrete sampling and EBMs please come to Poster C2 at the noon (EST) poster session!
1
2
64
@wgrathwohl
will grathwohl
3 years
Hi all, the wait is over! The official code for "Oops I Took A Gradient: Scalable Sampling for Discrete Distributions" () can be found here: which should be a good jumping-off point for your discrete EBM needs. Enjoy! :)
0
5
61
@wgrathwohl
will grathwohl
3 years
That feeling when you tryna reproduce baselines but the authors used hella data augmentation and didn’t put that in the paper…
2
2
62
@wgrathwohl
will grathwohl
3 years
By that of course I mean I got my PhD. None of this would’ve been possible if @zemelgroup and @DavidDuvenaud hadn’t taken a chance on me years ago. I learned so much from them on what it means to be a scientist and how to do research. I’ll never forget what they have done for me
6
0
59
@wgrathwohl
will grathwohl
2 years
The best part of #NeurIPS22 is obviously the significant boats gallery.
Tweet media one
4
2
58
@wgrathwohl
will grathwohl
3 years
Got my 1000th citation! We’ll timed with my defense.
0
1
55
@wgrathwohl
will grathwohl
2 years
Haven't tweeted in a while but y'all should check out Its a discrete sampler in the same spirit as Gibbs-With-Gradients but more directly acts as a discrete analog of Langevin (w/ gradient-based proposals)
2
3
51
@wgrathwohl
will grathwohl
3 years
Lots of EBMs in the orals this year at @iclr_conf ! Some favorites:
1
9
49
@wgrathwohl
will grathwohl
2 years
Inspiring stuff. It’s never too late. I spent 4 years working before deciding to pursue science and it was the best (and most difficult) decision I have ever made. My life changed completely and for the better.
@yufeizhao
Yufei Zhao
2 years
Congratulations to Jinyoung Park and Huy Pham for proving the Kahn-Kalai conjecture---a central open problem in probabilistic combinatorics. Truly exciting breakthrough! The story of Jinyoung's extraordinary path to mathematics:
6
152
774
1
2
48
@wgrathwohl
will grathwohl
3 years
I’ve been busy moving and trying to get my wife a green card (hard!) so I haven’t been following the research too close for a while. What are some of the best papers y’all have read the last few months?! Bonus points for generative models!
5
4
50
@wgrathwohl
will grathwohl
3 years
Honeymoon / PhD defense celebration in muskoka. Good vibes @afan_foryou
Tweet media one
2
0
47
@wgrathwohl
will grathwohl
1 year
We really gotta slow this AI this down...not for any moral or risk based reason...I'm just really fuckin tired.
1
0
47
@wgrathwohl
will grathwohl
2 years
This is awesome! Bring RBMs back baby!
@lrjconan
Renjie Liao
2 years
We show that Gaussian RBMs can generate good images just like other generative models, despite the single-layer architecture. Key innovations: 1) Gibbs-Langevin sampling; 2) modified Contrastive Divergence. Paper: Code: 1/2
12
118
554
0
8
45
@wgrathwohl
will grathwohl
3 years
I agree with that!
@imbue_ai
Imbue
3 years
Learn about energy-based models, implicit functions, and why modular systems may be our best bet for general intelligence with @du_yilun from MIT!
0
11
64
1
0
45
@wgrathwohl
will grathwohl
3 years
I need a new book to read, gimme a recommendation!!! I’m looking for classic sci-fi — think Asimov, Dune, Clarke. That kinda stuff. The wackier and older the better!!!!
42
3
42
@wgrathwohl
will grathwohl
2 years
Pleased to say this has been accepted to AISTATS 2022! All credit to my amazing collaborators (especially @EliWeinstein6 ) who did everything. I learned so much from you all working on this and I’m happy for this outcome!
@deboramarks
Debora Marks
3 years
New paper: 'Optimal Design of Stochastic DNA Synthesis Protocols based on Generative Sequence Models' led by @EliWeinstein6
Tweet media one
2
14
72
0
3
39
@wgrathwohl
will grathwohl
3 years
def gwg(x, f): df = grad(f)(x) dx = -(2*x -1) * df qx'x = logsoftmax(dx / 2.0) i = categorical_sample(qx'x) x' = flipdim(x, i) dx' = -(2*x' - 1) * df qxx' = logsoftmax(dx' / 2.0) if rand() < exp(f(x') - f(x) + qxx'[i] - qx'x[i]) then return x' return x
2
6
40
@wgrathwohl
will grathwohl
1 year
RLHF is just energy-based models
6
2
40
@wgrathwohl
will grathwohl
1 year
Is anyone else a lil peeved that they’re calling it “generative ai” instead of “generative models” ???
2
0
39
@wgrathwohl
will grathwohl
11 months
I’m in Hawaii…with all the dorks…all the dorks are in Hawaii
5
0
39
@wgrathwohl
will grathwohl
2 years
While I could celebrate some neurips acceptances, I prefer to freak out about the looming ICLR deadline. #workhardnotsmart #ichoosepanic
0
0
39
@wgrathwohl
will grathwohl
2 years
Oh boi I do not like coding
2
3
35
@wgrathwohl
will grathwohl
3 years
Yet another super cool work from this group on EBMs and their unique properties. I like the idea of dropping the MCMC (since our samplers never mix anyway!) and rephrasing it as an optimization problem. I’m curious to see how this would work in a more standard EBM as well.
@du_yilun
Yilun Du
3 years
How can we discover, unsupervised, the underlying objects and global factors of variation in the world? We show how to discover these factors as different energy functions. Website: w/ @ShuangL13799063 , @yash_j_sharma , Josh Tenenbaum, @IMordatch (1/4)
5
46
244
0
0
36
@wgrathwohl
will grathwohl
1 year
I don't support a 6 month "pause" on AI research...but I would support a collective 6 month "vacation" for all AI researchers
2
0
35
@wgrathwohl
will grathwohl
3 years
I cannot thank @DavidDuvenaud and Rich Zemel ( @zemelgroup ) enough for their amazing supervision in my Ph.D None of this would have happened without their support and them taking a chance on me when I was an unknown, untested, and inexperienced researcher
1
0
34
@wgrathwohl
will grathwohl
3 years
A few neat ICLR papers on discrete EBMs! These appear to build upon (and improve!!!) some ideas from Gibbs-With-Gradients for even more efficient sampling and better EBM training! I'll follow up after I dig into the details.
0
1
32
@wgrathwohl
will grathwohl
1 year
Was a blast!!!
@jmtomczak
Jakub Tomczak
1 year
After a short break, we are back to listen to an amazing @wgrathwohl talking about energy-based models! 🤩
Tweet media one
1
4
19
0
0
28
@wgrathwohl
will grathwohl
3 years
My oldest friend @PierceFulton passed away last week. He was a great man and an incredibly talented musician. When we were 5 years old he walked up to me and declared we would be best friends...and we were until he moved away to Vermont.
9
1
29
@wgrathwohl
will grathwohl
7 months
Have fun at NeurIPS y'all! Eat lot's of turtles and crawdads for me. I'll be doing my darndest to not think about neural networks thank you. But in case anyone misses me there this is basically all yer missing...
2
0
26
@wgrathwohl
will grathwohl
3 years
Important topic! Understudied! Great idea for a workshop! Wish I knew about this sooner!
@jmhernandez233
Jose Miguel Hernández-Lobato
3 years
Happy to announce the 2021 NeurIPS workshop on 𝗗𝗲𝗲𝗽 𝗚𝗲𝗻𝗲𝗿𝗮𝘁𝗶𝘃𝗲 𝗠𝗼𝗱𝗲𝗹𝘀 𝗮𝗻𝗱 𝗗𝗼𝘄𝗻𝘀𝘁𝗿𝗲𝗮𝗺 𝗔𝗽𝗽𝗹𝗶𝗰𝗮𝘁𝗶𝗼𝗻𝘀. Abstract submission deadline Sep 17, 2021. Check it out!
1
28
202
0
0
25
@wgrathwohl
will grathwohl
3 years
I absolutely love this work. Very exciting to see domain agnostic architectures scale to domains where previously only heavily-specified and engineered architectures could work!
@GoogleDeepMind
Google DeepMind
3 years
To tackle all the challenges we meet while solving intelligence, we need tools that are as adaptable as possible. Announcing the paper & code for Perceiver IO, an architecture that handles a wide range of data and tasks, all while scaling gracefully: 1/4
Tweet media one
24
217
839
0
0
22
@wgrathwohl
will grathwohl
3 years
Thanks so much Dave. Working with you has been so much better than the best I could have ever imagined it could!
@DavidDuvenaud
David Duvenaud
3 years
Thanks @wgrathwohl for taking a chance on me as a then-unknown new prof, and then making me and @zemelgroup look good ever since then.
0
0
34
0
0
21
@wgrathwohl
will grathwohl
3 years
Plain boi
Tweet media one
2
0
20
@wgrathwohl
will grathwohl
3 years
a shockingly ubiquitous structure in discrete distributions: a differentiable energy function. The method is incredibly simple. We use a Taylor series to predict the energy values of nearby states. These predicted values are used to parameterize a proposal distribution for MH
Tweet media one
1
5
20
@wgrathwohl
will grathwohl
3 years
Haven’t seen any raccoons in nyc yet but this dudes is close enough.
Tweet media one
2
0
21
@wgrathwohl
will grathwohl
1 year
So excited to be a part of this!!
@jmtomczak
Jakub Tomczak
1 year
Together with @jesfrellsen & @pamattei , we organize Generative Modeling Summer School #GeMSS23 ! Please check out our webpage and apply! Deadline: April 10, 2023 Where: Copenhagen (Denmark) When: June 26-30, 2023 Lecturers: see below👇
Tweet media one
3
25
96
0
1
20
@wgrathwohl
will grathwohl
11 months
So proud of my research daddi and his amazing collaborators. Rich @zemelgroup helped me get into the field and I owe so much to him and the phenomenal community he helped create in Toronto. So well deserved. Congrats to everyone involved!
@david_madras
David Madras
11 months
Huge congrats to the authors on this, including my PhD supervisor @zemelgroup ! Not an exaggeration to say this line of work totally changed my academic career, it’s great to see their research foresight recognized
1
2
21
0
2
20
@wgrathwohl
will grathwohl
3 years
The crown jewel of my art collection is now framed. Getting this back to my apartment was an adventure! But certainly worth it! @dangarzi
Tweet media one
2
0
18
@wgrathwohl
will grathwohl
3 years
Thanks so much to everyone I've worked with during my Ph.D.
0
0
19
@wgrathwohl
will grathwohl
3 years
The finest work of art ever created now hangs in my apartment.
Tweet media one
0
0
17
@wgrathwohl
will grathwohl
3 years
@adjiboussodieng @PrincetonCS Adji is one of the greats! Y’all should apply!!!!
0
0
15
@wgrathwohl
will grathwohl
3 years
Love to see more work improving the *many* issues with EBM training. The method seems elegant. Curious @jesfrellsen have you looked into the kinds likelihood’s those method achieves??
@jesfrellsen
Jes Frellsen
3 years
Can an EBM be sandwiched? We ( @conggeng94 Wang Gao @jesfrellsen Hauberg) propose a bidirectional bound on EBM LL and link our upper bound to gradient penalties. This stabilize training and gives HQ densities and samples. Enjoy the sandwich! #NeurIPS2021
Tweet media one
3
14
83
1
0
17
@wgrathwohl
will grathwohl
7 months
"...blah blah blah energy-based models blah blah..." "...Sampling! Its all sampling!" "...you know this is basically just SMC..." "...yea but do we know WHY this basket of 50 hacks works?" "...but have you thought about it as a sampling problem?" "...UNNORMALIZED..."
0
0
17
@wgrathwohl
will grathwohl
1 year
Is this like when Goku and Vegeta fused?
3
1
15
@wgrathwohl
will grathwohl
3 years
lil dr boy
2
0
15
@wgrathwohl
will grathwohl
7 months
These homiez rule. Go talk to them.
@sigmabayesian
Anshuk Uppal
7 months
On my way 🚨✈️🚨 to NOLA. I'll be presenting our work on scalable implicit VI on Wed from 5pm ( #1313 ). I am also looking for internships for 2024 in fundamental or applied generative modelling and uncertainty quantification. #NeurIPS2023
Tweet media one
0
5
37
1
2
14
@wgrathwohl
will grathwohl
3 years
!!!!!!!
@priyankjaini
Priyank Jaini
3 years
Following the work of @wgrathwohl , these equivariant EBMs extend to equivariant joint energy models with equivariant marginals and conditionals. These can be used to generate samples from an Equivariant conditional distribution. e.g molecule structure with pre-defined properties.
Tweet media one
1
3
14
0
0
16
@wgrathwohl
will grathwohl
1 year
Happy birthday will!
Tweet media one
1
0
15
@wgrathwohl
will grathwohl
2 years
I’d like this in my house plz
Tweet media one
5
0
15
@wgrathwohl
will grathwohl
8 months
Today on “Never thought my life would take me here” This is awesome #DreamTrackAI @TPAIN #drifting I cannot explain how incredible it’s been building this. So many times throughout this endeavor I’ve been shocked by the progress the field has made.
2
1
15
@wgrathwohl
will grathwohl
3 years
We call the method Gibbs-With-Gradients (GWG) due to its similarity with adaptive Gibbs sampling. GWG leads to considerable improvements when compared to other samplers which do not exploit known structures in the distribution
Tweet media one
1
1
15
@wgrathwohl
will grathwohl
2 years
So true. My offer from UofT was considerably lower than any US school. This has made recruiting good students from the US very difficult for them.
@Guodzh
Guodong Zhang
2 years
I feel the stipends for Canadian schools are much lower. I got ~2000 CAD/month even with TAing for three courses in my first year. The living cost of Toronto is higher than many cities in US. Most apartments around the campus would cost you 1000+ even sharing with others.
4
3
35
2
1
14
@wgrathwohl
will grathwohl
3 years
Code to be released soon, but you shouldn't need it! Gibbs-with-gradients can be implemented in a tweet! See below:
2
1
14
@wgrathwohl
will grathwohl
3 years
Thank you mom for the bottle of dom #phd #backtoamerica
Tweet media one
2
0
13
@wgrathwohl
will grathwohl
2 years
I really hope living in the west village is worth the space and the view.:..
Tweet media one
1
0
12
@wgrathwohl
will grathwohl
3 years
But all of the successful methods rely on modeling continuous data. Of course, much relevant data is discrete -- and we cannot apply recent EBM methods to this data. In this work, we present a simple and generic MCMC sampler for discrete distributions which exploits...
1
1
14
@wgrathwohl
will grathwohl
7 months
Silicon baking sheet liners are a game changer! …idk what’s going down at neurips but I guarantee what I’m doing rn is much more interesting
1
1
14
@wgrathwohl
will grathwohl
3 years
Gibbs-With-Gradients is not the first discrete sampling method to use gradients of the energy-function, but we find it scales more gracefully to high-dimensional data than previous methods. See here some results on RBMs of increasing dimension:
Tweet media one
1
1
14
@wgrathwohl
will grathwohl
3 years
Gibbs-With-Gradients enables the training of Deep Energy-Based Models on high-dimensional discrete data (images treated as 1-of-256 categoricals)
Tweet media one
1
1
13
@wgrathwohl
will grathwohl
3 years
The energy function here is an unconstrained ResNet and these models outperform VAEs in terms of log-likelihood
Tweet media one
1
1
12
@wgrathwohl
will grathwohl
3 years
Much love to my co-authors @kswersk @miladhash @DavidDuvenaud and @cjmaddison !!! See you all (virtually) in July!
1
0
11
@wgrathwohl
will grathwohl
3 years
Gibbs-With-Gradients allows us to train Potts models for protein structure prediction with maximum likelihood on very large proteins outperforming the standard pseudolikelihood
Tweet media one
1
0
12
@wgrathwohl
will grathwohl
1 year
just found a typo in my thesis
3
0
11
@wgrathwohl
will grathwohl
1 year
Erryone gotta see @RRRMovie that shit was incredible.
0
0
9
@wgrathwohl
will grathwohl
2 years
How do I tell my trainer I want the “Daniel Radcliffe in the Weird Al biopic body” ?
2
0
9
@wgrathwohl
will grathwohl
2 years
For those following along at home. In just a few minutes I will consume THE EELS #EELBOI #EELTARIFF #FREETHEEEL
Tweet media one
1
0
11
@wgrathwohl
will grathwohl
2 years
Highly recommend reaching out if you're interested. I'd absolutely be there if I wasn't in Europe right now :(
@msalbergo
Michael Albergo
2 years
⭐️We are hosting a workshop on measure transport, sampling, and diffusions at @FlatironInst @FlatironCCM @FlatironCCQ Nov-16-18. A few spots recently opened up for in-person attendance. If you are interested in participating, please reach out by email!
3
16
56
3
0
10
@wgrathwohl
will grathwohl
1 year
It was in bobcaygeon I saw the constellations reveal themselves one star at a time #canada #tragicallyhip
Tweet media one
0
0
10
@wgrathwohl
will grathwohl
8 months
Really cool stuff @du_yilun
@tim_garipov
Timur Garipov
8 months
Classifier guidance in diffusion enables generation conditioned on information that might not have been specified at training time. In our recent #NeurIPS2023 paper we show how this idea can be generalized to compose pre-trained diffusion models as well as GFlowNets. 🧵 1/N
Tweet media one
1
59
330
0
0
10
@wgrathwohl
will grathwohl
2 years
When yer boi runs an ai company and also gets you into cool shows…
Tweet media one
0
0
10
@wgrathwohl
will grathwohl
2 years
@OriolVinyalsML @mariatta ~sophisticated~ coders use 1/0
1
0
10
@wgrathwohl
will grathwohl
2 years
I’m sorry but I think you are confusing yourself with me ;)
@AndrewMayne
Andrew Mayne
2 years
"a raccoon wearing a hoodie working on his laptop late into the night" @OpenAI 's DALL-E 2 sees the true me...
Tweet media one
24
111
1K
0
0
10
@wgrathwohl
will grathwohl
1 year
This is awesome! These are exactly the kinds of applications I had in mind when working on Gibbs-With-Gradients. I'd love to hear more!
@PatrickOmid
Patrick Emami
1 year
This work applies some cool new ideas coming out of the discrete MCMC literature (e.g., GWG from @wgrathwohl et al. and some follow-up work) also need to shoutout my co-authors at NREL that helped make this happen! end/
0
0
2
0
0
9
@wgrathwohl
will grathwohl
2 years
I’m back baby!! Where’s the eels at!?!
2
0
9
@wgrathwohl
will grathwohl
3 years
I give so much love to @aspetuckmusic and the rest of his family. I am so sorry for your loss. We all share this pain. It is clear to me that his music touched so many people and as someone who cared about Pierce this means a lot to me. I hope you feel that too.
2
0
9
@wgrathwohl
will grathwohl
3 years
@roydanroy @fhuszar @DavidDuvenaud Poppa taught me: first you make figure 1, then you write the tweet, then…you do the science. Love you @DavidDuvenaud
0
0
9
@wgrathwohl
will grathwohl
2 years
Obliterated a good-ass roast. Gotta get me some eels next.
Tweet media one
4
0
8
@wgrathwohl
will grathwohl
3 years
Dayum! TBH I think we gotta stop at this point. Congrats.
@hojonathanho
Jonathan Ho
3 years
New paper on cascaded diffusion models for ImageNet generation! We outperform BigGAN-deep and VQ-VAE-2 on FID score and classification accuracy score: Work with @Chitwan_Saharia @wchan212 @fleet_dj @mo_norouzi @TimSalimans
Tweet media one
11
107
440
1
0
8
@wgrathwohl
will grathwohl
3 years
@david_madras Agreed. And in my experience, most of the real science happens while we’re all drinking wine after the talks end. Casually discussing ideas with rarely-seen friends and colleagues has lead to most of my work and ideas and we’re completely robbed of that in the virtual setting.
0
0
7
@wgrathwohl
will grathwohl
3 years
I love this line of self-supervised learning! Beautiful, simple, powerful methods which can be trained on consumer hardware.
@ylecun
Yann LeCun
3 years
VICReg: Variance-Invariance-Covariance Regularization for Self-Supervised Learning. By Adrien Bardes, Jean Ponce, and yours truly. Insanely simple and effective method for self-supervised training of joint-embedding architectures (e.g. Siamese nets). 1/N
10
112
613
0
0
8