Jerry Li @jerryzli Twitter profile

Pinned Tweet

Jerry Li

2 months

Hello! For my biennial post here, a bit of a personal update: after 5 great years at @MSFTResearch , this fall, I'll be joining the CSE faculty at @UW . Much love to the great folks at MSR I've had the pleasure of working with. 1/2

13

10

340

Last Seen Profiles

@ibjmmmjbv1

@SticklandV28857

@mood_295

@AMSGTCD

@karirin_07

@couples_mot7arr

@Energy_QLD

@KhwimaMchizi

@jjjjj13520

@cierrojetes

@RLemaso80536

@namamp_

@Eduardo23298122

@bckstg_nyarl

@sikergah

@JYoungbluth

@gandangmorenxxx

@XOXOxoxoZOO

@CalamityDeath

@butlerbliss_

@Ilana1992935

@JK7783289844900

@ibjmmmjbv1

@blackcat5566

@id6457

@MS_Holaiby

@clips_porn2

@ibjmmmjbv1

@illBEtheLITE

@kaskagorri_

@athwaqchanel

@WOT597118254

@itsgoodgood

@SimthandileMaz2

@hatwifcookie

Jerry Li

@jerryzli

4 years

First lecture is online! Lecture 1: Introduction to robust statistics

Lecture 1: Introduction to robust statistics

Lecture notes available here: https://jerryzli.github.io/robust-ml-fall19.html

www.youtube.com

2

63

376

Jerry Li

@jerryzli

3 years

theoretical research in a nutshell

1

22

208

Jerry Li

@jerryzli

4 years

Hello! Just like @ilyaraz2 , I'm making video lectures covering material from the class I taught at UW last fall. The first video is already recorded and barring (additional) issues with Youtube it'll be online on Monday. (1/3)

3

20

137

Jerry Li

@jerryzli

3 years

Hi! So I was coerced into writing this thread---er I mean let me tell you a little bit about quantum learning and testing, and about some of the papers @sitanch , Jordan Cotler, @RobertHuangHY and I just posted on arxiv: 1/

Exponential separations between learning with and without quantum memory

We study the power of quantum memory for learning properties of quantum systems and dynamics, which is of great importance in physics and chemistry. Many state-of-the-art learning algorithms...

arxiv.org

3

5

80

Jerry Li

@jerryzli

4 years

Lecture 2: Total variation, statistical models, and lower bounds

Lecture notes available here: https://jerryzli.github.io/robust-ml-fall19.htmlReferences:Huber, Peter J. "Robust Estimation of a Location Parameter." The Ann...

www.youtube.com

1

10

74

Jerry Li

@jerryzli

4 years

Lecture 3: Robust mean estimation in high dimensions I'm pretty sure I bungled some constants in the proof, but oh well...

Lecture 3: Robust mean estimation in high dimensions

Lecture notes available here: https://jerryzli.github.io/robust-ml-fall19.htmlReferences:[Tuk75] Tukey, John W. "Mathematics and the picturing of data." Proc...

www.youtube.com

0

9

50

Jerry Li

@jerryzli

4 years

Understanding adversarial examples is not only an important theoretical problem, but they also pose security risks to systems using AI as a component. We consider the real world impact of adversarial ML, and how systems designers can mitigate them, and how researchers can help.

1

9

47

Jerry Li

@jerryzli

4 years

@ccanonne_ However, note that this effort may have been abandoned due to the author becoming addicted to various video games

2

0

46

Jerry Li

@jerryzli

3 years

When can you learn a simple transformation of a Gaussian in high dimensions? We give the first nontrivial end-to-end guarantees for this basic problem, which also has applications to learning deep generative models. Massive props to @sitanch for hard carrying, see his thread ⬇️

Sitan Chen

@sitanch

3 years

Excited to share something I've been working on over the last year, joint with @jerryzli , Yuanzhi Li, and Anru Zhang! We give provably efficient algorithms for learning a rich family of "pushforward distributions" inspired by generative models. 1/n

4

36

257

0

43

Jerry Li

@jerryzli

2 years

Slightly belated, but happy to announce a new paper with @sitanch , Brice Huang, and Allen Liu. We give tight lower bounds for basic quantum property testing problems without quantum memory. tl;dr Non-adaptivity is all you need, a thread 🤪 1/n

Tight Bounds for Quantum State Certification with Incoherent Measurements

We consider the problem of quantum state certification, where we are given the description of a mixed state $σ\in \mathbb{C}^{d \times d}$, $n$ copies of a mixed state $ρ\in...

arxiv.org

2

5

40

Jerry Li

@jerryzli

3 years

First time back in the office and immediately having technical issues with my computer. So nostalgic 🥲

1

0

35

Jerry Li

@jerryzli

3 years

Obivous bias aside: this talk was one of a number of exciting talks at the joint @MLFoundations / @SimonsInstitute workshop co-organized by Adam Klivans, Tselil Schramm, and myself. Check out the full list of talks and speakers here:

Joint IFML/CCSI Symposium

Members of the new NSF AI Institute for Foundations of Machine Learning (IFML) will spend a week in residence at the Simons Institute as part of the fall program on Computational Complexity of...

simons.berkeley.edu

Sebastien Bubeck

@SebastienBubeck

3 years

Just watched an incredible talk by @AlexGDimakis at the Simons Institute, highly recommended. Their Iterative Layer Optimization technique to solve inverse problems with GANs make a LOT of sense! The empirical results on the famous blurred Obama face speak for themselves! 1/4

3

78

456

0

6

33

Jerry Li

@jerryzli

4 years

Lecture 5: Efficient filtering from spectral signatures. The first "payoff" lecture! We give a super simple (~8 lines!) poly-time algorithm for robust mean estimation, with a fully self-contained analysis.

0

3

34

Jerry Li

@jerryzli

4 years

New video: A short proof of the spectral signatures lemma. Alternative title: Jerry learns that manim is a thing, thinks it's amazing, and obsessively messes around with it all week. But hey, this time, all the constants are correct!

A short proof of the spectral signatures theorem

Lecture 4: https://www.youtube.com/watch?v=CDVwF9ilRBMLecture notes: https://jerryzli.github.io/robust-ml-fall19/lec4.pdf

www.youtube.com

1

5

32

Jerry Li

@jerryzli

4 years

Lecture notes that I'll loosely follow are available here: (3/3)

1

5

32

Jerry Li

@jerryzli

4 years

Lecture 4.5: Finite sample concentration with bounded second moments. Sorry for the weird timing, youtube hates me sometimes! Also: can you spot all of the bungled constants??? 🤔

Lecture 4.5: Finite sample concentration with bounded second moments

[DKKLMS17] Diakonikolas, Ilias, et al. "Being robust (in high dimensions) can be practical." Proceedings of the 34th International Conference on Machine Lea...

www.youtube.com

3

5

30

Jerry Li

@jerryzli

4 years

Lecture 4: Spectral signatures and efficient certifiability. More bungled constants...it seems that this will be a trend ¯\_(ツ)_/¯.

Lecture 4: Spectral signatures and efficient certifiability

Lecture notes available here: https://jerryzli.github.io/robust-ml-fall19.htmlThe main lemma has appeared in many very similar forms in the literature. A sim...

www.youtube.com

0

6

29

Jerry Li

@jerryzli

4 years

for unknowable reasons sometimes youtube decides that it wants to upload exactly 67% of my videos...sorry about the delay, but the next video lectures are coming!

2

0

27

Jerry Li

@jerryzli

2 months

If you're a talented student interested in a nonempty subset of {TCS, quantum, machine learning, AI}, please consider UW! It's a very exciting place for all of these areas! 2/2

0

3

27

Jerry Li

@jerryzli

4 years

I'm generally planning on posting my lectures twice a week, but for my own sanity I'm only planning to do 1 per week (hopefully on Mondays) until the NeurIPS deadline.

1

0

25

Jerry Li

@jerryzli

5 years

A bit late, but let it be written here for the record that a couple of weeks ago, @SebastienBubeck promised, in front of witnesses ( @TheGregYang @hadisalmanX ), to dye his hair after his third child @ilyaraz2

3

2

19

Jerry Li

@jerryzli

4 years

I'm planning to focus only on robust statistics, so I won't cover everything I covered in the class. However, this allows me to cover more topics in this area than I could during the class, including robust covariance estimation, linear time algos, and SoS-based methods. (2/3)

1

2

18

Jerry Li

@jerryzli

4 years

@thegautamkamath how odd, I spent the whole day reading physical copies of papers I have to review and yet my stress level has increased dramatically 🤔🤔

1

0

15

Jerry Li

@jerryzli

1 year

@thegautamkamath @eytyang Whoa I won't stand for such blatant lies. I definitely provided citations.

0

13

Jerry Li

@jerryzli

2 years

Big shout-out to all my co-authors, they are all insanely talented. Fun fact: Brice and Allen didn't know any quantum at all, and within a month or so of working with them, we had a working proof of this lower bound, which had eluded @sitanch and me for nearly 2 years...🙃 9/9

0

12

Jerry Li

@jerryzli

3 years

First off, what is a quantum distribution? They're known as "mixed states", and they are any d-dimensional PSD matrix with trace 1. Notice the eigenvalues of this state form a probability distribution. You can think of them as rotations of distributions in d-dim space. 2/

1

0

11

Jerry Li

@jerryzli

5 years

who dis🤔 (h/t @TheGregYang )

2

0

11

Jerry Li

@jerryzli

3 years

Big shoutout to my coauthors @sitanch , Jordan Cotler, and @RobertHuangHY , they're all super great and you should hire them (although @hseas already claimed @sitanch ). 11/end

1

0

8

Jerry Li

@jerryzli

3 years

We also show that limited amounts of quantum memory are not sufficient for some of these tasks, and we show hierarchies for the power of entangled measurements. The techniques are surprisingly elementary, and we think may have application to classical learning! 10/

1

0

9

Jerry Li

@jerryzli

2 years

@SebastienBubeck not pictured: the thousands of mosquitos all around us

0

8

Jerry Li

@jerryzli

3 years

How do you interact with a mixed state? In the classical world, we get samples from our distribution. Quantumly, one designs a measurement scheme known as a POVM, and them measures the mixed state in this POVM. The outcome of this is a draw from a classical distribution. 3/

1

0

8

Jerry Li

@jerryzli

5 years

@XandaSchofield There are dozens of us!

0

8

Jerry Li

@jerryzli

3 years

In our new papers, we show *exponential* separations for a number of learning and testing tasks, including shadow tomography, purity testing, and some channel testing tasks. In fact, sometimes there are exponential separations even with very mild entanglement. 9/

1

0

8

Jerry Li

@jerryzli

3 years

Previously, @SebastienBubeck , @sitanch and I showed that for the task of mixedness testing, there are polynomial gaps between the power of entangled and unentangled measurements. But are there tasks with much larger separations? 8/

1

0

7

Jerry Li

@jerryzli

3 years

In the same way that n draws from a classical distribution can be thought of as one draw from the joint distribution, n copies of a mixed state can be thought of as one big tensored up mixed state. One can then apply an arbitrary POVM to this entire tensor! 5/

1

0

7

Jerry Li

@jerryzli

2 years

We show that the answer is no: adaptive algorithms still need Θ(d^3/2) samples. We construct a new hard instance based on Gaussian perturbations. We do this because unlike previous instances, the likelihood ratio for this instance has a really elegant self-similar structure. 5/n

1

0

7

Jerry Li

@jerryzli

3 years

But they also come at a cost: they require the algorithm to store many copies of the mixed state in memory simultaneously. They're also complex to form. In our papers, we ask if they're truly necessary for a number of basic quantum learning tasks. 7/

1

6

Jerry Li

@jerryzli

5 years

@thegautamkamath mfw I remember twitter exists and decide to check it for the first time in weeks

0

6

Jerry Li

@jerryzli

3 years

There are two key differences between classical and quantum learning. First, notice that the measurement operation is inherently interactive, so measurements can be chosen adaptively, which is not true classically. Also, measurements can be entangled. 4/

1

0

6

Jerry Li

@jerryzli

3 years

This allows us to do measurements that we can't do just by applying POVMs to each individual state. Such entangled measurements are quite powerful, many algos use them to get optimal statistical rates see e.g. @BooleanAnalysis and John Wright's amazing papers. 6/

1

0

6

Jerry Li

@jerryzli

2 years

It's "folklore" that if these measurements are specified ahead of time, that you need Θ(d^3/2) samples. More recently, @SebastienBubeck , @sitanch , and myself showed that even adaptive algorithms require Ω(d^4/3) samples. But this left the question: does adaptivity help? 4/n

1

0

5

Jerry Li

@jerryzli

3 years

@thesasho @BooleanAnalysis this reminds me of when I was learning probability theory my prof tried to use Av notation for Markov chains and at some point literally threw his notes in frustration and walked out of class...good times

0

5

Jerry Li

@jerryzli

2 years

For state certification, we generalize the prior instance-optimal bounds of @BooleanAnalysis , @sitanch , and myself, and show that they hold for arbitrary adaptive measurements, not just non-adaptive ones. Here too, you seem to get the same bounds with and without adaptivity. 7/n

1

0

5

Jerry Li

@jerryzli

2 years

This lets us boil the question down to controlling the behavior of a certain matrix martingale, which we can do using well-known techniques from matrix concentration inequalities. 🤯 6/n

1

0

5

Jerry Li

@jerryzli

4 years

...but perhaps consider watching this video on Thursday.

1

0

4

Jerry Li

@jerryzli

2 years

It's pretty fascinating: even though proving lower bounds against adaptive algorithms is often really hard, I don't know of any natural quantum state learning setting where it actually gives any advantage. Hence, non-adaptivity is all you need????? 🤔🤔🤔 8/n

1

0

4

Jerry Li

@jerryzli

4 years

:')

0

4

Jerry Li

@jerryzli

3 years

from

1

0

3

Jerry Li

@jerryzli

5 years

@suriyagnskr I figured it out in the end!

1

0

4

Jerry Li

@jerryzli

4 years

@thegautamkamath @VishwakFTW why can't I dislike a tweet D:<

0

4

Jerry Li

@jerryzli

2 years

We consider mixedness testing and state certification, which are the natural quantum analogues of uniformity and identity testing, respectively. For mixedness testing of a d-dim state, @BooleanAnalysis and John Wright showed that Θ(d) samples are sufficient and necessary. 2/n

1

0

4

Jerry Li

@jerryzli

2 years

But their tester needs heavily entangled measurements, which makes their tester impractical to implement on existing quantum devices.😭 So what if we only consider testers that don't use entangled measurements, i.e., only measure one copy of the state at a time? 🤔3/n

1

0

4

Jerry Li

@jerryzli

4 years

Joint work with some great collaborators:

0

4

Jerry Li

@jerryzli

5 years

@ccanonne_ this is not quite matrix Bernstein's but you can often do the following: since Frobenius is self-dual, you just need that <M, U> concentrates, if M is your matrix, and U is any unit Frobenius matrix. Get a good tail bound for a fixed, U, then union bound over all 2^(d^2) of them.

1

0

3

Jerry Li

@jerryzli

5 years

@thegautamkamath @ccanonne_ yeah, what @thegautamkamath said. Surprisingly, the results appear to achieve incomparable sample complexities, and the techniques are quite different!

0

3

Jerry Li

@jerryzli

5 years

Estimating Coronavirus Prevalence by Cross-Checking Countries by Jacob Steinhardt

Estimating Coronavirus Prevalence by Cross-Checking Countries

Every idea needs a Medium

link.medium.com

0

1

3

Jerry Li

@jerryzli

4 years

(whoops, sorry, made it public!) (but really, maybe consider watching it on Thursday?)

1

0

3

Jerry Li

@jerryzli

5 years

@suriyagnskr As is so often the case, the solution involved "lifting" the problem and, in this case, the chair arm.

1

0

3

Jerry Li

@jerryzli

5 years

@ccanonne_ Yeah, what I suggested is exactly the analog of the proof technique used there. See for instance the discussion of Cor. 2.1.12 in my thesis (although it's almost certainly been written down before).

1

0

2

Jerry Li

@jerryzli

3 years

@thegautamkamath @tetraduzione wow good thing I randomly decided to check twitter today, outlook decided that this email was junk...

0

2

Jerry Li

@jerryzli

4 years