Peyman Milanfar @docmilanfar Twitter profile | Pikagi

Pikagi

Peyman Milanfar

@docmilanfar

71,328

Followers

261

Following

2,223

Media

6,018

Statuses

Distinguished Scientist at Google. Computational Imaging, Machine Learning, and Vision. Tweets = personal opinions. May change or disappear over time.

https://t.co/w3jkV2j0SN

Joined February 2014

Don't wanna be here? Send us removal request.

Pinned Tweet

@docmilanfar

Peyman Milanfar

13 days

My team is hiring a Research Scientist (recent PhD grad or with a couple of years' experience) Mission: develop fundamental, state of the art tech at the intersection of imaging, vision, and machine learning. Come work w/ @2ptmvd , @hossTale & me Apply:

5

41

205

Last Seen Profiles

@cdrom1019

@gwenyn28

@NFLLegacy

@JensonThompson

@Sathish77132531

@TerryAnn_G

@naydblanc

@palwifi

@Cathaoir1

@nachihitotose

@idringp

@RealJCWood

@BoredXanthippe

@TaciaAlicia

@LindHryc

@Bensad1306

@FaithEnda35716

@andrewtaylor09

@BCompau14732

@minikvet

@justinblaze12

@EvaUnit02NZ

@crotailias1212

@MichaelRoderick

@cammalliga

@Nicoco220983

@ulises_ale86313

@NjieMerry64668

@Chozn__

@0816_mee

@sgheartiu

@Samysterio3

@JaguarJ50276427

@RealJCWood

@alondra_jpg

@TradingMojo

@docmilanfar

Peyman Milanfar

2 years

Abstract vs Paper

163

2K

15K

@docmilanfar

Peyman Milanfar

1 year

Years ago when my wife and I we were planning to buy our home, my dad stunned me with a quick mental calculation of loan payments. I asked him how - he said he'd learned the strange formula for compound interest from his father, who was a merchant in 19th century Iran. 🧵 1/4

Tweet media one

93

2K

10K

@docmilanfar

Peyman Milanfar

2 months

product managers in eng meetings

124

944

9K

@docmilanfar

Peyman Milanfar

5 months

“Sorry I was late for your talk, I got stuck in a local minimum”

Tweet media one

49

572

7K

@docmilanfar

Peyman Milanfar

3 years

Don't let low-order statistics fool you

33

1K

7K

@docmilanfar

Peyman Milanfar

2 years

Stochastic Gradient Descent

Tweet media one

47

448

5K

@docmilanfar

Peyman Milanfar

5 months

“when a measure becomes a target, it ceases to be a good measure”

Tweet media one

30

542

4K

@docmilanfar

Peyman Milanfar

3 years

Stochastic gradient descent

Tweet media one

29

510

4K

@docmilanfar

Peyman Milanfar

5 months

faculty sending their PhD graduates to industry

Tweet media one

18

219

3K

@docmilanfar

Peyman Milanfar

1 year

I published this in a 1-pager: P. Milanfar, “A Persian Folk Method of Figuring Interest”, Mathematics Magazine, vol. 69, no. 5, Dec. 1996 My late dad refused to be a co-author. But when it appeared, he printed it out, framed it, and hung it on the wall of the house. 🙂 4/4

Tweet media one

40

236

3K

@docmilanfar

Peyman Milanfar

8 months

hyperbolic ham sandwich manifold

Tweet media one

53

313

3K

@docmilanfar

Peyman Milanfar

2 years

Reinforcement Learning, episode 1

71

496

3K

@docmilanfar

Peyman Milanfar

4 years

(1/5) One of the most surprising and little-known results in classical statistics is the relationship between the mean, median, and standard deviation. If the distribution has finite variance, then the distance between the median and the mean is bounded by one standard deviation.

Tweet media one

12

684

3K

@docmilanfar

Peyman Milanfar

1 year

I’m calling for all artificial intelligence labs to stop publishing for at least six months, so I can catch up with the literature

48

283

3K

@docmilanfar

Peyman Milanfar

1 year

seeing your novel idea in an old paper

Tweet media one

14

238

2K

@docmilanfar

Peyman Milanfar

5 months

Stochastic Gradient Descent

Tweet media one

26

203

2K

@docmilanfar

Peyman Milanfar

3 years

The prior in your brain is wrong. This isn’t fried chicken

Tweet media one

65

344

2K

@docmilanfar

Peyman Milanfar

2 years

Very sorry for your loss

Tweet media one

34

157

2K

@docmilanfar

Peyman Milanfar

4 years

Years ago when my wife and I we were planning to buy our home, my dad stunned me with how quickly he could compute loan payments in his head. I asked him how he did it. He'd learned a strange formula for compound interest from his father, who was a merchant in 19th century Iran.

Tweet media one

13

407

2K

@docmilanfar

Peyman Milanfar

11 months

Prof. on website vs. Prof. in person

Tweet media one

18

163

2K

@docmilanfar

Peyman Milanfar

3 years

machine learning paper explaining advanced mathematics

55

354

2K

@docmilanfar

Peyman Milanfar

10 months

Strange but true - A wobbly square table on any reasonable floor can be made steady by just turning it

Tweet media one

34

209

2K

@docmilanfar

Peyman Milanfar

7 months

Almost every technical person knows about least-squares(LS) but most don’t know about *total* least-squares(TLS). These measure fitting error differently: LS minimizes sum of sq. vertical distances whereas TLS minimizes the sum of orthogonal distances from data to fit line 1/2

Tweet media one

41

271

2K

@docmilanfar

Peyman Milanfar

2 years

Go ahead AI, count the number of people in this photo.

Tweet media one

66

163

2K

@docmilanfar

Peyman Milanfar

9 months

I’m just an average researcher. But something I’ve learned in 30 years of being a researcher is that if you’ve convinced yourself at age 25 that you can teach others how to be great researchers, you’ve still got a lot to learn.

@_jasonwei

Jason Wei

9 months

Enjoyed visiting UC Berkeley’s Machine Learning Club yesterday, where I gave a talk on doing AI research. Slides: In the past few years I’ve worked with and observed some extremely talented researchers, and these are the trends I’ve noticed: 1. When

46

276

2K

65

121

2K

@docmilanfar

Peyman Milanfar

5 months

The retina is arguably the most impressive part of the brain. It’s the only part of the brain that faces the world directly - it’s a sensor and processor in one Its consumes 50% more energy per gram than the rest of the brain. 1000:1 compression from retina to optic nerve

Tweet media one

41

353

2K

@docmilanfar

Peyman Milanfar

7 months

There’s a single formula that makes all of your diffusion models possible: Tweedie's Say 𝐱 is a noisy version of 𝐮 with 𝐞 ∼ 𝒩(𝟎, σ² 𝐈) 𝐱 = 𝐮 + 𝐞 MMSE estimate of 𝐮 is 𝔼[𝐮 | 𝐱] and would seem to require P(𝐮|𝐱). Yet Tweedie says P(𝐱) is all you need 1/3

Tweet media one

16

268

2K

@docmilanfar

Peyman Milanfar

1 year

don’t be misled by low-order statistics

11

362

2K

@docmilanfar

Peyman Milanfar

8 months

The Kalman Filter was once a core topic in EECS curricula. Given it's relevance to ML, RL, Ctrl/Robotics, I'm surprised that most researchers don't know much about it, and many papers just rediscover it. KF seems messy & complicated, but the intuition behind it is invaluable 1/4

Tweet media one

31

244

2K

@docmilanfar

Peyman Milanfar

1 year

Think you understand Maximum Likelihood? Think again. It seems like such a natural idea, but there’s an epic and turbulent history with numerous assaults on the core idea, culminating in a beautiful and complicated theory. A highly entertaining account:

Tweet media one

8

311

2K

@docmilanfar

Peyman Milanfar

11 months

titans of industry. intellectual giants

Tweet media one

23

80

2K

@docmilanfar

Peyman Milanfar

2 years

Be careful what’s visible in your car. I had two diffusion papers on the front seat; somebody broke in and left two more.

Tweet media one

25

80

2K

@docmilanfar

Peyman Milanfar

3 years

Double blind peer review

Tweet media one

6

214

2K

@docmilanfar

Peyman Milanfar

2 years

Be careful when you park on a university campus. I had two ML theory papers in the car and somebody broke in and left two more

Tweet media one

30

72

2K

@docmilanfar

Peyman Milanfar

2 months

gradient descent

Tweet media one

19

109

2K

@docmilanfar

Peyman Milanfar

4 months

I'm releasing all the lectures and notes for an introductory course on Statistical Detection and Estimation I used to teach. The core material hasn't changed - it was an EE course, but it's as relevant today to AI researchers as ever before. Hope you find it useful. Covers: *

Tweet media one

18

210

2K

@docmilanfar

Peyman Milanfar

3 years

For a generation of Iranians who came of age during & just after the Islamic Revolution of 1979, current events in Afghanistan are not just heartbreaking, but also deeply personal & resonant with our own experiences. I was smuggled alone out of Iran. I was 15. Here's my story.

20

262

1K

@docmilanfar

Peyman Milanfar

11 months

Tweet media one

28

170

1K

@docmilanfar

Peyman Milanfar

2 years

A most surprising & little-known results in statistics is that the mean (μ) and median (m) are within a std deviation (σ) |μ−m| ≤ σ for unimodal densities bound is even tighter |μ−m| ≤ 0.7756 σ This beautiful results first appeared in a 1932 paper by Hotelling & Solomons

Tweet media one

Tweet media two

15

197

1K

@docmilanfar

Peyman Milanfar

6 months

that’s what physicists call a theorem

Tweet media one

33

87

1K

@docmilanfar

Peyman Milanfar

7 months

every man

Tweet media one

38

107

1K

@docmilanfar

Peyman Milanfar

7 months

savage

Tweet media one

7

118

1K

@docmilanfar

Peyman Milanfar

1 year

How to read a paper online: 1: open PDF in browser 2: skim abstract & figures 3: leave tab open for weeks 4: accidentally close tab 5: search for paper again 6: go to step 1

38

123

1K

@docmilanfar

Peyman Milanfar

11 months

advisor helping new PhD student write their 1st paper

24

100

1K

@docmilanfar

Peyman Milanfar

2 years

The original PageRank paper, the algorithm powering Google search, published in 1998, has been cited 17,139 times to date. The original ResNet paper, published in 2016, has been cited 146,746 times to date. To me, this seems extremely weird.

38

79

1K

@docmilanfar

Peyman Milanfar

4 months

Reviewer 2

Tweet media one

14

103

1K

@docmilanfar

Peyman Milanfar

3 years

The author's dilemma, circa 2021

Tweet media one

18

137

1K

@docmilanfar

Peyman Milanfar

10 months

Six mathematical objects you don't need 1/6. Klein Wine Bottle Which you have to pour from the bottom

Tweet media one

20

132

1K

@docmilanfar

Peyman Milanfar

2 years

On the first page of my (1993) PhD Thesis. Still true.

Tweet media one

13

111

1K

@docmilanfar

Peyman Milanfar

2 years

Today we announced a new feature on Pixel 7/Pro and @GooglePhotos called "Unblur". It's the culmination of a year of intense work by our amazing teams. Here's a short thread about it 1/n #MadeByGoogle #fixedonpixel #Pixel7 #PhotoUnblur @GooglePixel_US

38

132

1K

@docmilanfar

Peyman Milanfar

5 years

It’s been 20 years since I submitted my first paper with Nhat Nguyen and the late great Gene Golub on multi-frame super-res (SR). Here’s a thread, a personal story of SR as I’ve experienced it. It won’t be exhaustive or fully historical. Apologies to colleagues for any omissions

Tweet media one

10

294

1K

@docmilanfar

Peyman Milanfar

1 year

never seen a graphic so wrong, but so useful

Tweet media one

28

116

1K

@docmilanfar

Peyman Milanfar

3 years

1/6 Iterating (i.e. repeatedly composing) a function is tricky business. You can get wild (even chaotic) behavior with simple functions like r(x) = cx(1-x) That's why it's important to choose the nonlinear activation functions very carefully in neural networks.

Tweet media one

8

196

1K

@docmilanfar

Peyman Milanfar

11 months

Even technical people get this wrong: Sample Standard Deviation (SD) vs Standard Err (SE) You want an estimate m̂ of m=𝔼(x) from N independent samples xᵢ. Typical choice is the average or "sample" mean How stable is this? The Standard Error (SE) tells how stable it is 1/6

Tweet media one

20

141

1K

@docmilanfar

Peyman Milanfar

1 year

The perpetually undervalued least-squares: minₓ‖y−Ax‖² can teach a lot about some complex ideas in modern machine learning including overfitting & double-descent. Let's assume A is n-by-p. So we have n data points and p parameters 1/n

Tweet media one

12

190

1K

@docmilanfar

Peyman Milanfar

2 years

how daylight saving time was invented

14

283

1K

@docmilanfar

Peyman Milanfar

2 years

Why don’t more people know about the gem that is Tweedie's formula? Say 𝐱 is a noisy measurement of 𝐮 𝐱 = 𝐮 + 𝐞 w/ 𝐞 ∼ 𝒩(𝟎, σ² 𝐈) min mean² estimate of 𝐮 is 𝔼 [𝐮 | 𝐱]. Obviously we need the density P(𝐮|𝐱) right? No! Tweedie says P(𝐱) is all you need! 1/2

Tweet media one

9

155

1K

@docmilanfar

Peyman Milanfar

9 months

microscopic change to page margins in LaTeX

19

78

1K

@docmilanfar

Peyman Milanfar

1 year

Image-to-image models have been called 'filters' since the early days of comp vision/imaging. But what does it mean to filter an image? If we choose some set of weights and apply them to the input image, what loss/objective function does this process optimize (if any)? 1/8

Tweet media one

7

139

1K

@docmilanfar

Peyman Milanfar

7 months

“Mathematical rigor is like clothing: in its style it ought to suit the occasion, and it diminishes comfort and restricts freedom of movement if it is either too loose or too tight" -G.F. Simmons Physicists/Engineers know this well - too much rigor induces a fear of making

Tweet media one

12

164

1K

@docmilanfar

Peyman Milanfar

2 years

Tweet media one

16

62

1K

@docmilanfar

Peyman Milanfar

1 year

the only people I’ve seen dismissing this insightful article are ones who don’t seem to understand compression very well the only surprise(?) is that it took a skilled writer and thinker, instead of one of us researchers, to make the case crystal clear

Tweet card media

ChatGPT Is a Blurry JPEG of the Web

OpenAI’s chatbot offers paraphrases, whereas Google offers quotes. Which do we prefer?

www.newyorker.com

47

185

1K

@docmilanfar

Peyman Milanfar

2 years

"Non-parametric" regression is often misunderstood. Are there no parameters? Hardly It's just that non-parametric methods don't fit to explicit global models. The overall fit is many local fits that may use a total # of parameters that may even exceed the # of data points. 1/n

Tweet media one

6

167

1K

@docmilanfar

Peyman Milanfar

1 year

Reviewer 1: lacks novelty Reviewer 2: useless in practice Reviewer 3: strong accept

29

88

1K

@docmilanfar

Peyman Milanfar

2 years

Salt Bae is to haute cuisine what Lex Fridman is to AI research

34

63

997

@docmilanfar

Peyman Milanfar

4 months

What do polar coordinates, polar matrix factorization, & Helmholz decomposition of a vector field have in common? They’re all implied by Brenier’s Theorem: a cornerstone of Optimal Transport theory. It’s a fundamental decomposition result & deserves to be better known. 1/5

Tweet media one

13

111

1K

@docmilanfar

Peyman Milanfar

2 years

Here's a neat trick to impress your friends Let's say you have some curve with a random shape, possibly even self-intersecting. Can you measure its length? This isn't just a parlor trick -- it has many practical applications. For example, the curve could be a strand of DNA 1/n

Tweet media one

15

153

999

@docmilanfar

Peyman Milanfar

2 years

Tweet media one

14

97

995

@docmilanfar

Peyman Milanfar

8 months

Take pixels gᵢ=g(xᵢ,yᵢ) of an image as nodes in a weighted, undirected graph. The weights on each edge are the similarity between pixels, measured w/ a sym pos def kernel k(i,j) =exp[−d(gᵢ,gⱼ)] g is encoded in K. What can we learn about g from K? Can we get g back from K?

Tweet media one

9

111

987

@docmilanfar

Peyman Milanfar

3 years

This isn’t the abstract of the paper. It’s the whole paper. “Equilibrium points in n-person games” by John F. Nash Jr., PNAS January 1, 1950 36 (1) 48-49

Tweet media one

20

171

971

@docmilanfar

Peyman Milanfar

1 year

It's a privilege to work with so many talented folks in @GoogleAI . The body of work @JeffDean describes in this blog is so broad and deep - I'm learning something new everyday and fortunate to be able to contribute with my team on the vision/imaging side.

Tweet media one

6

148

976

@docmilanfar

Peyman Milanfar

3 years

When you compare your method to the other paper “under similar conditions”

Tweet media one

8

181

978

@docmilanfar

Peyman Milanfar

2 years

Seeing your ‘new’ idea in an old paper

Tweet media one

6

88

974

@docmilanfar

Peyman Milanfar

3 years

The perpetually undervalued least-squares minₓ‖y−Ax‖² can teach a lot about some complex ideas in modern machine learning. Overfitting & the double-descent ideas are some interesting examples. Let's assume A is n-by-p. So we have n data points & p parameters 1/n

Tweet media one

13

185

969

@docmilanfar

Peyman Milanfar

6 months

“We fine-tune the last layer”

Tweet media one

9

65

952

@docmilanfar

Peyman Milanfar

2 years

not-explainable AI

Tweet media one

7

103

933

@docmilanfar

Peyman Milanfar

3 months

It’s been >20 years since I published my first work on multi-frame super-res (SR) w/ Nhat Nguyen and the late great Gene Golub. Here’s my personal story of SR as I’ve experienced it from theory, to practical algorithms, to deployment in product. In a way it’s been my life’s work

Tweet media one

14

119

881

@docmilanfar

Peyman Milanfar

1 year

Congratulations to the authors of this lovely work that just won a best paper award at @siggraph #siggraph2023 . They achieve state-of-the-art visual quality with real-time (≥ 100 fps) novel-view synthesis at 1080p resolution. Far exceeding NeRF approaches on both quality and

Tweet media one

13

161

954

@docmilanfar

Peyman Milanfar

2 years

Never miss an opportunity to look younger and more fancy than your co-authors

Tweet media one

21

54

936

@docmilanfar

Peyman Milanfar

5 months

the math in most diffusion papers

36

82

933

@docmilanfar

Peyman Milanfar

1 month

the moment you realize your groundbreaking research is now a marketing buzzword

Tweet media one

10

33

937

@docmilanfar

Peyman Milanfar

3 months

This is not a scene from Inception. The sorcery is a real photo was taken with a very long focal length lens. When the focal length is long, the field of view becomes very small and the resulting image appears more flat. 1/4

Tweet media one

7

56

935

@docmilanfar

Peyman Milanfar

3 months

Motion blur is often misunderstood, because people think of it in terms of a single imperfect image captured at some instance in time. But motion blur is in fact an inherently temporal phenomenon. It is a temporal convolution of pixels (at the same location) across time. 1/4

Tweet media one

3

95

933

@docmilanfar

Peyman Milanfar

8 months

The Gaussian is a nice bumpy shape, but sometimes we hope for a smooth (i.e. C∞) function like the Gaussian that is 𝒂𝒍𝒔𝒐 compactly supported. One such class of functions is called "Bump functions" 1/6

Tweet media one

12

93

917

@docmilanfar

Peyman Milanfar

2 years

Me keeping track of arXiv

5

99

909

@docmilanfar

Peyman Milanfar

4 years

I published this in a 1-pager: P. Milanfar, “A Persian Folk Method of Figuring Interest”, Mathematics Magazine, vol. 69, no. 5, December 1996 My late dad refused to be a co-author. But when it appeared, he printed it out and framed it; and hung it on the wall of the house. 🙂

Tweet media one

17

110

882

@docmilanfar

Peyman Milanfar

1 year

Senior author’s contribution to the paper

12

73

861

@docmilanfar

Peyman Milanfar

11 months

1/7 The choice of nonlinear activation functions in neural networks can make a big difference. Why? Because iterating (i.e. repeatedly composing) even simple nonlinear functions can be tricky. Wild, or chaotic behavior can emerge even with something as simple as a quadratic.

Tweet media one

14

141

866

@docmilanfar

Peyman Milanfar

2 years

Double blind peer review

Tweet media one

7

83

871

@docmilanfar

Peyman Milanfar

5 months

For a small fee, I will go to your academic rival’s talk, sit in the front row, and then leave half-way through.

23

41

861

@docmilanfar

Peyman Milanfar

3 months

unnecessary math in most AI papers

Tweet media one

14

43

854

@docmilanfar

Peyman Milanfar

1 year

I figured out how the two formulae relate: the historical formula is the Taylor series of the exact formula around r=0. But the crazy thing is that the old Persian formula goes back 100s (maybe 1000s) of years before Taylor's, having been passed down for generations 3/4

Tweet media one

9

49

847

@docmilanfar

Peyman Milanfar

1 year

What do polar coordinates, polar matrix factorization, & Helmholz decomposition of a vector field have in common? They’re all implied by Brenier’s Theorem: a cornerstone of Optimal Transport theory. It’s a fundamental decomposition result & deserves to be better known. 1/5

Tweet media one

4

121

852

@docmilanfar

Peyman Milanfar

8 months

Perfect

Tweet media one

7

108

850

@docmilanfar

Peyman Milanfar

3 months

What is resolution in an image? It is not the number of pixels. Here’s the classical Rayleigh’s criterion taught in basic physics: 1/5

Tweet media one

8

122

848

@docmilanfar

Peyman Milanfar

1 year

measure theory is almost entirely pointless

41

52

841

@docmilanfar

Peyman Milanfar

3 years

The Gaussian is a nice bumpy shape, but sometimes we hope for a smooth (i.e. C∞) function like the Gaussian that is 𝒂𝒍𝒔𝒐 compactly supported. One such class of functions is called "Bump functions" 1/6

Tweet media one

6

140

842

@docmilanfar

Peyman Milanfar

3 years

Six mathematical objects you don't need 1. Klein Wine Bottle Which you have to pour from the bottom

Tweet media one

9

149

840

@docmilanfar

Peyman Milanfar

2 years

A quick thread on a ‘simple’ problem: denoising x = u + e Noisy signal = x Clean signal = u Gaussian iid noise = e Two well-known approaches: MAP: Maximum a-posteriori MMSE: Minimum mean square error They don’t coincide, but share several interesting properties 1/4

Tweet media one

7

122

842

@docmilanfar

Peyman Milanfar

2 years

when everybody is chasing the state of the art

4

118

832