Chi Jin @chijinML Twitter profile | Pikagi

Pikagi

Chi Jin

@chijinML

2,788

Followers

203

Following

5

Media

51

Statuses

Assistant Professor @Princeton . Researcher on theoretical foundations of machine learning, reinforcement learning, games and optimization.

Princeton, NJ

https://t.co/P37UhX2cod

Joined November 2012

Don't wanna be here? Send us removal request.

Pinned Tweet

@chijinML

Chi Jin

6 months

Interested in learning the mathematical foundations of Reinforcement Learning (RL)? Now is a good time! This semester, we will make videos and lecture notes from my graduate-level RL theory course at Princeton available to the public. Now it's week 1:

Tweet card media

Princeton University Lectures - Foundations of Reinforcement Learning

Lectures from ECE524 Foundations of Reinforcement Learning at Princeton University, Spring 2024. This course is a graduate level course, focusing on theoreti...

www.youtube.com

14

124

626

Last Seen Profiles

@vantaemono

@warr_oo

@fahinursezer1

@skyrk921

@ciudadanosober

@filifusilli

@mymindisnotmy

@TaquitoAlPaztor

@vantaemono

@JonaeMonae

@isaawaltz

@kasperpeders

@NCharayut86042

@nyaraw_

@andrewdesperto

@da__ricy

@GiaDivito

@DuvalSablock

@bjushee

@anpamar

@nantesfr

@nanaschoi

@88dd89264f9c41b

@dskullys

@713aL_MAriSi76

@opanetony

@KadirKarakayya

@dok2_mo

@DL_theHUNT

@FOCALISTIC

@OhOr_O

@TiffanyDyck123

@kt_fps

@FAV_MuNkA772

@lynn_yeakey

@EsperanzaR67856

@chijinML

Chi Jin

2 years

Wow, all of our 6 submissions to ICML and COLT got accepted this year! Congrats to all my collaborators.

12

7

439

@chijinML

Chi Jin

1 year

Extremely honored to receive this junior faculty award as well as the NSF Career award this year. Thank you all for the support!

Tweet media one

22

1

185

@chijinML

Chi Jin

2 months

I'm visiting Google Deepmind (London office) this whole summer till the end of August. Feel free to ping me if you are around!

8

2

170

@chijinML

Chi Jin

5 months

Truly honored to receive this recognition. Thank you all for your support, especially to my family, students, and collaborators!

@SloanFoundation

Sloan Foundation

@SloanFoundation

5 months

We have today announced the names of the 2024 Sloan Research Fellows! Congratulations to these 126 outstanding early-career researchers:

Tweet media one

6

40

248

23

3

136

@chijinML

Chi Jin

2 months

Announcing our new work, which shows transformer architecture can be a lot worse than RNN in modeling sequences with long-term correlations such as HMMs, and how to potentially fix it: , joint work with amazing collaborators Jiachen Hu and @qinghual2020 .

Tweet card media

On Limitation of Transformer for Learning HMMs

Despite the remarkable success of Transformer-based architectures in various sequential modeling tasks, such as natural language processing, computer vision, and robotics, their ability to learn...

2

21

133

@chijinML

Chi Jin

8 months

Announcing our new paper which studies OOD generalization under well-specified covariate shift, and proves that surprisingly vanilla MLE without using any importance weights is the best algorithm!

Tweet card media

Maximum Likelihood Estimation is All You Need for Well-Specified...

A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization -- generalizing to target data whose distribution differs from that of source data....

2

22

110

@chijinML

Chi Jin

2 years

First PhD student graduated. Congratulations to Sobhan Miryoosefi! Best wishes on your new adventure at Google research NY. via @YouTube

Tweet card media

✂️ Sobhan's graduation

17 seconds · Clipped by Chi Jin · Original video "Princeton University 2022 Graduate Hooding and Recognition Ceremony" by Princeton University

www.youtube.com

3

2

88

@chijinML

Chi Jin

8 months

In summer, @YuanhaoWang3 @qinghual2020 and I propose to learn Nash from human feedback in theory (inspired by the dueling bandit literature). It's truly fascinating to witness the idea being used to create innovative, practically useful algorithms for LLMs

Tweet card media

Is RLHF More Difficult than Standard RL?

Reinforcement learning from Human Feedback (RLHF) learns from preference signals, while standard Reinforcement Learning (RL) directly learns from reward signals. Preferences arguably contain less...

@misovalko

Michal Valko

8 months

Fast-forward ⏩ alignment research from @GoogleDeepMind ! Our latest results enhance alignment outcomes in Large Language Models (LLMs). Presenting NashLLM!

Tweet media one

4

129

837

0

9

78

@chijinML

Chi Jin

2 years

6+ hours group hiking at Mountain Tammany, so tired!

Tweet media one

Tweet media two

0

0

64

@chijinML

Chi Jin

5 months

Can we match the performance of optimally-tuned SGD without knowing the problem parameters including diameters, Lipschitz/smoothness constants, and noise levels? See our recent work led by amazing student @ahmedkhaledv2 at

Tweet card media

Tuning-Free Stochastic Optimization

Large-scale machine learning problems make the cost of hyperparameter tuning ever more prohibitive. This creates a need for algorithms that can tune themselves on-the-fly. We formalize the notion...

4

2

65

@chijinML

Chi Jin

2 years

While many existing results in game theory have been focused on finding equilibria, an equally important goal is to learn rationalizable behaviors, which avoid iteratively dominated actions.

1

3

60

@chijinML

Chi Jin

2 years

It's so much fun to attend IROS 2022 (visit Japan😃).

Tweet media one

Tweet media two

Tweet media three

Tweet media four

4

0

60

@chijinML

Chi Jin

2 months

Ever wonder how to play multiplayer games (>2 players, such as Mahjong, Poker) well and what would be the ultimate solution? Check our paper on why classical equilibria and existing self-play systems are not enough, and how to address it:

Tweet card media

Towards Principled Superhuman AI for Multiplayer Symmetric Games

Multiplayer games, when the number of players exceeds two, present unique challenges that fundamentally distinguish them from the extensively studied two-player zero-sum games. These challenges...

3

11

50

@chijinML

Chi Jin

6 months

Supporting the incredibly stylish video about my amazing PhD advisor.

@UCBStatistics

Berkeley Statistics

6 months

Distinguished Professor Michael Jordan recently sat down with Barbara Rosario (Ph.D. I-School, 2005) for a series called "AI Stories." Jordan's episode was filmed in Trieste, Italy, and titled "Lives, Loves, and Technology." #BerkeleyStats

0

5

30

0

1

35

@chijinML

Chi Jin

1 year

Ever wonder what is the principled approach to directly use existing standard (reward-based) RL techniques to handle RL from preferences? I will talk about "Is RLHF More Difficult than Standard RL" at ICML workshop "The Many Facets of Preference-based Learning" today.

1

2

34

@chijinML

Chi Jin

4 days

@ ICML till Friday, happy to catch up! (Be mindful when using Vienna metro, purchasing a ticket is not enough. Validate the ticket on a tiny blue machine before using it, or face a fine of €100+. There is no mercy for first-time foreign travelers unaware of the regulation😂)

Tweet media one

6

0

50

@chijinML

Chi Jin

8 months

Feel free to stop by our posters on RL theory and parameter-free optimization at NeurIPS! Our amazing student Qinghua Liu @qinghual2020 is also on the job market this year.

Tweet media one

0

6

29

@chijinML

Chi Jin

2 years

Do you know how to *provably* solve multiagent reinforcement learning problems under partial observability? Check out our NeurIPS poster at Hall J #616 on Wed 4pm, which gives the first sample-efficient solution for learning partially observable Markov games.

0

1

29

@chijinML

Chi Jin

2 years

Very glad to meet my old friend @shaneguML again! Princeton should also have a cat cafe. Very good business! It erases stress and heals your heart. :-)

@shaneguML

Shane Gu

2 years

@chijinML and I know for 10 years, but the guy in the middle is a random French data scientist from Israel. *the badge is time log, not name tag* Pro Tips in Tokyo: if you want to meet random data scientists, @GoogleAI researchers, @Princeton professors, go to cat cafe

Tweet media one

Tweet media two

0

0

10

0

1

22

@chijinML

Chi Jin

2 years

We are excited to announce our recent work with @YuanhaoWang3 , Dingwen Kong, @yubai01 , which presents new algorithms and the first sample-efficient guarantees for learning rationalizable equilibria.

Tweet card media

Learning Rationalizable Equilibria in Multiplayer Games

A natural goal in multiagent learning besides finding equilibria is to learn rationalizable behavior, where players learn to avoid iteratively dominated actions. However, even in the basic setting...

0

1

20

@chijinML

Chi Jin

1 year

Excited to join as an invited speaker in this upcoming ICML workshop!

@AadirupaSaha

Aadirupa Saha

1 year

CALL FOR PAPERS!!! Excited to organize our new workshop on "The Many Facets of Preference-based Learning" at this year's ICML, Hawaii, with my amazing co-organizers @BengsViktor , Robert Busa-Fekete, Mohammad Ghavamzadeh, and Branislav Kveton. Website:

1

3

7

0

1

16

@chijinML

Chi Jin

2 years

Congratulations to our students @qinghual2020 who won SEAS Award for Excellence

Tweet card media

Princeton Engineering - Award for Excellence honors graduate student achievement

Princeton Engineering's Award for Excellence went to 15 advanced graduate students who have performed at the highest level as scholars and researchers.

engineering.princeton.edu

0

0

11

@chijinML

Chi Jin

8 months

Out-of-distribution (OOD) generalization is a core challenge in modern ML/foundation models. We consider the well-specified setting as modern ML systems typically use very large and expressive models. This is a joint work with @EmilyJge , Shange Tang, Jianqing Fan, Cong Ma.

0

0

5

@chijinML

Chi Jin

1 year

Paper arXiv link: This is joint work with amazing students/collaborators @YuanhaoWang3 @qinghual2020

0

0

5

@chijinML

Chi Jin

8 months

@RuntianZhai Thanks for pointing out your nice paper! While these two works prove related phenomena, the settings and underlying mechanisms are orthogonal and in some sense complementing each other. We will add comparison to your work in our next version.

1

0

3

@chijinML

Chi Jin

5 months

@damekdavis @prof_grimmer Thank you!

0

0

2

@chijinML

Chi Jin

5 months

@bremen79 @durdi4 @ahmedkhaledv2 [1] Thanks for your comments! I would also like to quickly remark on a few points. A. Having coarse estimates on the upper/lower bound of the parameters is common in ML applications, as most optimizers have either explicit or implicit regularizations.

1

0

1

@chijinML

Chi Jin

2 months

joint work with amazing students at Princeton: @EmilyJge , @YuanhaoWang3 , @WenzheLiTHU .

0

0

2

@chijinML

Chi Jin

6 months

@iandanforth Those pathological behaviors can be results of many factors. We will cover principled approaches to handle exploration, function approximation, and later, multiagency, partially observable settings which relate to addressing some bad behaviors you mentioned.

0

0

2

@chijinML

Chi Jin

1 month

@BishPlsOk Our paper concerns the basic goal—just winning the game, which might be the first step to address those game, and are already highly non-trivial in the current context. I agree there are many other important things in practical game beyond just winning that worth further study.

0

0

2

@chijinML

Chi Jin

2 years

@SimonShaoleiDu @GangNiu3 @shaneguML Yeah, unicorn gundam.

0

0

1

@chijinML

Chi Jin

2 months

@misovalko not fast enough to catch you up :-)

0

0

1

@chijinML

Chi Jin

6 months

@GRIGORYANGA probably no enough time to cover inverse RL for the semester, but we will cover online RL, explore/exploitation tradeoff, etc.

0

0

1

@chijinML

Chi Jin

6 months

@yuxiangw_cs lol, maybe a good collection of lecture notes first. :-)

0

0

2

@chijinML

Chi Jin

5 months

@bremen79 @durdi4 @ahmedkhaledv2 [3] B. The lower order term is important especially in the stochastic optimization, given the improvement brought by momentum in SGD also only appears in the lower order terms.

Tweet card media

Optimal Stochastic Approximation Algorithms for Strongly Convex Stochastic Composite Optimization...

In this paper we present a generic algorithmic framework, namely, the accelerated stochastic approximation (AC-SA) algorithm, for solving strongly convex stochastic composite optimization (SCO)...

1

0

1

@chijinML

Chi Jin

5 months

@bremen79 @durdi4 @ahmedkhaledv2 [4] T in practice is often not large enough to enter the asymptotic regime — it may not be significantly larger than some high-order polynomial dependency of problem parameters that appear in the lower order terms of the prior works.

0

0

1

@chijinML

Chi Jin

1 year

@ben_eysenbach @Princeton @mldcmu @rsalakhu @svlevine @PrincetonCS Congrats Ben! Looking forward to seeing you at Princeton.

0

0

1

@chijinML

Chi Jin

5 months

@CsabaSzepesvari @SloanFoundation @nanjiang_cs @SimonShaoleiDu Thank you Csaba!

0

0

1

@chijinML

Chi Jin

5 months

@PandaAshwinee @ahmedkhaledv2 Final goals are similar, but there are multiple versions of "parameter-free" used by prior work. We call it tuning-free to make our definition more formal/rigorous, and not to be confused with prior definitions. The meat is in algorithms and results instead of definition.

1

0

1

@chijinML

Chi Jin

5 months

@bremen79 @durdi4 @ahmedkhaledv2 [2] Don't know D_upper? We can simply make it extremely large so that the algorithm will never pass that limit in practice. This is fine since we only pay extra log factors.

1

0

1

@chijinML

Chi Jin

6 months

@Hypertrooper Yes, Psets will also be posted on the course website, which is available to public (see course website link in videos).

0

0

1