Csaba Szepesvari Profile
Csaba Szepesvari

@CsabaSzepesvari

8,844
Followers
710
Following
110
Media
2,504
Statuses

"If there is not folly in the world, then the world itself is folly. You must understand that mistakes are not always regrets." - Paul Tobin, Bandette🤠

England, United Kingdom
Joined February 2016
Don't wanna be here? Send us removal request.
@CsabaSzepesvari
Csaba Szepesvari
2 years
This semester I'll teach an undergraduate "intro to RL" course at the UofA. For the first lecture, I collected some exciting, recent, impactful applications of RL. Link to the relevant slides: I thought this may be worthwhile to share.
Tweet media one
27
100
760
@CsabaSzepesvari
Csaba Szepesvari
6 years
Yours truly and his coauthor Tor Lattimore happily present the near-final draft of their upcoming bandit book at The pdf will stay free. In this phase we welcome reader comments. The book will be printed by #CambrideUniversityPress . Please share:)
7
174
422
@CsabaSzepesvari
Csaba Szepesvari
4 years
Interested in hearing about the theoretical foundations of RL from a multidisciplinary perspective (CS, control, stats, OR)? If so, join us at the (all virtual) RL Theory Bootcamp at the Simons Institute next week. Lectures in the morning and the afternoon ==>
4
79
387
@CsabaSzepesvari
Csaba Szepesvari
5 years
After a 2 year break, I'll be teaching in the fall a grad course. Go Bandits!
8
43
359
@CsabaSzepesvari
Csaba Szepesvari
5 years
Glad to announce the "Theory of RL" program at the Simons Institute in the Fall of 2020. DM me if you are interested! @SebastienBubeck @EmmaBrunskill Alan Malek @SeanMeyn Ambuj Tewari and Mengdi Wang are my awesome coorganizers.
4
41
227
@CsabaSzepesvari
Csaba Szepesvari
5 years
Is RL used in real applications? If so, how and where? And if not, why not and how can this be fixed? Join our excellent panelists and speakers at the half-day RL2 workshop organized at @icmlconf or submit a paper to present your views.
3
22
184
@CsabaSzepesvari
Csaba Szepesvari
4 years
I feel very much honoured to be selected for this role. To make the best of this job, hive mind of ML people on twitter, if you have any ideas about how to improve ICML, drop me a message (or just respond to this tweet).
@JohnCLangford
John Langford
4 years
Some decisions for ICML from the board: ICML General Chairs: 2022: Kamalika Chaudhuri @kamalikac 2023: Andreas Krause @arkrause ICML 2022 Program Chairs: Csaba Szepesvari @CsabaSzepesvari , Le Song @dasongle , and Stefanie Jegelka (maybe @StefanieJegelka )
1
12
179
17
2
175
@CsabaSzepesvari
Csaba Szepesvari
4 years
Friends: I am looking for theory oriented postdocs in RL (with past theory experience). I appreciate if you spread the word.
1
86
162
@CsabaSzepesvari
Csaba Szepesvari
4 years
Just for counterbalancing, hats off to those reviewers who are still doing a great job! I know that you are out there and while your numbers could be diminishing, we need you to keep doing what you do (post inspired by reading actual good reviews doing my editorial job).
4
5
162
@CsabaSzepesvari
Csaba Szepesvari
4 years
Advice for future reviews: An important question to ask when figuring out whether to recommend accept or reject is "How difficult it is to fix the issues I found?" If very difficult, the paper can't be saved. If not too difficult, there is no reason to reject the paper.
5
10
144
@CsabaSzepesvari
Csaba Szepesvari
4 years
Broader impact predictions back in the day.
@fermatslibrary
Fermat's Library
4 years
Heinrich Hertz after proving the existence of radio waves stated that "it's of no use whatsoever" and regarding the applications of the discovery: "Nothing, I guess"
Tweet media one
31
487
3K
1
13
145
@CsabaSzepesvari
Csaba Szepesvari
8 months
Our department is hiring theoreticians working on ML! If you are on the job market for faculty positions and have a strong track record in theory, this may be your dream job! Why apply? Read on.. 1/x
Tweet media one
4
27
115
@CsabaSzepesvari
Csaba Szepesvari
4 years
This sounded like a crazy idea two weeks ago, but here we go! @RLtheory is the account to follow! Thanks for the speakers who already accepted our invitations! I hope the community will like this series!
@neu_rips
Gergely Neu
4 years
excited to announce a new series of virtual seminars on ~~~REINFORCEMENT LEARNING THEORY~~~ we've set this up with @CiaraPikeBurke and @CsabaSzepesvari to keep track of all the advances of this fast-paced field. hope others will also find it useful!
Tweet media one
5
75
306
5
26
116
@CsabaSzepesvari
Csaba Szepesvari
5 years
I have a duty to spread the truth: "Don't worry about the overall importance of the problem; work on it if it looks interesting. I think there's a sufficient correlation between interest and importance. — David Blackwell" And remember:
0
16
113
@CsabaSzepesvari
Csaba Szepesvari
3 years
For whatever it's worth, I am offering a mentoring session at #AISTATS on Wednesday, April 14, 2021 18:30 MDT. All are welcome!
3
13
109
@CsabaSzepesvari
Csaba Szepesvari
6 years
Please share: The newly created "Foundations team" of @DeepMindAI have openings for research scientists with strong theoretical background, and an unstoppable interest in pushing the boundaries of AI and machine learning. PM me if you are interested. #ICML2018
3
37
107
@CsabaSzepesvari
Csaba Szepesvari
4 years
Just in case the travel restrictions would last until July, preorder our book now on Amazon:
4
8
106
@CsabaSzepesvari
Csaba Szepesvari
4 years
After creating a new homepage, I discovered, I used to have a blog. Since I already had it, why not add a new post? Here we go:
3
15
101
@CsabaSzepesvari
Csaba Szepesvari
4 years
Tomorrow we will have Martha White! She will talk about "Policy Gradient Methods as Approximate Policy Iteration: Advantages and Open Questions". Talks open to anyone! Join here:
@AmiiThinks
Amii
4 years
The @rlai_lab Tea Time Talks return! Hosted by Amii’s Chief Scientific Advisory Dr. Richard S. Sutton, the 20-minute talks are delivered by students, faculty and guests, and range from ideas starting to take root to finished projects. #AI #ML #RL
0
2
14
2
19
91
@CsabaSzepesvari
Csaba Szepesvari
1 year
RL Theory Seminars are back! First talk, Policy learning "without'' overlap: Pessimism and generalized empirical Bernstein's inequality by Ying Jin!
Tweet media one
2
20
92
@CsabaSzepesvari
Csaba Szepesvari
4 years
@roydanroy Of course, can't compete with Dan, but I am also still looking for postdocs -- right down in Edmonton, driving distance to the rockies. Awesome hikes, climbs, kayaking, .. + I can promise interesting RL theory problems and a fast paced environment:)
5
10
90
@CsabaSzepesvari
Csaba Szepesvari
3 months
Venting. Reviewer: The paper is bad because of X, Y and Z. Rebuttal: You are wrong on X, Y and Z + detailed explanation. Reviewer: I maintain my score. The paper is bad (no explanation given). How is this ever an acceptable behavior? Why does a reviewer think this is fine?
10
2
92
@CsabaSzepesvari
Csaba Szepesvari
3 years
@peter_richtarik 's recent post gave me this idea: As next year yours truly will be partially responsible for reviewing quality at ICML, and you just got your first round of reviews back from named conference, vent for me. I promise to listen.
27
9
92
@CsabaSzepesvari
Csaba Szepesvari
11 months
@jasondeanlee He skipped this. Vitanyi & Li's book, or article below gives you the answer. In one formulation, see attached pic, one has that maximum likelihood for a large large class of distributions over one-way infinite sequences is implemented by Kolm-compression
Tweet media one
3
4
88
@CsabaSzepesvari
Csaba Szepesvari
4 years
This is a mini water treatment plant that will be used to optimize the water treatment process using reinforcement learning. It's really awesome to see this happening in Alberta!
@ISLadapt
ISL Adapt
4 years
We are excited to advance the science of water treatment and AI with our partners @rlai_lab @UAlberta @AmiiThinks @DraytonValley and @ISLengineering ! 💧💻 Many thanks to our supporters @ABInnovates @NSERC_CRSNG for this #aiforgood opportunity!
Tweet media one
Tweet media two
Tweet media three
Tweet media four
0
4
31
1
5
90
@CsabaSzepesvari
Csaba Szepesvari
4 years
The third and final workshop in the RL theory program starts tomorrow. The topic is batch RL (sorry @jacobmbuckman ) and simulation-based optimization. All are welcome! The workshop will stream on Youtube. To join on zoom, you need to register.
2
15
90
@CsabaSzepesvari
Csaba Szepesvari
2 years
Offline RL is cool, but will it ever work? Next Tuesday, Yunzong Xu (MIT) will put the nail into the coffin of offline RL by showing us the proof of the correctness of a 2019 conjecture by Chen and Jiang that predicted bad bad news for offline RL.
Tweet media one
0
9
89
@CsabaSzepesvari
Csaba Szepesvari
3 years
While some moments are pretty bleak (CMT mishaps), it warms my heart to see how many people care about @icmlconf . Thank you reviewers and other program committee members and I am looking forward to working with you in the coming year.
0
0
88
@CsabaSzepesvari
Csaba Szepesvari
5 years
#NevernendingReviewingSeason What makes a review good? (1) Objective; (2) helps the decision maker; (3) helps the authors; (4) polite. Constructive criticism is the expression. Constructive, not destructive.
2
12
84
@CsabaSzepesvari
Csaba Szepesvari
5 years
Happy to report that it seems chances are really high that we'll record and will post the lectures online. I'll test the tech on Friday to see whether it is able to track me as I zip from board to board.
2
5
83
@CsabaSzepesvari
Csaba Szepesvari
5 years
To the attention of friends of #ReinforcementLearning : After all those years, finally, our home, @rlai_lab from @UofAResearch is live on twitter.
@rlai_lab
Reinforcement Learning and Artificial Intelligence
5 years
Hello World! This account will share the latest news and updates about what the Reinforcement Learning and Artificial Intelligence (RLAI) Lab at the University of Alberta is up to. Let’s figure out intelligence!
2
27
152
3
6
81
@CsabaSzepesvari
Csaba Szepesvari
4 years
With some glitches, but we are done with the first of the series. Never knew so many people care about RL theory, yay! Great talk Chi Jin! Awesome audience! Next one can only be smoother:) Sign up here if you have not signed up yet:
Tweet media one
3
6
78
@CsabaSzepesvari
Csaba Szepesvari
3 months
@thegautamkamath I grind for my students. And for the love of science and knowledge:) It's not rational, but I can't help it. I am not sure whether this sound honest, but I really never cared about anything but my students and the joy I get from learning new things and connecting to others
2
2
74
@CsabaSzepesvari
Csaba Szepesvari
3 years
Unsolicited student email: "This is my second reminder. I believe your research team is one of the best positions for me to continue my studies, I would be thankful if you could respond to my initial email." (The student never carefully checked my homepage.) Go figure!
5
2
73
@CsabaSzepesvari
Csaba Szepesvari
3 months
We often hear about the theory-practice gap. At this workshop we will take a thorough look at this. Is there a gap? What is the nature of the gap? Who made it? Is it good to have the gap? If not, how to close it? I think this is super important for the healthiness of the field!
@arlet_workshop
ARLET
3 months
🧵 Thrilled to announce the #ICML RL workshop 'Aligning RL Experimentalists and Theorists'! We will have several talks and a panel delivered by a super lineup of speakers: @white_martha , @ShamKakade6 , @yayitsamyzhang , Dylan Foster, Niao He, @svlevine , and @MengdiWang10 . 1/3
Tweet media one
1
14
65
1
11
72
@CsabaSzepesvari
Csaba Szepesvari
4 years
.. and we will finish every day with a bonus talk which brings in the perspective of some particular application. For registration (no fees, just to receive the zoom link) and further details, visit the bootcamp website.
0
7
72
@CsabaSzepesvari
Csaba Szepesvari
4 years
Tired of starring at the pages of the free pdf at ? Want to smell it, flip the pages? Visit the @CambridgeUP booth at #NeurIPS2020 or just head directly to for an incredible 30% discount! #BanditBook
Tweet media one
1
4
72
@CsabaSzepesvari
Csaba Szepesvari
4 years
To the attention of grad students. New Mentor Session scheduled Who? Csaba Szepesvari When? Thu, 10 Dec 2020 18:00:00 GMT Description: phd advise and virtual cookies Details about event:
1
11
72
@CsabaSzepesvari
Csaba Szepesvari
3 years
More awesome RL content; Reinforcement Learning, Bit by Bit by Xiuyuan (Lucy) Lu (DeepMind) Date / Time: Lecture 1: 9:30 AM - 10:30 AM (PT), April 20th (Tuesday) Lecture 2: 10:30 AM - 11:30 AM (PT), April 23rd (Friday) (Stanford RL forum!)
2
17
69
@CsabaSzepesvari
Csaba Szepesvari
4 years
It's here! This weekend, a fully online, pre-ICML, soothing "RL for real life" 2x3 hours virtual conference! Fantastic invited speakers & panel, moderators. Prepare and submit your questions in advance!!! All credit should go to my incredible coorganizers.
@yuxili99
Yuxi Li
4 years
Welcome to RL for Real Life Virtual Conference, June 27-28. , co-organized with @gabepsilon , Alborz Geramifard, Omer Gottesman, @LihongLi20 , Anusha Nagabandi, Zhiwei (Tony) Qin, @CsabaSzepesvari With two panels on general RL and RL+healthcare topics.
Tweet media one
Tweet media two
Tweet media three
1
16
47
0
9
68
@CsabaSzepesvari
Csaba Szepesvari
5 years
Bandits going strong at UofA! 32 seats in the classroom all taken on the day when they became available.
3
1
69
@CsabaSzepesvari
Csaba Szepesvari
3 months
Now that the #COLT2024 decisions are out, I'd like to announce a workshop that we are organize that will happen just before COLT. The workshop theme is RL Theory. All are welcome! Details here: Please spread the word!
2
20
68
@CsabaSzepesvari
Csaba Szepesvari
5 years
Illustration, slightly edited to protect anonymity: "paper feels incremental ..putting together well-known ideas in a straightforward manner." What can I say? Previous work missed even these. And straightforward once done. Reviewer also admitted not reading the proof. Great job?!
@scottniekum
Scott Niekum
5 years
ICML review rant: The ML community is screwed if we keep insisting that scientific inquiry about known algorithms isn't "novel" (even if it leads to major new capabilities / SoTA), but that engineering yet another new, incremental algorithm that we know nothing about is great.
24
201
1K
0
6
66
@CsabaSzepesvari
Csaba Szepesvari
4 years
Any tips on what to write as a broader impact statement for theory papers to be sent to NeuroIPS? #powerofmath #poweroftheory
10
3
65
@CsabaSzepesvari
Csaba Szepesvari
2 years
1/x Our department has 2 Assistant Professor positions in AI/ML and one in Theoretical Computing Science. Here are the job ads. Our department is a super fun, collegial place. Ads:
1
16
63
@CsabaSzepesvari
Csaba Szepesvari
3 years
The moment when the hope that review quality can be improved appears to be fading into the void.. But: #NeverGiveUp #ICML2022
5
3
64
@CsabaSzepesvari
Csaba Szepesvari
5 years
New post on the inescapable appeal of Bayesian methods in the context of adversarial bandits. Or how Bayesian methods can help the agnostic. Hint: Minimax theorems open wormhole between distant corners of the universe.
0
17
63
@CsabaSzepesvari
Csaba Szepesvari
2 years
One day before reviews are due for Phase 1 at #ICML2022 , 50% of the reviewers have submitted zero reviews. The review load for this phase is <=2 papers and there were 19 days for writing these <=2 reviews. What percentage of reviewers will submit all of their reviews in time?
50-69
166
70-89
254
90-100
114
just relax Csaba
407
14
2
61
@CsabaSzepesvari
Csaba Szepesvari
4 years
Asking for a friend: A student wants to pick up intuition about Bregman divergences and their use in convex optimization/online learning. There are lots of excellent texts out there, but is there one that is strong on providing intuition? 1/x
5
4
61
@CsabaSzepesvari
Csaba Szepesvari
3 years
"What information to seek, how to seek that information, and what information to retain?" What else is there to know? A principled approach to this problem will be presented tomorrow by DeepMind's Xiuyuan Lu. Last RL Theory Seminar before the summer break!
Tweet media one
0
7
60
@CsabaSzepesvari
Csaba Szepesvari
5 years
New favourite quote:)
@CompSciFact
Computer Science
10 years
'Just because you've implemented something doesn't mean you understand it.' -- Brian Cantwell Smith
5
169
126
0
2
59
@CsabaSzepesvari
Csaba Szepesvari
3 years
Super proud of Tor and Andras! It's a delight to have them in the team! The paper can be access from here:
@GoogleDeepMind
Google DeepMind
3 years
Huge congratulations to Tor and Andras! Their paper “Improved Regret for Zeroth-Order Stochastic Convex Bandits” was recently recognised for a best paper runner-up award by the flagship learning theory conference, COLT: 1/
Tweet media one
7
55
322
1
4
57
@CsabaSzepesvari
Csaba Szepesvari
3 years
Exactly what the program committee needs to know! Thanks Mike! :-D
Tweet media one
2
0
58
@CsabaSzepesvari
Csaba Szepesvari
2 years
I got many good comments, suggestions and I have significantly expanded the list. I am quite pleased with the result, RL seems to be doing quite well. Very nice applications and more in the works! Thanks everyone!
@CsabaSzepesvari
Csaba Szepesvari
2 years
This semester I'll teach an undergraduate "intro to RL" course at the UofA. For the first lecture, I collected some exciting, recent, impactful applications of RL. Link to the relevant slides: I thought this may be worthwhile to share.
Tweet media one
27
100
760
0
5
57
@CsabaSzepesvari
Csaba Szepesvari
3 years
I am delighted to invite everyone tomorrow for the first RL Theory Seminar talk of 2021 by Andrea Zanette. Andrea will explain to us why and how batch reinforcement learning can be much harder than online RL. For details check out
Tweet media one
0
11
56
@CsabaSzepesvari
Csaba Szepesvari
4 years
NeurIPS experience: Does anyone enjoy moving around a silly avatar with the speed of a snail in oversized rooms to get to specific posters?
9
0
56
@CsabaSzepesvari
Csaba Szepesvari
1 year
Wow, I just discovered this treat: Moritz Hardt and Ben Recht: "Patterns, predictions, and actions". I will surely recommend this for my students or whoever starts with this subject! Very cool. Thank you @beenwrekt !
3
2
56
@CsabaSzepesvari
Csaba Szepesvari
2 years
My typical day..
@docmilanfar
Peyman Milanfar
2 years
On the first page of my (1993) PhD Thesis. Still true.
Tweet media one
13
111
1K
0
0
54
@CsabaSzepesvari
Csaba Szepesvari
4 years
Improper learning? Who would do that? Is not that bad by definition? Not even proper? Come to our seminar to find out what Max Simchowitz thinks about improper learning for non-stochastic control!
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 06/30: Max Simchowitz (UC Berkeley) "Improper Learning for Non-Stochastic Control" For details, please see the website:
Tweet media one
0
9
29
1
8
52
@CsabaSzepesvari
Csaba Szepesvari
4 years
@thegautamkamath When I was a PhD student, a few times I was quite discourage by some reviews. SIAM J. Opt told me in 2000 that exploration in finite MDPs is old-fashioned:) Soon enough though, I learned not to pay attention to failures or rejections and focused on positives. ==>
2
1
51
@CsabaSzepesvari
Csaba Szepesvari
5 years
Cool universality argument for SGD with FF neuralnets: Take any learning alg A for learning Boolean functions without noise from a sample of size n. Then there is a NN architecture G(A,n) such that SGD+G(A,n)+Any reasonable loss with sequential processing "implements" A.
@DimitrisPapail
Dimitris Papailiopoulos
5 years
A tour de force by Abbe & Sandon, "Any function distribution that can be learned from samples in poly-time can also be learned by a poly-size neural net trained with SGD on a poly-time initialization with poly-steps" + "[this] does not hold for GD"
1
20
101
1
7
48
@CsabaSzepesvari
Csaba Szepesvari
4 years
@neu_rips being featured in @marcgbellemare 's talk (awesome talk Marc, by the way!! congrats again for all those involved!!). But Twitter does work, eh?
Tweet media one
1
0
49
@CsabaSzepesvari
Csaba Szepesvari
7 years
I am very excited to announce that I am joining Deepmind, taking a two year leave. I will miss people in Edmonton, but you should visit!
2
4
49
@CsabaSzepesvari
Csaba Szepesvari
4 years
@beenwrekt You mean no progress? Nah.. Btw, I like the style of some of these old papers that describe some unbaked idea for what they are, not trying to oversell them, making them look bigger than what they are (eg a heuristic is a heuristic..). Papers of this type won't make it today.
2
1
48
@CsabaSzepesvari
Csaba Szepesvari
4 years
You must see this, new webpage! ..after the service I have previously used to compile my publications-page stopped working (dire times..), put together in a day with the help of and
4
0
47
@CsabaSzepesvari
Csaba Szepesvari
4 years
@yisongyue Research is done in many small steps. You may think something goes unnoticed, but it may have influenced someone, who gets a new idea, writes another small thing. This leads to the next thing. Wait 20 years, the many little things add up and a much cleaner, deeper ==>
1
2
47
@CsabaSzepesvari
Csaba Szepesvari
4 years
..and next week we take a break to let the "Deep RL meets theory" workshop to take the stage! Check out the program at: Do not forget to put all these events in your calendar! The most convenient way to do this is to go here:
@RLtheory
RL Theory Virtual Seminars
4 years
We are glad to announce that we are now officially part of the "Theory of RL" program at the Simons Institute! See our updated schedule that now includes two new speakers and the RL theory workshops at @SimonsInstitute .
Tweet media one
1
5
66
0
8
46
@CsabaSzepesvari
Csaba Szepesvari
1 year
Aaditya Ramdas (not on twitter; good for him) is coediting a special issue for MLJ on "Conformal Prediction and Distribution-Free Uncertainty Quantification". Deadline Nov 30. Consider submitting if you have something! I will be looking forward to see what comes out of this!
2
5
43
@CsabaSzepesvari
Csaba Szepesvari
3 years
A frequent issue in batch RL is that evaluation methods are biased and the size of the bias is unknown. Come and join us tomorrow to learn from Yi Su about how to build optimizers that do almost as well as if the bias was known! For details:
Tweet media one
1
11
45
@CsabaSzepesvari
Csaba Szepesvari
4 years
@Maggiemakar @zacharylipton For those who like books, I also love the Anthony-Bartlett book While it is quite short, it explains soo much about how SLT has evolved over the years!
0
6
44
@CsabaSzepesvari
Csaba Szepesvari
1 year
RL Theory Seminars is pleased to present a talk by Yujia Jin (Stanford) tomorrow on "VOQL: Towards Optimal Regret in Model-free RL with Nonlinear Function Approximation". For further details, check out
0
6
44
@CsabaSzepesvari
Csaba Szepesvari
3 years
Representation learning and exploration in RL together? Aditya Modi got you covered! Details? Well, you should come to the next talk! For details visit:
Tweet media one
0
2
43
@CsabaSzepesvari
Csaba Szepesvari
1 year
Very happy for this! What a spectacular future for @UAlberta / @UAlbertaCS and @AmiiThinks !
@nathansttt
Nathan Sturtevant
1 year
A packed house to hear @BFlanaganUofA from the @UAlberta and @AmiiThinks announce that 20 new faculty will be hired in AI across campus in the next 3 years, with 5 of these positions in CS.
Tweet media one
Tweet media two
1
14
65
1
0
43
@CsabaSzepesvari
Csaba Szepesvari
3 years
Advice for people thinking of registering an email address at CMT or other similar reviewing systems: Register an email that is NOT associated with your school/workplace. School and workplace change. Then you will end up with multiple identities, which is not what you want:)
2
1
43
@CsabaSzepesvari
Csaba Szepesvari
11 months
Proud of my colleagues, winning an IJCAI distinguished paper award! Go @GoogleDeepMind @UAlbertaCS @AmiiThinks !
@mhutter42
Marcus Hutter
11 months
What do you get when you cross modern Machine Learning with good old-fashioned Search? An IJCAI distinguished paper award 🙂 for Levin Tree Search with Context Models:
Tweet media one
6
32
183
1
2
41
@CsabaSzepesvari
Csaba Szepesvari
2 years
@pcastr SOMs are an awesome example of how curiosity driven research looks like. Neither neuroscience, nor solving any real problem. Yet, one can still write books about SOMs, think about them in various ways, etc. Sg to remember when judging relevance while reviewing!
2
1
42
@CsabaSzepesvari
Csaba Szepesvari
2 years
I hope everyone enjoyed ICLR. As promised, RL Theory seminars are back and we are super lucky to have Kwang-Sung Jun fixing our bad ideas about how to use Boltzmann exploration via the help of the mysterious "Maillard sampling" idea. Intrigued? Check out
0
8
42
@CsabaSzepesvari
Csaba Szepesvari
4 years
Why do we use softmax to represent policies? Could we use some other "transfer" function? Which one? Pros/cons? Come to see our posters to hear about the gravitational pull of softmax and how physicist are always right! I can't guarantee to be up at the time of the oral though:)
@rlai_lab
Reinforcement Learning and Artificial Intelligence
4 years
Come hear Jincheng Mei, Chenjun Xiao, @daibond_alpha , @LihongLi20 , @CsabaSzepesvari , Dale Schuurmans talk about "Escaping the Gravitational Pull of Softmax" on Tuesday. Oral: 0715–0730 MST Poster: 10–12pm MST Link: #NeurIPS2020
0
1
16
1
4
42
@CsabaSzepesvari
Csaba Szepesvari
4 years
Ladies and gentlemen! We are delighted to give you OPPO, optimistic policy optimization (very much related to the previous talk by the way!) to achieve efficient and effective exploration with linear function approximation in finite horizon MDPs as presented by Zhuoran Yang!
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 09/22: Zhuoran Yang (Princeton) "Provably Efficient Exploration in Policy Optimization" For details, please see the website:
Tweet media one
0
8
42
0
5
42
@CsabaSzepesvari
Csaba Szepesvari
4 years
Our chance to stay positive during these dire times is to attend Simon's seminar tomorrow where I hope we learn that despite all other signs RL is not much harder than bandits. Long live RL, long live bandits!
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 11/24: Simon S. Du (University of Washington) "Is Reinforcement Learning More Difficult Than Bandits? A Near-optimal Algorithm Escaping the Curse of Horizon" For details, please see the website:
Tweet media one
0
11
72
1
3
39
@CsabaSzepesvari
Csaba Szepesvari
4 years
Huge congratulations to my colleagues at @DeepMind ! This is a really awesome achievement!
@GoogleDeepMind
Google DeepMind
4 years
In a major scientific breakthrough, the latest version of #AlphaFold has been recognised as a solution to one of biology's grand challenges - the “protein folding problem”. It was validated today at #CASP14 , the biennial Critical Assessment of protein Structure Prediction (1/3)
134
3K
10K
0
0
41
@CsabaSzepesvari
Csaba Szepesvari
4 years
Please join us and Matthieu to hear about breaking news about how averaging and regularization work together to make your RL algorithms go faster!
@RLtheory
RL Theory Virtual Seminars
4 years
Reminder: this talk is coming up tomorrow! ***Note that the talk starts at 4PM UTC, one hour earlier than our regular time slot*** Public YouTube link: Sign up for the talk on Google Meet:
2
8
28
0
3
41
@CsabaSzepesvari
Csaba Szepesvari
4 years
Gentle reminder, this talk is happening tomorrow! I hope to see many of you there:)
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 06/16: Niao He (UIUC) "A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms" For details, please see the website:
Tweet media one
0
9
41
0
9
40
@CsabaSzepesvari
Csaba Szepesvari
4 years
It is a great pleasure to have Fei Feng from UCLA speaking at our next seminar. Join us to learn about how to combine RL and unsupervised learning and keep everything provably efficient!
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 07/07: Fei Feng (UCLA) "Provably Efficient Exploration for RL with Unsupervised Learning" For details, please see the website:
Tweet media one
0
17
75
0
5
40
@CsabaSzepesvari
Csaba Szepesvari
4 years
Join us on Tuesday to hear from Mengdi about the latest and greatest lower and upper bounds in off-policy evaluation with linear function approximation!
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 08/04: Mengdi Wang (Princeton / DeepMind) "Minimax-Optimal Off-Policy Evaluation with Linear Function Approximation" For details, please see the website:
Tweet media one
1
14
55
0
6
40
@CsabaSzepesvari
Csaba Szepesvari
2 years
Huge improvements for the sample complexity of RL for representation learning in low-rank (linear) MDPs! How? Why? Really? Come check out the seminar of Masatoshi Uehara tomorrow! For details follow this link:
0
2
39
@CsabaSzepesvari
Csaba Szepesvari
4 years
We are delighted to have Shie give the next RL Theory Virtual Seminar. I hope to see many of you online at the seminar.
@RLtheory
RL Theory Virtual Seminars
4 years
Our next talk: 06/09: Shie Mannor (Technion) "Adaptive Trust Region Policy Optimization: Global Convergence and Faster Rates for Regularized MDPs" For details, please see the website:
Tweet media one
1
13
63
0
4
40
@CsabaSzepesvari
Csaba Szepesvari
2 years
@ylecun @nanjiang_cs Perhaps better to focus on what needs to be done than on who is doing it or whether we call it RL or anything else. But I am glad you recognize that some sort of planning with models (or not?) will be needed! We are on the same page with this one. And Merry Christmas!! 2/2
2
1
39
@CsabaSzepesvari
Csaba Szepesvari
3 years
Pessimism is back on stage! Join the RL Theory Seminars tomorrow to hear from Paria Rashidinejad about *more reasons* of why being pessimistic in the batch RL setting is actually good. Fast rates? Adaptive optimality? Pessimism delivers!
Tweet media one
0
5
38
@CsabaSzepesvari
Csaba Szepesvari
5 years
To the attention of strong final year PhD students, junior faculty in CS/Theory/..! Excellent opportunity to stay at Berkeley while the 'Theory of RL' and other programs are happening. Please pass it along to relevant candidates.
0
5
38
@CsabaSzepesvari
Csaba Szepesvari
4 years
Yours truly talks RL.. Thanks @TalkRLPodcast /Robin for having me!!
@TalkRLPodcast
TalkRL Podcast
4 years
Episode 10 @CsabaSzepesvari of DeepMind shares his views on Bandits, Adversaries, PUCT in AlphaGo / AlphaZero / MuZero, AGI and RL, what is timeless, and more!
0
10
48
0
2
38
@CsabaSzepesvari
Csaba Szepesvari
3 years
We are glad to welcome Tadashi! Btw, I still have some openings for postdocs. PM me if you are interested in theoretical foundations of RL, and, more broadly decision making (stay tuned!), or you know someone who could be good!
1
5
36
@CsabaSzepesvari
Csaba Szepesvari
6 years
Yep, good one! We could do more of this: "AI as a field is starving for a few carefully documented failures. [..] I can learn more by just being told why a technique won't work than by being made to read between the lines."
@shakir_za
Shakir Mohamed
6 years
#SundayClassicPaper 📜: McDermott (1976) 'Artificial Intelligence Meets Natural Stupidity'. As we critique our own field, it is useful to see what recurs from the critique of the past. The critique on 'Wishful Mnemomics' seems still relevant.
Tweet media one
2
60
191
1
4
36
@CsabaSzepesvari
Csaba Szepesvari
4 years
@MarlosCMachado Great for them! While international universities are great, we should not forget that local universities can also be great. I did all my studies in Hungary and I don't regret this the tiniest a bit! I met wonderful, dedicated, caring, knowledgable profs there, which meant a lot!
3
0
36
@CsabaSzepesvari
Csaba Szepesvari
3 years
In RL being optimistic is often the "right thing" when learning interactively. But what happens in the batch case? Perhaps pessimism is then the best? Come join us next Tuesday to learn the answer and more from Ying Jin! For details check out
Tweet media one
0
9
36
@CsabaSzepesvari
Csaba Szepesvari
3 years
Exploration! The hunt for the "right" characterization of sample efficiently learnable RL problem classes is not over yet! Enter the Bellman eluder dimension, which subsumes all that came before, as Quinghua Liu will kindly explain to all of us who care.
Tweet media one
0
6
35
@CsabaSzepesvari
Csaba Szepesvari
4 years
@RandomlyWalking my students!
1
1
36