Shiori Sagawa @shiorisagawa Twitter profile

Last Seen Profiles

@OO3OqO

@OperationAIR

@bokeplokalmalam

@Shiba_EVOLVE

@marim92210

@ThePCCTC

@akugetsu

@Velana_sa

@FDTL_Scachetti

@Py5uVhAfi9O2JdF

@wildforscotland

@CLforTrump24

@MefTeam

@Dudamuda12

@GenreGenerator

@GloireSimb7572

@shauna202xx

@Dahwong_g_j

@SukaIbuIbuTua2

@rei_yonebosi

@CliffvdLinden

@Prashanth_Josep

@paris_xl

@BeFreeBreakFree

@carrancomusic

@BrophyAthletics

@bokeplokalmalam

@AjayiAdeku79505

@AiRodri

@bakiaydin

@bokeplokalmalam

@hrkchan_o316

@HatariThailand

@Tcafaro_MD

@RosebrouEarl

@nanavet3

Shiori Sagawa

@shiorisagawa

4 years

We're excited to announce WILDS, a benchmark of in-the-wild distribution shifts with 7 datasets across diverse data modalities and real-world applications. Website: Paper: Github: Thread below. (1/12)

8

205

896

Shiori Sagawa

@shiorisagawa

3 years

We’ve released v1.1 of WILDS, our benchmark of in-the-wild distribution shifts! This adds the Py150 dataset for code completion + updates to existing datasets to make them faster and easier to use. Website: Paper: Thread 👇 (1/8)

2

83

349

Shiori Sagawa

@shiorisagawa

2 years

We’ll be presenting WILDS v2.0 as an oral at ICLR! We extended the WILDS benchmark of real-world shifts by adding unlabeled data, which can be used for domain adaptation and representation learning. Talk + poster: Paper: 🧵

8

60

289

Shiori Sagawa

@shiorisagawa

3 years

We’ll be organizing a NeurIPS Workshop on Distribution Shifts! We’ll focus on bringing together applications and methods to facilitate discussion on real-world distribution shifts. Website: Submission deadline: Oct 8 Workshop date: Dec 13

4

46

232

Shiori Sagawa

@shiorisagawa

3 years

Just in time for ICML, we’re announcing WILDS v1.2! We've updated our paper and added two new datasets with real-world distribution shifts. Website: Paper: ICML: Blog🆕: 🧵(1/9)

2

32

171

Shiori Sagawa

@shiorisagawa

2 years

Join us at the NeurIPS Workshop on Distribution Shifts (DistShift) tomorrow! When: Saturday, Dec 3, 9am-5pm Where: Room 388 - 390 Website: Virtual site:

1

19

118

Shiori Sagawa

@shiorisagawa

2 years

I'm excited to speak at the Principles of Distribution Shift workshop at #ICML2022 tomorrow at 9:50am in Ballroom 3! I'll be talking about extending the WILDS benchmark with unlabeled data. Please join us! The talk will also be streamed at .

0

18

104

Shiori Sagawa

@shiorisagawa

9 months

Join us at the NeurIPS Workshop on Distribution Shifts (DistShift) tomorrow! When: Friday, Dec 15, 9am-5pm Where: Room R06-R09 Website: Virtual site:

1

25

94

Shiori Sagawa

@shiorisagawa

2 years

We're excited to organize the DistShift workshop at NeurIPS 2022! Like last year, we'll focus on real-world shifts and bringing together methods and applications. Please consider submitting to the workshop!

Yoonho Lee

@yoonholeee

2 years

We're organizing the second Workshop on Distribution Shifts (DistShift) at #NeurIPS2022 , which will bring together researchers and practitioners. Submission deadline: Oct 3 (AoE) Workshop date: Dec 3 Website:

2

28

127

8

13

70

Shiori Sagawa

@shiorisagawa

3 years

Excited to give a talk at the Rising Star Spotlights Seminar tomorrow at 9am PT! I'll talk about robustness to distribution shifts, focusing on DRO methods and the WILDS benchmark. Please join us, and thank you @trustworthy_ml for having me!

Trustworthy ML Initiative (TrustML)

@trustworthy_ml

3 years

1/ It’s Rising Star Spotlights Seminar ⭐️ time again! For this week’s TrustML seminar, we're delighted to host @shiorisagawa (Stanford) & @p_vihari (IITB) on Thurs Aug 19th 12pm ET 🎉🎉🥳 Register here: See this thread for the speaker & talk details👇

1

5

21

0

5

48

Shiori Sagawa

@shiorisagawa

2 years

Excited to give a talk at the Oxford Women in CS Seminar Series tomorrow, 6/16 at 9am PT! Please join us, and thank you @OxWoCS for having me!

Oxford Women in Computer Science

@OxWoCS

2 years

For our seminar speaker event this Thursday, Shiori Sagawa ( @shiorisagawa ) from Stanford will be talking about her work on distributionally robust optimization (DRO) as well as the WILDS benchmark 👏👏👏 Time: 5-6pm BST, Thursday, 16th June Sign up:

1

6

18

0

4

43

Shiori Sagawa

@shiorisagawa

2 years

I'll be moderating a breakout session on OOD generalization at the SCIS workshop at #ICML2022 today at 5:45pm. Please stop by Room 340 if you're interested in joining!

0

10

41

Shiori Sagawa

@shiorisagawa

2 years

@zacharylipton My talk was on this paper: ! We saw that success need not transfer across different shifts: domain adaptation algorithms, which work well on certain shifts like photos to sketches in DomainNet, often don’t work on the shifts in the WILDS benchmark.

Extending the WILDS Benchmark for Unsupervised Adaptation

Machine learning systems deployed in the wild are often trained on a source distribution but deployed on a different target distribution. Unlabeled data can be a powerful point of leverage for...

arxiv.org

1

3

25

Shiori Sagawa

@shiorisagawa

4 years

Most WILDS datasets consider the domain generalization setting, which tests generalization to unseen domains. In iWildCam, we train on photos from some camera traps, and test on other camera traps. Goal: classify animal species (for conservation/ecology). (3/12)

2

19

Shiori Sagawa

@shiorisagawa

4 years

Distribution shifts can cause significant degradation in ML systems deployed in the wild. We worked with domain experts to adapt datasets that reflect these real-world distribution shifts. On each dataset, we show a substantial out-of-distribution performance drop. (2/12)

2

0

18

Shiori Sagawa

@shiorisagawa

4 years

This was joint work with @PangWeiKoh and a team of incredible coauthors: HenrikMarklund @sangmichaelxie MarvinZhang @_bakshay @weihua916 @michiyasunaga @rlanasphillips @sarameghanbeery @jure @anshulkundaje @2plus2make5 @svlevine @chelseabfinn @percyliang (11/12)

1

0

14

Shiori Sagawa

@shiorisagawa

4 years

WILDS is available as an open-source Python package that automates data downloading and processing + has standardized evaluators/leaderboards + default models for all datasets. (9/12)

1

0

14

Shiori Sagawa

@shiorisagawa

3 years

In addition to the v1.2 release, we have a new blog post: . Please check it out! And as always, please reach out if you have any questions or feedback about the benchmark. (7/9)

WILDS: A Benchmark of in-the-Wild Distribution Shifts

A curated benchmark of 10 datasets with real-world distribution shifts.

ai.stanford.edu

1

3

13

Shiori Sagawa

@shiorisagawa

4 years

In Camelyon17, we train on lymph node sections from some hospitals, and test on a different hospital. Goal: predict breast cancer vs. normal tissue. (6/12)

1

12

Shiori Sagawa

@shiorisagawa

4 years

A huge thank you to the many others who generously volunteered their time and expertise to help us: We're actively expanding WILDS. Please let us know if you have any questions, feedback, or if you are interested in contributing a dataset! (12/12)

0

13

Shiori Sagawa

@shiorisagawa

4 years

Other datasets consider the subpopulation shift setting. In CivilComments, we train models to classify toxicity of online comments, and we want equally high performance on different demographic subpopulations (e.g., comments mentioning particular races). (7/12)

1

0

9

Shiori Sagawa

@shiorisagawa

3 years

The GlobalWheat-WILDS detection dataset comprises images of wheat fields collected from 12 countries around the world. The task is to draw bounding boxes around instances of wheat heads in each image, and the distribution shift is over different locations. (2/9)

1

0

8

Shiori Sagawa

@shiorisagawa

4 years

In PovertyMap, we train on satellite images from some countries, and test on other countries. Goal: estimate asset wealth esp. in rural areas (for development and humanitarian efforts). (4/12)

1

10

Shiori Sagawa

@shiorisagawa

3 years

This was joint work with @PangWeiKoh and a team of incredible coauthors. Special thanks to HenrikMarklund IrenaGao @michiyasunaga @rlanasphillips @sangmichaelxie @sarameghanbeery TonyLee @2plus2make5 for all of their contributions to v1.1! (8/8)

0

9

Shiori Sagawa

@shiorisagawa

4 years

Finally, beyond the application areas above, we also survey distribution shifts in algorithmic fairness benchmarks and other applications areas: medicine and healthcare, genomics, natural language and speech processing, code, education, and robotics. (10/12)

1

0

9

Shiori Sagawa

@shiorisagawa

4 years

In OGB-MolPCBA, we train on molecules with particular scaffolds, and test on other molecular scaffolds. Goal: predict biochemical properties (for drug development). (5/12)

1

8

Shiori Sagawa

@shiorisagawa

3 years

This was joint work with @PangWeiKoh and a team of incredible co-authors: . Special thanks to @bertonearnshaw @ImranSHaque @EtienneDavid @IanStavness @guowei_net @_bakshay @anshulkundaje Tony Marvin Henrik for all of their contributions to v1.2! (8/9)

1

0

8

Shiori Sagawa

@shiorisagawa

3 years

The RxRx1-WILDS dataset comprises images of genetically-perturbed cells taken with fluorescent microscopy and collected across 51 experimental batches. The task is to classify the genetic perturbation, and the distribution shift is over different experimental batches. (3/9)

1

8

Shiori Sagawa

@shiorisagawa

3 years

Finally, we have updated the leaderboard submission guidelines and added evaluation scripts and other infrastructure to support submission. For more details on the v1.2 update, please see our release notes: . (6/9)

Releases · p-lambda/wilds

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models. - p-lambda/wilds

github.com

1

0

8

Shiori Sagawa

@shiorisagawa

3 years

For more details on the v1.1 update, please see our release notes: . We’re also currently working on a few new datasets that we’re hoping to include in a subsequent release, so please stay tuned! (7/8)

Releases · p-lambda/wilds

A machine learning benchmark of in-the-wild distribution shifts, with data loaders, evaluators, and default models. - p-lambda/wilds

github.com

1

0

7

Shiori Sagawa

@shiorisagawa

3 years

We’ve updated our paper to include results on GlobalWheat-WILDS and RxRx1-WILDS. We’ve also added an analysis of a distribution shift over cell types in a genomic dataset based on the ENCODE-DREAM challenge. (4/9)

1

0

7

Shiori Sagawa

@shiorisagawa

3 years

We’ve added a new dataset Py150. In this code completion dataset, we train on some Github repos and test on unseen repos. We evaluate the accuracy on the subpopulation of class and method tokens, as those are frequent queries in real-world settings. (2/8)

1

0

7

Shiori Sagawa

@shiorisagawa

4 years

Lastly, in the Functional Map of the World, we train models to classify land use on satellite images taken <= 2012 and test on images taken >= 2016. We want equally high performance across geographic regions. (8/12)

1

0

7

Shiori Sagawa

@shiorisagawa

3 years

Distribution shifts can pose significant robustness challenges in ML applications, but these real-world shifts are understudied in the ML research community today. By convening domain experts and methods-oriented researchers, we hope to accelerate research on this topic.

1

0

7

Shiori Sagawa

@shiorisagawa

3 years

It's been really exciting to see all the work done on WILDS, and we're looking forward to seeing all of the future progress! Thank you to all the users who have provided feedback as well. (9/9)

0

6

Shiori Sagawa

@shiorisagawa

3 years

On each of the WILDS datasets, including the two new ones, we show that there is a large gap between in-distribution and out-of-distribution performance. Measuring this gap is an important but subtle problem, and we’ve expanded our discussion on this in the paper (Sec 5). (5/9)

1

0

6

Shiori Sagawa

@shiorisagawa

3 years

We’ve updated the paper to include results on Py150 and other additional baseline experiments. (6/8)

1

0

6

Shiori Sagawa

@shiorisagawa

2 years

DistShift 2022 is jointly organized with @BeccaRoelofs @chelseabfinn @FannyYangETH @hsnamkoong MasashiSugiyama @jacobeisenstein JonasPeters @PangWeiKoh and @yoonholeee . Thank you to everyone who submitted and who helped us review. We hope to see you tomorrow!

0

1

7

Shiori Sagawa

@shiorisagawa

2 years

This was joint work with @PangWeiKoh * @tonyh_lee * IrenaGao*, and @sangmichaelxie @kendrick_shen @ananyaku @weihua916 @michiyasunaga HenrikMarklund @sarameghanbeery @EtienneDavid @IanStavness @guowei_net @jure @kate_saenko_ @tatsu_hashimoto @svlevine @chelseabfinn @percyliang .

1

5

Shiori Sagawa

@shiorisagawa

3 years

We have an exciting lineup of speakers with diverse expertise in different applications and methods! We’ll hear from @aleks_madry , @chelseabfinn , @stats_tipton , @emwebaze , JonasPeters, MasashiSugiyama, and @suchisaria .

1

0

4

Shiori Sagawa

@shiorisagawa

2 years

... - domain-adjusted regression for domain generalization ( @ElanRosenfeld PradeepRavikumar @risteski_a ) - distribution shifts in federated learning ( @KrishnaPillutla @LaguelYassine JeromeMalick ZaidHarchaoui) - data feedback loops ( @rtaori13 @tatsu_hashimoto )

1

0

4

Shiori Sagawa

@shiorisagawa

3 years

We’ll also have a panel discussion on future directions on robustness to distribution shifts. We’re very excited to hear from @AndyBeck , @jamiemmt , @judyfhoffman , and @tatsu_hashimoto !

1

0

4

Shiori Sagawa

@shiorisagawa

3 years

All of our baseline experiments are now available on @CodaLabWS . For reproducibility, this includes the exact commands used to run baseline experiments as well as all experiment outputs, including model parameters. (5/8)

1

0

4

Shiori Sagawa

@shiorisagawa

2 years

In addition, we'll have 6 spotlight talks on: - theory of domain generalization ( @kefandong @tengyuma ) - economic prediction benchmark ( @keyonV @EmilPalikot TianyuDu @AyushKanodia @Susan_Athey DavidBlei) - invariant predictors (KangDu YuXiang) ...

1

0

4

Shiori Sagawa

@shiorisagawa

3 years

This workshop is jointly organized with @PangWeiKoh , FannyYang, @hsnamkoong , JiashiFeng, @kate_saenko_ , @percyliang , @slbird , and @svlevine . Please reach out to distshift-workshop-2021 @googlegroups .com for any questions, and we hope you’ll submit to and attend our workshop!

0

4

Shiori Sagawa

@shiorisagawa

2 years

Unlabeled data is a powerful source of leverage for improving out-of-distribution (OOD) performance. For example, existing domain adaptation algorithms improve OOD performance on standard domain adaptation benchmarks, such as shifting from photos to sketches in DomainNet.

1

0

4

Shiori Sagawa

@shiorisagawa

3 years

We updated some of the existing datasets and default models to make them significantly faster and easier to use. For most datasets, the training time is now less than 10 hours (on a V100). (3/8)

1

0

4

Shiori Sagawa

@shiorisagawa

3 years

@CianEastwood @sarameghanbeery Thanks! That's a great question -- we think these settings are promising directions for improving out-of-distribution performance too. We're hoping to look into extending the benchmark to support these in the future, and we'd be very interested if you do explore them!

0

4

Shiori Sagawa

@shiorisagawa

2 years

Special thanks to @tonyh_lee for overseeing all the infrastructure for the experiments and leaderboard! We’re also grateful to the many others who helped us with WILDS and the v2.0 update: .

1

3

Shiori Sagawa

@shiorisagawa

3 years

Some of these changes are breaking changes that will impact users who are currently running experiments with WILDS. Sorry about the inconvenience, and we ask all users to update their package. At this time, we don’t expect to make further changes to the existing datasets. (4/8)

1

0

3

Shiori Sagawa

@shiorisagawa

2 years

These results tell us that success doesn’t necessarily transfer across different types of distribution shifts, and it’s important to develop and evaluate algorithms on a wide range of distribution shifts. And there’s much work to be done to be robust to shifts in WILDS!

1

0

3

Shiori Sagawa

@shiorisagawa

2 years

We’re excited to see all the work using WILDS so far! Our leaderboard has a variety of approaches: transformations and augmentations (MBDG, LISA), invariance and distributional robustness (IID repr learning, CGD, Fish), ensembling (Model Soups), and test time adaptation (ARM).

1

0

3

Shiori Sagawa

@shiorisagawa

2 years

How well do successes on standard benchmarks transfer to other shifts, like those in WILDS? Shifts such as photos to sketches are useful, challenging diagnostics, but at the same time, there are many other types of shifts in the wild that we also want to make progress on.

1

0

3

Shiori Sagawa

@shiorisagawa

2 years

Erin Hartman on external validity in the social sciences; Alicia Wassink on demographic disparities in automatic speech recognition systems; and @sarameghanbeery on geospatial shifts in ecology and conservation.

1

3

Shiori Sagawa

@shiorisagawa

3 years

Please submit to our workshop! The submission deadline is on Oct 8, with an option to sign up for the mentorship program by late September. We’re broadly interested in methods, evaluations and benchmarks, and theory for distribution shifts, especially real-world ones.

1

0

3

Shiori Sagawa

@shiorisagawa

9 months

DistShift 2023 is jointly organized with @BeccaRoelofs @FannyYangETH @hsnamkoong MasashiSugiyama @jacobeisenstein @PangWeiKoh @tatsu_hashimoto and @yoonholeee . We hope to see you tomorrow!

0

3

Shiori Sagawa

@shiorisagawa

4 years

@josh_tobin_ Great question! For our datasets, the domain information is very easy to obtain (e.g., camera IDs come for free for iWildCam). So OOD detection on our datasets might not have great real-world motivations, although they could potentially be reasonable test beds.

0

2

Shiori Sagawa

@shiorisagawa

2 years

Beyond method development, it’s also been nice to see WILDS being used to study empirical trends on distribution shifts. For example, @oliviawiles1 et al. evaluate an extensive set of methods on WILDS and other benchmarks in .

2

0

2

Shiori Sagawa

@shiorisagawa

9 months

Our focus this year is distribution shifts in the context of foundation models. We're excited to explore the new challenges and approaches for distribution shifts raised by foundation models!

1

0

2

Shiori Sagawa

@shiorisagawa

2 years

Finally, check out the poster session from 1-2:30! We have 90 accepted papers this year.

1

0

2

Shiori Sagawa

@shiorisagawa

2 years

We'll then have a panel on future directions, featuring @bneyshabur , @david_sontag , Erin Hartman, and Pradeep Ravikumar. Our panelists span various applications (e.g., medicine, social sciences, reasoning) and methods (e.g., domain generalization, foundation models, causality).

1

0

2

Shiori Sagawa

@shiorisagawa

2 years

We then benchmarked representative domain adaptation methods, including domain-invariant, self-training, and self-supervised methods. Unlike on DomainNet, these algorithms often do even worse than standard training on WILDS, despite using additional unlabeled data.

1

0

1

Shiori Sagawa

@shiorisagawa

3 years

@wiebketous Thank you for the question, and yes, that's definitely something we're interested in! We welcome characterization of distribution shifts across various application areas, and we're not necessarily looking for novel solutions. We'll clarify this on the website.

1

0

2

Shiori Sagawa

@shiorisagawa

2 years

WILDS is available as a Python package, where you can use the WILDS datasets and its unlabeled data in just a few lines of code. We also have a leaderboard at , both for submissions with and without unlabeled data.

1

0

2

Shiori Sagawa

@shiorisagawa

2 years

@ZhongingAlong @Princeton Congratulations Ellen!! This is super exciting, and looking forward to all the great work to come from your group!

1

0

2

Shiori Sagawa

@shiorisagawa

2 years

Finally, our talk will be on Wednesday at 10am PT, and our poster will be on Thursday from 6:30am to 8:30pm PT. Please check it out, and hope to see you there!

0

1

Shiori Sagawa

@shiorisagawa

2 years

We're excited to hear from our 6 invited speakers: Mingsheng Long on transfer learning; @Reichstein_BGC on distribution shifts in earth and climate sciences; Pradeep Ravikumar on distributionally robust optimization; ...

1

0

1

Shiori Sagawa

@shiorisagawa

2 years

Our focus is on real-world distribution shifts, and we hope to bring together various communities that have been working on this topic, connecting methods and applications.

1

0

1

Shiori Sagawa

@shiorisagawa

2 years

To answer the above question, we added unlabeled data to 8 WILDS datasets, while keeping all labeled data and evaluation metrics unchanged. These unlabeled data can come from source domains, target domains, or extra domains that are in neither the training nor test distribution.

1

0

1