Karan Singhal @thekaransinghal Twitter profile

Pinned Tweet

Karan Singhal

1 year

Honored to see our work on Med-PaLM, our medical large language model, published in @Nature today! Huge thanks to our all-star team @azizishekoofeh @taotu831 @alan_karthi @vivnat and many more teammates at @GoogleAI @GoogleHealth !

3

19

134

Last Seen Profiles

@abebrown716

@joelgames

@NgoanCop

@stwmaniax

@johncarpenterr

@yero955814

@TheGoldbergsABC

@kuma_ari_0618

@Absilicon

@MadridSPM

@tiarawhyy

@humanzeke

@coachDwhitey

@cukienaknikmati

@mrperfectlymau

@BlackedRule34

@CD_Marchamalo

@NEM1976

@KokoChen18

@BTScener

@brandonjc_art

@onoderaKOKORO8

@rmlowst

@stw_pdg

@wti_org_india

@julian_dangelo

@butwhyalyssa

@JuliaHaaf

@MMatthieu12

@AdrinRa24634835

@fnf_Azufunkin

@bokeplokalmalam

@s3UR6YeHgSYvNQ0

@11chigua

@kilmorens

@monstax_IC

Karan Singhal

@thekaransinghal

2 months

📣 Life update: I’ve joined OpenAI and am hiring researchers! 💥 I’m immensely grateful to all of my teammates at Google/DeepMind over the last ~5 years–you all have taught me so much. I’m excited to continue marching towards our shared mission to enable universal access to

48

72

941

Karan Singhal

@thekaransinghal

6 months

Excited to share our newest work! 📝 Evaluation of LLMs is hard, especially for health equity. We provide a multifaceted human assessment framework, 7 newly-released adversarial datasets, and perform the largest human eval study on this topic to date. 🧵:

5

27

116

Karan Singhal

@thekaransinghal

6 months

Our paper “Towards Generalist Biomedical AI” was just published in @NEJM_AI ! Congrats to the team.

Towards Generalist Biomedical AI

Medicine is inherently multimodal, requiring the simultaneous interpretation and integration of insights between many data modalities spanning text, imaging, genomics, and more. Generalist biomedic...

ai.nejm.org

2

20

74

Karan Singhal

@thekaransinghal

1 year

Excited to share the Med-PaLM 2 preprint! Physicians preferred Med-PaLM 2 answers over physician answers on eight of nine clinically relevant axes. Med-PaLM 2 also scored 86.5% on the MedQA licensing exam style benchmark (SOTA), 19% over Med-PaLM. 😁

Towards Expert-Level Medical Question Answering with Large Language Models

Recent artificial intelligence (AI) systems have reached milestones in "grand challenges" ranging from Go to protein-folding. The capability to retrieve medical knowledge, reason over it, and...

arxiv.org

2

17

64

Karan Singhal

@thekaransinghal

1 year

Follow along with the latest on Med-PaLM, our medical large language model, at

Med-PaLM: A Medical Large Language Model - Google Research

Discover Med-PaLM, a large language model designed for medical purposes. See how we developed our AI system to accurately answer medical questions.

sites.research.google

0

8

30

Karan Singhal

@thekaransinghal

9 months

Excited to share wider availability of our medical LLMs. It's been an exciting arc, from training the first Med-PaLM model this time last year, to Med-PaLM 2 and our trusted tester program just a few months later in April, and now more availability. Thanks to the team!!

Google Health

@GoogleHealth

9 months

📢Big #HealthAI news: Our latest medically tuned model is here — and it's available to allowlisted @GoogleCloud customers.🙌 Meet MedLM. It's a suite of models, built on Med-PaLM 2, that helps answer medical questions, summarize information, and more:

15

45

171

3

28

Karan Singhal

@thekaransinghal

1 year

Today we announced our new medical LLM, Med-PaLM 2. On MedQA (USMLE), Med-PaLM 2 achieves accuracy of over 85%, going from a passing score to expert performance. Med-PaLM 2 beats our own previous SOTA by 18%. With Tao Tu, @Mysiak , @vivnat , @AziziShekoofeh , @alan_karthi .

1

2

25

Karan Singhal

@thekaransinghal

1 year

Excited to share Med-PaLM Multimodal (Med-PaLM M), the first demonstration of generalist biomedical AI, a single model that can perform a range of biomedical tasks. Work from our fantastic team @GoogleAI @GoogleHealth @GoogleDeepMind .

2

0

24

Karan Singhal

@thekaransinghal

8 months

Excited for this to come out! AMIE is a research system for diagnostic reasoning and conversations. In a double-blinded crossover study (kind of like a "medical Turing test"), it outperformed primary care physicians!

Alan Karthikesalingam

@alan_karthi

8 months

Happy to introduce AMIE (Articulate Medical Intelligence Explorer) our research LLM for diagnostic conversations. AMIE surpassed Primary Care Drs in conversational quality & diagnostic accuracy in a "virtual OSCE"-style randomized study. Preprint ➡️ (1/7)

10

77

349

0

20

Karan Singhal

@thekaransinghal

1 year

Very excited to explore responsible, safe use-cases of our medical LLM Med-PaLM 2. More details: Research with @AziziShekoofeh @vivnat @alan_karthi @Mysiak Tao Tu, and many others!

0

1

16

Karan Singhal

@thekaransinghal

9 months

Excited to announce our latest work exploring the potential of LLMs for differential diagnosis, including a human-in-the-loop study on real-world cases! See below thread for details. Grateful to work with such amazing teammates @GoogleAI @GoogleDeepMind @GoogleHealth .

Alan Karthikesalingam

@alan_karthi

9 months

*New Research Paper* - Diagnostic conundrums are an unsolved grand challenge for AI. We present a new research LLM optimized for differential diagnosis (DDx), tested in @NEJM challenges. Our LLM outperformed clinicians & other LLMs... (1/6)

4

27

107

0

4

16

Karan Singhal

@thekaransinghal

1 year

Had a great time discussing Med-PaLM and potential implications for healthcare with the Alaa lab!

Ahmed Alaa

@_ahmedmalaa

1 year

Med-PaLM, a medical large language model from @Google , achieved a notable feat by exceeding the passing USMLE score early on. We were fortunate to have @thekaransinghal join us & deliver an insightful talk about Med-PaLM to our lab. Catch Karan's talk at:

0

3

18

1

0

15

Karan Singhal

@thekaransinghal

1 year

Through better alignment with the requirements of the medical domain, we also observe exciting improvements on other tasks, including long-form medical question answering. Blog post: We will share a preprint soon!

0

6

Karan Singhal

@thekaransinghal

11 months

This is a really nice resource, and cool to see Med-PaLM featured!

Nathan Benaich

@nathanbenaich

11 months

🪩The @stateofaireport 2023 is now here. Our 6th installment is one of the most exciting years I can remember. The #stateofai report covers everything you *need* to know, covering research, industry, safety and politics. There’s lots in there, so here’s my director’s cut 🧵

63

541

2K

0

2

8

Karan Singhal

@thekaransinghal

6 months

10/ Thanks to a fantastic team of interdisciplinary collaborators spanning @GoogleAI @GoogleDeepMind @GoogleHealth @MIT @UAlberta : @stephenpfohl @hcolelewis @DrNealResearch @dr_nyamewaa @adoubleva @weballergy @AziziShekoofeh @negar_rz @LiamGMcCoy @HardyShakerman and many others!

1

0

8

Karan Singhal

@thekaransinghal

4 months

Really impressive work!

Anthropic

@AnthropicAI

4 months

New Anthropic research paper: Scaling Monosemanticity. The first ever detailed look inside a leading large language model. Read the blog post here:

68

574

2K

0

1

7

Karan Singhal

@thekaransinghal

6 months

11/ Paper link, with EquityMedQA attached as ancillary data and assessment rubrics included in the appendix:

A Toolbox for Surfacing Health Equity Harms and Biases in Large...

Large language models (LLMs) hold immense promise to serve complex health information needs but also have the potential to introduce harm and exacerbate health disparities. Reliably evaluating...

arxiv.org

0

1

6

Karan Singhal

@thekaransinghal

6 months

3/ To bridge the gap, we developed a 3-part human assessment framework. We used multiple complementary methods, including a participatory approach, physician focus groups, actual Med-PaLM 2 failures, and iterative pilot evaluations to expand coverage across 6 dimensions of bias.

1

5

Karan Singhal

@thekaransinghal

6 months

5/ Finally, we applied our adversarial datasets and assessment rubrics to evaluate Med-PaLM 2. To increase coverage, we involved 806 raters across three rater groups: physicians, health equity experts, and consumers, for a total of 17k+ ratings.

1

5

Karan Singhal

@thekaransinghal

6 months

4/ We’re introducing EquityMedQA, 7 newly-released adversarial medical question answering datasets. They represent a portfolio of approaches for adversarial testing, including curation based on known issues, red teaming based on Med-PaLM 2 failures, and LLM-based generation.

1

0

4

Karan Singhal

@thekaransinghal

6 months

2/ LLMs have immense potential to widen access to medical expertise, especially in global health settings. But without evaluation and mitigation of potential harms, these systems could widen persistent gaps in health outcomes. Existing tools for evaluation are limited.

1

0

4

Karan Singhal

@thekaransinghal

1 year

2

0

3

Karan Singhal

@thekaransinghal

2 years

We're accepting applications for a research intern @GoogleAI to work on a project applying large language models (LLMs) to medical AI! Please apply here and reach out once you're team matching. @vivnat @alan_karthi @AziziShekoofeh

0

1

3

Karan Singhal

@thekaransinghal

1 year

Now our preprint for Med-PaLM 2 is up: We see a 19% improvement on the USMLE-style task, and answers to consumer queries are preferred over physician answers across eight of nine axes studied (factuality, harm, bias, ...).

1

0

3

Karan Singhal

@thekaransinghal

6 months

6/ Different datasets, assessments, and rater groups surfaced different potential biases, suggesting the importance of using multiple complementary approaches. We identified new potential harms not measured in our previous bias evals.

1

0

3

Karan Singhal

@thekaransinghal

1 year

Most importantly, through instruction prompt tuning, it had greatly improved long-form answers to consumer queries, often comparing similarly to physicians. 92.6% of Med-PaLM answers were aligned with scientific consensus, compared to 92.9% for clinicians (baseline model 61.9%).

1

0

3

Karan Singhal

@thekaransinghal

1 year

@_jasonwei Thank you Jason!!! I hope your grandpa is proud!!

0

3

Karan Singhal

@thekaransinghal

1 year

Another link for easier reading:

0

2

Karan Singhal

@thekaransinghal

7 months

@jasonafries Nice work team! :)

0

2

Karan Singhal

@thekaransinghal

1 year

Our results indicate rapid progress towards physician-level performance in medical question answering, highlighting the importance of both evaluation frameworks and alignment of models to societal values as we think about potential real-world impact of this technology.

1

0

2

Karan Singhal

@thekaransinghal

6 months

9/ We’ve included all EquityMedQA adversarial questions and assessment rubrics with the preprint. We hope that the broader health AI community builds on these tools to realize our shared goal of systems that promote high-quality healthcare for all.

1

0

2

Karan Singhal

@thekaransinghal

6 months

8/ While our tools can surface potential biases in LLM-generated answers to medical questions, further evaluation contextualized to specific clinical settings is needed to assess whether deployment of these systems promotes equitable health outcomes.

1

0

2

Karan Singhal

@thekaransinghal

6 months

7/ Some other personally interesting bits: (i) LLM-generated datasets surfaced potential biases, although differently than manual sets, (ii) Med-PaLM 2 answers were usually preferred more often than either Med-PaLM and physician answers were preferred.

1

0

2

Karan Singhal

@thekaransinghal

1 year

When we started this work, we set out to better understand the potential of building safe foundation models for medicine. We put together MultiMedQA, a benchmark of 7 medical question answering tasks spanning medical exams, medical research, and consumer queries.

1

0

2

Karan Singhal

@thekaransinghal

1 year

Work done with our spectacular team, including @taotu831 @AziziShekoofeh @vivnat @alan_karthi @DannyDriess @HardyShakerman @peteflorence @pichuan @acarroll_ATG @RyutaroTanno @s0f1ra @_basilM @achowdhery @greg_corrado @blaiseaguera @ymatias and many more!

0

2

Karan Singhal

@thekaransinghal

3 years

Interested in learning more about the latest research on ML and analytics on decentralized data? Join @EmilyGlanz , @MatharyCharles , @KairouzPeter , myself, and others on Nov 10th for the Federated Learning and Analytics Research Workshop. Register below:

Workshop on Federated Learning and Analytics Research using TFF - Home

events.withgoogle.com

0

1

2

Karan Singhal

@thekaransinghal

1 year

@blaiseaguera @Nature Thank you Blaise!!!

0

2

Karan Singhal

@thekaransinghal

1 year

Work was done with an amazing team of interdisciplinary researchers spanning @GoogleAI @DeepMind @GoogleHealth , including @taotu831 @vivnat @AziziShekoofeh @alan_karthi @Mysiak , Rory Sayres, Ellery Wulczyn, @stephenpfohl @hcolelewis @Hou_Le , and many others!

1

0

2

Karan Singhal

@thekaransinghal

1 year

Thank you Tao for pushing this forward!!

Tao Tu

@taotu831

1 year

Excited to push the forefront of multimodal LLMs for Medicine! We previewed an ambitious generalist approach with Med-PaLM M last week as the first demonstration of a generalist biomedical AI system that flexibly encodes and integrates multimodal biomedical data.

1

17

69

0

1

Karan Singhal

@thekaransinghal

1 year

We started our team to catalyze the medical AI community and work on building more steerable, safe systems in a context where safety matters, in partnership with researchers, physicians, policymakers, and others. We're excited to share this milestone on our journey.

1

0

1

Karan Singhal

@thekaransinghal

1 year

@hwchung27 @Nature Thank you Hyung!!!

0

1

Karan Singhal

@thekaransinghal

1 year

@arankomatsuzaki Thank you for sharing!!

0

1

Karan Singhal

@thekaransinghal

1 year

@pearsekeane @Nature @AziziShekoofeh @taotu831 @alan_karthi @vivnat @GoogleAI @GoogleHealth Thank you Pearse!!

0

1

Karan Singhal

@thekaransinghal

6 months

@misovalko @Meta @edunov @ylecun @jpineau1 @AIatMeta @lvdmaaten Good luck Michal!!

0

1

Karan Singhal

@thekaransinghal

1 year

Biomedicine is highly multimodal, and Med-PaLM M is a multitask, multimodal large language model that achieves performance near or exceeding SOTA on (visual) question answering, radiology report generation, genomics variant calling, and more.

1

0

1

Karan Singhal

@thekaransinghal

1 year

Moving forward, as biomedical models become more capable, it becomes more crucial to measure and mitigate safety risks, including hallucinated medical information and harmful uses of biological knowledge. We’re excited about grounding our safety research in this setting.

1

0

1

Karan Singhal

@thekaransinghal

26 days

@cyrilzakka @huggingface Congrats Cyril!!

1

0

1

Karan Singhal

@thekaransinghal

2 months

@Michael_D_Moor @ETH @ETH_BSSE Congrats Michael!!

0

1

Karan Singhal

@thekaransinghal

1 year

@pearsekeane @UCLeye @Moorfields @Nature Amazing work Pearse!! Was great to meet the team in London last week as well.

1

0

1

Karan Singhal

@thekaransinghal

1 year

Moving beyond automated evaluation is crucial for safe real-world impact. In human evaluation of generated radiology reports, clinicians preferred model-produced reports over radiologist-produced reports 40.5% of the time on average, suggesting potential future clinical utility.

1

0

1

Karan Singhal

@thekaransinghal

1 year

When we observed model limitations, we worked with physicians to train Med-PaLM, a state-of-the-art large language model aligned to the medical setting. It surpassed the passing score on US medical licensing exam-style questions for the first time.

1

0

1

Karan Singhal

@thekaransinghal

1 year

Med-PaLM 2 improves Med-PaLM across multiple-choice and long-form medical question answering by leveraging PaLM 2, domain-specific tuning, and prompting strategies (new: ensemble refinement). We provide an overview here:

1

0

1

Karan Singhal

@thekaransinghal

1 year

Physician eval shows Med-PaLM 2’s answers to common consumer medical questions were preferred over physicians' across eight of nine axes. For example, answers were preferred for alignment with medical consensus 72.9% of the time, and for better knowledge recall 80.1% of the time.

1

0

1