Alex Wang @W4ngatang Twitter profile

Last Seen Profiles

@2848048sexy

@del_doug4154

@penyukastw21

@Diary___00

@penikmatt_ibu

@LAEL_LU

@ArnitaGagg59912

@toshamichelle90

@STychelle31170

@ShifraK68051

@sitw

@AdeolaShak29589

@shibalsaekkia

@khaled2994

@iangreenexxx

@NSF

@CarolTh33819336

@EdithVega494452

@Hamza1shaikh0

@AlinaHakie32236

@XBTDAO

@tia_serpong

@MOTCHAN_NO_HEYA

@jilbabcrot54932

@rin_nata

@bokeplokalmalam

@PeterJa6463446

@KaleyKeion44768

@JanetBurto52824

@Darren_BEE_Lin

@meongmungi6299

@berteduhann

@wife_annie

@bdooor_q

@1t0Az91zw61Grwo

@kenzo7311

Alex Wang

@W4ngatang

2 years

Cohere is growing! If you’re passionate about building world-class LLMs and delivering them to customers, you should apply. I’m specifically looking for folks with experience in NLP data, eval, and annotation. Check out the roles here: DMs are open!

Cohere

Job openings at Cohere

jobs.lever.co

8

30

264

Alex Wang

@W4ngatang

6 years

Been working with BERT, but wish you could talk to it? Check out @kchonyc and my tech report on babbling from BERT, training free! Demo:

bert-babble.ipynb

Colaboratory notebook

colab.research.google.com

3

37

173

Alex Wang

@W4ngatang

5 years

Our new work "Asking and Answering Questions to Evaluate the Factual Consistency of Summaries" does exactly that. We use question generation and question answering models to evaluate whether summaries are factually consistent w/ the source text.

4

39

168

Alex Wang

@W4ngatang

7 months

🎉🎉🎉 Also, I'm hiring for an MLE/SWE! If you want to build LLMs with @cohere and are interested in developing challenging model evaluation settings + curating high-quality data, please reach out! ICYMI: We also just opened our NYC office 👀

lmarena.ai (formerly lmsys.org)

@lmarena_ai

7 months

[Arena Update] @cohere 's Command R is now top-10 in Arena leaderboard🔥 It's now one of the best open models reaching the level of top proprietary models. We find the model great at handling longer context, which we plan to separate as a new category in Arena very soon.

14

69

393

8

22

158

Alex Wang

@W4ngatang

2 years

Hello. I am popping up from Twitter lurking to claim the "Longest Time Between Life Update and Actually Announcing It" award: I graduated from NYU in this May and started working at @CohereAI in August as a tech lead for Data+Evaluation!

6

4

108

Alex Wang

@W4ngatang

7 months

📣 We heard you liked the open weights we dropped last month, so we're doing it again, except more. 🎉 Introducing Command R+! 🎉 Really proud of what we've built and excited to see what y'all build on top of this!

CohereForAI/c4ai-command-r-plus · Hugging Face

huggingface.co

Aidan Gomez

@aidangomez

7 months

⌘R+ Welcoming Command R+, our latest model focused on scalability, RAG, and Tool Use. Like last time, we're releasing the weights for research use, we hope they're useful to everyone!

26

187

982

6

10

105

Alex Wang

@W4ngatang

2 years

Already tired of months-old papers at ACL? Looking for a hot, new preprint? Check out SQuALITY 💨🍵! SQuALITY is a long document, question-focused summarization dataset. Unlike many existing summ. datasets SQuALITY summaries are fully crowdsourced! (1/8)

1

12

94

Alex Wang

@W4ngatang

2 years

It's true, I successfully defended my dissertation yesterday! Big thanks to @hhexiy , @ml_perception , @JoaoSedoc for serving on the committee, and an especially big thank you to my advisors @sleepinyourhat and @kchonyc for advising and supporting me over the past five years.

Sam Bowman

@sleepinyourhat

2 years

Congrats to @W4ngatang for a successful dissertation defense today!

13

1

131

3

5

85

Alex Wang

@W4ngatang

6 years

Happy to have two @iclr2019 papers! - GLUE: a benchmark for multitask/transfer learning in NLP, with @apsdehal , Julian Michael, Felix Hill, @omerlevy_ , and @sleepinyourhat () - probing for sentence structure: led by @iftenney ()

What do you learn from context? Probing for sentence structure in...

We probe for sentence structure in ELMo and related contextual embedding models. We find existing models efficiently encode syntax and show evidence of long-range dependencies, but only offer small...

openreview.net

0

10

77

Alex Wang

@W4ngatang

1 year

there is an unreasonable number of "alex wang"s in the LLM space, between myself at Cohere, an Alex Wang at Perplexity, @alexandr_wang at Scale...truly blursed. s/o Alex L. Wang for once maintaining a disambiguation of "alex wang"s in ML

7

1

74

Alex Wang

@W4ngatang

8 months

Excited to share this work with the world, both the results and the actual model weights. Looking forward to seeing what the community will build with this! Stay tuned for more! ✍️details: ⚖️weights: 🤖chat:

CohereForAI/c4ai-command-r-v01 · Hugging Face

huggingface.co

Aidan Gomez

@aidangomez

8 months

⌘-R Introducing Command-R, a model focused on scalability, RAG, and Tool Use. We've also released the weights for research use, we hope they're useful to the community!

31

186

1K

0

8

64

Alex Wang

@W4ngatang

1 year

I live in Toronto now and #ACL2023NLP happens to be here too! If you want to chat about LLMs, where to eat/drink in Toronto, or opportunities at @cohere , feel free to reach out or stop by the Cohere booth!

3

1

66

Alex Wang

@W4ngatang

7 months

This drove me crazy for a while: We had internal experiments showing RM > LLM for evaluation, which felt really counterintuitive to me. Nice to get external confirmation, and thanks for building the benchmark @natolambert ! :)

Nathan Lambert

@natolambert

7 months

Thx to @cohere 's SOTA reward model, LLM-as-a-judge isn't SOTA on RewardBench :)

4

17

114

1

6

66

Alex Wang

@W4ngatang

11 months

I'm at #NeurIPS2023 the whole week! Shoot me a message if you wanna catch up/chat about LLMs, evaluation, @CohereForAI + @CohereForAI , or catch me at the @cohere booth from 3-6pm CT today!

0

5

56

Alex Wang

@W4ngatang

7 months

If you're interested in working with us shoot me a DM or email about yourself and something cool you've worked on recently! I'm looking for people interested in LLM evaluation and data creation, but we have plenty of other roles. The new NYC office is sweet!!

3

2

43

Alex Wang

@W4ngatang

7 months

Amazing release from @Nils_Reimers and many others! 🎉🎉

Aidan Gomez

@aidangomez

7 months

Introducing Rerank 3! Our latest model focused on powering much more complex and accurate search. It's the fastest, cheapest, and highest performance reranker that exists. We're really excited to see how this model influences RAG applications and search stacks.

23

124

743

4

0

39

Alex Wang

@W4ngatang

5 years

The SustaiNLP2020 (at @emnlp2020 ) Call for Submissions is up at . The task evaluates on SuperGLUE and energy efficiency as measured by @PeterHndrsn 's library. Come develop more energy efficient NLP models! Deadline Aug 28 and baseline code available soon!

SustaiNLP2020 - Shared Task

Shared Task: Call for Submissions SustaiNLP 2020 (co-located with EMNLP2020) is organizing a shared task to promote the development of effective, energy-efficient models for difficult NLU tasks. This...

sites.google.com

2

12

33

Alex Wang

@W4ngatang

5 years

New ish paper for ACL2019 comparing a diverse set of tasks for pretraining sentence encoders and augmenting existing pretrained LMs, made possible by great collaborators from Brown, Google, JHU, and many more, as well as oodles of compute.

1

2

27

Alex Wang

@W4ngatang

5 years

Come hear me attempt to recap a couple years of progress in NLP in 5m, or tell me your favorite glue puns at the poster session immediately afterwards!

AI at Meta

@AIatMeta

5 years

#NeurIPS2019 , catch the spotlight on our recently created SuperGLUE benchmark which helps language understanding researchers set a new, higher bar for #NLP research. It's Wed 4:55-5:00 PM West Ballrooms A + B. Read more: Benchmark:

0

28

109

0

1

24

Alex Wang

@W4ngatang

1 year

really excited for this work and honored to have helped mentor the first cohort of C4AI Scholars! Nice work Max!

Max Marion

@maxdoesresearch

1 year

📢New Pretraining Paper 📢 Delighted to share our new paper coming out of @forai_ml : "When Less is More: Investigating Data Pruning for Pretaining LLMs at Scale" Paper: w/ @ahmetustun89 @luizapzbn @W4ngatang @mziizm @sarahookr

9

25

87

0

4

24

Alex Wang

@W4ngatang

2 years

Really cool work from @luizapzbn and @forai_ml ! Keep an eye on this lab, buy your stonks in it now!

Luiza Pozzobon

@luizapzbn

2 years

🚨PREPRINT ALERT🚨 "On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research" w/ Beyza Ermis, @PSH_Lewis , and @sarahookr Paper: Code:

3

34

150

1

21

Alex Wang

@W4ngatang

4 years

Excited to be co-hosting a mentoring session on "establishing collaborations and networking" and "managing up" with @ryanzhumich and @yangfeng_ji for #acl2020nlp on 7/8 at 12pm ET. I imagine there will be a lot of learning from this session, especially by me😅

2

3

22

Alex Wang

@W4ngatang

2 years

Sam is an amazing advisor and human being! This is incredibly deserved. The lab is also looking to hire researchers at various levels, and that's a great opportunity to work with Sam and the rest of us!

Sam Bowman

@sleepinyourhat

2 years

I just got tenure! Wheee! Predictable-but-heartfelt gratitude thread:

131

17

2K

1

2

16

Alex Wang

@W4ngatang

6 years

Excited to share this work! We look at (1) methods for measuring bias in words embeddings applied to sentence encoders, and find that these methods don't straightforwardly apply (2) tests for nuanced social biases that are difficult or impossible to study at the word level

0

14

Alex Wang

@W4ngatang

4 years

The birds-of-a-feather session on generation at #acl2020nlp was awesome! Discussion was lively and spawned *multiple* followup discussions. Kudos to @sebgehr , @gh_marjan , and another moderator whose name I missed! There's another one in a few hours (5pm ET), highly recommended!

1

2

14

Alex Wang

@W4ngatang

7 months

@cohere Oh this is not an April Fool's thing 🫠

1

0

13

Alex Wang

@W4ngatang

4 years

I'll also be presenting "Asking and Answering Questions to Evaluate the Factual Consistency of Summaries" (joint work with @kchonyc and @ml_perception ) at sessions 9A (7/7, 1pm ET) and 10B (7/7, 5pm ET). Come chat and hang out!

Asking and Answering Questions to Evaluate the Factual Consistency...

Practical applications of abstractive summarization models are limited by frequent factual inconsistencies with respect to their input. Existing automatic evaluation metrics for summarization are...

arxiv.org

1

5

13

Alex Wang

@W4ngatang

7 months

@PSH_Lewis Patrick Lewis is a legend

0

11

Alex Wang

@W4ngatang

2 years

The past four months have been a blitz of fast, fun, and cool projects. And I've been fortunate to learn from @egrefen @Nils_Reimers Phil Blunsom and many others. There's cool stuff from Cohere on the horizon that I'm excited to share soon.

1

0

11

Alex Wang

@W4ngatang

6 years

Come join our lab! Sam does really neat work (disclaimer: I am biased).

0

2

9

Alex Wang

@W4ngatang

2 years

Feel free to reach out if you want to talk about LLMs, Cohere, or anything else!

0

10

Alex Wang

@W4ngatang

2 years

This is being presented at @emnlpmeeting on Friday at Session 2! Sadly none of us ( @yzpang97 , @_angie_chen , @zhansheng , @sleepinyourhat ) could make it to Abu Dhabi, but feel free to reach out if you have questions or want to talk about summ., data quality, or crowdsourcing!

Alex Wang

@W4ngatang

2 years

Already tired of months-old papers at ACL? Looking for a hot, new preprint? Check out SQuALITY 💨🍵! SQuALITY is a long document, question-focused summarization dataset. Unlike many existing summ. datasets SQuALITY summaries are fully crowdsourced! (1/8)

1

12

94

0

5

9

Alex Wang

@W4ngatang

7 months

Come build on Command R+ and Rerank 3 with us in our offices, especially in NYC!!

Aidan Gomez

@aidangomez

7 months

We're throwing 4 hackathons at each of our offices around the world!! If you're in NYC, London, Toronto, or SF come hang out with us and build with Command R and R+ 🛠️

13

27

228

0

8

Alex Wang

@W4ngatang

5 years

@phu_pmh @alfcnz Finishing up for the day and the workshop, panel with @Tetreault_NLP (who had a great talk right before this), @functiontelechy , @sleepinyourhat , @xkianteb , @phu_pmh , and Shubham Chandel on practical ways for beginners to get started in AI #nycaiworkshop

1

2

8

Alex Wang

@W4ngatang

2 years

There's a lot to do in using the multi-references, developing efficient human evaluation of long texts, and enabling long-text summ. with prompting. If this sounds interesting to you, check out the links below: paper: data: (7/8)

GitHub - nyu-mll/SQuALITY: Query-focused summarization data

Query-focused summarization data. Contribute to nyu-mll/SQuALITY development by creating an account on GitHub.

github.com

1

0

7

Alex Wang

@W4ngatang

6 years

I participated in this and it was a fantastic introduction to doing research. Highly recommended

Christopher Potts

@ChrisGPotts

6 years

We're now accepting applications for the 6th CSLI Undergraduate Summer Internship Program, which places students in Stanford labs for 8 weeks of mentored research. Housing and a stipend provided. Prior research experience not required:

4

71

78

0

7

Alex Wang

@W4ngatang

6 years

Got a Google home mini, immediately hit it off with Alexa #ModernLove #Alexa #GoogleHome

0

7

Alex Wang

@W4ngatang

6 years

Nihilist duolingo @shitduosays

1

0

6

Alex Wang

@W4ngatang

7 months

@jaa_campos Jon Ander is a legend

0

6

Alex Wang

@W4ngatang

8 months

Real cool title and really cool result!

Frances Ding

@FrancesDing

8 months

Protein language models (pLMs) can give protein sequences likelihood scores, which are commonly used as a proxy for fitness in protein engineering. But what do likelihoods encode? In a new paper (w/ @JacobSteinhardt ) we find that pLM likelihoods have a strong species bias! 1/

10

59

249

0

1

5

Alex Wang

@W4ngatang

2 years

Also, while I have you here, consider taking the NLP Community Metasurvey! Having an opinion is fun and seeing how your opinion lines up with the rest of the community is extra fun!

1

0

3

Alex Wang

@W4ngatang

5 years

@vincentsunnchen @SnorkelML Thanks so much! We're big fans of @SnorkelML :)

0

4

Alex Wang

@W4ngatang

2 years

@nickfrosst No u

0

4

Alex Wang

@W4ngatang

5 years

Inspecting the generated questions, we were surprised to find that they are often fluent, on-topic, and sensible. Nvidia has a great paper pushing on the question generation capabilities of existing models: .

Training Question Answering Models From Synthetic Data

Question and answer generation is a data augmentation method that aims to improve question answering (QA) models given the limited amount of human labeled data. However, a considerable gap remains...

arxiv.org

1

4

Alex Wang

@W4ngatang

6 years

Into the Spiderverse confirms: grad student Spiderman is the best Spiderman #Spiderman #SpiderVerse #MarvelsSpiderMan #gradlyfe

0

4

Alex Wang

@W4ngatang

2 years

SQuALITY is question-focused and multi-reference: For each story there are 5 questions, and for each question there are 4 reference summaries. The responses are highly diverse, an aspect of summarization that isn't well-represented in existing single-reference datasets. (4/8)

1

0

4

Alex Wang

@W4ngatang

5 years

This is work done while interning with @kchonyc and @ml_perception at FAIR, to be published at #acl2020 . Preprint available now, code to come soon!

0

4

Alex Wang

@W4ngatang

2 years

Probably one of the best decisions I've made in the past five years has been to do my PhD at NYU. It's a great place to do cutting-edge ML and NLP research. Not to mention it's in NYC!

1

0

3

Alex Wang

@W4ngatang

6 years

God is dead, I found him drowned in the 2nd Ave ice flows

0

4

Alex Wang

@W4ngatang

2 years

We spent several months working with Upwork writers and undergraduates to create summaries of Project Gutenberg stories (4-6k words long). We put a big focus on developing a protocol for collecting text responses that is cost-efficient while also maintaining quality. (3/8)

1

0

3

Alex Wang

@W4ngatang

4 years

Outside of these sessions, I'd love to chat and flex my "establishing collaborations and networking" muscles. Feel free to reach out!

0

1

3

Alex Wang

@W4ngatang

7 months

@leonardtang_ @cohere Meatpacking!

0

3

Alex Wang

@W4ngatang

2 years

SQuALITY is relatively small, but we think it's high-quality and a good benchmark test set for summarization. (6/8)

1

0

3

Alex Wang

@W4ngatang

5 years

This should be fun! Look out for more updates about this soon!

0

3

Alex Wang

@W4ngatang

2 years

@d_aumiller Email me at alexwang @cohere .com!

1

0

3

Alex Wang

@W4ngatang

2 years

Stay tuned for what I'll be doing next!

1

0

2

Alex Wang

@W4ngatang

4 years

#EMNLP2020 is great, but it can be challenging to engage with so much research when it's getting late and you've spent most of the day "at" the conf... shout out to @gregd_nlp for putting the Language Generation session on his back and keeping the questions+discussion flowing😅

0

3

Alex Wang

@W4ngatang

2 years

Human evaluators consider human-written summaries to be substantially better than summaries from state-of-the-art supervised summarization systems along several dimensions. Also, automatic metrics are a poor indicator of model quality for SQuALITY. (5/8)

1

0

3

Alex Wang

@W4ngatang

2 years

The group is very collaborative and supportive, and is pursuing excitingly risky and fun lines of research. I highly recommend collaborating with the folks here and visiting whenever you're able!

1

0

2

Alex Wang

@W4ngatang

6 years

Mondays are dumb. Cloudy Mondays are dumb. Damn, cloudy Mondays are dumb. Those damn cloudy Mondays are dumb. Dominate those damn cloudy dumb Mondays. Tl;dr: dom dem dam dim dum days #MondayMotivation

0

3

Alex Wang

@W4ngatang

2 years

Common approaches for building summarization datasets (scraping, developing heuristics) have led to unexpected amounts of noise in the datasets. Crowdsourcing summaries is expensive (and subsequently understudied), but one way to mitigate noise, if done carefully. (2/8)

1

0

2

Alex Wang

@W4ngatang

5 years

Using NLP models to evaluate generated text is a promising direction, but it's clear there is a lot of (exciting!) work to be done to make these methods reliable.

1

0

2

Alex Wang

@W4ngatang

6 years

New lunchables are insane

0

2

Alex Wang

@W4ngatang

7 months

@idavidrein @maximevoisin_ai @EpochAIResearch Lemme cook

0

2

Alex Wang

@W4ngatang

5 years

This method correlates much better with human judgments of consistency than existing metrics on the XSUM and CNN/DM summarization datasets. Our method is especially effective on the latter, likely due to the somewhat extractive nature of the dataset.

1

0

2

Alex Wang

@W4ngatang

5 years

@SustaiNLP2020

0

2

Alex Wang

@W4ngatang

6 years

somewhere in Singapore, there's an actual real estate dynasty that thinks Crazy Rich Asians is the most terrifying and informative movie of 2018

0

2

Alex Wang

@W4ngatang

1 year

@sebgehr @cohere Let's do it!

0

1

Alex Wang

@W4ngatang

8 months

@aidangomez @KocmiTom 👋

1

0

1

Alex Wang

@W4ngatang

2 years

Joint work with @yzpang92 @_angie_chen @zhangsheng @sleepinyourhat (8/8)

2

0

1

Alex Wang

@W4ngatang

4 years

Hongyao and Eric TA'd several of my classes, and I can attest that they are super smart and kind people working on exciting problems. I remember talking with Hongyao about strategic Doodle voting, the lessons of which I continue to use today. Congrats @hongyaoma and Eric!

ACM SIGecom

@AcmSIGecom

4 years

The ACM SIGecom Dissertation Award for 2019 goes to Hongyao Ma @hongyaoma , with honorary mentions going to Rediet Abebe @red_abebe and Eric Balkanski. Read more about their dissertations here:

2

3

48

0

1

Alex Wang

@W4ngatang

7 months

@hu_yifei @CohereForAI Hmm, send me your prompt?

1

0

1

Alex Wang

@W4ngatang

7 months

@rajammanabrolu @natolambert Mostly believing in the magic of a general purpose LLMs working better than a smaller task specific model, nothing especially principled

0

1

Alex Wang

@W4ngatang

2 years

by which I mean @yzpang97 and @zhansheng

0

1

Alex Wang

@W4ngatang

5 years

@LeonDerczynski @SustaiNLP2020 @sleepinyourhat @Thom_Wolf @LukeZettlemoyer @srush_nlp @gg42554 @NafiseSadat @VeredShwartz @andrewmccallum @paperswithcode @vict0rsch @JotyShafiq @MosheWasserblat @pmichelX Thanks for the heads up! Do you know who in those communities or what channels we should reach out to?

1

0

1

Alex Wang

@W4ngatang

2 years

@joechoochoy @idavidrein Anecdotally, there were some activities (mostly cognitively intensive games) where i am starving afterwards but I've only really been thinking

0

1

Alex Wang

@W4ngatang

6 years

Last week I attended @NAACLHLT 2018 in New Orleans. Here are some trends I observed:

Some Highlights from NAACL 2018

I attended NAACL 2018, which ended last week. Here are some trends I observed:

medium.com

0

1

Alex Wang

@W4ngatang

5 years

On the other hand, we find that the bottleneck in our metric is due to the QA model breaking down, despite the models being pretrained and finetuned on quite similar data sources as the test environment.

1

0

1

Alex Wang

@W4ngatang

5 years

@LeonDerczynski @SustaiNLP2020 It's up now!

SustaiNLP2020 - Shared Task

Shared Task: Call for Submissions SustaiNLP 2020 (co-located with EMNLP2020) is organizing a shared task to promote the development of effective, energy-efficient models for difficult NLU tasks. This...

sites.google.com

0

1

Alex Wang

@W4ngatang

5 years

@khanhxuannguyen @kchonyc @uralik1 Thanks for the great talk!

0

1