Laura Ruis @LauraRuis Twitter profile | Pikagi

Pikagi

Laura Ruis

@LauraRuis

3,480

Followers

668

Following

47

Media

889

Statuses

Currently research intern @cohere , PhD supervised by @_rockt and @egrefen . Language and LLMs. Spent time at FAIR, Google, and NYU ( @LakeBrenden ). She/her.

London

https://t.co/FDfXVODu7c

Joined October 2019

Don't wanna be here? Send us removal request.

Pinned Tweet

@LauraRuis

Laura Ruis

2 years

Before giving up on Twitter, check out our recent finding 🔥 LLMs are *not* zero-shot communicators! We show a limitation of #LLMs in interpreting language given the context of shared experience of the world, a ubiquitous part of communication ⬇️🧵 1/22

Tweet card media

The Goldilocks of Pragmatic Understanding: Fine-Tuning Strategy...

Despite widespread use of LLMs as conversational agents, evaluations of performance fail to capture a crucial aspect of communication: interpreting language in context -- incorporating its...

12

97

482

Last Seen Profiles

@elfielipe

@stw_pdg

@MeghanMyer40559

@TomanTrad

@bokeplokalmalam

@cuoisaccom

@Manga_Materials

@MVKVNI_

@Props4Mayhem

@melix_cj

@Summeeervibes

@frothspuns

@descoffeeinado

@Dubois5092002

@anthonyyylp

@kibou12r

@azh55001i

@stwmaniax

@higgins_donna

@glpUtfTtvaXd4kC

@Obras_Camp_nou

@TheKaseyKyle

@Qi98173095

@RemyTonCto

@CyberM0rph

@Chunglinhnhi

@Karuna_70

@ScootyScoot12

@gabu_vt

@Nother_Chi_Alt

@ed_draws_

@The__Madame

@evan_gkn

@IBN_JUM3AN

@gerghoria41109

@ijeparma

@LauraRuis

Laura Ruis

2 years

"Is your name GPT-3? Because you're the third most impressive thing I've seen today."

Tweet media one

36

115

1K

@LauraRuis

Laura Ruis

2 years

when you debug your NN for 2 days until you realise the lr is 1e3 instead of 1e-3 🙂🆒

36

15

891

@LauraRuis

Laura Ruis

2 years

Some of the best #advice I got early in my PhD was from @rockt urging me to set up a good note-keeping system. 1 year in my notes have helped me in so many situations! It took a few iterations, and I'll share the things that were most helpful for me personally here ⬇️🐊 1/13

14

73

562

@LauraRuis

Laura Ruis

1 year

Just ran 5 km without ChatGPTs help

17

36

452

@LauraRuis

Laura Ruis

3 years

When you have 2 phd advisors.

Tweet media one

13

20

444

@LauraRuis

Laura Ruis

1 year

ok sorry llama2

Tweet media one

8

26

307

@LauraRuis

Laura Ruis

3 years

I reproduced part of the Learning in High Dimension paper by @randall_balestr , @an_open_mind , & @ylecun . Join me on my journey towards understanding that *we shouldn't use interpolation (as def. in the paper) to explain models' generalization skills*:

Tweet media one

4

32

208

@LauraRuis

Laura Ruis

4 years

Yay! I accepted a PhD @ucl_dark with the most awesome advisors ( @egrefen & @_rockt )🥳! My mom thinks that after my PhD this image 👇will be classified as 'tiny horse', exhibiting clear common-sense knowledge of the size of microphones and horses so nothing but great expectations.

Tweet media one

10

4

199

@LauraRuis

Laura Ruis

2 years

This was *definitely* my highlight of NeurIPS 2022! After watching ML street talk countless hours myself (proof: see my recent raving tweet on their Chomsky episode), it’s pretty crazy to see myself featured. Thanks @MLStreetTalk , it was fun to talk to you!

@MLStreetTalk

Machine Learning Street Talk

2 years

At #NeurIPS2022 we spoke with @LauraRuis about her paper "Large language models are not zero-shot communicators" and implicatures with LLMS, co-authors @AkbirKhan @BlancheMinerva @sarahookr @_rockt @egrefen

Tweet media one

1

13

58

3

20

163

@LauraRuis

Laura Ruis

4 years

Pt 2 of the series on structured prediction with a #CRF is up! In this post I put a CRF on top of a bi-LSTM and show how to implement batched versions of the efficient algorithms belief propagation and Viterbi decoding in @PyTorch . Lmk what you think! 🐊

Tweet media one

1

35

145

@LauraRuis

Laura Ruis

11 months

Very happy to announce that this paper just got accepted to #NeurIPS2023 as a spotlight 🥳😍 Will make a thread with new results we got since last year once the camera-ready is done

@LauraRuis

Laura Ruis

2 years

Before giving up on Twitter, check out our recent finding 🔥 LLMs are *not* zero-shot communicators! We show a limitation of #LLMs in interpreting language given the context of shared experience of the world, a ubiquitous part of communication ⬇️🧵 1/22

12

97

482

5

18

130

@LauraRuis

Laura Ruis

2 years

Lots of people have been sending me implicatures we used as examples in our paper that #chatGPT understands (i.e. explains well when prompted). So cool! Happy to see people interested in this. I wanted to write a short thread about my thoughts of what this means. ⬇️

2

30

130

@LauraRuis

Laura Ruis

4 years

Super excited to present: a benchmark for systematic generalization in grounded language understanding! Curious why pink brontosauri like the one below are relevant in this context? w/ @jacobandreas Marco Baroni Diane Bouchacourt and @LakeBrenden (1/n)

Tweet media one

1

27

124

@LauraRuis

Laura Ruis

2 years

4 easy steps I use to keep myself on track for the #ICLR2022 deadline: 1) meticulously plan to-do's for each day of the week 2) Start of day, ignore all to-do's for that day and do whatever I feel like 3) End of day transfer missed to-do's to another day 4) Repeat until deadline

7

4

124

@LauraRuis

Laura Ruis

9 months

I've been interning @cohere the past months and have been enjoying it so much! 🧡 Prior to joining I could not have imagined the immense value it would bring to my (academic) understanding of #LLMs to see how they are developed in practice. More on the project I’m doing later :)

2

0

113

@LauraRuis

Laura Ruis

9 months

More excited about #NeurIPS2023 🐊 than Christmas🤶 because I'll be presenting our spotlight poster on pragmatic understanding by #LLMs 🔍. Our **main insight** is that IFT at the example-level helps pragmatic understanding! Poster #312 Tues 5:15PM (OK to bring 🍷)

Tweet media one

3

19

100

@LauraRuis

Laura Ruis

2 years

Dear god I logged out of chatGPT and now it's at capacity this was the worst mistake of my life pls. send. help.

7

3

95

@LauraRuis

Laura Ruis

2 years

I simultaneously have chatGPT-fatigue and also think about it every day - it truly changed the discourse around LLMs. Plus, chatty-P provided us with _so_ many memes. The following is my current mental model of chatGPT (now popularly synonymous to LLMs) and what might be missing.

4

15

94

@LauraRuis

Laura Ruis

4 years

Happy news 🤩 This fall I'll join @LakeBrenden 's awesome lab at @NYUDataScience to do research on compositionality and machine common sense for a year. Can't wait to continue the collaboration & be back in NYC! ... and will be looking for PhD's starting 2021 😼

2

3

83

@LauraRuis

Laura Ruis

4 years

After learning that timezones exist when missing the ICML rebuttal deadline and getting rejected (🤧), gSCAN now got accepted to #NeurIPS2020 ! All-in-all a good 1st week of quarantine in NYC (also graduated and started a job at NYU).

@LauraRuis

Laura Ruis

4 years

Super excited to present: a benchmark for systematic generalization in grounded language understanding! Curious why pink brontosauri like the one below are relevant in this context? w/ @jacobandreas Marco Baroni Diane Bouchacourt and @LakeBrenden (1/n)

Tweet media one

1

27

124

1

2

79

@LauraRuis

Laura Ruis

1 year

Defended my upgrade viva today titled "Emergent Agency in Models of Language". I'll be working on questions like: do models fine-tuned with HF simulate agents? Is agency useful for a reliable model of language? How can we evaluate it?

Tweet media one

3

3

67

@LauraRuis

Laura Ruis

9 months

I am at #NeurIPS2023 ! Currently I'm thinking about what kind of understanding language models have, non-verbatim contamination in pretraining data, and how to evaluate this. Reach out if you're up for a chat!

2

3

59

@LauraRuis

Laura Ruis

2 years

I feel like workboat is winning

3

0

54

@LauraRuis

Laura Ruis

1 year

In our recent #ICML2023 Theory of Mind Workshop paper we ask: do LLMs encode the goals of agents? Super excited about this (ongoing) colab between @UCL_DARK , @carperai , @AiEleuther , UCL, and Cambridge ⬇️⬇️

2

8

46

@LauraRuis

Laura Ruis

4 years

#NeurIPS2020 is a big test for dealing with FOMO🥺. 1 session I have to attend though is this Thursday 6PM CET; I'll be presenting gSCAN in town A3 spot C2, come chat about compositional generalization! w/ @jacobandreas , Marco Baroni, Diane Bouchacourt, and @LakeBrenden

Tweet media one

1

6

45

@LauraRuis

Laura Ruis

6 months

New 📜: *Debating with More Persuasive LLMs Leads to More Truthful Answers*. Large-scale work led by @akbirkhan @McHughes288 & @danvalentine256 : we find #debate between more knowledgeable models helps less knowledgeable models answer q's w/o ground-truth

Tweet card media

Debating with More Persuasive LLMs Leads to More Truthful Answers

Common methods for aligning large language models (LLMs) with desired behaviour heavily rely on human-labelled data. However, as models grow increasingly sophisticated, they will surpass human...

1

12

62

@LauraRuis

Laura Ruis

2 years

“Surely they can do this if _only_ you found the right prompt?” they’ll say. Besides the 6 templates we test, we take inspiration from the Sparrow paper ( @__nmca__ @mia_glaese ) and try additional, more elaborate instruction prompts for the zero-shot case. Doesn’t help.. 10/22

Tweet media one

1

4

41

@LauraRuis

Laura Ruis

2 years

Everything below I use @obsdmd for. I love how it's simply a markdown editor and file-structure visualiser at first but you can add all the plugins you need to make it ever more complex.

1

2

39

@LauraRuis

Laura Ruis

2 years

By popular request here is the link to the paper notes template!

Paper notes template.md

drive.google.com

@LauraRuis

Laura Ruis

2 years

Some of the best #advice I got early in my PhD was from @rockt urging me to set up a good note-keeping system. 1 year in my notes have helped me in so many situations! It took a few iterations, and I'll share the things that were most helpful for me personally here ⬇️🐊 1/13

14

73

562

1

5

39

@LauraRuis

Laura Ruis

3 years

Thanks for the shoutout Yannic! Also I'm glad you recognize how professional my GitHub picture is 🤝

@ykilcher

Yannic Kilcher 🇸🇨

3 years

🌐This week in ML News🌐 - Cedille: A French Language Model based on GPT-J🇫🇷 - The first multilingual model to win WMT🌍 - YOU: The private alternative to Google Search🔏 - Alibaba DAMO creates 10T parameter model🏗️ - AI finds profitable Meme Tokens💰

Tweet media one

1

7

68

0

1

37

@LauraRuis

Laura Ruis

3 years

Move over Bengios; there's a new duo in town 😼. Check out my brother @fa_ruis 's new #NeurIPS2021 paper if you're interested in compositional zero-shot learning:

Tweet card media

Independent Prototype Propagation for Zero-Shot Compositionality

Humans are good at compositional zero-shot reasoning; someone who has never seen a zebra before could nevertheless recognize one when we tell them it looks like a horse with black and white...

1

0

37

@LauraRuis

Laura Ruis

2 years

The field made great progress on classic compositionality like verb-object binding, but it doesn't transfer to less studied cases like adverb-verb comp. We dive into this and find that adding (a lot) more data doesn't help without extras like modularity. With @LakeBrenden 💃

1

4

33

@LauraRuis

Laura Ruis

2 years

This highlights the sensitivity of models to prompts and an area for improvement on prompt robustness! 16/22 (meme credit: @akbirkhan )

Tweet media one

1

4

33

@LauraRuis

Laura Ruis

2 years

At my first in-person conference _ever_ this week in Toronto! 🥺💃 Reach out if you want to get a coffee or a beer :) #CogSci2022 #CovidPhDs

1

3

33

@LauraRuis

Laura Ruis

2 years

I fill out the same note template for every paper I read. It asks for metadata, content tags, a personal brief thought about the paper, etc. I make sure to write a tl;dr in my own words after reading. DM me if you want the template :) (it adjusted from @y0b1byte 's template)

3

2

30

@LauraRuis

Laura Ruis

4 years

Jasmijn's supervision of my first research project years ago is the reason I got where I am now - mentors like her are invaluable. Congratulations, Dr. Jasmijn!🔥

0

1

30

@LauraRuis

Laura Ruis

2 years

@AndrewYNg Data Distributional Properties Drive Emergent In-Context Learning in Transformers - about understanding why in-context learning happens

1

0

28

@LauraRuis

Laura Ruis

2 years

Just pls for the love of The Spaghettimonster can we stop using the word “solve” altogether unless it pertains puzzles

@JLenzyy

Julian Lenz

2 years

I want to urge fellow audio-AI researchers to stop talking about 'solving' audio. People, we work in a creative domain You don't 'solve' art. Bach did not 'solve' baroque music. The term alienates us from musicians. We should be here to empower and enhance creativity!

9

45

276

2

1

25

@LauraRuis

Laura Ruis

4 years

AI Twitter expectation: oh yay 🥺 I can follow all my favourite researchers and read their brilliant thoughts on important AI-ery AI Twitter reality: <researchers spending 90hrs a week on meming this week's scandalous tweet>

0

0

27

@LauraRuis

Laura Ruis

1 year

First author of a paper I'm on knows how to give credit where credit is due.

Tweet media one

1

0

25

@LauraRuis

Laura Ruis

2 years

@Simeon_Cps You can read Appendix A, where we show how often this goes wrong and how often right with this exact example. It’s an illustrating example. We do a systematic analysis in the paper and show 70% of the time InstructGPT-3 gets similar example right. Not sure what’s misleading

0

1

26

@LauraRuis

Laura Ruis

1 year

Important lessons for modern #LLM research in Woodward’s 1998 seminal paper on goal-directed action. We should consider the possibility that LLMs are using other generalisation methods than reasoning when they pass our tests (like repeating semantic patterns from train data)

Tweet media one

1

1

25

@LauraRuis

Laura Ruis

1 year

This library is coming at a perfect time; it unlocks so many interesting projects. Check out the demo!

chatarena-chatarena-demo.hf.space

@YuxiangJWu

Yuxiang (Jimmy) Wu @ACL2024

1 year

Introducing ChatArena 🏟 - a Python library of multi-agent language game environments that facilitates communication and collaboration between multiple large language models (LLMs)! 🌐🤖 Check out our GitHub repo: #ChatArena #NLP #AI #LLM 1/8 🧵

Tweet media one

13

143

616

0

6

26

@LauraRuis

Laura Ruis

2 years

Work with awesome collaborators @akbirkhan @BlancheMinerva @sarahookr @_rockt @egrefen If this thread is long, it’s only because I had to keep the meme-to-information ratio at a respectable level (and because there’s a lot to unpack) 2/22

Large language models are not zero-shot communicators

Blog about AI.

lauraruis.github.io

1

3

25

@LauraRuis

Laura Ruis

2 years

A huge thanks to @AiEleuther , @OpenAI , and @CohereAI for providing a PhD student with the necessary compute and infrastructure for all these experiments! 20/22

Tweet media one

1

1

23

@LauraRuis

Laura Ruis

2 years

When we speak, there's complex underlying intentions and beliefs at play. These intentions are what make us mostly coherent and give our language meaning. LLMs simulate this agency to some extent (), but are no agent and hence show incoherence sometimes.

Tweet card media

Language Models as Agent Models

Language models (LMs) are trained on collections of documents, written by individual human agents to achieve specific goals in an outside world. During training, LMs have access only to text of...

1

3

21

@LauraRuis

Laura Ruis

3 years

@Ngasii_ @valavoosh Her books tell everything and they are amazing to read (this particular story in "heart of a woman")

3

1

24

@LauraRuis

Laura Ruis

1 year

btw alignment was solved in the 90s by IBM

1

1

24

@LauraRuis

Laura Ruis

2 years

This aspect of language is studied by the field of #pragmatics . Side note: besides being important for communication, we can also use it to infer human values 👀 Check out this recent cool theoretical work emphasising this for value #alignment : 6/22

@GoogleDeepMind

Google DeepMind

@GoogleDeepMind

2 years

How can conversational agents be aligned with human values? New research from @Dr_Atoosa and @IasonGabriel explores this question using philosophy and linguistics:

Tweet media one

6

59

230

1

2

24

@LauraRuis

Laura Ruis

4 years

Very valuable to watch great minds respectfully disagreeing on the role and origin of compositionality in intelligence and what that means for the design of artificial intelligence, among many other things

@criticalneuro

Ida Momennejad

4 years

Join us at the salon at 4 PM ET today! Let's be happy anxious together and chat with the brilliant Jay McClelland into a celebration :-)

0

10

50

1

1

22

@LauraRuis

Laura Ruis

2 years

What linguists don't want you to know is that linguistics is a Ponzi scheme where they make up words to lure new linguists into studying those words who make up new words etc.

Tweet media one

3

0

23

@LauraRuis

Laura Ruis

2 years

I thank OP for the praise, but I have to say he misses the point. The message of the paper is that pragmatics of LLMs is lacking, for some classes more than others, and 1 class of instruct-tuning is promising for this capability (OpenAI-style) and the other isn't (Flan-T5 style).

@rasbt

Sebastian Raschka

2 years

Just read the excellent "Large language models are not zero-shot communicators" () paper on InstructGPT LLM a few weeks ago. Great paper but (probably) already out of date. That's the current pace of AI research for you.

Tweet media one

6

29

191

1

0

20

@LauraRuis

Laura Ruis

2 years

Also, get yourself some great mentors because it's priceless :)

3

0

22

@LauraRuis

Laura Ruis

2 years

If you think this is cool (which it is) come to the @LaReL2022 Workshop at #NeurIPS next week! This work is part of our late-breaking results slot ⬇️😍

@AIatMeta

AI at Meta

2 years

Meta AI presents CICERO — the first AI to achieve human-level performance in Diplomacy, a strategy game which requires building trust, negotiating and cooperating with multiple players. Learn more about #CICERObyMetaAI :

239

843

4K

1

3

21

@LauraRuis

Laura Ruis

9 months

Honestly so happy to be part of DARK (did you know DARK is hiring PhD students 🫣)

@UCL_DARK

UCL DARK

9 months

Members of the lab have a *huge* presence at #NeurIPS2023 this year! In fact, almost everyone of the lab is in NOLA 🥳 Come find us at our posters, workshops, competitions, or talks! See 🧵for more info.

Tweet media one

1

15

76

0

3

22

@LauraRuis

Laura Ruis

2 years

Come chat with me at my poster! 8:30 am today at #CogSci2022

@LakeBrenden

Brenden Lake @ CogSci2024

2 years

People can understand new adverb-verb combos in ways that stump machines: after learning how to "walk cautiously", one can "cycle cautiously." In new #CogSci2022 paper, @LauraRuis shows how modularity + structured data augmentation leads to real progress

Tweet media one

2

3

32

0

4

21

@LauraRuis

Laura Ruis

2 years

Submit to the language and RL workshop at #NeurIPS2022 ! Deadline September 22nd 🕺

@LaReL2022

LaReL Workshop 2022

2 years

Announcing the second edition of the Language and Reinforcement learning (LaReL) workshop at #NeurIPS2022 ! We're calling for papers at the intersection of language and reinforcement learning. Submission deadline: 22 September, 2022, AoE. 🧵⬇️ 1/9

1

23

80

0

3

21

@LauraRuis

Laura Ruis

2 years

Every day of course means 2 or 3 times a week come on guys my work ethic fluctuates like the #BTC price

2

0

20

@LauraRuis

Laura Ruis

1 year

Prompting can be unintuitive.

Tweet media one

Tweet media two

1

0

19

@LauraRuis

Laura Ruis

2 years

Incredibly happy with my choice for a PhD @ai_ucl with @UCL_DARK . Feel free to DM me with questions if you're considering a PhD! 🐊 Will already say here; I think choice of supervisor is the most important thing!

@_rockt

Tim Rocktäschel

2 years

If you want to do a PhD in cutting-edge AI, this is the place. @AI_UCL provides a fantastic cohort-based PhD training program with strong connections to industry and entrepreneurship. At @UCL_DARK we are fortunate to have @LauraRuis @_robertkirk @akbirkhan funded by this program.

0

6

26

0

4

20

@LauraRuis

Laura Ruis

2 years

We are excited about the prospects of large language models becoming “communicators” by improving their pragmatic language skills! Check the paper out for details Paper: Blogpost: Dataset: 19/22

Tweet card media

UCL-DARK/ludwig · Datasets at Hugging Face

1

4

18

@LauraRuis

Laura Ruis

2 years

InstructGPT might understand the syntax and semantics of the question perfectly but still misses the meaning entirely. This question has a *pragmatic* implication; an inquiry about the location of a phone 4/22

2

1

19

@LauraRuis

Laura Ruis

1 year

My Spotify queue feature is working as a stack and I understand why we do coding interviews now

0

2

18

@LauraRuis

Laura Ruis

2 years

“But what about in-context examples?!” they’ll ask. We add up to 30 examples from a dev set to the prompt. For most models, it doesn’t help much. Davinci-002 can get close to human performance at k=30. But again, for context-heavy examples the gap is still 9% with humans. 11/22

2

1

19

@LauraRuis

Laura Ruis

2 years

For example, imagine Esther asking “Can I use your stapler?” and Juan responding “Here’s the key to my office.” Juan is implicating that (1) Esther can use the stapler, (2) the stapler is located in the office, and (3) the office is currently locked. 13/22

1

1

19

@LauraRuis

Laura Ruis

2 years

PS. also check this work that shows LLMs empirically failing is not a problem of the next-word pred objective and, in theory, they could learn the entailment semantics from text, even though the necessary context is not directly part of the window! 22/22

@lambdaviking

William Merrill ✈️ACL🇹🇭

2 years

Thanks to Meryl for covering our recent work on semantics and language models on the CDS blog! The paper proves entailment prediction can be reduced to language modeling, and shows how to extract entailment from an “ideal” LM. Check out the blog to learn more!

0

2

16

3

2

19

@LauraRuis

Laura Ruis

2 years

Let’s start off with an example of the type of language we’re investigating. Consider the following exchange between a human and InstructGPT (text-davinci-002 temp. 0) 3/22

2

2

19

@LauraRuis

Laura Ruis

10 months

The commenting in a JS file I wrote when I first started programming is .. excessive 🙃

Tweet media one

1

0

18

@LauraRuis

Laura Ruis

2 years

Again, as with DALL-E prompts, the thing that I enjoy the most is the creativity of the human prompts. To get something interesting out of #chatGPT , the prompt needs to be interesting. A beautiful collaboration of human and model creativity.

@spiantado

steven t. piantadosi

2 years

still just unbelievable

Tweet media one

2

7

99

0

0

17

@LauraRuis

Laura Ruis

11 months

DARK is just getting better and better 😍

@UCL_DARK

UCL DARK

11 months

We are super excited to announce that Dr Roberta Raileanu ( @robertarail ) and Dr Jack Parker-Holder ( @jparkerholder ) have joined @UCL_DARK as Honorary Lecturers! Both have done impressive work in Reinforcement Learning and Open-Endedness, and our lab is lucky to get their support.

Tweet media one

4

12

86

1

0

18

@LauraRuis

Laura Ruis

2 years

Like prior work we found LLMs handle some prompt templates better than others. However there is no single prompt template that dominates - some models prefer structured prompts (no. 1 below) whilst others prefer natural prompt templates (no. 2 below). 14/22

Tweet media one

1

2

18

@LauraRuis

Laura Ruis

2 years

E.g. a note titled "LLN": "The law of large numbers is the most important theorem in ML; it allows estimating expectations by sample avgs. We use it in max likelihood est. The law says that if we have independent data from a source, we can recover properties of the source."

1

1

18

@LauraRuis

Laura Ruis

2 years

We evaluate LLMs in 4 groups: base models (like OPT), instructable LLMs with an unknown method (like InstructGPT-3), instructable LLMs finetuned on downstream tasks (T0 and Flan-T5), and LLMs finetuned on dialog (BlenderBot) and find that they struggle compared to humans. ⬇️ 8/22

Tweet media one

1

1

18

@LauraRuis

Laura Ruis

2 years

So much going on in that scale plot, most notably; InstructGPT-3 outperforms all! But, even the best model performs much worse than humans (14% worse), especially on a subset of the data that requires context to be resolved (24% worse, shown in section 4.1 of the paper). 9/22

1

1

18

@LauraRuis

Laura Ruis

2 years

Jk I feel exactly the same and still think about M&Ms every day

1

0

18

@LauraRuis

Laura Ruis

2 years

We design a pragmatic language understanding task, schematically depicted below. This image shows one of our prompt templates, but we test 6 different templates to control for sensitivity to the wording. 7/22

Tweet media one

2

1

18

@LauraRuis

Laura Ruis

2 years

I think an important missing component is #agency (setting goals and achieving them). There are levels to agency. Lizards can find food under uncertainty. A monkey sets diverse goals and achieves them under more uncertainty. We are at the top of this self-defined agency-pyramid.

1

0

16

@LauraRuis

Laura Ruis

2 years

In only 40 minutes #LaReL2022 will start in room 391! Come by to get up to date with the field of RL and language 🥳

@LaReL2022

LaReL Workshop 2022

2 years

Now that we’ve introduced all our speakers and our late-breaking results, we’re super excited for tomorrow! Come to chat with people interested in language & RL, and to get up to date on what’s happening in this rapidly growing field! The workshop is in room 319 at #NeurIPS2022

2

5

13

0

5

17

@LauraRuis

Laura Ruis

2 years

On a scale of 1 to pretty worrying how worrying is it that I've given over my full autonomy as a programmer to copilot within hours of enabling it

2

1

16

@LauraRuis

Laura Ruis

2 years

And different models react completely differently to in-context prompting. Below you can see InstructGPT-3-175B benefiting relatively equally from few-shot examples for all templates, but Cohere-52B only benefiting for structured prompts, and OPT-175B vice-versa. 🤔 15/22

Tweet media one

1

2

17

@LauraRuis

Laura Ruis

2 years

And, on top of all this, in our work we test on a *very* simple binary resolution task (resolving to “yes” or “no”). Humans resolve much more complex implicatures intuitively in conversation, leaving the door wide open for more complex benchmarks in the future. 12/22

1

2

17

@LauraRuis

Laura Ruis

2 years

It makes a good point.

Tweet media one

1

0

15

@LauraRuis

Laura Ruis

2 years

Sources of inspiration: - Ape language: From conditioned response to symbol by Savage-Rumbaugh - The Symbolic Species by Deacon - Symbolic Behaviour in AI by Santoro, Lampinen et al - The Evolution of Agency by Michael Tomasello - Language models as agent models by Andreas

2

1

15

@LauraRuis

Laura Ruis

4 years

Quarantine day 1billion, bit fed up w/ watching Bridgerton & watching my fav influencer watch Bridgerton (millenials, amirite 🧐) - wrote a blog on structured prediction with CRFs! Pt 1: Content: PGMs, CRFs, belief prop, and viterbi! Feedback v welcome 😊

0

2

16

@LauraRuis

Laura Ruis

1 year

@Nabil_Alouani_ A croissant

1

0

15

@LauraRuis

Laura Ruis

2 years

Humans, having had an experience of losing their own phone in the past, infer that the speaker is looking for their phone. This illustrates an essential aspect of our everyday usage of language: interpreting it given the context of our shared experience 🤝 5/22

1

1

15

@LauraRuis

Laura Ruis

2 years

🙏Massive thanks to all my collaborators without whom this large-scale work would definitely not have been possible 🫶 @akbirkhan @BlancheMinerva @sarahookr @_rockt @egrefen That’s it. 21/22

Tweet media one

1

1

15

@LauraRuis

Laura Ruis

1 year

that definitely clears up any confusion. thanks 🦙2

Tweet media one

@LauraRuis

Laura Ruis

1 year

ok sorry llama2

Tweet media one

8

26

307

1

0

13

@LauraRuis

Laura Ruis

9 months

Doing a PhD at DARK has been a great choice for me. Ed and Tim are amazing mentors and have created a really great environment to do research in with DARK.

@UCL_DARK

UCL DARK

9 months

We ( @_rockt , @egrefen , @robertarail , and @jparkerholder ) are looking for PhD students to join us in Fall 2024. If you are interested in Open-Endedness, RL & Foundation Models, then apply here: and also write us at ucl-dark-admissions @googlegroups .com

3

20

65

0

0

15

@LauraRuis

Laura Ruis

1 year

Reading something by a researcher whose work I admire and she writes something that is (perhaps strangely) very motivational to me: "The first answer guided my research for about 20 years, but I now believe that it is wrong"

0

0

13

@LauraRuis

Laura Ruis

2 years

Big fan of @AiEleuther ’s eval harness. In our recent work on implicature and LLMs we had to run lots of different prompt templates on lots of different large models (OPT-175, BLOOM-176, etc) and with the eval harness it literally took a few lines of code and some yaml files 👏

0

3

13

@LauraRuis

Laura Ruis

1 year

The Wikipedia page of the Sally-Anne test references only the Kosinski paper claiming GPT-4 has advanced #TheoryOfMind under the header “artificial intelligence” 🤔

Tweet media one

1

0

12

@LauraRuis

Laura Ruis

2 years

Every day I spend 30 minutes at the end of the day to do some upkeep. Transfer random links I sent to myself to separate reading lists, clean some of the metadata, polish some stuff in project notes sections and move to standalone notes, etc.

1

1

13

@LauraRuis

Laura Ruis

1 year

This is a great paper not only because of the clever control tasks, but also because it relates current evals of LLMs to decades old discussions on psychologism vs behaviorism

@TomerUllman

Tomer Ullman

2 years

So about the 'Large Language Models Learned Theory-of-Mind(?)' discussion: Has ToM emerged in current LLMs? I doubt it.

Tweet media one

Tweet media two

20

88

340

1

3

13

@LauraRuis

Laura Ruis

1 year

Consider submitting to SoLaR! Should be a really exciting and important workshop to join this year at NeurIPS

@solarneurips

SoLaR @ NeurIPS2024

1 year

The NeurIPS 2023 workshop on Socially Responsible Language Modelling Research (SoLaR) is accepting submissions! The deadline is Sep 29. Check out our submission guide:

0

6

20

0

1

12

@LauraRuis

Laura Ruis

7 months

This paper shaped my thinking about using human feedback for evaluation (and with that, fine-tuning), check it out!

@tomhosking

Tom Hosking

7 months

"Human Feedback is not Gold Standard" was accepted at ICLR 2024 🥳 I'd love to chat about the limits of human feedback wrt LLM alignment (and about @cohere ) if you're going to be at the conference! 🇦🇹 Thanks again to @max_nlp for making it an awesome internship experience ❤️

5

26

185

0

0

13

@LauraRuis

Laura Ruis

8 months

Starting soon!! Come chat with us

@LauraRuis

Laura Ruis

9 months

More excited about #NeurIPS2023 🐊 than Christmas🤶 because I'll be presenting our spotlight poster on pragmatic understanding by #LLMs 🔍. Our **main insight** is that IFT at the example-level helps pragmatic understanding! Poster #312 Tues 5:15PM (OK to bring 🍷)

Tweet media one

3

19

100

0

1

12

@LauraRuis

Laura Ruis

10 months

This work has some super interesting insights on RLHF vs SFT

@_robertkirk

Robert Kirk

11 months

Excited to share work from my FAIR internship on understanding the effects of RLHF on LLM generalisation and diversity: While RLHF outperforms SFT in-distribution and OOD, this comes at the cost of a big drop in output diversity! Read more below🧵

Tweet media one

3

57

336

0

2

12

@LauraRuis

Laura Ruis

2 years

I'm personally blown away by chatGPT's capabilities, it's absolutely incredible at explaining things, compositional generalisation of concepts, simulating a VM, coherence, creativity, writing essays, poems, and more!

1

0

12