Laura Ruis Profile Banner
Laura Ruis Profile
Laura Ruis

@LauraRuis

3,480
Followers
668
Following
47
Media
889
Statuses

Currently research intern @cohere , PhD supervised by @_rockt and @egrefen . Language and LLMs. Spent time at FAIR, Google, and NYU ( @LakeBrenden ). She/her.

London
Joined October 2019
Don't wanna be here? Send us removal request.
Pinned Tweet
@LauraRuis
Laura Ruis
2 years
Before giving up on Twitter, check out our recent finding 🔥 LLMs are *not* zero-shot communicators! We show a limitation of #LLMs in interpreting language given the context of shared experience of the world, a ubiquitous part of communication ⬇️🧵 1/22
12
97
482
@LauraRuis
Laura Ruis
2 years
"Is your name GPT-3? Because you're the third most impressive thing I've seen today."
Tweet media one
36
115
1K
@LauraRuis
Laura Ruis
2 years
when you debug your NN for 2 days until you realise the lr is 1e3 instead of 1e-3 🙂🆒
36
15
891
@LauraRuis
Laura Ruis
2 years
Some of the best #advice I got early in my PhD was from @rockt urging me to set up a good note-keeping system. 1 year in my notes have helped me in so many situations! It took a few iterations, and I'll share the things that were most helpful for me personally here ⬇️🐊 1/13
14
73
562
@LauraRuis
Laura Ruis
1 year
Just ran 5 km without ChatGPTs help
17
36
452
@LauraRuis
Laura Ruis
3 years
When you have 2 phd advisors.
Tweet media one
13
20
444
@LauraRuis
Laura Ruis
1 year
ok sorry llama2
Tweet media one
8
26
307
@LauraRuis
Laura Ruis
3 years
I reproduced part of the Learning in High Dimension paper by @randall_balestr , @an_open_mind , & @ylecun . Join me on my journey towards understanding that *we shouldn't use interpolation (as def. in the paper) to explain models' generalization skills*:
Tweet media one
4
32
208
@LauraRuis
Laura Ruis
4 years
Yay! I accepted a PhD @ucl_dark with the most awesome advisors ( @egrefen & @_rockt )🥳! My mom thinks that after my PhD this image 👇will be classified as 'tiny horse', exhibiting clear common-sense knowledge of the size of microphones and horses so nothing but great expectations.
Tweet media one
10
4
199
@LauraRuis
Laura Ruis
2 years
This was *definitely* my highlight of NeurIPS 2022! After watching ML street talk countless hours myself (proof: see my recent raving tweet on their Chomsky episode), it’s pretty crazy to see myself featured. Thanks @MLStreetTalk , it was fun to talk to you!
@MLStreetTalk
Machine Learning Street Talk
2 years
At #NeurIPS2022 we spoke with @LauraRuis about her paper "Large language models are not zero-shot communicators" and implicatures with LLMS, co-authors @AkbirKhan @BlancheMinerva @sarahookr @_rockt @egrefen
Tweet media one
1
13
58
3
20
163
@LauraRuis
Laura Ruis
4 years
Pt 2 of the series on structured prediction with a #CRF is up! In this post I put a CRF on top of a bi-LSTM and show how to implement batched versions of the efficient algorithms belief propagation and Viterbi decoding in @PyTorch . Lmk what you think! 🐊
Tweet media one
1
35
145
@LauraRuis
Laura Ruis
11 months
Very happy to announce that this paper just got accepted to #NeurIPS2023 as a spotlight 🥳😍 Will make a thread with new results we got since last year once the camera-ready is done
@LauraRuis
Laura Ruis
2 years
Before giving up on Twitter, check out our recent finding 🔥 LLMs are *not* zero-shot communicators! We show a limitation of #LLMs in interpreting language given the context of shared experience of the world, a ubiquitous part of communication ⬇️🧵 1/22
12
97
482
5
18
130
@LauraRuis
Laura Ruis
2 years
Lots of people have been sending me implicatures we used as examples in our paper that #chatGPT understands (i.e. explains well when prompted). So cool! Happy to see people interested in this. I wanted to write a short thread about my thoughts of what this means. ⬇️
2
30
130
@LauraRuis
Laura Ruis
4 years
Super excited to present: a benchmark for systematic generalization in grounded language understanding! Curious why pink brontosauri like the one below are relevant in this context? w/ @jacobandreas Marco Baroni Diane Bouchacourt and @LakeBrenden (1/n)
Tweet media one
1
27
124
@LauraRuis
Laura Ruis
2 years
4 easy steps I use to keep myself on track for the #ICLR2022 deadline: 1) meticulously plan to-do's for each day of the week 2) Start of day, ignore all to-do's for that day and do whatever I feel like 3) End of day transfer missed to-do's to another day 4) Repeat until deadline
7
4
124
@LauraRuis
Laura Ruis
9 months
I've been interning @cohere the past months and have been enjoying it so much! 🧡 Prior to joining I could not have imagined the immense value it would bring to my (academic) understanding of #LLMs to see how they are developed in practice. More on the project I’m doing later :)
2
0
113
@LauraRuis
Laura Ruis
9 months
More excited about #NeurIPS2023  🐊 than Christmas🤶 because I'll be presenting our spotlight poster on pragmatic understanding by #LLMs  🔍. Our **main insight** is that IFT at the example-level helps pragmatic understanding! Poster #312 Tues 5:15PM (OK to bring 🍷)
Tweet media one
3
19
100
@LauraRuis
Laura Ruis
2 years
Dear god I logged out of chatGPT and now it's at capacity this was the worst mistake of my life pls. send. help.
7
3
95
@LauraRuis
Laura Ruis
2 years
I simultaneously have chatGPT-fatigue and also think about it every day - it truly changed the discourse around LLMs. Plus, chatty-P provided us with _so_ many memes. The following is my current mental model of chatGPT (now popularly synonymous to LLMs) and what might be missing.
4
15
94
@LauraRuis
Laura Ruis
4 years
Happy news 🤩 This fall I'll join @LakeBrenden 's awesome lab at @NYUDataScience to do research on compositionality and machine common sense for a year. Can't wait to continue the collaboration & be back in NYC! ... and will be looking for PhD's starting 2021 😼
2
3
83
@LauraRuis
Laura Ruis
4 years
After learning that timezones exist when missing the ICML rebuttal deadline and getting rejected (🤧), gSCAN now got accepted to #NeurIPS2020 ! All-in-all a good 1st week of quarantine in NYC (also graduated and started a job at NYU).
@LauraRuis
Laura Ruis
4 years
Super excited to present: a benchmark for systematic generalization in grounded language understanding! Curious why pink brontosauri like the one below are relevant in this context? w/ @jacobandreas Marco Baroni Diane Bouchacourt and @LakeBrenden (1/n)
Tweet media one
1
27
124
1
2
79
@LauraRuis
Laura Ruis
1 year
Defended my upgrade viva today titled "Emergent Agency in Models of Language". I'll be working on questions like: do models fine-tuned with HF simulate agents? Is agency useful for a reliable model of language? How can we evaluate it?
Tweet media one
3
3
67
@LauraRuis
Laura Ruis
9 months
I am at #NeurIPS2023 ! Currently I'm thinking about what kind of understanding language models have, non-verbatim contamination in pretraining data, and how to evaluate this. Reach out if you're up for a chat!
2
3
59
@LauraRuis
Laura Ruis
2 years
I feel like workboat is winning
3
0
54
@LauraRuis
Laura Ruis
1 year
In our recent #ICML2023 Theory of Mind Workshop paper we ask: do LLMs encode the goals of agents? Super excited about this (ongoing) colab between @UCL_DARK , @carperai , @AiEleuther , UCL, and Cambridge ⬇️⬇️
2
8
46
@LauraRuis
Laura Ruis
4 years
#NeurIPS2020 is a big test for dealing with FOMO🥺. 1 session I have to attend though is this Thursday 6PM CET; I'll be presenting gSCAN in town A3 spot C2, come chat about compositional generalization! w/ @jacobandreas , Marco Baroni, Diane Bouchacourt, and @LakeBrenden
Tweet media one
1
6
45
@LauraRuis
Laura Ruis
6 months
New 📜: *Debating with More Persuasive LLMs Leads to More Truthful Answers*. Large-scale work led by @akbirkhan   @McHughes288 & @danvalentine256 : we find #debate  between more knowledgeable models helps less knowledgeable models answer q's w/o ground-truth
1
12
62
@LauraRuis
Laura Ruis
2 years
“Surely they can do this if _only_ you found the right prompt?” they’ll say. Besides the 6 templates we test, we take inspiration from the Sparrow paper ( @__nmca__ @mia_glaese ) and try additional, more elaborate instruction prompts for the zero-shot case. Doesn’t help.. 10/22
Tweet media one
1
4
41
@LauraRuis
Laura Ruis
2 years
Everything below I use @obsdmd for. I love how it's simply a markdown editor and file-structure visualiser at first but you can add all the plugins you need to make it ever more complex.
1
2
39
@LauraRuis
Laura Ruis
2 years
By popular request here is the link to the paper notes template!
@LauraRuis
Laura Ruis
2 years
Some of the best #advice I got early in my PhD was from @rockt urging me to set up a good note-keeping system. 1 year in my notes have helped me in so many situations! It took a few iterations, and I'll share the things that were most helpful for me personally here ⬇️🐊 1/13
14
73
562
1
5
39
@LauraRuis
Laura Ruis
3 years
Thanks for the shoutout Yannic! Also I'm glad you recognize how professional my GitHub picture is 🤝
@ykilcher
Yannic Kilcher 🇸🇨
3 years
🌐This week in ML News🌐 - Cedille: A French Language Model based on GPT-J🇫🇷 - The first multilingual model to win WMT🌍 - YOU: The private alternative to Google Search🔏 - Alibaba DAMO creates 10T parameter model🏗️ - AI finds profitable Meme Tokens💰
Tweet media one
1
7
68
0
1
37
@LauraRuis
Laura Ruis
2 years
The field made great progress on classic compositionality like verb-object binding, but it doesn't transfer to less studied cases like adverb-verb comp. We dive into this and find that adding (a lot) more data doesn't help without extras like modularity. With @LakeBrenden 💃
1
4
33
@LauraRuis
Laura Ruis
2 years
This highlights the sensitivity of models to prompts and an area for improvement on prompt robustness! 16/22 (meme credit: @akbirkhan )
Tweet media one
1
4
33
@LauraRuis
Laura Ruis
2 years
At my first in-person conference _ever_ this week in Toronto! 🥺💃 Reach out if you want to get a coffee or a beer :) #CogSci2022 #CovidPhDs
1
3
33
@LauraRuis
Laura Ruis
2 years
I fill out the same note template for every paper I read. It asks for metadata, content tags, a personal brief thought about the paper, etc. I make sure to write a tl;dr in my own words after reading. DM me if you want the template :) (it adjusted from @y0b1byte 's template)
3
2
30
@LauraRuis
Laura Ruis
4 years
Jasmijn's supervision of my first research project years ago is the reason I got where I am now - mentors like her are invaluable. Congratulations, Dr. Jasmijn!🔥
0
1
30
@LauraRuis
Laura Ruis
2 years
@AndrewYNg Data Distributional Properties Drive Emergent In-Context Learning in Transformers - about understanding why in-context learning happens
1
0
28
@LauraRuis
Laura Ruis
2 years
Just pls for the love of The Spaghettimonster can we stop using the word “solve” altogether unless it pertains puzzles
@JLenzyy
Julian Lenz
2 years
I want to urge fellow audio-AI researchers to stop talking about 'solving' audio. People, we work in a creative domain You don't 'solve' art. Bach did not 'solve' baroque music. The term alienates us from musicians. We should be here to empower and enhance creativity!
9
45
276
2
1
25
@LauraRuis
Laura Ruis
4 years
AI Twitter expectation: oh yay 🥺 I can follow all my favourite researchers and read their brilliant thoughts on important AI-ery AI Twitter reality: <researchers spending 90hrs a week on meming this week's scandalous tweet>
0
0
27
@LauraRuis
Laura Ruis
1 year
First author of a paper I'm on knows how to give credit where credit is due.
Tweet media one
1
0
25
@LauraRuis
Laura Ruis
2 years
@Simeon_Cps You can read Appendix A, where we show how often this goes wrong and how often right with this exact example. It’s an illustrating example. We do a systematic analysis in the paper and show 70% of the time InstructGPT-3 gets similar example right. Not sure what’s misleading
0
1
26
@LauraRuis
Laura Ruis
1 year
Important lessons for modern #LLM research in Woodward’s 1998 seminal paper on goal-directed action. We should consider the possibility that LLMs are using other generalisation methods than reasoning when they pass our tests (like repeating semantic patterns from train data)
Tweet media one
1
1
25
@LauraRuis
Laura Ruis
1 year
This library is coming at a perfect time; it unlocks so many interesting projects. Check out the demo!
@YuxiangJWu
Yuxiang (Jimmy) Wu @ACL2024
1 year
Introducing ChatArena 🏟 - a Python library of multi-agent language game environments that facilitates communication and collaboration between multiple large language models (LLMs)! 🌐🤖 Check out our GitHub repo: #ChatArena #NLP #AI #LLM 1/8 🧵
Tweet media one
13
143
616
0
6
26
@LauraRuis
Laura Ruis
2 years
Work with awesome collaborators @akbirkhan @BlancheMinerva @sarahookr @_rockt @egrefen If this thread is long, it’s only because I had to keep the meme-to-information ratio at a respectable level (and because there’s a lot to unpack) 2/22
1
3
25
@LauraRuis
Laura Ruis
2 years
A huge thanks to @AiEleuther , @OpenAI , and @CohereAI for providing a PhD student with the necessary compute and infrastructure for all these experiments! 20/22
Tweet media one
1
1
23
@LauraRuis
Laura Ruis
2 years
When we speak, there's complex underlying intentions and beliefs at play. These intentions are what make us mostly coherent and give our language meaning. LLMs simulate this agency to some extent (), but are no agent and hence show incoherence sometimes.
1
3
21
@LauraRuis
Laura Ruis
3 years
@Ngasii_ @valavoosh Her books tell everything and they are amazing to read (this particular story in "heart of a woman")
3
1
24
@LauraRuis
Laura Ruis
1 year
btw alignment was solved in the 90s by IBM
1
1
24
@LauraRuis
Laura Ruis
2 years
This aspect of language is studied by the field of #pragmatics . Side note: besides being important for communication, we can also use it to infer human values 👀 Check out this recent cool theoretical work emphasising this for value #alignment : 6/22
@GoogleDeepMind
Google DeepMind
2 years
How can conversational agents be aligned with human values? New research from @Dr_Atoosa and @IasonGabriel explores this question using philosophy and linguistics:
Tweet media one
6
59
230
1
2
24
@LauraRuis
Laura Ruis
4 years
Very valuable to watch great minds respectfully disagreeing on the role and origin of compositionality in intelligence and what that means for the design of artificial intelligence, among many other things
@criticalneuro
Ida Momennejad
4 years
Join us at the salon at 4 PM ET today! Let's be happy anxious together and chat with the brilliant Jay McClelland into a celebration :-)
0
10
50
1
1
22
@LauraRuis
Laura Ruis
2 years
What linguists don't want you to know is that linguistics is a Ponzi scheme where they make up words to lure new linguists into studying those words who make up new words etc.
Tweet media one
3
0
23
@LauraRuis
Laura Ruis
2 years
I thank OP for the praise, but I have to say he misses the point. The message of the paper is that pragmatics of LLMs is lacking, for some classes more than others, and 1 class of instruct-tuning is promising for this capability (OpenAI-style) and the other isn't (Flan-T5 style).
@rasbt
Sebastian Raschka
2 years
Just read the excellent "Large language models are not zero-shot communicators" () paper on InstructGPT LLM a few weeks ago. Great paper but (probably) already out of date. That's the current pace of AI research for you.
Tweet media one
6
29
191
1
0
20
@LauraRuis
Laura Ruis
2 years
Also, get yourself some great mentors because it's priceless :)
3
0
22
@LauraRuis
Laura Ruis
2 years
If you think this is cool (which it is) come to the @LaReL2022 Workshop at #NeurIPS next week! This work is part of our late-breaking results slot ⬇️😍
@AIatMeta
AI at Meta
2 years
Meta AI presents CICERO — the first AI to achieve human-level performance in Diplomacy, a strategy game which requires building trust, negotiating and cooperating with multiple players. Learn more about #CICERObyMetaAI :
239
843
4K
1
3
21
@LauraRuis
Laura Ruis
9 months
Honestly so happy to be part of DARK (did you know DARK is hiring PhD students 🫣)
@UCL_DARK
UCL DARK
9 months
Members of the lab have a *huge* presence at #NeurIPS2023 this year! In fact, almost everyone of the lab is in NOLA 🥳 Come find us at our posters, workshops, competitions, or talks! See 🧵for more info.
Tweet media one
1
15
76
0
3
22
@LauraRuis
Laura Ruis
2 years
Come chat with me at my poster! 8:30 am today at #CogSci2022
@LakeBrenden
Brenden Lake @ CogSci2024
2 years
People can understand new adverb-verb combos in ways that stump machines: after learning how to "walk cautiously", one can "cycle cautiously." In new #CogSci2022 paper, @LauraRuis shows how modularity + structured data augmentation leads to real progress
Tweet media one
2
3
32
0
4
21
@LauraRuis
Laura Ruis
2 years
Submit to the language and RL workshop at #NeurIPS2022 ! Deadline September 22nd 🕺
@LaReL2022
LaReL Workshop 2022
2 years
Announcing the second edition of the Language and Reinforcement learning (LaReL) workshop at #NeurIPS2022 ! We're calling for papers at the intersection of language and reinforcement learning. Submission deadline: 22 September, 2022, AoE. 🧵⬇️ 1/9
1
23
80
0
3
21
@LauraRuis
Laura Ruis
2 years
Every day of course means 2 or 3 times a week come on guys my work ethic fluctuates like the #BTC price
2
0
20
@LauraRuis
Laura Ruis
1 year
Prompting can be unintuitive.
Tweet media one
Tweet media two
1
0
19
@LauraRuis
Laura Ruis
2 years
Incredibly happy with my choice for a PhD @ai_ucl with @UCL_DARK . Feel free to DM me with questions if you're considering a PhD! 🐊 Will already say here; I think choice of supervisor is the most important thing!
@_rockt
Tim Rocktäschel
2 years
If you want to do a PhD in cutting-edge AI, this is the place. @AI_UCL provides a fantastic cohort-based PhD training program with strong connections to industry and entrepreneurship. At @UCL_DARK we are fortunate to have @LauraRuis @_robertkirk @akbirkhan funded by this program.
0
6
26
0
4
20
@LauraRuis
Laura Ruis
2 years
We are excited about the prospects of large language models becoming “communicators” by improving their pragmatic language skills! Check the paper out for details Paper: Blogpost: Dataset: 19/22
1
4
18
@LauraRuis
Laura Ruis
2 years
InstructGPT might understand the syntax and semantics of the question perfectly but still misses the meaning entirely. This question has a *pragmatic* implication; an inquiry about the location of a phone 4/22
2
1
19
@LauraRuis
Laura Ruis
1 year
My Spotify queue feature is working as a stack and I understand why we do coding interviews now
0
2
18
@LauraRuis
Laura Ruis
2 years
“But what about in-context examples?!” they’ll ask. We add up to 30 examples from a dev set to the prompt. For most models, it doesn’t help much. Davinci-002 can get close to human performance at k=30. But again, for context-heavy examples the gap is still 9% with humans. 11/22
2
1
19
@LauraRuis
Laura Ruis
2 years
For example, imagine Esther asking “Can I use your stapler?” and Juan responding “Here’s the key to my office.” Juan is implicating that (1) Esther can use the stapler, (2) the stapler is located in the office, and (3) the office is currently locked. 13/22
1
1
19
@LauraRuis
Laura Ruis
2 years
PS. also check this work that shows LLMs empirically failing is not a problem of the next-word pred objective and, in theory, they could learn the entailment semantics from text, even though the necessary context is not directly part of the window! 22/22
@lambdaviking
William Merrill ✈️ACL🇹🇭
2 years
Thanks to Meryl for covering our recent work on semantics and language models on the CDS blog! The paper proves entailment prediction can be reduced to language modeling, and shows how to extract entailment from an “ideal” LM. Check out the blog to learn more!
0
2
16
3
2
19
@LauraRuis
Laura Ruis
2 years
Let’s start off with an example of the type of language we’re investigating. Consider the following exchange between a human and InstructGPT (text-davinci-002 temp. 0) 3/22
2
2
19
@LauraRuis
Laura Ruis
10 months
The commenting in a JS file I wrote when I first started programming is .. excessive 🙃
Tweet media one
1
0
18
@LauraRuis
Laura Ruis
2 years
Again, as with DALL-E prompts, the thing that I enjoy the most is the creativity of the human prompts. To get something interesting out of #chatGPT , the prompt needs to be interesting. A beautiful collaboration of human and model creativity.
@spiantado
steven t. piantadosi
2 years
still just unbelievable
Tweet media one
2
7
99
0
0
17
@LauraRuis
Laura Ruis
11 months
DARK is just getting better and better 😍
@UCL_DARK
UCL DARK
11 months
We are super excited to announce that Dr Roberta Raileanu ( @robertarail ) and Dr Jack Parker-Holder ( @jparkerholder ) have joined @UCL_DARK as Honorary Lecturers! Both have done impressive work in Reinforcement Learning and Open-Endedness, and our lab is lucky to get their support.
Tweet media one
4
12
86
1
0
18
@LauraRuis
Laura Ruis
2 years
Like prior work we found LLMs handle some prompt templates better than others. However there is no single prompt template that dominates - some models prefer structured prompts (no. 1 below) whilst others prefer natural prompt templates (no. 2 below). 14/22
Tweet media one
1
2
18
@LauraRuis
Laura Ruis
2 years
E.g. a note titled "LLN": "The law of large numbers is the most important theorem in ML; it allows estimating expectations by sample avgs. We use it in max likelihood est. The law says that if we have independent data from a source, we can recover properties of the source."
1
1
18
@LauraRuis
Laura Ruis
2 years
We evaluate LLMs in 4 groups: base models (like OPT), instructable LLMs with an unknown method (like InstructGPT-3), instructable LLMs finetuned on downstream tasks (T0 and Flan-T5), and LLMs finetuned on dialog (BlenderBot) and find that they struggle compared to humans. ⬇️ 8/22
Tweet media one
1
1
18
@LauraRuis
Laura Ruis
2 years
So much going on in that scale plot, most notably; InstructGPT-3 outperforms all! But, even the best model performs much worse than humans (14% worse), especially on a subset of the data that requires context to be resolved (24% worse, shown in section 4.1 of the paper). 9/22
1
1
18
@LauraRuis
Laura Ruis
2 years
Jk I feel exactly the same and still think about M&Ms every day
1
0
18
@LauraRuis
Laura Ruis
2 years
We design a pragmatic language understanding task, schematically depicted below. This image shows one of our prompt templates, but we test 6 different templates to control for sensitivity to the wording. 7/22
Tweet media one
2
1
18
@LauraRuis
Laura Ruis
2 years
I think an important missing component is #agency (setting goals and achieving them). There are levels to agency. Lizards can find food under uncertainty. A monkey sets diverse goals and achieves them under more uncertainty. We are at the top of this self-defined agency-pyramid.
1
0
16
@LauraRuis
Laura Ruis
2 years
In only 40 minutes #LaReL2022 will start in room 391! Come by to get up to date with the field of RL and language 🥳
@LaReL2022
LaReL Workshop 2022
2 years
Now that we’ve introduced all our speakers and our late-breaking results, we’re super excited for tomorrow! Come to chat with people interested in language & RL, and to get up to date on what’s happening in this rapidly growing field! The workshop is in room 319 at #NeurIPS2022
2
5
13
0
5
17
@LauraRuis
Laura Ruis
2 years
On a scale of 1 to pretty worrying how worrying is it that I've given over my full autonomy as a programmer to copilot within hours of enabling it
2
1
16
@LauraRuis
Laura Ruis
2 years
And different models react completely differently to in-context prompting. Below you can see InstructGPT-3-175B benefiting relatively equally from few-shot examples for all templates, but Cohere-52B only benefiting for structured prompts, and OPT-175B vice-versa. 🤔 15/22
Tweet media one
1
2
17
@LauraRuis
Laura Ruis
2 years
And, on top of all this, in our work we test on a *very* simple binary resolution task (resolving to “yes” or “no”). Humans resolve much more complex implicatures intuitively in conversation, leaving the door wide open for more complex benchmarks in the future. 12/22
1
2
17
@LauraRuis
Laura Ruis
2 years
It makes a good point.
Tweet media one
1
0
15
@LauraRuis
Laura Ruis
2 years
Sources of inspiration: - Ape language: From conditioned response to symbol by Savage-Rumbaugh - The Symbolic Species by Deacon - Symbolic Behaviour in AI by Santoro, Lampinen et al - The Evolution of Agency by Michael Tomasello - Language models as agent models by Andreas
2
1
15
@LauraRuis
Laura Ruis
4 years
Quarantine day 1billion, bit fed up w/ watching Bridgerton & watching my fav influencer watch Bridgerton (millenials, amirite 🧐) - wrote a blog on structured prediction with CRFs! Pt 1: Content: PGMs, CRFs, belief prop, and viterbi! Feedback v welcome 😊
0
2
16
@LauraRuis
Laura Ruis
1 year
@Nabil_Alouani_ A croissant
1
0
15
@LauraRuis
Laura Ruis
2 years
Humans, having had an experience of losing their own phone in the past, infer that the speaker is looking for their phone. This illustrates an essential aspect of our everyday usage of language: interpreting it given the context of our shared experience 🤝 5/22
1
1
15
@LauraRuis
Laura Ruis
2 years
🙏Massive thanks to all my collaborators without whom this large-scale work would definitely not have been possible 🫶 @akbirkhan @BlancheMinerva @sarahookr @_rockt @egrefen That’s it. 21/22
Tweet media one
1
1
15
@LauraRuis
Laura Ruis
1 year
that definitely clears up any confusion. thanks 🦙2
Tweet media one
@LauraRuis
Laura Ruis
1 year
ok sorry llama2
Tweet media one
8
26
307
1
0
13
@LauraRuis
Laura Ruis
9 months
Doing a PhD at DARK has been a great choice for me. Ed and Tim are amazing mentors and have created a really great environment to do research in with DARK.
@UCL_DARK
UCL DARK
9 months
We ( @_rockt , @egrefen , @robertarail , and @jparkerholder ) are looking for PhD students to join us in Fall 2024. If you are interested in Open-Endedness, RL & Foundation Models, then apply here: and also write us at ucl-dark-admissions @googlegroups .com
3
20
65
0
0
15
@LauraRuis
Laura Ruis
1 year
Reading something by a researcher whose work I admire and she writes something that is (perhaps strangely) very motivational to me: "The first answer guided my research for about 20 years, but I now believe that it is wrong"
0
0
13
@LauraRuis
Laura Ruis
2 years
Big fan of @AiEleuther ’s eval harness. In our recent work on implicature and LLMs we had to run lots of different prompt templates on lots of different large models (OPT-175, BLOOM-176, etc) and with the eval harness it literally took a few lines of code and some yaml files 👏
0
3
13
@LauraRuis
Laura Ruis
1 year
The Wikipedia page of the Sally-Anne test references only the Kosinski paper claiming GPT-4 has advanced #TheoryOfMind under the header “artificial intelligence” 🤔
Tweet media one
1
0
12
@LauraRuis
Laura Ruis
2 years
Every day I spend 30 minutes at the end of the day to do some upkeep. Transfer random links I sent to myself to separate reading lists, clean some of the metadata, polish some stuff in project notes sections and move to standalone notes, etc.
1
1
13
@LauraRuis
Laura Ruis
1 year
This is a great paper not only because of the clever control tasks, but also because it relates current evals of LLMs to decades old discussions on psychologism vs behaviorism
@TomerUllman
Tomer Ullman
2 years
So about the 'Large Language Models Learned Theory-of-Mind(?)' discussion: Has ToM emerged in current LLMs? I doubt it.
Tweet media one
Tweet media two
20
88
340
1
3
13
@LauraRuis
Laura Ruis
1 year
Consider submitting to SoLaR! Should be a really exciting and important workshop to join this year at NeurIPS
@solarneurips
SoLaR @ NeurIPS2024
1 year
The NeurIPS 2023 workshop on Socially Responsible Language Modelling Research (SoLaR) is accepting submissions! The deadline is Sep 29. Check out our submission guide:
0
6
20
0
1
12
@LauraRuis
Laura Ruis
7 months
This paper shaped my thinking about using human feedback for evaluation (and with that, fine-tuning), check it out!
@tomhosking
Tom Hosking
7 months
"Human Feedback is not Gold Standard" was accepted at ICLR 2024 🥳 I'd love to chat about the limits of human feedback wrt LLM alignment (and about @cohere ) if you're going to be at the conference! 🇦🇹 Thanks again to @max_nlp for making it an awesome internship experience ❤️
5
26
185
0
0
13
@LauraRuis
Laura Ruis
8 months
Starting soon!! Come chat with us
@LauraRuis
Laura Ruis
9 months
More excited about #NeurIPS2023  🐊 than Christmas🤶 because I'll be presenting our spotlight poster on pragmatic understanding by #LLMs  🔍. Our **main insight** is that IFT at the example-level helps pragmatic understanding! Poster #312 Tues 5:15PM (OK to bring 🍷)
Tweet media one
3
19
100
0
1
12
@LauraRuis
Laura Ruis
10 months
This work has some super interesting insights on RLHF vs SFT
@_robertkirk
Robert Kirk
11 months
Excited to share work from my FAIR internship on understanding the effects of RLHF on LLM generalisation and diversity: While RLHF outperforms SFT in-distribution and OOD, this comes at the cost of a big drop in output diversity! Read more below🧵
Tweet media one
3
57
336
0
2
12
@LauraRuis
Laura Ruis
2 years
I'm personally blown away by chatGPT's capabilities, it's absolutely incredible at explaining things, compositional generalisation of concepts, simulating a VM, coherence, creativity, writing essays, poems, and more!
1
0
12