What syntactic generalizations can be learned from predicting the next word by domain-general learning algorithms? A paper summarizing the case of filler--gap dependencies (+ theoretical implications!) with
@roger_p_levy
and
@rljfutrell
A project I'm involved in on smaller-scale, cognitively plausible AI was covered in the New York Times today! Thanks
@oliverwhang21
for your interest in our work!
Excited for day 1 of
#EMNLP2023
! I’ll be at
#CoNLL
checking out the
#BabyLM
posters today and tomorrow. Then presenting work at the main conf, with work on multilingual cognitive modeling, word lengths, and prosody!
This week at
#AMLaP2023
I’ll be presenting two posters about ongoing projects! If you’re around and want to chat, let me know! Excited to connect folks and, of course, eat some Pintxos! 🥘🍢😋 (Teasers for the posters below!)
The goat that the cat that the dog that the stick that the fire that the water that the ox that the butcher that the Angel of Death that the Holy One, Blessed be He smote killed slaughtered drank quenched burned beat bit ate was purchased by my father for two zuzim.
🔔New Preprint Alert🔔 Why can some presuppositions be used to introduce new information into a discourse while others cannot? Check out“Presupposing Novel Information: A Cross-Trigger Experiment in English” together with
@TeaAnd_OrCoffee
and
@roger_p_levy
2019
#BlackboxNLP
paper is out! Neural LMs trained on English can suppress and recover syntactic expectations, approximating stack-like data structures; but recovery is imperfect! With
@roger_p_levy
@rljfutrell
Ever wondered whether prosody - the melody of speech 🎶 - conveys information beyond the text or whether it's just vocal gymnastics? 🏋️♂️ Great work by
@MIT
and
@ETH_en
soon to be seen at
@emnlpmeeting
Neural Language Models' incremental predictions under predict human sensitivity to ungrammatical sentences! New
#ACL2021
paper with
@roger_p_levy
and Pranali Vani
#NLProc
Congratulations to Cui Ding who was awarded with the Semester Prize by the president of
@UZH_ch
for her Master's thesis
@cl_uzh
co-supervises by
@weGotlieb
on Mouse Tracking for Reading, a low-cost alternative to eye-tracking. Find out more here:
📣 New paper📢 Alexandre Cremers, myself, and
@benspect
test the influence of prior belief on exhaustivity inferences. This paper combines it all: comprehension / production experiments, lots of RSA modeling, as well as a deep dive into model analysis.
@thomashikaru
I tried more explicit prompting, which worked OK -- "the box moved from the shelf fell on the floor" is technically a gardenpath, but boxes don't really move themselves much, so idk if I would be tricked. Then it just falls apart...
The piece is all about the BabyLM challenge, which invites folks to train a language model from scratch on the amount of linguistic input available to the typically-developing child. If this sounds interesting, please consider submitting:
*Alots free hour for ACL review*
... 2 hours later ...
"... Aside from the five paragraphs above, could the authors address this list of twelve questions as well as the fifteen missing references I compiled along with their major takeaways..."
New work from Anastasia Kobzeva,
@sArehalli
,
@tallinzen
: "Neural networks can learn patterns of island-insensitivity in
#Norwegian
", which will be presented at
#SCiL2023
Preprint available here:
comments welcome!
@BlancheMinerva
@roger_p_levy
@rljfutrell
Thanks for your interest! We have a platform for analysis of ANNs, including some transformers, with a larger variety of syntactic structures. Check it out at
The most amazing part imo: Models learn different rules for different types of dependencies! When we compare FGD (top) with dependency-like expectations for gendered pronouns (bottom) ... no island effects! See the paper for results also on unboundedness of the dependency!
I urge interested readers to consult the
@weGotlieb
et al. in press: we find that the "GRNN" LSTM of
@xsway_
et al. 2018 trained on a childhood's worth of English shows substantial success on filler–gap dependencies and the island constraints on them. 1/3
@shota_momma
@wavyphd
This is right on. As we say in the paper there are two big arguments in favor of nativism: Poverty-of-Stimulus style claims and x-linguistic distribution. Our results provide good evidence against POS for islands.
We simulate eye tracking in the browser. We blur out a piece of text, except just above the mouse tip. Participants have to move the cursor to reveal and read the text. Then, we analyze their movements similar to eye tracking data. 👀🖱️
In the same theme, we have “Language Model Quality Correlates with Psychometric Predictive Power in Multiple Languages” with
@tpimentelms
,
@clara__meister
and
@ryandcotterell
. We test the relationship between LM quality and cognitive modeling power cross-linguistically.
Finally, for 3️⃣ our results are rather striking 😱 When we train generalized additive models to fit the surprisal / reading time relationship they find an essentially linear curve (which is green in this figure). For more in-depth statistical tests, see the paper!
#2
is an information theoretic treatment of regressions during reading. 📖👀 We ask, how well does pointwise MI between words predict regressions? Also excited for this one because it’s (as far as I know) the first large x-linguistic analysis of regressions out there! 🌎🌍🌏
We argue that these data support a hybrid theory for some triggers, as elements that both introduce a presupposition and act as anaphors. We also argue that the same mechanism that drives the variation above, also determines when a trigger is obligatory.
😮Surprisal Theory 😮 posits that a word’s contextual predictability should be related to its processing effort. It’s often argued that the functional form of this relationship should be simple, specifically, if predictability is measured as surprisal, it should be linear ↗️
#1
is a methods project. We introduce Mouse Tracking for Reading (MoTR) a real-time processing measurement tool that produces reading times similar to eye tracking and runs in the browser. 🖱️🐁👀 Shout out to colleagues
@CuiDing_CL
,
@mrinmayasachan
and
@LenaAJaeger
Interestingly, this was the only prompt I tried for which the model was on the right track. But instead of Lenin, these photos look like ... President Taft??
For 2️⃣ our results are mixed. Using contextual entropy as a predictor alongside surprisal can help predict reading times, but alone, it is not better than surprisal. For a much more detailed exploration, see another recent publication from this team:
Recently, this linear relationship has been questioned, with some studies showing: linear ↗️, super-linear ⤴️and even sub-linear relationships ↘️. But these debates have been based on almost exclusively English experimental data 🇺🇸🇬🇧🇦🇺🇳🇿
@Alexfan80136457
We believe it can be a good option when (i) there are budget considerations (eye tracking is expensive) and also (ii) geographic considerations (hard to take precision eye trackers into the field). We hope to validate (ii) in future work!
🌎🌏 These validation experiments were conducted in English, but I’m hopeful that MoTR can be used to collect eye-tracking style data in lots of languages! It can also run locally on the computer, and we are excited about possible applications in fieldwork. 🌍🌏
This paper builds on previous work by Alex Göbel and Nadine bade (
@GT1902
) , and I want to acknowledge their wonderful previous contributions to these topics!
For 1️⃣ our results are a resounding “yes” ! We find that in every language tested, surprisal helped us predicting reading times, both when estimated from a large multilingual language model (top row) and from medium or even smaller-sized monolingual models (bottom rows)
Ever want to run an eye tracking study but hampered by the cost? 💸 Maybe you wish you could bring your eye tracker somewhere, but it’s just too darn heavy? 🏋🏽 Well, we might have the solution!
We use the multilingual 👁️MECO eye tracking dataset 👁️ and run experiments on 11 languages across 5 language families: Indo-European, Turkic, Koreanic, Afroasiatic, and Uralic.
@shota_momma
@wavyphd
I will say, there are a lot of great, non-nativist theories out there that could explain the "origins" of islands, including processing accounts and the discourse-structural accounts of
@adelegoldberg1
and
@Benambridge
2️⃣ We collected MoTR reading times for more naturalistic reading, specifically for the Provo Corpus. We find that the MoTR RTs correlate well with the eye tracking RTs in this setting. We also find a (roughly) linear relationship between MoTR RTs and surprisal.
Ever wondered what verbs are mostly transitive, mostly ditransitive? I collected all this information and more and posted it on github. It turns out "wondered" is intransitive 99.09595843251683% of the time.
@postylem
Yeah, good question! My understanding from Appendix B of Smith and Levy (2013) is that these modeling choices didn't change the outcome for their dataset. But that might not be true for our dataset!
🐁⚙️🐭 Feel free to get in touch with us if you think MoTR could be useful to answer your research questions! Data, scripts and code to implement MoTR in Magpie can be found at 🐁⚙️🐭
One interesting thing is that, in certain situations, some RSA models actually predict anti-exhaustivity effects! That is, “I ordered tea” becomes a cost-effective way to communicate that the speaker ordered both tea and a croissant!
@LChoshen
Ah, that tweet had nothing to do with the outcome, only my own extraneous commitment. Copious engagement is the hallmark of all good reviews, both reject and accept!
A word’s predictability (a.k.a. surprisal) is suspected to influence its reading time. But almost all studies investigating this are in English! Does it hold up cross linguistically? 🌍🌐🌎
@mathemagic1an
Haha Linguistic Relativity is certainly intuitive and it *has* to be true on a very mechanistic level. The problem for me is that to test it rigorously you'd need to control for culture and language.
@mathemagic1an
So you'd need (i) a population that speaks the language but isn't exposed to the culture and (ii) a population that is exposed to the culture but not the language. Basically, impossible to test rigorously. 🤷
A whole range of factors may be at play, including the information content of the presupposed material or whether or not interlocutors trust each other. But in this work we focus on semantic properties and information structure.
The thing I love about MoTR is that it’s really simple. But does it work? 🤷🏽To test it out, we conduct two types of studies in English. We were pleasantly surprised at what we found.
@ahnaphor
I screen by asking if the participant considers themself a speaker of language x after the experiment (the hope here being that they won't select `yes` just to participate/get paid.)
We ask: 1️⃣ Does surprisal help predict reading times? 2️⃣ Does expected surprisal (or contextual entropy) help to predict reading times? 3️⃣ And what is the functional relationship between surprisal and reading?