🚨Long thread warning: excited to share that I defended my PhD thesis earlier in May!
Here's my thesis, Learning Language Structures through Grounding:
1/
Personal update: I'll be starting in July 2024 as an Assistant Professor
@UWCheritonCS
and a Faculty Member
@VectorInst
! Looking forward to working with all the amazing folks!
Prospective students: if you are interested in NLP and/or comp. linguistics, please consider applying!
Large language models show reasoning abilities in English with chain-of-thought prompting - how are their multilingual reasoning abilities?
New preprint📄: Language models are multilingual chain-of-thought reasoners. (1/n)
Honored to receive the 2021 Google PhD fellowship in natural language processing. Thanks
@GoogleAI
for the support! Kudos to my advisors and mentors: thanks for teaching me everything over the past years, and for showing me concrete examples of best researchers---yourselves!
Continuing our tradition of supporting outstanding graduate students in their pursuit of research in computer science and related fields, we congratulate our 13th annual PhD Fellowship Program recipients! See the list of 2021 Fellowship recipients below:
🚨(Not Really) Old Paper Alert🚨: sharing our 2-year-old NeurIPS paper that I’m still quite excited about.
We learn grounded, neuro-symbolic CCGs from multi-modal data and demonstrate nearly perfect compositional generalization to unseen sentences and scenes. (1/)
Late post but let’s do this! Happy to share our
#EMNLP2022
work on translating natural language to executable code with execution-aware minimum Bayes risk decoding
📝Paper:
📇Code:
📦Data (codex output):
(1/n)
Just got a paper w/ scores 4, 4, 4 rejected by
#acl2020nlp
, but the comments from the meta-reviewer and all reviewers are super, super constructive. Would like to say thank you to them all!
Though time it quite limited, I'm happy to spend most of my weekend reviewing for
#iclr2023
- my assigned papers are all interesting, carefully written and relevant (to me), as most ICLR papers I've reviewed before - kudos to the ICLR matching system (and my ACs)!
#ACL2023
attendees: Welcome to Canada! 🇨🇦
I'll be at the conference from Monday to Wednesday. First time attending a conference without presenting a paper, and I’m sure I’ll enjoy all the cool presentations. Old & new friends: please don’t hesitate to come & say hi!
I am hiring NLP/ML PhD students at UWaterloo, home to 5 NLP professors! Apply by Dec 1
Strong consideration will be given to those who can tackle the below challenge: Can we use LM's hidden states to reason multiple problems simultaneously?
Retweets/shares appreciated🥰
Late post but let’s do this! Happy to share our
#EMNLP2022
work on translating natural language to executable code with execution-aware minimum Bayes risk decoding
📝Paper:
📇Code:
📦Data (codex output):
(1/n)
This has been one of the most exciting posters I’ve visited at EMNLP2022. Neat results showing syntax and semantics are learnably separated in spectrums!
For
#EMNLP2022
, we (w/
@robvanderg
,
@barbara_plank
) look through differentiale, rainbow-colored glasses to find linguistic timescale profiles for 7
#NLProc
tasks across 6 languages 🌈
📑
📽️
💬 10th Dec 9:00 at Poster Session 7 & 8
Are there any resource/study showing which words (in any language) are more likely to be mispronounced (by either native speakers or L2 learners)? Any pointer is appreciated!
I very much enjoyed this paper, and of course, the poster! Large-sized data and LLMs present a fantastic opportunity for studying cultural differences.
"आज-कल NLP Research के साथ बने रहना उतना ही आसान है जितना कि मानसून मॆं भीगने से बचे रहना!" . Did you understand? How about LMs? Our
#ACL2023
Findings paper explores multilingual models' cultural understanding through figurative language in 7 langs 🌎(1/9)
Third-year PhD student Freda Shi bridges the gap between linguistics and computer science in her natural language processing research. Follow the link to learn more:
#computerscience
#womeninstem
Surprisingly, PaLM-540B shows decent multilingual reasoning ability, solving >40% problems in any of the 10 investigated languages, including the underrepresented ones (such as Bengali and Swahili) that only cover <0.01% tokens of the pretraining data. (3/n)
Back to 2017, when thinking about visually grounded syntax induction (), I dreamed for 1 second if we could parse image in similar ways---apparently it's too difficult for me then (and now), so, super excited to see this! Congrats on the nice work!
Introducing
#CVPR2022
GroupViT: Semantic Segmentation Emerges from Text Supervision 👨👩👧
Without any pixel label ever, Our Grouping ViT can group pixels bottom-up to open vocabulary semantic segments. The only training data is 30M noisy image-text pairs.
Same here. Even worse: I feel I'm probably not qualified to review some of them -- no experience in this domain, not quite familiar with recent work, no labmates or close friends working on it -- while relevant papers (I thought) were not assigned to me.
Got 5 papers to review for ARR today, all from different AEs, the due date is Dec 16! Logged into the system, there's no option to reject the assignment or discuss with AEs to extend the deadline/find a replacement. I wonder what's the average review load for Nov🤔
@ReviewAcl
In this work, we introduce the Multilingual Grade School Math (MGSM) dataset, by manually translating 250 English GSM8K test examples to 10 typologically diverse languages, and investigate language models’ reasoning ability with it. (2/n)
1. Chain-of-thought prompting is essential for the reasoning performance for both GPT-3 and PaLM; and notably, reasoning steps in English (EN-CoT) almost always outperform the ones in the same language as the problem (Native-CoT). (5/n)
I'll be talking about Visually Grounded Neural Syntax Acquisition, one of the listed papers, on Monday 4:00 pm at Session 3E! This is a joint work with Jiayuan Mao,
@kevingimpel
and Karen Livescu.
Paper:
Project page:
2 great surveys centered around the above 2 senses of grounding, respectively:
In the Harnad (1990) sense, , by
@ybisk
,
@universeinanegg
,
@_jessethomason_
and colleagues
In the Clark & Brennan (1991) sense, , by folks incl.
@ybisk
4/
In my thesis, I discuss a family of tasks: learning language structures from supervision in other sources (through grounding) and corresponding methods to deal with each considered task.
As many have recognized, grounding is a highly ambiguous term. More in 🧵
2/
Prior work has mainly categorized grounding into 2 types: semantic grounding (finding meanings for forms; Harnad, 1990) and communicative grounding (finding common ground in dialogue; Clark and Brennan, 1991 + earlier work in pragmatics).
3/
The multilingual reasoning abilities of language models also extend to other tasks: on XCOPA, a multilingual commonsense reasoning dataset, PaLM-540B sets a new state of the art (89.9% average accuracy) using only 4 examples, outperforming the prior best by 13.8%. (7/n)
An interesting and counterintuitive example of grounding under this formalization is GroupViT by
@Jerry_XU_Jiarui
,
@xiaolonw
, and folks, where an image segmentation model is trained from textual supervision---vision can be grounded in language, too!
8/
Thanks to
@McAllesterDavid
, our anti-grounding prof at
@TTIC_Connect
: thank you for all the inspiring conversations and writings that push back the idea of grounding, e.g., . I hope (and believe) my grounding above is not what you are against :)
10/
@ybisk
@universeinanegg
@_jessethomason_
One exception is acoustically grounded word embeddings (e.g., Settle et al., 2019), where they encode acoustic knowledge into word embeddings. Perhaps no one thinks the pronunciation of a word is its meaning, but still, this is an acceptable usage of "grounding."
5/
@ybisk
@universeinanegg
@_jessethomason_
In my thesis, I proposed the following definition of grounding, unifying all cases above.
Grounding means processing the primary data X with supervision from source Y (the ground), where the mutual information I(X; Y) > 0, so we can find meaningful connections between them.
6/
@ybisk
@universeinanegg
@_jessethomason_
In real-world scenarios, the conditional entropy H(Y|X) almost always> 0, meaning that the ground is usually more complicated than what is to be grounded from certain perspectives.
7/
2. When example problems in the same language as the problem of interest are available, use them for prompting. If not, use examples from a diverse set of languages. (6/n)
I'm extremely thankful to my advisors Karen and
@kevingimpel
& my committee members and mentors
@lukezettlemoyer
and
@roger_p_levy
, for the great questions and suggestions on my thesis.
12/
To my friends, mentors, coauthors, and everyone who has offered direct or indirect help in the past years, please read my thanks in acknowledgment, which is probably the most exciting part of every PhD thesis.
14/14
Excited to have the work on tree-based neural sentence modeling (joint with my excellent collaborators Hao Zhou, Jiaze Chen and Lei Li) accepted by
#EMNLP2018
I had some difficulty figuring out the horizontal scroll (横批; héng pī)—while eventually I realize in this case it should be read from left to right, we typically write it from right to left in China :) Happy New Year to my friends who are celebrating!
The difficulty of expressing "nothing": This is a clever attempt to write a spring couplet (chūnlián 春聯), not in the usual Sinoglyphs / Chinese characters, but in pictographs: (source) I could figure out about half of the character equivalents (rebuses…
@denny_zhou
@kchonyc
I believe both explanations are valid, although marginalizing over reasoning paths that share the same result is probably the most natural way to think about it. My thesis (P123) discusses three explanations of SC and MBR-Exec (…)
Special thanks to
@MichaelHBowling
, Dale Schuurmans, and
@nidhihegde
for the wonderful discussion on grounding at a dinner a year ago. The conversation has made the term grounding (in my mind) more articulable.
9/
@MorrisAlper
@moranynk
@ElorHadar
@RGiryes
Excited to see more work on quantifying visual concreteness! Our ACL'19 work on quantifying text span concreteness and using it for syntactic parsing might also be of interest:
#EMNLP2018
"A Tree-based Decoder for NMT", a framework for incorporating trees in target side of MT systems. We compare constituency/dependency/non-syntactic binary trees, find surprising result that non-syntactic trees perform best, and try to explain why
Can't agree more. I voted for "that's syntax", but I wouldn't be happy to see a paper using "syntactic features" to refer to POS tags only, and I've been not so happy for >3 times.
@emilymbender
At the very least it's a misleading use of the term. To me it's like doing linear regression and calling it a neural approach... technically true (linear regression can be seen as a 1-neuron neural network) but I don't see why anyone would say it (w/o context) if not to oversell.
The
@COLM_conf
reviewing period has started. Reviewers should now receive emails, and all papers are now assigned. Thanks to all our ACs who adjusted assignments in the last few days. Happy reviewing all!
Definitely one of my top 3 favourite papers :) It marries deep learning with a minimal set of universal grammar rules for grounded language learning. It draws inspiration from lexicalist linguistics and cognitive science (bootstrapping from core knowledge).
Honored to receive the 2021 Google PhD fellowship in natural language processing. Thanks
@GoogleAI
for the support! Kudos to my advisors and mentors: thanks for teaching me everything over the past years, and for showing me concrete examples of best researchers---yourselves!
Madhur's course is really nice! I'd recommend it to everyone who wishes to review/learn some fundamental mathematical concepts related to machine learning.
@EugeneVinitsky
Madhur Tulsiani runs a very similar course every other year (this has links to iterations of the class, the latter ones have more refined notes).
Of course, the work covered in this thesis is built on the foundation of the literature—my thanks go to the authors of the papers I cited. I hope I've discussed your work in a fair way.
13/
@maojiayuan
@jiajunwu_cs
@roger_p_levy
As in a CCG, each lexicon entry has its syntactic type and semantic representation. We induce the syntax and semantics of the questions, execute the neuro-symbolic semantic program with visual input, and reward the parser if the execution result is correct. (4/)
@tallinzen
That’s part of the reason why I started using GitHub to manage my working papers. Another part is the nice combination of VSCode & LaTeX workshop.
@UndefBehavior
Sorry to hear this! As an alternative, my coauthors and I tried to publish at ML conferences (for our case, NeurIPS) on highly linguistic topics. We got constructive feedback from reviewers, but very little attention for our presentation.
@sharonlevy21
Congrats on the excellent work! I found the Table 1 example very interesting: these four sentences are clearly negative to me, and I can't imagine if anyone would label any of them positive---wonder if more data could fix this?
@kanishkamisra
I started to use onenote (not supposed for managing todos though). Just start a new page for all todos this week each Monday morning, and copy leftovers from the prior week.
@joycjhsu
This is cool! Can I ask a quick question - why would humans say "no" to the teaser question? From a quick glance, it could perfectly be a "wug" to me :)
We are also aware that the method comes with efficiency issues in complicated real-world settings, and that’s an exciting direction to explore in the future! (13/13)
@denny_zhou
Oh wow, I'm surprised that the homework includes a clear chain-of-thought example! Also I think this is a challenging example for LMs: it's nontrivial to work out generalization from <-2, /3> operations to <+2, /4>, even for the 20-year-ago me.
In summary, we show what will happen when neuro-symbolic models meet grammars (generalized CCGs), where we achieve significantly improved performance on compositional generalization. (12/)
Surprisingly, with only program input and no access to ground-truth output, MBR-Exec shows significant improvement over all execution-unaware methods on Python program generation, tested on the MBPP dataset.
(4/n)
I am trying to make a submission to
@emnlp2020
, but the system asks me to fill out the reviewer data form. I filled it, and it says "thank you", and I was redirected to the home page, and everything repeats for another time. Anyone has similar problem?
I am trying to make a submission to
@emnlp2020
, but the system asks me to fill out the reviewer data form. I filled it, and it says "thank you", and I was redirected to the home page, and everything repeats for another time. Anyone has similar problem?
@wzhao_nlp
@denny_zhou
Agreed and that’s our motivation for proofreading multiple times before submitting a paper :) I’d love to see more human experiments and rigorous comparison!
Language models spread their probability over multiple programs with slightly different implementations but the same underlying functionality. When we only have one chance to choose an output program for each natural language sentence, how should we select it?
(2/n)
@m2saxon
@yuvalmarton
This is an excellent point! Starting with America ≠ US, I spent some time realizing America = US in most of my conversations, and then some more time bringing back America ≠ US.