Kelly Marchisio (St. Denis) Profile
Kelly Marchisio (St. Denis)

@cheeesio

1,981
Followers
614
Following
45
Media
606
Statuses

Multilingual NLP @cohere . Formerly: PhD @jhuclsp Alexa Fellow @amazon dev @Google MPhil @cambridgenlp EdM @hgse ๐Ÿ”‘๐Ÿ”‘ยฌ๐Ÿง€ ( @kelvenmar20 )

Connecticut, USA
Joined June 2019
Don't wanna be here? Send us removal request.
@cheeesio
Kelly Marchisio (St. Denis)
2 months
Train steps: [34500/40000] (86.3%) GPU utilization: 100% Saving checkpoint: 7BB_marchisio_stdenis_pretrain_mixture_v0_34500.ckpt
Tweet media one
59
135
3K
@cheeesio
Kelly Marchisio (St. Denis)
4 years
Freshly added to my .bash_profile: alias cdd="cd ../.." alias cddd="cd ../../.." alias cdddd="cd ../../../.." I'm not lazy, I'm... efficient... right??
18
16
240
@cheeesio
Kelly Marchisio (St. Denis)
3 months
How does quantization affect multilingual LLMs? ๐ŸŒ For wide adoption, multilingual LLMs must be highly-performant *and* lightweight. ๐Ÿ“ˆ ๐Ÿชถ We analyze SOTA multilingual LLMs in 23 languages under various quantization techniques to find out! ๐Ÿ“œ
Tweet media one
9
60
241
@cheeesio
Kelly Marchisio (St. Denis)
10 months
(mostly positive!) Reflections as a female AI researcher at #NeurIPS2023 part 1 of ?? A man from unnamed-but-very-well-known-AI-company approached me near the booth. He asked about my research and biggest challenges in multilingual NLP. A rapidfire back-and-forth ensued: 1/5 ๐Ÿงต
3
11
192
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Just trained an MT model. The output for every test sentence is: " I & amp ; apos ; m sorry . I & amp ; apos ; m sorry . I & amp ; apos ; m sorry . I & amp ; apos ; m sorry . I & amp ; apos ; m sorry ." It's not your fault, little buddy! It's me, not you!
4
3
184
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Life update! Tomorrow, I join @CohereAI as a Member of Technical Staff!
14
4
182
@cheeesio
Kelly Marchisio (St. Denis)
10 months
I was pleased to be treated as an equal, and for the opportunity to sharpen my intellectual battle swordโš”๏ธ๐Ÿคบ (and proud that I was deffo ๐Ÿ’ฏ๐Ÿ’ฏ๐Ÿ’ฏ correct ๐Ÿคช๐Ÿ’โ€โ™€๏ธ๐Ÿ‹๏ธโ€โ™€๏ธ๐Ÿ„โ€โ™€๏ธ๐Ÿ•ต๏ธโ€โ™€๏ธ๐Ÿ˜œ) 5/5
1
1
146
@cheeesio
Kelly Marchisio (St. Denis)
2 years
I'm on the job market! (industry/post-doc/faculty) I work on multilinguality and low-resource NLP, with a focus on computational efficiency. Please donโ€™t hesitate to reach out with opportunities (DM/email)! Applying broadly, flexible location! ๐Ÿ–๏ธโ„๏ธ๐Ÿ”๏ธ๐ŸŒดโ˜”๏ธ
3
33
109
@cheeesio
Kelly Marchisio (St. Denis)
2 years
New year, new name! I'll still publish as "Kelly Marchisio", but socially, you can call me "Kelly St. Denis" :)
Tweet media one
8
0
110
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Engagement posts are sooo clichรฉ -- so we trained a neural language model to write ours: Engaged Engaged Engaged for her big big big move on the big fella hasn't even when we were top-notch! from the happiest and her unravel and her *<expletive>* today๐Ÿคฃ๐Ÿคฃ๐Ÿคฃ๐Ÿคฃ #princesscut
Tweet media one
7
3
102
@cheeesio
Kelly Marchisio (St. Denis)
5 months
Tweet media one
24
0
99
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Done!!
@jhuclsp
JHU CLSP
1 year
Congratulations to Kelly Marchisio @cheeesio (advised by Philipp Koehn) on successfully defending her @JHUCompSci PhD thesis "Multilinguality from Embedding Spaces: Algorithmic, Geometric and Data Considerations." Kelly will join @CohereAI @HopkinsEngineer
0
2
28
20
2
97
@cheeesio
Kelly Marchisio (St. Denis)
2 months
@mayhewsw Expected date to run first inference is Sept 2 - weโ€™re currently setting up our eval suite, so remains to be seen. I have high hopes for this one
2
0
85
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Introducing โœจMini-Model Adaptationโœจ - a new parameter- and compute-efficient method for rapid adaptation of pretrained models to new languages! ๐Ÿงต1/5
Tweet media one
3
8
77
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Me: "Hm, I don't know much about X. I should watch a video." First comment on said video: ๐Ÿ˜ณ
Tweet media one
3
1
70
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Thesis-writing starts TODAY. Join me on my thesis-writing journey for a PhD in Computer Science / AI / ML / NLP! 1/N
3
2
63
@cheeesio
Kelly Marchisio (St. Denis)
10 months
But they donโ€™t with me, and I donโ€™t with them. I have brilliant female computer scientist friends, we just donโ€™t tend to engage with each other this way. I left thinking โ€œWOAH that was aggressive! But heโ€™d do the same if I were male.โ€ 4/5
3
2
62
@cheeesio
Kelly Marchisio (St. Denis)
4 months
DONE done. ๐Ÿ‘ฉโ€๐ŸŽ“
@jhuclsp
JHU CLSP
4 months
Congrats to CLSPโ€™ers graduating this year! ๐Ÿฅณ๐Ÿฅณ Photo credit: @esalesk
Tweet media one
Tweet media two
0
12
90
3
1
61
@cheeesio
Kelly Marchisio (St. Denis)
6 months
At @cohere , we prioritize multilinguality!
@aidangomez
Aidan Gomez
6 months
Multilinguality is something that is crucial for equitable utility of this technology. We want our models to work for as many people, organizations, and markets as possible. We perform strongly across 10 languages and we're eager to expand this further.
Tweet media one
3
5
91
1
3
58
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Honored to be named an Amazon Fellow!
@AmazonScience
Amazon Science
2 years
Amazon and @HopkinsEngineer announced the first PhD fellowships and faculty research awards recipients as part of the JHU + Amazon Initiative for Interactive AI. Learn why Alexa AI VP @natarajan_prem says these projects will help drive new advances in AI. #ArtificalIntelligence
2
7
38
3
4
57
@cheeesio
Kelly Marchisio (St. Denis)
3 months
We all get a little *confused* sometimes ๐Ÿซข๐Ÿซจ๐Ÿ˜ตโ€๐Ÿ’ซ - joint work with @seb_ruder @weiyinko_ml Alex Bรฉrard, Thรฉo Dehaze, hot off the press! โ™จ๏ธ
@seb_ruder
Sebastian Ruder
3 months
Understanding and Mitigating Language Confusion ๐Ÿ˜ตโ€๐Ÿ’ซ User: ยฟDe quรฉ trata nuestro artรญculo? LLM: We analyze one of LLMsโ€™ most jarring errors: their failure to generate text in the userโ€™s desired language. ๐Ÿ“‘ ๐Ÿ’ป
Tweet media one
5
43
182
6
7
55
@cheeesio
Kelly Marchisio (St. Denis)
6 months
Our prioritization of multilinguality extends even to our tokenizer. Better tokenization -> better representations -> better cost-efficiency for you! ๐Ÿ’ธ
@aidangomez
Aidan Gomez
6 months
One subtlety worth mentioning is how significant the tokenizer is to the cost to use models in non-english languages. Our tokenizer is meaningfully better than others at the 9 non-English languages, achieving up to a 2x effective cost reduction to use.
Tweet media one
5
13
124
1
6
56
@cheeesio
Kelly Marchisio (St. Denis)
2 years
GBO (qualifying exam) passed! ๐ŸŽ‰-> now, ๐Ÿ˜ด๐Ÿ•๐Ÿฆ (it's like gym/tan/laundry, but for newly-minted PhD candidates๐Ÿ‘ฉโ€๐Ÿ’ป)
7
0
53
@cheeesio
Kelly Marchisio (St. Denis)
2 years
The ability to extract accurate translation dictionaries from monolingual embedding spaces depends critically on their geometric similarity--"degree of isomorphism." We address this root-cause of faulty X-lingual mapping with โœจIsoVecโœจ #EMNLP2022 ๐Ÿงต1/N
3
7
49
@cheeesio
Kelly Marchisio (St. Denis)
10 months
He left, then returned for sources. I left, pleased for standing my ground, & that the โ€œbattleโ€ had *happened*. Let me explain: Respectful intellectual argument is a valuable skill to be honed. My male colleagues do it to each other. They hang, they banter & spar, they hack. 3/5
1
1
48
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Hilarious that this pops up now while Iโ€™m at EMNLP. Nine years ago - Iโ€™d coded my first โ€œhello worldโ€ only about 6 weeks earlier - my my how things have changed! ๐Ÿ’ป ๐Ÿ’• ๐Ÿค“
Tweet media one
2
1
44
@cheeesio
Kelly Marchisio (St. Denis)
4 years
Our new unsupervised MT work is up on Arxiv
@arxiv_cs_cl
cs.CL Papers
4 years
When Does Unsupervised Machine Translation Work?. (arXiv:2004.05516v1 []) #NLProc
0
11
21
0
8
41
@cheeesio
Kelly Marchisio (St. Denis)
6 months
We released our best multilingual LLM yet, with support for 10 languages and open weights for research! Check it out! ๐ŸŒ๐ŸŒ๐ŸŒŽ
@CohereForAI
Cohere For AI
6 months
C4AI Command R+ is a state-of-the-art RAG-optimized model with advanced tool use to automate sophisticated tasks, including multi-hop tool use. โœจ Command R+ is optimized for general reasoning and excels at multilingual performance evaluated across 10 languages. ๐ŸŒŽ
Tweet media one
1
2
22
1
3
40
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Headed to โœจNAACL 2022 โœจtomorrow! Looking forward to an exciting week of chats about multilinguality and low-resource MT. Come say โ€œhiโ€ if you see me!
3
2
40
@cheeesio
Kelly Marchisio (St. Denis)
5 months
please show me the training data ๐Ÿ™ƒ
Tweet media one
0
1
39
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Might supervised and unsupervised MT be mutually-beneficial? In our #NAACL2022 work, we ask whether the training methods result in systematically different output beyond what is visible via quality metrics like adequacy or BLEU. ๐Ÿงต1/4
3
5
37
@cheeesio
Kelly Marchisio (St. Denis)
10 months
Arrived in New Orleans for #NeurIPS2023 ! Iโ€™ll be at the Cohere booth tomorrow (Mon) 2:30-3:30pm, and 9-11:30am Tues-Thurs - come by if you want to chat about anything and everything multilingual NLP!
0
3
36
@cheeesio
Kelly Marchisio (St. Denis)
7 months
So thrilled to show you what weโ€™ve been working on!!
@aidangomez
Aidan Gomez
7 months
โŒ˜-R Introducing Command-R, a model focused on scalability, RAG, and Tool Use. We've also released the weights for research use, we hope they're useful to the community!
31
186
1K
0
0
36
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Took a break from thesis-writing on Monday to visit @esalesk at JHU's Edible Book Festival, presenting her edible rendition of our advisor's book! Cake recipe generated with Bard! ๐Ÿค“ @jhuclsp
Tweet media one
6
3
34
@cheeesio
Kelly Marchisio (St. Denis)
9 months
New year, new manager! So excited to work with you, @seb_ruder ! Folks - come join us!
@seb_ruder
Sebastian Ruder
9 months
I'm excited to announce that I've joined @cohere to help make LLMs more multilingual! Itโ€™s crazy how the capabilities of NLP models have evolved over the last years. Iโ€™m thrilled to work with a team full of smart, dedicated and kind individuals to push the boundaries of LLMs.
60
24
848
1
0
34
@cheeesio
Kelly Marchisio (St. Denis)
1 year
To appear in Findings of ACL2023! @artetxem @PSH_Lewis @yihong_thu
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Introducing โœจMini-Model Adaptationโœจ - a new parameter- and compute-efficient method for rapid adaptation of pretrained models to new languages! ๐Ÿงต1/5
Tweet media one
3
8
77
3
4
33
@cheeesio
Kelly Marchisio (St. Denis)
2 years
From Monday until early October, I'll be interning with @artetxem at Meta AI in London. If you'll be in ๐Ÿ‡ฌ๐Ÿ‡ง in the next few months, let's meet up!
0
1
31
@cheeesio
Kelly Marchisio (St. Denis)
10 months
โ€ฆ Him: โ€œI donโ€™t believe thatโ€ Me: cites sources โ€ฆ Him: โ€œWith infinite computation, thatโ€™s not trueโ€ Me: โ€œSure, but we live in reality. Infinity isnโ€™t real.โ€ โ€ฆ etc. etc. etc. 2/5
1
1
31
@cheeesio
Kelly Marchisio (St. Denis)
3 months
Very fun to work with @johnamqdang on this project!
@johnamqdang
John Dang
3 months
Is RLHF effective for aligning multilingual LLMs? ๐Ÿค” Our work studies multilingual preference optimization to train a new SOTA multilingual LLM, advancing the frontier of alignment techniques to 23 languages covering half the worldโ€™s population ๐ŸŒŽ! ๐Ÿงต ๐Ÿ“œ
Tweet media one
15
54
180
0
2
29
@cheeesio
Kelly Marchisio (St. Denis)
3 months
(1) Automatic metrics severely underestimate damage from quantization. โš ๏ธ While automatic evals estimate deterioration of a quantized model relative to FP16 across tasks at a modest โˆ’0.3% for French and โˆ’1.7% for Japanese. Humans report the drops as โˆ’16.6% and โˆ’16.0% ๐Ÿ‘Ž๐Ÿ‘Ž
Tweet media one
1
5
27
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Iโ€™m here in Toronto! #ACL2023NLP Iโ€™ll present Mini-Model Adaptation in a virtual poster session tomorrow 11:00โ€“12:30 Toronto time, and again in-person at the RepL4NLP workshop on Thursday. Come say ๐Ÿ‘‹!! @yihong_thu @PSH_Lewis @artetxem
@cheeesio
Kelly Marchisio (St. Denis)
1 year
To appear in Findings of ACL2023! @artetxem @PSH_Lewis @yihong_thu
3
4
33
0
6
26
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Life-hack: I watched a 3-minute YouTube video about steaming milk and now Iโ€™ve been complimented at the office two days in a row and called a โ€œpro.โ€ Please spam me other ways I can fool others into thinking Iโ€™m competent in 3mins or less!!! (Voilร  โ˜•๏ธ)
Tweet media one
6
0
22
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Itโ€™s my decade codeaversary! Right around this time 10 years ago, I coded my first line: a โ€œhello worldโ€ in C. My life has never been the same ๐Ÿฅฐ
2
0
22
@cheeesio
Kelly Marchisio (St. Denis)
2 months
I'll be presenting our recent work "How Does Quantization Affect Multilingual LLMs?" at the Cohere4AI ML Efficiency Group on Friday at noon Eastern (GMT-4). Come join in on the fun! To join, please fill out the form:
@Sree_Harsha_N
SreeHarsh (C#)
2 months
At the ML efficiency group, excited to have @cheeesio to present work on 'How does quantization affect multilingual LLMs'. Quantization is ever present in the large model stack -- but it can have unintended impacts on quality. Join in to find out :)
2
1
20
0
3
22
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Low-resourcedness + domain/script shift + noise dramatically โฌ‡๏ธโฌ‡๏ธ geometric similarity of word embedding spaces. #EMNLP2022 We improve BLI on non-isomorphic spaces using a new optimal transport-based graph-matching algorithm. 9am Sunday in Abu Dhabi! 1/4๐Ÿงต
2
3
22
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Public Service Announcement!! Watch this: Python2: round(1.5) -> 2.0 round(2.5) -> 3.0 Cool. Python3: round(1.5) -> 2. round(2.5) -> โ€ฆ 2. What?!?! (And the type change, too!) ๐Ÿงต 1/3
2
5
22
@cheeesio
Kelly Marchisio (St. Denis)
2 years
I'll be presenting two papers starting 30mins from now at #EMNLP2022 ! โœจIsoVecโœจ (below) as a poster, and ๐Ÿ“ˆBLI... using Graph Matching via Optimal Transport ๐Ÿ“‰ (tweeting yesterday) live in Hall B! Join me! @jhuclsp @n_verma1 @AliSaadEldin @kevinduh Carey Priebe, Philipp Koehn
@cheeesio
Kelly Marchisio (St. Denis)
2 years
The ability to extract accurate translation dictionaries from monolingual embedding spaces depends critically on their geometric similarity--"degree of isomorphism." We address this root-cause of faulty X-lingual mapping with โœจIsoVecโœจ #EMNLP2022 ๐Ÿงต1/N
3
7
49
0
1
21
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Day 22-30ish: The full draft is complete! A few hours per week turned into all-day-every-day for a week or two, between adding intro/abstract/future work/conclusion, and making requested edits from my committee. I defend *tomorrow* at 2pm Eastern at JHU!
0
0
20
@cheeesio
Kelly Marchisio (St. Denis)
2 months
It's our final push on improving multilingual MMLU! If you speak any of the languages below, please consider contributing!
@CohereForAI
Cohere For AI
2 months
With 1โƒฃ week left in our MMLU Translation sprint, we are 22% through the task. โŒ›๏ธ Korean, Arabic, Vietnamese, Amharic, German, Indonesian, Chinese, Sinhala, Nepali, and Swedish are all closing in on the goal! ๐Ÿฅ… ๐ŸŒŽ Speak these languages? Join us:
Tweet media one
0
8
21
0
5
18
@cheeesio
Kelly Marchisio (St. Denis)
3 months
(2) Languages are disparately affected by quantization: non-Latin script languages are impacted worst ๐Ÿฅบ We knew they were poorly represented in training data & tokenization, causing โฌ performance and โซ cost/latency. Now we know theyโ€™re treated unfairly in quantization, too ๐Ÿ˜Ÿ
Tweet media one
2
1
18
@cheeesio
Kelly Marchisio (St. Denis)
3 months
(3) Challenging tasks degrade fastest. ๐Ÿ“‰ For example, mathematical reasoning (MGSM) and generative tasks as evaluated by humans and LLM-as-a-Judge suffer a large performance penalty under quantization.
Tweet media one
1
2
17
@cheeesio
Kelly Marchisio (St. Denis)
7 months
Have fun! ๐Ÿค–
@aidangomez
Aidan Gomez
7 months
Here are the weights:
4
19
125
0
2
18
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Our Findings of EMNLP 2021 paper, โ€œAn Analysis of Euclidean vs. Graph-Based Framing for BLI from Word Embedding Spacesโ€, is now public: Code: Paper: *thread* 1/5
2
2
17
@cheeesio
Kelly Marchisio (St. Denis)
27 days
Doing my PhD at JHU CS was a true joy! Join them!
@DanielKhashabi
Daniel Khashabi ๐Ÿ•Š๏ธ
27 days
Computer Science @ JHU is hiring in ALL areas: ๐Ÿ”‘ Apply early for flexible scheduling + potential early offer. Our department is expanding fast, especially in AI-adjacent fields. Come join us!
1
17
83
0
2
17
@cheeesio
Kelly Marchisio (St. Denis)
4 years
Finally! Finished my 2019 New Years resolution ๐Ÿฅณ๐ŸŽ‰โ˜•๏ธ Whatโ€™s next? Iโ€™ve got Hogbenโ€™s Mathematics for the Million on the list. (And please excuse the crude coffee mug - a neural network named the color ๐Ÿ˜…)
Tweet media one
2
0
15
@cheeesio
Kelly Marchisio (St. Denis)
2 months
@mayhewsw I expect to update X with a preprint within a few weeks of training completion - stay tuned
1
0
15
@cheeesio
Kelly Marchisio (St. Denis)
4 years
For our 4th date, Martin and I took apart a computer together. For our anniversary, he surprised me with this - the GPU from that night ๐Ÿ˜ญ๐Ÿ˜ญ๐Ÿ˜ญ "Love you too much to process" -- not quite sure if he's referring to me or the GPU ๐Ÿคทโ€โ™€๏ธ
Tweet media one
Tweet media two
0
0
14
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Days ~15-17: Defense date is set! **Wednesday 7 June, 2-4pm** I now have to deliver the full draft to my committee members 2 weeks early, by next Wednesday. I've been sending my advisor draft chapters every few days. Final research content chapter today!
1
0
15
@cheeesio
Kelly Marchisio (St. Denis)
2 years
I've overhauled the โœจIsoVecโœจ code to make it more usable -- give it a try! (and feel free to reach out with questions) Github: Paper:
@cheeesio
Kelly Marchisio (St. Denis)
2 years
The ability to extract accurate translation dictionaries from monolingual embedding spaces depends critically on their geometric similarity--"degree of isomorphism." We address this root-cause of faulty X-lingual mapping with โœจIsoVecโœจ #EMNLP2022 ๐Ÿงต1/N
3
7
49
0
2
15
@cheeesio
Kelly Marchisio (St. Denis)
3 years
in-person @ EMNLP this week - letโ€™s meet for โ˜•๏ธ! (or ๐Ÿน๐Ÿฅ›๐Ÿง‰๐Ÿฅ‚๐Ÿง‹๐Ÿซ–๐Ÿป๐Ÿงƒ!)
0
0
14
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 3: Today I read that Ernest Hemingway allegedly said, โ€œwrite drunk, edit sober.โ€ Turns out he *didn't* actually say this, which is a real shame because for a moment there I thought he'd cured my writer's block. Anywho, I copied the JHU thesis template today: it exists! ๐Ÿป
Tweet media one
2
0
14
@cheeesio
Kelly Marchisio (St. Denis)
4 months
A pleasant "good morning" from my dear friend, Command R ๐Ÿฅฐ
Tweet media one
0
0
13
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 2: Skimmed over the thesis of a former lab-mate as an example, then made a rough outline: 3/N
Tweet media one
2
0
13
@cheeesio
Kelly Marchisio (St. Denis)
3 months
The ability to serve low-compute models is *critical* for wide global adoption. Even widely-used W8 quantization leads to degradation detectable by humans for some languages, and W4 is even worse. Consider multilinguality as a key evaluation criterion for efficient models!
1
0
12
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Actual footage of me, a vim user, trying to quit nano ๐Ÿคฌ
2
0
12
@cheeesio
Kelly Marchisio (St. Denis)
10 months
Come check out our poster tomorrow!
@yihong_thu
Yihong Chen
10 months
If you're deciding which #NeurIPS23 poster to check out tomorrow, don't forget our forgetting paper! Visit poster #328 Thursday morning to dive into the world of active forgetting. Discover how it enhances language models with greater language plasticity. See you there!
Tweet media one
Tweet media two
Tweet media three
2
14
48
0
3
12
@cheeesio
Kelly Marchisio (St. Denis)
10 months
@ahmetustun89 and I will be chatting about multilingual research at Cohere & C4AI today at #NeurIPS2023 ! Stop by and say โ€œhelloโ€! ๐Ÿ‘‹ ๐ŸŒ๐ŸŒ๐ŸŒŽ
@CohereForAI
Cohere For AI
10 months
@NeurIPSConf @fraser_mince @dzungdinhh This afternoon at 2:30p - 3:30p CT join us at the booth to meet @ahmetustun89 and @cheeesio as they chat with attendees about โ€œMultilingual Research & Innovation at Cohere and Cohere For AI.
0
0
2
0
6
11
@cheeesio
Kelly Marchisio (St. Denis)
2 years
These are the types of questions & answers that get me excited about using ChatGPT -- The ones that are hard to ask traditional search engines, because punctuation/syntax really matters!
Tweet media one
0
1
10
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Same! It me!
@jacasiegel
Jaclyn A. Siegel, PhD
3 years
If you ever see me in person, please say hi. Please approach me at conferences and assume we are best friends. Yes, I want to get coffee or drinks or dinner and talk about your cool new project or hobby or family.
54
226
5K
1
0
9
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Legend. Many a late-night spent watching Professor Strang's lectures on 2x speed to understand my linear algebra homework. The impact this man has had on budding scientists/mathematicians is astounding!
@mitregressions
MIT:REGRESSIONS
1 year
Professor Strang gave his last Linear Algebra lecture today after 66 years at MIT. Strang was among the first to upload his classes to MIT OpenCourseWare when it first came online in the early 2000s. His 18.06 lectures have been viewed millions of times around the world
52
1K
7K
0
0
9
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Re: Hybrid format -- I know some have felt bogged down with the amount of time it takes to make a recording + (poster / in-person talk) + paper. But, I am "reading" *so* many more of your papers now! I hope if/when we go back to in-person-only, the 10min videos will stay ๐Ÿ“–
2
0
9
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Just watched this very clear talk from EMNLP 2021 on Underline. Might help explain our findings in "When Does Unsupervised Machine Translation Work?", particularly Table 5 on instability in BLI ()
@arxiv_cscl
arXiv CS-CL
4 years
Analyzing the Surprising Variability in Word Embedding Stability Across Languages
0
1
2
0
0
9
@cheeesio
Kelly Marchisio (St. Denis)
4 years
Just received the cutest little โ€œWork From Home Internโ€ Android from @Google for my remote internship. Thanks to Google Translate Research @markuseful @GrangierDavid for hosting me this summer!
Tweet media one
0
0
9
@cheeesio
Kelly Marchisio (St. Denis)
3 years
@fchollet 22, almost by accident, after a bachelors in psychology/sociology. Changed my life and has brought me more excitement, joy, and fulfillment than I ever could have imagined from a career
0
0
9
@cheeesio
Kelly Marchisio (St. Denis)
9 months
So excited to work with you, @johnamqdang !
@johnamqdang
John Dang
9 months
Excited to announce that I've joined @cohere as a Research Scholar to work on Multilingual RLHF for LLMs! Thrilled to be working with @ahmetustun89 @cheeesio @KreutzerJulia @sarahookr and the @CohereForAI team!
6
5
105
0
0
9
@cheeesio
Kelly Marchisio (St. Denis)
10 months
Iโ€™ll be at #NeurIPS2023 next week! Letโ€™s meet up!
@sarahookr
Sara Hooker
10 months
Excited to be in New Orleans next week! ๐ŸŽ‰ Very proud of the work we will be presenting, with many posters, talks and presentations ahead. Come chat with the @CohereForAI @cohere team. Happy to connect -- looking forward to catching up with friends old and new.
2
9
70
0
0
8
@cheeesio
Kelly Marchisio (St. Denis)
1 year
@davidbau @iclr_conf @srush_nlp @boknilev I'd love to see a write-up of this after, for those of us who won't be in Rwanda!
0
0
8
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 1: Feeling energized after listening to Episode 151 of @marvettelacy โ€™s podcast: โ€œWriting a shitty paragraph takes 10 minutes, tops.โ€ Letโ€™s gooooooo! ๐Ÿ’ช๐Ÿฝ 2/N
2
0
8
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Day 10: Phew! No one tells you (...ok fine, plenty of people told me) that interviewing full-time at the end of a PhD means squeezing in writing in any spare energised moment. 1 hour til liftoff - can I crack out a couple sections? โœˆ๏ธ โ˜•๏ธ
Tweet media one
1
0
8
@cheeesio
Kelly Marchisio (St. Denis)
9 months
@sarahookr We had work about efficient adaptation via the embedding layer alone at ACL and NeurIPS this year!
@JayAlammar
Jay Alammar
1 year
Here's my colleague Kelly Marchisio ( @cheeesio ) presenting โ€œMini-Model Adaptation: Efficiently Extending Pretrained Models to New Languages via Aligned Shallow Training" Work with @yihong_thu @PSH_Lewis @artetxem at @cohere @forai_ml @ucl_nlp @MetaAI @jhuclsp @RekaAILabs
1
14
60
1
0
8
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 4: Printed out my relevant publications, and I'm deciding which parts will be moved to overall intro/background sections vs. which will stay in-chapter with research findings. These two will def need merging, as "BLI for Low Res..." was a follow-on paper to "An Analysis..."
Tweet media one
1
0
7
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Iโ€™ll be presenting this live in 30mins. Come stop by!
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Might supervised and unsupervised MT be mutually-beneficial? In our #NAACL2022 work, we ask whether the training methods result in systematically different output beyond what is visible via quality metrics like adequacy or BLEU. ๐Ÿงต1/4
3
5
37
0
1
8
@cheeesio
Kelly Marchisio (St. Denis)
7 months
Coverage of our NeurIPS23 paper, lead by @yihong_thu !
@QuantaMagazine
Quanta Magazine
7 months
To learn more flexibly, a new machine learning model selectively forgets what it already knows. @settostun reports:
0
24
78
0
0
8
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 7: Unexpected ๐ŸŽ from my past self: In many of my latex docs, Iโ€™d commented-out alternate phrasings, paragraphs that I didnโ€™t have space for, additional derivations, mathematical intuition, etc. Now with unlimited space, these are given new life!
1
0
8
@cheeesio
Kelly Marchisio (St. Denis)
8 months
Alright, hats off to GitHub Copilot ๐ŸŽฉ I wrote only the comments and tiny post-edit to specify behavior of keep, but that's because my prompt was unclear. (OK I know it didn't actually ๐Ÿ–จ๏ธ, but variable assignment is what I actually wanted so I could play with it myself.)
Tweet media one
1
0
8
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Day 11: Personally, I โค๏ธ the new required "Limitations" section for *ACL conferences. When written well, they clarify work and (counterintuitively?) make the authors' main claims stronger. Keeping them in my thesis!
Tweet media one
1
0
7
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 5: Decided that "BLI for Low Res..." and "An Analysis..." definitely belong together under the broader category of *Graph Matching Methods for Bilingual Lexicon Induction*. This morning, I spent an hour combining their setups into a "Shared Experimental Setup" section.
Tweet media one
1
0
7
@cheeesio
Kelly Marchisio (St. Denis)
2 years
@FromPhDtoLife Take ~5 years between ugrad & PhD to work, make some money (invest!), have a blast in your early-mid 20s, re-evaluate whether PhD is truly the path for you. If it is, go for it full-force!
0
0
7
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Day 13: Now that I can talk freely about it, the final chapter isโœจMini-Model Adaptationโœจ! I got feedback that I should "modernize" my thesis; What does multilinguality from embedding spaces look like in the age of LLMs? Here's a response! #ACL2023NLP
1
0
6
@cheeesio
Kelly Marchisio (St. Denis)
5 years
X-mas gifts received when your BF knows you too well! -- A NN was trained to name new paint colors. This is what it came up with ๐Ÿ˜‚๐Ÿ˜‚
Tweet media one
0
0
6
@cheeesio
Kelly Marchisio (St. Denis)
5 years
Video of my presentation of our recently published work: Found a couple nice summaries of it from MT Summit, also:
@WeCNLP
WeCNLP
5 years
The videos for the invited and lightning talks at WeCNLP 2019 are up! #WeCNLP19
1
5
19
0
0
6
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Happening in 1hr! Gathertown link here, poster 938:
@CohereForAI
Cohere For AI
1 year
Today we are excited to have work from several of our Cohere Research staff being presented, take a look at where you can find our colleagues @PSH_Lewis @cheeesio @bminixhofer Phil Blunsom @satwik1729 and @tomhosking .
Tweet media one
2
2
6
1
2
6
@cheeesio
Kelly Marchisio (St. Denis)
4 years
Thereโ€™s a certain joy in reading a book which recommends using a slide rule for division ๐Ÿฅฐ โ˜•๏ธ
Tweet media one
1
0
6
@cheeesio
Kelly Marchisio (St. Denis)
3 years
Equal Contribution: Kelly Marchisio and Martin St. Denis, 23 Dec 2021. (*some repetition removed)
0
0
6
@cheeesio
Kelly Marchisio (St. Denis)
2 years
Day 6.5: Wearing this as a critical note-to-self.
2
0
6
@cheeesio
Kelly Marchisio (St. Denis)
1 year
Day 14: Time to re-commit to a writing habit! Interviewing is a full-time job, and each requires prep--so I've fallen off the writing ๐Ÿš‚ recently. To defend in June, I'm re-committing to 1hr writing sessions, 3x/week. Achievable, measurable! ๐Ÿ†๐Ÿ“ All Aboard!! ๐Ÿคช๐Ÿคธ๐Ÿชฉ
1
0
5
@cheeesio
Kelly Marchisio (St. Denis)
4 months
Excited to be part of this collab! ๐Ÿ—บ๏ธ
@CohereForAI
Cohere For AI
4 months
๐Ÿ“ฃAnnouncing our new cross-institutional collaboration. We've brought together researchers invested in improving multilingual benchmarks. We're starting with MMLU, a heavily translated dataset used for multilingual evals that doesn't capture cultural nuances. Let's address this
Tweet media one
1
28
114
0
0
5