💬🤖 In Vision-and-Language Navigation (VLN), tons of new methods are fueled by the breakthroughs of LLMs and VLMs, and new challenges are popping up. We've put together a systematic survey of trends and future directions to catch you up :)
@zhan1624
🎉Thrilled to share that our paper "World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models" was selected for the outstanding paper award at
#ACL2023NLP
! Thanks
@aclmeeting
:-)
Let's take grounding seriously in VLMs because...
🧵[1/n]
Want to edit your image with language descriptions in less than 3s? Ever questioned the need for prolonged inversion in text-guided editing? We are happy to release ♾ InfEdit (with demo), a flexible framework for fast, faithful and consistent editing.
🔗
Honestly, the quality of the batches I reviewed for EMNLP and NeurIPS isn't much better than random arXiv papers. It’s clear that some submissions are from beginners, and while I tried to give constructive feedback, come on, mentors, you guys can do better. Reviewing service is a
The moment I hate privileged reviewers: "Why don't you pre-train something larger but still trainable, like GPT-3 or LLaMA?" Oh, sure, let me just casually whip up a 175 billion parameter model on 300 billion tokens with 20 random seeds :)
Seriously, I also started to remind
Happy to share some “slow research” - it's been 15 months and it's now (finally) on arXiv! Our language development is different from LLMs. We're asking: How do you interactively babysit a language model from scratch, and would it help?🤔
🔗
@Michigan_AI
Emerged or not emerged, that may not be the right question to ask for Theory of Mind (ToM) in
#LLM
. In our theme track paper in the Findings of
#EMNLP2023
@emnlpmeeting
, we asked ourselves (1) what constitutes a machine ToM? (2) How to better evaluate ToM in LLMs?...🧵[1/n]
The NeurIPS submission process is frustrating, and the review process is equally so. While it’s annoying when some reviewers focus solely on criticizing novelty, looking at the batch of papers I have to review, I can’t help but feel that some people are using this as an excuse to
Life Update: Starting early next year, I'll be embarking on a new journey as a research intern with
@AdobeResearch
. Always passionate about language + embodiment in my research, I've now realized some answers I seek might lie beyond texts and 2D interactions.
Excited to guest a lecture on “Connecting Language to the World: Towards Language Grounding to Vision and Embodied Agents” at UCSD. 😇
Thanks Prof.
@ZhitingHu
for hosting!
Happy to receive the Weinberg Cognitive Science Fellowship! **Scalable** computational models of cognition are at the core of my passion. There's so much to explore in the developmental trajectories and emergent capabilities of large-scale models and human-like machine learning.
Want to edit your image with language descriptions in less than 3s? Ever questioned the need for prolonged inversion in text-guided editing? We are happy to release ♾ InfEdit (with demo), a flexible framework for fast, faithful and consistent editing.
🔗
Want to edit your image with language descriptions in less than 3s? Ever questioned the need for prolonged inversion in text-guided editing? We are happy to release ♾ InfEdit (with demo), a flexible framework for fast, faithful and consistent editing.
🔗
Recent life updates:
A lot has been going on towards the end of my second year as a Ph.D. student.
1. Excited to join
@AmazonScience
over the summer with the Embodied AI team!
2. Got 2 papers accepted to
#ACL2023
and 1 paper accepted to
#IJCAI2023
. See you in Toronto and Macau!
A tad late to the party, but happy to share that CycleNet has been accepted to
#NeurIPS2023
@NeurIPSConf
! Consistency has been a pain in text-guided image editing with
#DiffusionModels
and here is our solution to guarantee cycle consistency...🧵[1/n]
📍
Just landed my work at both
#EMNLP2023
(12.6-12.10) and
#NeurIPS2023
(12.10-12.16), but the Singapore ➡️ New Orleans flight takes over a day...😨
Do I just pick one or attempt the marathon?
I actually spent my first day at
#CVPR2024
with
@UW_Robotics
. I can definitely see why so much great work is happening here! Huge thanks to
@DJiafei
for the fantastic tour, he is simply amazing!
Same story +1. Our paper on visually grounded language learning was rejected by AAAI 2023. After revisions, this paper ( ) received an outstanding paper award at ACL 2023. Peer reviews were helpful in improving paper quality!
We had a similar story. our paper on vocabulary learning received all rejections in ICLR2021 (). We revised the paper and submitted to ACL 2021. It received the best paper. But I do not complain the review. We did take comments and revised substantially.
Heading to
#CVPR2024
in Seattle next week with 2 posters and a workshop paper. If you are interested in scalable multimodal learning (specifically 3D and video representations for LLMs) and embodied AI, please DM me and connect, I am ready for 10 cups of coffee every day😄☕
Since I started my Ph.D. 2 years ago, I've ALWAYS waited until my papers were officially accepted and the camera-ready versions prepared before releasing & advertising any arXiv preprints. This might change soon. I'm considering uploading some work to arXiv earlier than usual…🧵
Fantastic seminar talk by our esteemed
@Michigan_AI
alumni
@lajanugen
on Task Planning with Large Language Models! Always inspiring to see Wolverines leading the way. 💛💙
Mark you calendar for our upcoming seminars and check out the recordings 👉
I went through 3 rounds of exchanges with the authors whose papers I reviewed, but when it comes to my own submission, there's no feedback even though they asked for 4 more experiments. This really makes me think twice about submitting and serving the ARR in the future ... :(
I'll be presenting papers at
#ACL2023NLP
in Toronto. Catch me at poster sessions 1 and 7. Looking forward to reconnecting with old friends and meeting new ones. 🤓 Let's chat about (vision-)language models, grounding, embodied AI, multimodal interaction, and theory of mind ✨
Whenever I can, I like to take my time to properly cite papers, looking up details like the conference names, page numbers, or even workshop names, instead of just referencing the arXiv version. What do people think about this? Just curious.
Truly grateful for the recognition bestowed upon me by CSE for my dedication to teaching. My ultimate aspiration is to make NLP accessible to every student. It fills me with immense joy to realize that my efforts are positively impacting the education of others. 😊
We're thrilled to announce the Outstanding GSI and IA Award winners for 22-23! Sending our heartfelt congratulations and thanks to these exceptional student teachers! Your enthusiasm, expertise, and dedication have had a lasting impact on CSE students
#IChoseUMich
|
#GoBlue
Thrilled to share that I will be staying in our wonderful
@Michigan_AI
this fall as a Ph.D. student, working with prof. Joyce Chai on the exciting
#NLProc
research! Thanks Joyce for her trust and support!
Since 2024, I've undertaken 17 paper reviews across a range of conferences like NAACL, CVPR and COGSCI, with another 4 to go for ACL. This commitment has increasingly encroached upon my free time off work. Conversations with ACs also revealed a mutual sentiment of exhaustion...
We are excited to host Professor Ming Yin () to present her work on human-AI collaborative decision-making!
I met Prof. Yin at IJCAI 2023 last year and I gained so much insight from her. Can't wait to catch up on her new work.
👉
Definitely gonna remember to print the poster before the deadline, because taking it on the plane is way too much a pain…
Anyways, at
#NeurIPS2023
now!
Open to any random coffee talk about multimodality, cognitively motivated AI, embodied AI, and basically anything.🤓
I'm unable to join
#EMNLP2023
physically because of a conflicting talk travel commitment. Missing my NLP colleagues already.🥹
However, you can meet
@jhsansom
and
@RoihnPeng
at the in-person poster session! Also DMs are open for discussions about ToM or any other topics.😬
Emerged or not emerged, that may not be the right question to ask for Theory of Mind (ToM) in
#LLM
. In our theme track paper in the Findings of
#EMNLP2023
@emnlpmeeting
, we asked ourselves (1) what constitutes a machine ToM? (2) How to better evaluate ToM in LLMs?...🧵[1/n]
Reflecting as the year comes to a close:
So, my friends outside of the academic world keep asking me why I picked a career in research. They wonder if I'm trying to change the world with AI. Honestly, after thinking about it, my answer is no.
Huge congrats to everyone who got their papers accepted to
#ACL2024NLP
! 🎉
Our SpLU-RoboNLP Workshop @
#ACL2024
is still open for submissions! We have a non-archival option and welcome papers accepted to ACL Findings (And, yes, RSS too😆) for presentation.
@gimdong58085414
I can relate to it, I’ve also noticed several uncited paraphrases over my papers, but seems that there is no way to report plagiarism on
@arxiv_org
…🤦♂️
Life is not Lipschitz continuous. There are sudden burnouts and aha moments popping up when you least expect them, so don't base all your future plans solely on your current momentum :)
Referential grounding is not a one-to-one mapping. When queried about multiple objects, more hallucination arises. We provide insights on whether grounded pretraining, scaling, and post-training techniques can effectively reduce object hallucinations. (Spoiler: Not necessarily)
🎉Happy to share my first paper "Multi-Object Hallucination in Vision-Language Models" in
#ALVR
at
#ACL2024
! Do VLMs hallucinate more in multi-object scenes? Shortcuts? Spurious correlations? What factors contribute to their hallucinations? 👀
🔗 🧵[1/n]
Excited to co-organize this wonderful venue focused on Spatially Grounded NLP and Human-Robotics Communication. We invite you to submit your papers or propose talks for SpLU-RoboNLP at
#ACL2024
!🥳
I will be presenting this work at
#IJCAI2023
in Macau. Feel free to drop by our presentation on Wednesday, Aug 23rd, 15:30-16:50 in the Humans and AI track!
🎉We are happy to share that our paper "Towards Collaborative Plan Acquisition through Theory of Mind Modeling in Situated Dialogue" has been accepted at
#IJCAI2023
@IJCAIconf
! 🧵[1/n]
Mood for reviewing for NeurIPS and ARR:
A gentle reminder to those claiming to have created a general or holistic benchmark: you did try your best but no such thing exists, sorry.
Today, we are thrilled to welcome
@DrGloWash
to our Michigan AI seminar series, who will be sharing insightful research on human-centered AI.
For upcoming
@Michigan_AI
seminars and catch the recordings 👉
We're proud to present 16 student projects from our NLP class in Fall 2022! These papers showcase the hard work of our talented students. We invite you to visit our website and discover their innovative work.
🔗
#NLP
#studentprojects
It would be great to study vision-language navigation (VLN):
1⃣ in a dynamic environment;
2⃣ with freeform, sensorimotor-grounded dialogue;
3⃣ to handle unexpected situations.
But how? Check out our new
#EMNLP2022
Findings paper *DOROTHIE* from the
#SLEDLab
@Michigan_AI
.
A🧵
We are hosting
@BondiElizabeth
's AI Seminar on Realizing AI for Impact: Uncertainty and Human-Agent Collaboration in Multi-Agent Systems for Public Health and Conservation :-)
🔗
Today, we solemnly remember the 15th anniversary of the devastating 2008 Sichuan earthquake. As someone born and raised in Sichuan, it remains etched as the deepest trauma in the hearts of our people. My heartfelt condolences to all who were affected.
Since I started my Ph.D. 2 years ago, I've ALWAYS waited until my papers were officially accepted and the camera-ready versions prepared before releasing & advertising any arXiv preprints. This might change soon. I'm considering uploading some work to arXiv earlier than usual…🧵
As the new semester kicks off, we're excited to restart our
@Michigan_AI
Seminar Series! Today, we warmly welcome
@DejiaoZhang
, who will be delving into code representation learning. Don't miss out on this insightful session, it's happening in an hour!
👉
We
@Michigan_AI
are thrilled to host an amazing seminar with Gaowen Liu and
@yuzhang_shang
from
@Cisco
Research, delving into cutting-edge developments in quantization, pruning, and knowledge distillation!
For upcoming seminars and catch the recordings 👉
Due to visa issues, I won't be attending
#ICML2024
in person, but I will be giving a contributed talk remotely at the LLMs and Cognition workshop. Looking forward to discussions on scalable pre-training and cognitive science!
For a paper we submitted in August, a reviewer kindly asked us "how do you think about this" plus link to a paper that is released this month ............😅
Unfortunately, I won't be able to attend
#ACL2024NLP
this year (with so much work on my plate and feeling too exhausted to travel and socialize)...However, I encourage you to check out our
#SpLU
-
#RoboNLP
workshop and our paper "Multi-Object Hallucination in Vision-Language
Hello, SLED (
@michigan_AI
) has a Twitter now! 👋 We'll kick off our account by sharing some of our members' upcoming work at
#ACL2023NLP
. Follow us to keep up!
Instead, I'm really into understanding how intelligence works, not just in human intelligence but intelligence in a broader sense, as a piece of science. For me, the coolest part of AI research is just satisfying my own intellectual curiosity :-)
We study if embodied experiences (description-video pairs from a simulator) and social experiences (human-vehicle dialogues where another human play the role of the vehicle) enhance the development of video-language models for interactive autonomous driving agents. Demo included!
🚗 Excited to share our latest work on building outdoor embodied agents: DriVLMe, appearing at VLADR @
#CVPR2024
! 🥳
🔗
DriVLMe is a video-language-model-based agent designed for seamless communication between humans and autonomous vehicles, enabling
We're absolutely delighted to have
@leejayyoon
join us at the
@Michigan_AI
Seminar today! Can’t wait to this engaging presentation titled "Automatically Capturing and Reflecting Latent Label Dependencies in Machine Learning Models." 🤖
As the semester begins, we're excited to restart the Michigan AI
@michigan_AI
seminar series!
Check out our initial lineup of speakers, and keep an eye out for more updates and new additions! 🙌
Speakers (ordered by the schedule):
Bryan Pardo
Today we're excited to host an engaging seminar featuring
@maia_jacobs
, in which she will explore the complexities of interactive design for AI in medical and healthcare!
For upcoming
@Michigan_AI
seminars and catch the recordings 👉
Thank you for the great invite! I’m excited to talk about my CausalNLP research in one of the rising star talks at
@michigan_AI
Symposium on 17th (Tue). See you in Ann Arbor! (I’m also happy to take chats throughout the upcoming week at UMich!)
Unexpectedly yet excitingly, we are wrapping up this year's
@michigan_AI
seminar with
@saprmarks
, on Sparse Feature Circuits and their role in discovering and editing interpretable causal graphs in
#LLM
!!
🔗
Ok, every year when I look up my bids for NeurIPS, I can't help but feel like I've been scooped again. It's like the field is on rocket fuel….😫 But is there something wrong with the configuration this year? I can even bid papers that I proofread for my previous coauthors…
Chatting with pals and suddenly a flashback hits: I tried something somewhat similar to Chain of Thought with GPT-2 back in 2021. Got some results but we gave up as it was far from robust. So, keep those brainwaves flowing, maybe your next big break is just a GPT-3 away! 😂
Thanks
@ml_collective
for hosting! We will also present this work at the Large Language Models and Cognition Workshop @
#ICML2024
. Happy to chat about how interaction and agentic behaviors could shape language learning.
🚀 Join us tomorrow at 10AM PDT for our
#DLCT
session with
@ziqiao_ma
! Explore how human social interactions shape language learning and discover the trial-and-demonstration (TnD) framework's impact on AI. Don't miss insights on efficient word learning in language models! 🌟
📢Call for POSTERS/DEMOS!
Do you have work in
#AI
that you would like to share ?
Participate with a poster or demo in the
#MichiganAI22Symposium
.
Submit the title/abstract by Oct. 22.
Registration also required (free).
In Dec 2023,
@TairanHe99
convinced me that humanoid teleoperation is the future - moving beyond the foundation model hypes and potentially tackling the interactive data scarcity for robotics. I knew the team had worked tirelessly for this groundbreaking release. Congrats!
🤖 Introducing H2O (Human2HumanOid):
- 🧠 An RL-based human-to-humanoid real-time whole-body teleoperation framework
- 💃 Scalable retargeting and training using large human motion dataset
- 🎥 With just an RGB camera, everyone can teleoperate a full-sized humanoid to perform
My 5 reviews for
#EMNLP2023
are each longer than the combined length of the 5 reviews I got from
#NeurIPS2023
. Do unto others as you would have them do unto you :-)
OpenDevin
An Open Platform for AI Software Developers as Generalist Agents
Software is one of the most powerful tools that we humans have at our disposal; it allows a skilled programmer to interact with the world in complex and profound ways. At the same time, thanks to
Our SpLU-RoboNLP workshop is set for August 16th with
#ACL2024
. We invite you to join us and look forward to the incredible talks from our outstanding speakers :)
Thank you Freda (
@fredahshi
) for graciously accepting our invitation to speak at our
#AI
Seminar by
@MichiganAI
. 😆 I can’t wait to gain valuable insights from her presentation on Learning Syntactic Structures from Visually Grounded Text and Speech.
👉
We are hosting Prof. Sidney D'Mello in today’s
@michigan_AI
seminar, who will share his vision for the next generation Human-AI partnerships from autonomy to synergy.
🔗
Excited to be involved in this collaborative initiative!😬 Alignment research shouldn't be a static, one-way process but rather a continuous, reciprocal engagement between humans and AI. AI and HCI researchers need to communicate to facilitate this. Check out this thread...👇
📢Is current “human-AI alignment” research clarified and comprehensive? 🤔 We systematically reviewed 400+ papers across HCI, NLP, and ML to develop a framework for 👫<>🤖"Bidirectional Human-AI Alignment", encompassing the dual paths of “Aligning AI to Human” and “Aligning Human
Instruction following has been a critical but challenging task for Embodied AI like household robots. Check out how our work DANLI addresses this problem with deliberate planning 🤖
P.s. I will be attending EMNLP 2022 in person, happy to chat and connect then!
💭Ever wondered how deliberative symbolic planning can help in Embodied AI🤖? Check out our new
#EMNLP2022
paper *DANLI* from the
#SLEDLab
(
@michigan_AI
). Meet us on Friday, Dec. 9 in Oral Session 3!
Pre-print:
Video:
A thread🧵
How would trials and demonstrations from interactions influence neural language acquisition from the ground up? Our experiments (on both BookCorpus and
@babyLMchallenge
) reveal that TnD accelerates word acquisition for student models of equal and smaller numbers of parameters.
Check out this great work by
@ZeYanjie
to bridge 3D representation and diffusion policies for visual imitation learning. Very surprised by the improvement to avoid safety violations! Next step to get LLMs in the loop to tackle long-horizon tasks?
Introduce 3D Diffusion Policy (DP3), a simple visual imitation learning algorithm that achieves:
- 55.3% relative improvement on 72 simulated tasks, most with 10 demos
- 85% success rates on 4 real-world tasks, with 40 demos🥟🌯
Open-sourced!
Code/Data:
The SLED group just had our first mentee symposium: our amazing masters and undergraduate members updated on their research and shared their research interests. Special credits to
@jed_yang
for moderating this wonderful event!
🚀🎉 Exciting news! Registration for Michigan AI 2024 Symposium is NOW OPEN, link below.
🎟️✨ Don’t miss out on this incredible experience - join us for a day of all things
#AI
!