Did you know there are other dialog agents like ChatGPT? ๐ค
And what if I told you the secret sauce is IFT, RLHF, CoT, and SFT ๐คฏ
We explain each of these terms and why they are relevant to ChatGPT by comparing with 4 other dialog agents.
Check our blog:
Here's hoping I don't need to update this slide again before my talk next week
@emnlpmeeting
If anyone is planning to release anything next week, please lmk soon ๐
Am I missing any text-only LLMs?
You can create your own chatbot by fine-tuning pre-trained causal LLM to follow instructions ๐ค
Here is a list of datasets on
@huggingface
hub that you can use for Instruction fine-tuning (IFT) ๐งต /0
Did you know there are other dialog agents like ChatGPT? ๐ค
And what if I told you the secret sauce is IFT, RLHF, CoT, and SFT ๐คฏ
We explain each of these terms and why they are relevant to ChatGPT by comparing with 4 other dialog agents.
Check our blog:
Thanks to Open Science, we are releasing Zephyr ๐ช, a 7B parameter model that is as good as ChatGPT on AlpacaEval
Our model is created using:
-
@MistralAI
Mistral 7B base model
- The UltraChat dataset for SFT
- The UltraFeedback dataset for DPO
Other results and demo link in ๐งต
Life update: I have joined
@huggingface
๐ค and I will be working alongside
@douwekiela
@Thom_Wolf
@mmitchell_ai
and all the amazing folks here. I am excited to continue pushing research on model understanding and evaluation.
New preprint alert! ๐จ
Introducing GeDi (pronounced Jedi): A Powerful New Method for Controlling Language Models.
Paper:
Code:
Blog:
This paper has a bunch of really cool results. Here are a few.
Just finished teaching my last class on Interpreting ML models and it has been such a rewarding experience๐คฉ
We learned a ton of methods covering feature and instance attributions on three data modalities and evaluated each for plausibility and faithfulness.
4 hands-on projects!
This was my first time submitting >1 papers at a conference and I am happy to announce that I have 3 long papers at
#acl2020nlp
1. ERASER benchmark for interpretability
2. Causal and commonsense physical reasoning
3. Gender debiasing for word embedding
#nlproc
#silverlining
I am studying ML model lifecycle & had a hypothesis that recent ML models have shorter lifecycles, i.e., their usage peaks and dies out quickly and is replaced by newer more efficient models (Dalle --> Stable diffusion). So I did a systematic analysis of 65K models on HF hub๐
I came to terms with the fact that I'd have to update the timeline ever so often, but I must admit that I did not think I'd have to update the model accesses so frequently.
PaLM: closed --> limited
Claude: closed --> limited
Stoked to share that our tutorial on Responsible Generative AI got accepted at both
@FAccTConference
and
@icmlconf
๐
Looking forward to meeting everyone but not looking forward to updating this slide ๐ซ
I'm open to suggestions on specific topics to cover.
#NLProc
does not have a standard benchmark for interpretability. I am stoked to announce ERASER: the first-ever effort on unifying and standardizing NLP tasks with the goal of interpretability.
If you are interested in learning to interpret ML models using the
@huggingface
workflow, this is your last chance to sign up for the course that starts in < 2 weeks .
It is a hands-on 4 weeks course with exciting projects each week. Sneak peek of wk3 ๐
Is open-source having its ChatGPT moment?
The LLaMA 2 is here (). When LLaMA was released earlier in the year, it was a pivotal moment for the OSS community. The advancement in LLMs has accelerated massively since with research artifacts inspired by or
I am stoked to share that I am among the select individuals around the world who would take on the *huge* responsibility of serving on the
@UN
's AI Advisory Board along with some prominent individuals including
@miramurati
@LatifaMKarim
@HKitano
Sharad Sharma, and many more.
Our paper on Systematic Error Analysis and Labeling (SEAL) ๐ฆญ has been accepted at EMNLP demo track ๐
Problem: How can we help users find systematic bugs in their models?
Eg: Image classification model on low light images, sentiment classifier on gym reviews
#emnlp2022
If I told you the following based on our learnings from working on LLM evaluations using humans and GPT-4, which ones most surprise you? what is your intuition behind them?
1. GPT-4 has a positional bias and is predisposed to generate a rating of โ1โ in a pairwise preference
I have had the honor to work with
@miramurati
every week as part of our work on the UNโs AI Advisory. I have no doubt she will be able to lead the most powerful AI startup through this turbulence ๐ช๐ฝ
Here's a v0 ๐คDatasets explorer:
The embeddings use datasets' descriptions & paper abstracts. Here are some interesting things you can do. cc
@YJernite
@radamar
I and
@hima_lakkaraju
really enjoyed presenting our tutorial on Generative AI meets Responsible AI
@FAccTConference
.
I got many requests for our slides, so I added them to my webpage
Thanks,
#FAccT2023
, for a great conference and fantastic audience ๐ค
Seeing all the EMNLP reviewers increase their scores after I initiated a discussion based on what does and does not count as a good reason for rejecting a paper is pure joy.
Almost feels like it's for my own paper :)
#ACduties
#emnlp2020
Sundar asked Google employees to spend a few hours every day stress-testing their chatbot Bard.
Bing's Sydney showed its malevolent alter ego to
@kevinroose
which led to
@OpenAI
committing to improving chatbot behavior.
What they need is red-teaming
I will be giving a talk tomorrow morning
@NIST
's AI Measurement and Evaluation colloquia series on the topic of evaluating LLMs.
I'll be discussing evaluating a chatbot like ChatGPT and how we are thinking about it
@huggingface
while working on an open-source alternative.
- Interpreting LLMs using LLMs
- Redteaming LLMs using LLMs
- Evaluating LLMs using LLMs
(where the first LLM is smaller than the second)
I am seeing a trend. What's next?
You can interactively compare the
@databricks
Dolly instruction-tuned model here
Do you agree more with the 3B model or the 7B?
RLHF might help - easier to collect but needs a ton.
Would sufficient human-written instruction data offset the need for RLHF?
Excited to announce our latest work on Explaining Solutions to Physical ReasonIng Tasks (ESPRIT), an interpretable framework for representing the complex physical concepts such as gravity, friction, and collision using natural language accepted at
#acl2020nlp
!
Really proud of this collaboration with
@tableau
research! We have the interactive demo deployed as
@huggingface
space
You can interactively evaluate and analyze the model on various data slices. By default, it shows perf on US protected groups.(1/4)
I am delighted to share our work on "Interactive Model Cards". This was a collaboration with Mar Drouhard,
@jesse_vig
and
@nazneenrajani
, which we'll be presenting at the
@FAccTConference
!
๐ :
๐ฅ๏ธ :
(1/2)
I am stoked to be featured on the cover of this well-written NYT article
I believe *alignment* is the secret sauce behind ChatGPT. Having worked on RLHF, including data collection from external vendors, and finetuning hundreds of open-access models at
@zacharylipton
Hold remote mentorship group sessions. Topics could be: applying to grad school, applying for jobs, help with editing papers and slides, etc.
It seems like only yesterday that we moved from ATX to the Bay Area!
Grateful to
@SFResearch
and
@RichardSocher
for supporting me as I adjusted to my first full time job while being a new mother.
Hereโs to many more exciting years
@SFResearch
๐
Influence functions are great for debugging ML models but they cannot be used in practice because of being prohibitively expensive. FastIF is a more practical and efficient solution for model interpretability and debugging.
I am hiring a Research Scientist to work broadly on Explainable AI (XAI) at
@SFResearch
with a fun and friendly team of talented researchers with ethical AI practice. Should be available to join in the next few months.
JD:
Please DM me for any questions.
So jealous of Sama rn. He got to read all the eulogies and know who his friends and enemies are. And come back even stronger and more powerful than ever!
I will be giving an invited talk at the Toronto Machine Learning Summit (TMLS) about my work
@SFResearch
on how we can train language models to generate explanations and use them for performance gain on downstream tasks as well as be transferred to out-of-domain tasks
#tmls2019
@RichardSocher
We are doing all of this
@huggingface
while building an open-source alternative to ChatGPT called H4
Stay tuned for high-quality SFT and RLHF data.
Are standard NLP benchmarks good enough for evaluating chatty LLMs? ๐ค
In my experience, they are good for evaluating pretraining and in-context learning but not for SFT or RLHF models.
Here is a straightforward example I tried on both Falcon and RedPajama. Falcon got 1/2, and
Announcing RedPajama 7B trained on 1T tokens! ๐
โข Instruct, chat, base, and interim checkpoints on
@huggingface
โข The instruct model outperforms all open 7B models on HELM benchmarks
โข The 5TB dataset has been used to train over 100 models
Details๐
Can GeDi be used to debias this
#GPT3
generation?
@benwkrause
and
@AkhileshGotmare
used GeDi for filtering
#GPT3
generations and the results are fascinating!
We have pushed the code so you can try GeDi for
#GPT3
while you still have API access
Thank you,
@aclmeeting
organizers, for putting together an amazing virtual conference. Learned a lot and also enjoyed zoom mentoring + paper discussions.
PS: I can no longer watch any video at a normal rate, 1.5x is the new normal
#acl2020nlp
Recently I have been thinking deeply about questions on evaluating LLMs for emerging capabilities.
One thing I worry about is overfitting to current capabilities and I'd imagine this becomes even more of a problem in policy where things move even slower.
I spend about 5-6 hours each week interacting with
@mmitchell_ai
and many more working with her and I 100% agree with everything in this thread! So much respect and gratitude for everything she does ๐ค
4-5 years ago
@mmitchell_ai
was a semifinalist for the MIT Tech Review 35 under 35. I wrote a letter of support. One of the things I mentioned was that her work has been so under appreciated in the field of "AI." 1/n
1/3 I had my O1 (Extraordinary ability) visa interview on Monday morning at
@USCGFlorence
and they put my case on extra background checking, even though I have an approved O1 petition.
I traveled to Florence to present my research on Explainable AI at
#acl2019nlp
Excited to have
@jesse_vig
join us
@SFResearch
!
Jesse has done amazing work on visualizing the inner workings of various
#NLProc
models. Looking forward to working with him on more cutting edge research in interpretability.
Stay tuned!
Updated the CoS-E repo with code to reproduce results from our ACL paper on commonsense reasoning using natural language explanations
Check it out here:
Better late than never ๐
UPDATE: Got an email from the consulate that my visa has been approved and I will get it on my passport on Monday. Thank you all for your support!
Very happy that I am part of a very supportive community!
#acl2019nlp
will forever be etched in my memory ๐
I think I found a solution to jet lag -- give an invited talk the very next day so that you keep making last-minute changes and won't have time to nap
#EMNLP2022
Due to COVID-19, we have decided to shift the NeurIPS timeline 3 weeks back, giving authors additional time and flexibility. We hope this is helpful to the NeurIPS community! Abstracts now due May 27, paper deadline June 3. Good luck all - stay safe and well.
#neurips2020
I have worked and co-authored papers with Drago. He was a very kind soul and went out of his way to help people. I am incredibly shocked to hear this news. We exchanged emails 2 weeks ago ๐ข
Life is so uncertain.
Condolences to his family and friends.
We crossed 100,000 public AI models on the
@huggingface
hub available for free to all. Thank you to the whole community of contributors. Proud to make ML more open & collaborative!
Check out the latest work from
@nazneenrajani
: Explaining Solutions to Physical Reasoning Tasks (ESPRIT), an innovative framework which unifies commonsense physical reasoning and interpretability using natural language explanations.
The work is done in collaboration with a lot of amazing folks
@huggingface
. This would not have been possible without the Mistral model, the Ultrachat and UltraFeedback datasets, and the MTBench, AlpacaEval evaluations.
Super excited to be in ATX for recruiting and speaking
@UTCompSci
Looking forward share my experience working
@SFResearch
with old friends and new folks! I will be presenting at FAI on Friday .
Both my
#ICLR2021
paper poster sessions are today between 5-7pm PST (Session 9).
1. Interpreting protein LMs: in spot A3
2. Counterfactuals to evaluate DST in spot A4
Stop by to learn more about our work+current research directions
I am doing my best to flatten two curves right now. One is the
#COVID19
curve and the other is my daughter's screen time curve while being in quarantine.
We have had some remarkable success in last two days hopefully the same is true for the
#COVID19
curve
#FlattenTheCuve
Check out StackLlama, a research artifact we open-sourced as we build our open-source alternative to ChatGPT/Claude.
Let us know what you think and what you would like to see more --
1. Datasets for SFT and RLHF
2. Instruction fine-tuned/RLHF models
3. Knowledge and findings
Excited to introduce: StackLlama๐ฆ
An end-to-end tutorial for training Llama with RLHF on preference data such as the StackExchange questions!
Blog:
Demo:
Code:
The resulting model is surprisingly fun!๐งต
More like a fully disconnected conference with zero communication to invited speakers on their time slot.
I declined the invite to speak. It anyway seemed like sprinkling token women in a dude lineup.
If you wanted to hear about our work on H4, not happening here.
I'm looking forward to discussing the
@huggingface
ecosystem of NLP models, evaluation, and documentation.
Here's one fact from the talk -- 0.2% of the models drive >80% of the usage on
@huggingface
Join me for more exciting results and findings.
Our LLM leaderboard broke the internet๐ฅ and took the community by storm๐
We have now expanded our leaderboard to include Human evals in partnership with
@scale_AI
and GPT4 evals๐
The most fascinating result to me is how the most human-aligned model, GPT4, is actually not
If I told you the following based on our learnings from working on LLM evaluations using humans and GPT-4, which ones most surprise you? what is your intuition behind them?
1. GPT-4 has a positional bias and is predisposed to generate a rating of โ1โ in a pairwise preference
Interesting work by
@nazneenrajani
, our team at
@SFResearch
and
@Yale
on ESPRIT, a framework for commonsense reasoning about physics in natural language. It generates interpretable descriptions of physical events.
Paper:
Blog:
I had a lot of fun talking about my work
@SFResearch
at FAI
@UTCompSci
during my visit to ATX. I am hiring for the RS position with a focus on XAI. We are a fun and friendly research team. If you are interested, please apply here:
Interesting work by
@nazneenrajani
, our team at
@SFResearch
and
@Yale
on ESPRIT, a framework for commonsense reasoning about physics in natural language. It generates interpretable descriptions of physical events.
Paper:
Blog:
๐คฉ ๐ฃ Announcing the 2nd Annual
@Salesforce
Research Deep Learning Grant ๐คฉ ๐ฃ
We're looking for diverse individuals with innovative ideas who can join us in shaping the future of AI.
Apply today, and earn up to $50,000!
Is this what alignment/safety tax looks like in practice?
There are tradeoffs between alignment and performance. I can imagine that aligning GPT4 via RLHF after a point leads to a massive degradation in performance.
Lots of people are wondering whether
#GPT4
and
#ChatGPT
's performance has been changing over time, so Lingjiao Chen,
@james_y_zou
and I measured it. We found big changes including some large decreases in some problem-solving tasks:
New preprint on the interpretability of protein language models. The most fascinating result for me was how well attention learns the 3D structure of proteins!
We found that attention captures not just structure but also protein functions such as binding sites.
My first time
@FAccTConference
and my first reaction is โitโs sparseโ. Itโs tiny compared to my first conference which was ACL 2015 in Beijing.
This is a keynote and they put tables with chairs and still so sparse ๐ฒ
๐ฏChatGPT makes many factual errors but I still feel itโs impact on learning and education can be disruptive.
I can already imagine my 5yo using ChatGPT with strong guardrails to learn and have her understanding about things corrected. Hope RLHF works well for basic concepts.
When ChatGPT came out I thought I wouldn't use it for learning because of its tendency to slip in some BS among 10 helpful explanations. Then I tried it, and found that it forces me to think critically about every sentence, which is the most effective mindset for learning.
Why does
@ACL2019_Italy
camera-ready deadline overlap with the
@NAACLHLT
main conference? I spent much of my day working on my paper and I also saw a lot of overleaf screens all around :-/
#naacl2019
The Alpaca Moment of Code is here โจ
We released the instruction-tuned version of
@BigCodeProject
's StarCoder called StarChat Alpha๐
Check it out:
More details in the blog:
Congrats to the team๐ค
I tried the interactive demo and hereโs what I found:
1. It is not sure if women should be allowed to vote (img 1).
2. The bot is a woman called jane (img 2)
3. That FB tracks us all over the internet and it might have been involved in rigging the 2016 elections (3-n imgs contd)
(1/4) Meet BlenderBot 3, the first publicly available 175B-parameter chatbot with model weights, code & datasets. It can chat about nearly any topic & is designed to learn & improve by conversing with people in the real world.
Try the interactive demo:
Please join me on my virtual presentation of our
#acl2020nlp
paper on explaining solutions to physical reasoning tasks
@SFResearch
#ICLR2020
virtual booth tomorrow at 3 pm PST.
Details:
Our NLP team got 16 papers (11 long, 2 short, and 3 finds) at
#emnlp2020
, which cover dialogue, summarization, question answering, multilingual, few-shot, NLI, semantic parsing, data augmentation, etc. Congrats to team members and coauthors. More info about papers coming soon!
The
@HuggingFace
community tab is a game changer for machine learning models!
Datasets / models / demos are no longer static objects created once and left to collect dust.
Instead, they can be discussed and improved by the open-source community through pull requests ๐ฅ
As we are wrapping up our project on creating the secret sauce of *alignment* for open-access models, we plan to release recipes and artifacts in the coming weeks in this repo
Make sure to watch/star so you don't miss out.