Btw, I joined
@OpenAI
and this is what I’ve been up to so far.
We've just released a paper on training LLM critics to enhance human feedback for training LLMs.
Kudos to the incredible team —
@nmca
,
@gadzin1203
,
@agentydragon
, Juan,
@janleike
Excited for what's ahead 🚀
We’ve trained a model, CriticGPT, to catch bugs in GPT-4’s code. We’re starting to integrate such models into our RLHF alignment pipeline to help humans supervise AI on difficult tasks:
𓅪𓅪𓅪 Sparrow 𓅪𓅪𓅪
It was amazing to work on this dialogue agent and train it from human feedback! Sparrow searches Google to improve and back up its claims, and follows a set of rules to be less harmful.
Large language models can exhibit falsehoods, discriminatory language, and other unsafe behaviour. Introducing Sparrow: a dialogue agent that can search the internet and is trained to be more helpful, correct, and harmless using RL from human feedback: 1/
We tuned a massive language model to support its answers with quotes from the web.
I'm delighted to share this after putting much effort into it. Privileged to work on this project at
@deepmind
and collaborate with an amazing team
@jacobmenick
,
@__nmca__
,
@geoffreyirving
et al.
Introducing GopherCite, a fine-tuned version of Gopher that used human feedback to learn to back up claims with supporting evidence from the web. GopherCite can also answer questions about a given document or abstain when unsure.
Learn more: 1/