Presenting PaLM-E 562B, one-model generalist across robotics, language, and vision-language. It showcases multimodal chain-of-thought reasoning and the ability to reason over multiple images!
And positive transfer enables it to work well on robots!!!
Check out Danny's thread 👇
What happens when we train the largest vision-language model and add in robot experiences?
The result is PaLM-E 🌴🤖, a 562-billion parameter, general-purpose, embodied visual-language generalist - across robotics, vision, and language.
Website:
1/ Reflecting back on 2022: we shared our most advanced language model PaLM - a single 540B-parameter dense language model for multiple domains & tasks, trained over two TPUv4 Pods.
Research paper:
Blog post:
Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud
#TPU
v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes.
This quarter, Stanford’s Advances in Foundation Models Class (CS 324) will be partnering with the Stanford MLSys Seminar to host a special talk series on foundation models!
Our first talk will be
@tri_dao
. Catch us *TOMORROW* at 3:30 PT:
Introducing the 540 billion parameter Pathways Language Model. Trained on two Cloud
#TPU
v4 pods, it achieves state-of-the-art performance on benchmarks and shows exciting capabilities like mathematical reasoning, code writing, and even explaining jokes.
#palm
But how long did it need to train? Training PaLM 62B to 1.3 trillion tokens results in significant gains as suggested by Chinchilla data scaling. However it does not bridge the gap to PaLM 540B that 5x training FLOP count.
See updated results in:
Incredibly fun and interesting panel discussion with Percy Liang (
@percyliang
) and Angela Fan! Thank you so much to Sasha (
@srush_nlp
) for the amazing work at organizing and moderating this panel!
Super excited about discussing Gemini and LLM related advances from Google at the Beyond Scaling Panel tomorrow afternoon at NeurIPS, jointly with Sasha Rush (
@srush_nlp
) , Angela Fan, Percy Liang (
@percyliang
), and Jie Tang (
@jietang
).
Medicine is inherently multimodal.
Thrilled to share Med-PaLM M, the first demonstration of a generalist multimodal biomedical AI system with a stellar team
@GoogleAI
@GoogleDeepMind
@GoogleHealth
Paper:
PaLM-SayCan combines the understanding of
language models with the real-world capabilities of a
helper robot. The accuracy improvements in robotic task execution from PaLM combined with SayCan are impressive. Examples of task-planning:
1) We updated the underlying LLM to PaLM (), resulting in PaLM-SayCan. This resulted in an interesting trend:
Improving the underlying LLM resulted in much higher robotics (!) performance (halving the errors)
Today developers can start building with our first version of Gemini Pro through Google AI Studio at .
Developers have a free quota and access to a full range of features including function calling, embeddings, semantic retrieval, custom knowledge
Excited about these improvements on PaLM model:
1) U-PaLM: finetune with UL2 mixture-of-denoisers
2) Flan-PaLM: finetune on 1.8K tasks phrased as instructions
You can even stack these two methods!
U-PaLM:
Flan-PaLM:
Introducing U-PaLM 540B!
@GoogleAI
Training PaLM w UL2's mixture-of-denoisers with only 0.1% more compute unlocks:
- Much better scaling 📈
- Emergent abilities on BIGBench 😎
- Saving 2x compute (4.4 million TPU hours!) 🔥
- New prompting ability
link:
Can robots think? For our series finale,
@deanwrussell
reports on AI research stretching the limits of machine learning and studying if robotic sentience REALLY matters for our future.
Combining safety+interpretability via affordance grounding with language model PaLM in robotics is really impressive. PaLM-SayCan results show that the system chooses the correct sequence of skills 84% of the time and executes them successfully 74% of the time.
Learn how we combined our latest language model, PaLM, with robot learning algorithms to create PaLM-SayCan, a robotics system that uses natural language to complete complex tasks in a real-world environment →
If you're at
#NeurIPS2023
come chat with us, the Gemini team! We're at the Google booths tomorrow from 1:30-3:00 to answer your questions on what it's like to work on Gemini.
I'm excited to head to
@NeurIPSConf
#NeurIPS2023
this week.
We'll be having a couple of "Chat with the Gemini Team" events in the
@GoogleDeepMind
/
@GoogleResearch
booth areas on Tuesday and Wednesday from 1:30 to 3:00 PM (New Orleans time). Quite a few Gemini team members will
11/ There is just a glimpse of the exciting research with PaLM - the list is too long to summarize here, however I am incredibly grateful to all the amazing collaborators and researchers at
@GoogleAI
for their contributions and innovations. And super excited for what's next!!!
Can multi-100B param language models be served efficiently? We think so! Today we’re announcing the PaLM inference paper and releasing code for low-latency, high-throughput inference of 8B–540B models on TPU v4.
Paper:
Code: 1/5
4/ Minerva finetunes PaLM on mathematical content and scientific papers to solve mathematical questions using step-by-step natural language reasoning, establishing new SOTA on STEM benchmarks, MATH and MMLU-STEM
Very excited to present Minerva🦉: a language model capable of solving mathematical questions using step-by-step natural language reasoning.
Combining scale, data and others dramatically improves performance on the STEM benchmarks MATH and MMLU-STEM.
Have you ever “heard” yourself talk in your head? Turns out it's a useful tool for robots too!
Introducing Inner Monologue: feeding continual textual feedback into LLMs allows robots to articulate a grounded “thought process” to execute long, abstract instructions 🧵👇
See how VisionAir is using Federated Learning and the TensorFlow Java API to estimate air quality taken from smartphone photos, while keeping user privacy in mind. 🤳🌏
Read the blog →
New from Google Research! Language models perform amazing feats, but often still "hallucinate" unsupported content. Our model, RARR🐯, automatically researches & revises the output of any LM to fix hallucinations and provide citations for each sentence. 🧵
Excited about this
@GoogleAI
work on "PaLM: Scaling Language Modeling with Pathways" with many authors. Be sure to check out the accompanying 83 page PDF!
Pathways Language Model (PaLM) is a new advanced AI model that uses a technique called chain of thought prompting to do complex tasks like solve math word problems — and even explain its reasoning process step-by-step.
#GoogleIO
3.2/ Multilingual capabilities of PaLM are surprising and powerful. For example, you can ask novel questions in Bengali, and get surprisingly good answers on both English and Bengali even when it has never seen parallel sentences in both languages.
Today at
#GoogleIO
@sundarpichai
showed some examples of the capabilities of the PaLM 540B language model. For example, you can prompt the model with:
"I will ask a question in Bengali and get English and Bengali answers"
And then give it two examples of this behavior.
(cont)
Learn about chain of thought prompting, a method that equips language models to decompose multi-step problems into intermediate steps, enabling models of sufficient scale to solve complex reasoning problems that are not solvable with standard prompting. →
Congratulations to the Celestini Program India 2018 student team supported by
@MarconiSociety
! The Android demo app they built to predict Air Quality in Delhi using
#TFLite
featured in
#TFDevSummit
today and in the demos. Link:
Delighted to share our new
@GoogleHealth
@GoogleAI
@Deepmind
paper at the intersection of LLMs + health.
Our LLMs building on Flan-PaLM reach SOTA on multiple medical question answering datasets including 67.6% on MedQA USMLE (+17% over prior work).
5/ Tasks that seem simple to humans are actually incredibly complex for helper robots. PaLM-SayCan showcases how a robotics system uses PaLM to interpret natural language to complete complex tasks in a real-world environment.
Tasks that seem simple to humans — like cleaning up a spilled drink — are actually incredibly complex for helper robots. That’s why Google Research and Everyday Robots are using language models to improve robot learning.
6/ Flan-PaLM instruction tunes 540B PaLM to follow instructions, establishing a new SOTA on MMLU benchmarks and be helpful in the zero-shot setting with high accuracy.
New paper + models!
We extend instruction finetuning by
1. scaling to 540B model
2. scaling to 1.8K finetuning tasks
3. finetuning on chain-of-thought (CoT) data
With these, our Flan-PaLM model achieves a new SoTA of 75.2% on MMLU.
Chain of thought promoting.
Encouraging language models to "show their work" makes them both more interpretable and more accurate at complex reasoning tasks, solving math problems, etc.
@Mxbonn
@petewarden
For MobileNet V2, when finegrain_classification_mode is set to False, the model will shrink the last layer small for small multipliers. Please feel free to email me if there are further questions.
Pathways Language Model (PaLM) is a new advanced AI model that uses a technique called chain of thought prompting to do complex tasks like solve math word problems — and even explain its reasoning process step-by-step.
#GoogleIO
@AllennxDD
@arankomatsuzaki
Allen, PaLM 540B is actually SOTA in STEM category of MMLU. There was a correction from copying GitHub leaderboard. Please see table 6 updated version.
Now you can train your Object Detection models on Cloud TPUs! Learn how in this end-to-end walkthrough. Bonus: we'll run the trained model on a phone using TensorFlow Lite and detect pets.
Read the post here →