In August,
@PortkeyAI
crossed 2 Billion total requests processed through our platform 💥
(this number was 0 last year!)
We're humbled to be production partners for thousands of leading AI companies around the world, like
@getpostman
@quizizz
@italic
@Pepper_Content
📢 Exciting News!!
Portkey is now natively integrated with
@llama_index
! 🎉
Building resilient and production-ready LLM apps has just become extremely easy 💆
Portkey adds 4 core production capabilities to any Llamaindex app: ⬇️
Over *$8 Million USD* in committed investments & credits are available for early-stage AI startups today.
But where can you find them?
We scoured the depths of internet (and Twitter) to compile just that!
🥁 Introducing.. The AI Grants Finder
🚀 Portkey now natively integrates with Tembo to conduct vector searches on Postgres
Use multiple LLM providers to create embeddings on
@tembo_io
- without changing your setups
Try it out:
🚀New Guide on the Portkey Blog: DSPy in Production
- by
@ganarajpr
, based on his talk at the LLMs in Prod BLR meetup.
Learn how to leverage DSPy by
@lateinteraction
to tackle real world challenges, optimize AI pipelines, and revolutionize e-commerce operations.
Read it now:
Portkey is now a first class provider in the
@vercel
AI SDK.
When should you use Portkey's AI gateway with the AI SDK?
1. You're routing requests to multiple models & providers.
2. You're going to production and need metrics on cost, performance and accuracy
3. You want to
Live Now: LLMs in Production Event from SF! 🌟
We're kicking off an insightful evening exploring production-specific questions and issues on scaling LLM apps!
Link to join the livestream in the next tweet! ↓
feat.
@lightspeedvp
@databricks
@llama_index
@getpostman
@yi_ding
We just shipped a series of changes which have significantly improved the Gemini 1.5 Flash latency (>3x reduction) and output tokens per second (>2x more)⚡️🚢
New York, you can't miss this.
Catch
@jumbld
and
@ayushgarg_xyz
in the first ever LLMs in Prod meetup on the east coast.
We'll be sharing our hard won lessons in shipping AI apps to production.
Limited seats. Please register soon! Link ↓
There is not a year in the past 8 years that I have not used nginx
With the Portkey AI gateway, the best platform teams are at the best position for AI too :)
Pro tip: Lua unlocks the unlimited nginx power
Billions of tokens. 200+ LLMs. 5,000 Github stars.
Behind these numbers lies a story waiting to be told.
Join Portkey's CEO
@jumbld
at the Bangalore Gen AI Meetup, as he unveils the lessons learnt building a massive-scale OSS AI gateway — a critical component in unlocking the
🚨 Heads Up: Major Update from OpenAI! 🚨
Starting Jan 4, say goodbye to 33 models, including the iconic text-davinci-003 (aka GPT-3). This is the largest model sunset at OpenAI yet! 🌆
What's changing? Read on! ⬇️ 1/3
While
@AnthropicAI
's Claude Sonnet 3.5 has seen very quick adoption in production,
@GoogleDeepMind
just released the new Gemini 1.5 Pro experimental (0801) model which is now ranked
#1
on the LMSYS Chatbot Arena leaderboard, neck-and-neck with GPT-4 and Claude 3.5 Sonnet!
The
Streaming from cache is now live and is supported in these 3 portkey routes:
/v1/chat/completions,
/v1/completions and
/v1/prompts/:id/completions.
available across all providers.
We also cache function streaming calls now!
New Portkey +
@LangChainAI
Cookbooks ✨ thanks to
@yujian_tang
!
See how easy it is to start making your Langchain agents reliable, cost-efficient, and production-grade using Portkey:
Explore the Cookbooks:
- Portkey + OpenAI + The Milvus Project RAG Agent:
.
@jumbld
unveils what's next for Portkey live on
@CNBC
🔗 New integrations with popular projects
🛣️ AI gateway with load balancing, fallbacks, retries, and more features
🛠️ Advanced LLM fine-tuning
...and so much more!
We're just getting started 🚀
Capturing a special moment from our community! 📸
Mohamed shared some fantastic feedback about our features at Portkey - from seamless integrations to our game-changing analytics, it's feedback like this that keeps us going 🚀
Our LLMs in Prod server is about to hit 200 members! 🎉
To celebrate, we are giving limited edition Portkey merch to the 200th new member - which includes the apocryphal Yud t-shirt, and more.
Who will be the lucky 200th? RT for karma!
➡️
Over the last 8 months, Portkey has processed a total of 300 Billion+ tokens across 100+ LLMs
Every day, more and more of our customers are taking their LLM apps to production using Portkey
And this is just the start.
🚀 Last week, we had an absolute blast co-hosting the AI Agents: Inter-continental Hackathon with
@meetaugustai
at their BLR Chapter!
24 hours of pure innovation, collaboration, and building the future of AI agents.
A massive shoutout to all the participants, our co-hosts-
We are open sourcing our Guardrails on the Gateway framework today.
But why is this important?
We built the gateway to bring some uniformity to so many LLM APIs that are out there, and added fallbacks, load balancing to make the routing more reliable.
That was a good start!
We're launching our open-source AI guardrails framework on our AI gateway today.
Been building it with inputs from 600+ teams who use the gateway in production and have collectively made 1.4 billion API requests on our hosted platform itself!
trying our luck with an HN launch
.
@MistralAI
just released Codestral Mamba, a revolutionary 7B parameter coding model based on the Mamba architecture.
With linear time inference, 256k token context, and Apache 2.0 license, it's set to transform code generation.
Try out the Codestral Mamba model using Portkey!
If you're looking for an easy way to switch to the latest OpenAI embedding models, configs are a great way to do that.
You can switch to the latest models without touching your codebase. Also lets you try multiple models in parallel as you experiment.
We call this
Ever stumbled upon a buggy request in your logs and wished you could instantly rerun it to fix it?
Say Hello to Log Replays! 🔄
With Log Replays, you can open the full request in a fresh prompt playground straight from your logs — easily replay the buggy request and edit it
Introducing 7 Spells of Portkey🪄
Our production-grade features that help your AI app run at scale
Day 1: 🧠 Semantic Cache
✅ Serve faster responses
✅ Save API costs
Seamless integration with your existing workflows.
#Portkey
#LLMOps
Team
@PortkeyAI
's been busy lately:
✔️ Now processing 10B+ LLM tokens *every day*
✔️ Added open source guardrails for enforcing real-time LLM behavior
✔️ Improved routing, tracing, and governance features
✔️ New integrations with
@vercel
,
@phidatahq
,
@crewAIInc
, and many more
Attention Isn't All You Need
Mamba: A New Approach to LLMs. This State Space Model achieves Transformer-level performance with RNN-like efficiency.
@MistralAI
's Codestral Mamba, with just 7B parameters, outperforms many larger, widely-used LLMs.
We've done a deep dive to
🚀 AI Devs! Don't miss this:
@jumbld
in conversation with
@yujian_tang
on "Enforcing real-time LLM behavior with Guardrails on the Gateway"
Learn how to
→ Build secure, robust AI apps
→ Prevent hallucinations
→ Implement effective guardrails
👇 Registration link below:
Level up your
@vercel
chatbots with Portkey's conditional routing config! 🚀
🌍 Route requests based on user type, data sensitivity, or any custom condition.
Switch models on-the-fly, handle traffic spikes, and ensure compliance effortlessly.
Check out the code:
Paid users ➡️
"Easy to use, easy to navigate... we could see the value immediately. Having all the LLMs together, logs, and latency data helps us identify issues much faster!"
-
@orask
, CTO of
@XP3co
AI Gateway coupled with Observability is game-changer for productionizing AI
Hear from Oras
Today is a special day - we raised $3M
But something more important happened:
@isro
's Chandrayaan3 became the only mission in the world to land on moon’s south pole🌖
As we build a world-class LLMOps platform out of India, we take immense inspiration from what ISRO has achieved
✨Portkey is now a native provider in the
@vercel
AI SDK!
Build lightning-fast GenAI apps with Vercel, while ensuring they're robust with Portkey's powerful features.
Check out the docs:
Thanks
@lgrammel
for helping with this integration!
Potterverse unite! 🪄
Thrilled to share the
@PatronusAI
's industry-leading evaluators for retrieval accuracy, hallucination detection, toxicity, and much more — are now available on Portkey Gateway.
1. Add your Patronus API key to Portkey
2. Define Guardrail checks & set
Introducing
@PatronusAI
+
@PortkeyAI
🚀
@PortkeyAI
is the leading open source AI gateway. It’s blazing fast and supports over 200+ LLMs. Today, you can use Patronus evaluators all within Portkey ✨
Bringing AI agents to production just got a whole lot easier! 🚀
Over the past year, we've seen AI agents evolve from experimental tech to production-ready tools. But deploying them at scale? That's been a challenge.
With Portkey, we are addressing that challenge head-on.
80+ open-source models from
@FireworksAI_HQ
are now available on Portkey!
✅ Test your existing prompts on faster, cheaper open source models
✅ Switch to open source models without changing any underlying code
✅ Version, track, and deploy your prompts to production with one
Next up, was
@RajaswaPatil
from
@getpostman
on building Postbot
From disjointed features to a smooth-talking AI assistant - this team's journey with Postbot is a masterclass in evolving AI architectures.
Multi-agent systems, here we come!
OpenAI's new o1-preview and o1-mini models are supported on Portkey.
Compared to gpt-4o, these models reflect for a long time, and are able to answer questions like "how many r's in the word strawberry?", "how many words in your output?" exceptionally well.
o1 models work
.
@LangChainAI
& Portkey, better together.
Supercharge your Langchain apps & agents with Portkey's production-grade tools: logging, tracing, caching, tagging and retries.
All with 1 LoC change
🗃️ Caching, tracing, tagging, retries, and more with
@PortkeyAI
Add one line of headers:
llm = OpenAI(headers=Portkey.Config(api_key ="<PORTKEY_API_KEY>"))
and get caching, tracing, tagging, retries, and more with Portkey!
Docs:
🚨 PSA: Set the Fallback mode ON
Last Thursday, the
@AnthropicAI
API was unstable for 12 hours.
Due to Portkey Gateway's fallback feature, 99.86% of the requests made to Anthropic during this time succeeded despite Anthropic being down.
A massive thank you to
@meronogbai
& Luke Vanagas for their recent contributions to the Gateway!
Meron's contribution adds support for multiple user messages with Google Gemini, and Luke's implementation of Anthropic's input_tokens and output_tokens response params allows
🔔Calling all Bengaluru AI builders🔔
Learn how to build production-grade & reliable AI apps this Sat at the
@huggingface
@Inferless_
party
Our CTO
@ayushgarg_xyz
will demo how you can add critical production functionalities on top of your existing LLM workflows with Portkey✨
Last month, we brought
#LLMsinProd
to Bengaluru, and the AI community showed up in force!
500+ registrations in just 4 days for a practitioner-led deep dive into production AI
From building useful agents at scale
@getpostman
, to putting DSPy in production at
@zoro_uk
, here's a
A crucial detail about the GPT-4o mini model pointed out by
#LLMsinProd
community member
@kleneway
-
The max output tokens for the model (that haven't moved much compared to all the increase in max input tokens) are 4x of all GPT-4 models and 2x of Gemini & Anthropic models!
.
@OpenAI
just dropped GPT-4o mini — a game-changer for AI accessibility! 🚀
✅ 60% cheaper than GPT-3.5 Turbo
✅ Outperforms competitors on key benchmarks
✅ Multimodal: supports both text and vision
Try out the new GPT-4o mini with Portkey! 👇
🎉Thrilled to see
@springroleinc
successfully implementing Semantic Caching for production use cases!
This single change will help reduce costs, eliminate latency for at least 20% of their requests, and make Albus more reliable.
Keep making waves, Team Albus! 🚀
Yesterday was a big day for us at Albus.🙌
We deployed:
1. Teams concept to add privacy over wiki data
2. Non-authed API to Slack Connect *anywhere*
3. Improved pre-processing for env. stability
4. Semantic caching thanks to
@PortkeyAI
🙏
5. Bug fixes, &
one more feature 🧵👇
Ever feel like you're playing catch up with
@OpenAI
's model updates?
- Which models are being deprecated?
- Which models are being announced?
- Which models are currently available?
We've distilled the essentials into a quick, easy-to-understand snapshot with no jargon👇
.
@hey_ario
uses Portkey caching to run their Github workflows 24x times faster and save thousands of dollars.
"Portkey is a no brainer for people who use AI in their workflows - it really is the best caching solution"
- Kiran Prasad, Senior MLE
tired: i'll do a git diff between my prompt versions and track changes manually
wired: i'll just click "Update" on Portkey's prompt playground and get accurate diff summaries 💆
We are thrilled to be a launch partner for Meta Llama 3.
Experience Llama 3 now at up to 350 tokens per second for Llama 3 8B and up to 150 tokens per second for Llama 3 70B, running in full FP16 precision on the Together API! 🤯
🤖 You can now chat with our AI Gateway repo, powered by Llama 3.1 405B using
@huggingface
assistants!
Ask questions about the repo,
Get code snippets to use any LLM,
Streamline your AI integrations!
Check it out and start building!
👉🏼
@UCBerkeley
Welcome
@PortkeyAI
to OSS Alley!
A foundational model ops platform to help companies ship gen AI apps & features with confidence!
- Monitor usage, latency & costs
- Manage models with ease.
.
@OpenAI
just dropped GPT-4o mini — a game-changer for AI accessibility! 🚀
✅ 60% cheaper than GPT-3.5 Turbo
✅ Outperforms competitors on key benchmarks
✅ Multimodal: supports both text and vision
Try out the new GPT-4o mini with Portkey! 👇
Introducing Cerebras Inference
‣ Llama3.1-70B at 450 tokens/s – 20x faster than GPUs
‣ 60c per M tokens – a fifth the price of hyperscalers
‣ Full 16-bit precision for full model accuracy
‣ Generous rate limits for devs
Try now:
Big news! 🚀 Portkey is now part of the
@MongoDB
partner ecosystem
Portkey AI Gateway + MongoDB's robust storage = Production-ready AI apps, faster & smarter
Here's the scoop ↓
New on Portkey: Take DBRX (
@databricks
) & Mixtral-8x22B(
@MistralAI
) to prod with full-stack monitoring and reliability baked in (w/o making any changes to your existing OpenAI code)
All thanks to
@togethercompute
inference!
We're honoured to contribute to the LLM Survey Report by the
@mlopscommunity
The survey underscores the pressing need to address production & reliability related challenges for LLM apps.
Read the in-depth report here:
.
@AnthropicAI
just released a guide on Contextual Retrieval, a method that cuts RAG retrieval failures by up to 67%.
✨We've adapted this powerful technique to work with Portkey's AI Gateway.
Now you can easily implement Contextual Retrieval across 250+ language models
🎉 Exciting news! Portkey now seamlessly integrates with
@pyautogen
!
Portkey helps you bring your Autogen agents to production fast!
Check out the details in the Autogen docs here:
.
@pyautogen
is one of the most popular frameworks for building multi-agent systems. The agents are customizable, conversable, and seamlessly allow human participation.
We created a simple cookbook on how to use AutoGen with Portkey to get full-stack observability on key metrics
🌟 Bangalore AI enthusiasts!
Swing by our booth at
#KCD2023
TODAY until 5 PM and snag your $50 in AI credits!
🎉 Let’s explore some live demos and talk all things AI. Can't wait to meet you all! 🤝
👉
@TFUGIndia
@TFUGBangalore
🚨 LLM Pricing Alert! 📉
@MistralAI
slashes prices across models:
- open-mistral-nemo-2407: 50% off (Now $0.15/0.15 per 1M tokens)
- mistral-large-2407: 33% off (Now $2/6 per 1M tokens)
- codestral-2405: 80% off (Now $0.2/0.6 per 1M tokens)
If you're using the AI Gateway, we're
huge unlock: Portkey's multimodal prompt templates now support unlimited context window
- Create complex, context-rich prompts without limitations
- Leverage multiple modalities, including text and images
- Integrate with your existing workflows and deploy with ease
Portkey is designed from the ground up to be modular and adapt to your workflows, WHILE making them production-ready and reliable. This means, you can bring your existing services and workflows to Portkey seamlessly:
✅ Bring Your Own Guardrails
✅ Bring Your Own Agents
✅ Bring
The team at
@Walmarttech
have built one of the most impressive semantic caching stacks in the industry, that works at mega scale.
For the first time, we're bringing an exclusive peek inside how this was built, in a fireside chat with
@RohitChatter
, Walmart's Chief Arch.
Link ⬇️
👾 Calling all
#GenAI
folks out there! If you haven't given
@PortkeyAI
a whirl yet, you're in for a treat. No, they didn't send us a howler, but their service is legit awesome. Check 'em out – you won't regret it! 🕹️🔥
#AIAdventures
Using a single Portkey endpoint, you can now connect with *100+* open source models, switch from one to another, set up complex routing logics — all without breaking a sweat!
We're now integrated with Together AI, Perplexity, Mistral AI, and Anyscale 💥
Are we missing any?
We analysed latencies for GPT 3.5 & 4 over the last few months—there are some fascinating trends emerging:
The highlight? Latency gap between the two is narrowing, with GPT-4 getting faster!
⬇️
August is going to be special for Portkey. Here's what we shipped just last week -
📊 LLM Analytics 2.0
🛠️ Few-shot prompting
➕ Claude instant 1.2
📨 New invitations flow
☁️ Deployments for Azure OpenAI
💧 Streaming for Anthropic & Cohere
Quick recap🧵
🦙 Llama 2 is shaking things up in the world of LLMs - it’s open-source, customizable, and holds its ground against
@OpenAI
models.
But, did you know it offers more up-to-date data and ethical considerations? Dive into the nuances with our new read! 📖
Llama 2 has been an extremely popular open source model since it's launch.
Now that the hype has settled down and we're seeing what's capable of, here's a primer on Meta's flagship LLM.
It's aimed at answering:
- What is Llama 2?
- Can I use Llama 2 instead of OpenAI's model?
✨Build a secure customer support bot that protects user's PII data
Use
@LangChainAI
chatbot and
@PatronusAI
PII guardrails with Portkey. Detect and block PII. Keep user info safe.
Link to the cookbook-
👇see, when you go from building prototypes to running llm apis at scale, the game changes.
and that's where something like Portkey can make life a bit easier🌺
just route through a proxy, add two headers, and you've got automatic retries and caching. small changes, big impact!
for something that sees quite popular usage OpenAI's embedding API seems to fail quite often, often enough to where I can't just let a batch job run for thousands of requests w/o a robust caching and retry mechanism in place. more a mild itch/annoyance than a big issue, but.
We are absolutely thrilled to announce that Portkey &
@AporiaAI
are teaming up to bring Aporia's SOTA AI Guardrails on our open source AI Gateway.
This is a special moment for us — Gateway is built from the ground up to help companies take their AI apps to production.
We help
BUT WAIT.
Can I actually use AI Gateways like
@PortkeyAI
in the Gemini API Competition???
I'm getting worried here... 😬
I need an OFFICIAL reply.
#buildwithgemini