Fireworks AI Profile Banner
Fireworks AI Profile
Fireworks AI

@FireworksAI_HQ

6,316
Followers
72
Following
141
Media
488
Statuses

🎆 Generative AI Platform built for developers

Joined September 2022
Don't wanna be here? Send us removal request.
Pinned Tweet
@FireworksAI_HQ
Fireworks AI
6 days
🚀Llama 3.2 is now live on Fireworks! @AIatMeta ’s Llama 3.2 models (1B, 3B, 11B, and 90B) are accessible now! 🧵 📖 Read more
Tweet media one
2
5
22
@FireworksAI_HQ
Fireworks AI
11 months
We are opening our public platform to all developers at no cost for a two-week period, specifically for API usage. Please visit to begin. In exchange, we would love your valuable feedback on how we can improve our services to assist you in creating
10
46
391
@FireworksAI_HQ
Fireworks AI
10 months
We released our tuned Mixtral chat a few hours ago. Play with it through our app or API: . Big thanks to @MistralAI ‘s new addition of this MoE model. We are very excited about it.
11
23
281
@FireworksAI_HQ
Fireworks AI
10 months
Latest @MistralAI ’s Mixtral MoE 8x7B model has been on Fireworks since Saturday (2 days before it was officially released!). 🏆 Quality: beating GPT-3.5 on most benchmarks. 🏎 Speed: fastest inference engine reaching 175 tokens/sec 💸 Pricing: $0.45-$0.60/million tokens for
2
27
220
@FireworksAI_HQ
Fireworks AI
1 year
Fireworks blazing fast LLM inference is now available on Poe! Today, we’re excited to bring the power of the new Mistral 7B Instruct model to the @poe_platform powered by our lightning-fast Fireworks inference platform. You can now have conversations with the Mistral 7B bot and
4
29
202
@FireworksAI_HQ
Fireworks AI
10 months
In our thanks to the Fireworks users, we give you a few gifts: 🐎 llama-70b-chat is now 1.5-2x faster running on the same hardware. More to come. 🎇 is FREE for API usage for two weeks. 🐪 Yi and Zephyr models are enabled. Happy Thanksgiving and
Tweet media one
6
4
159
@FireworksAI_HQ
Fireworks AI
9 months
What's the most performant way to serve Mixtral and other open-source MoE models? Fireworks investigated this topic and came up with our proprietary serving stack with 4x the speed compared to vLLM and negligible quality impact! Read about our findings here
Tweet media one
7
19
176
@FireworksAI_HQ
Fireworks AI
10 months
Fireworks is excited to raise the quality bar by launching our function calling model and API as an alpha launch! We’ve fine-tuned a model specifically to reliably call APIs, even when provided with multi-turn context and numerous functions! In our evals, we achieved accuracy on
Tweet media one
9
12
159
@FireworksAI_HQ
Fireworks AI
10 months
@MistralAI 8x7B is live on Fireworks. Try it now at Warning: this is not an official implementation, as the model code hasn’t been released yet. Results might vary. We will update it once Mistral does the release. More perf improvements are landing soon
Tweet media one
9
34
161
@FireworksAI_HQ
Fireworks AI
1 year
🎉Mistral 7B Instruct is now available on the platform! Try out the model here:
3
26
161
@FireworksAI_HQ
Fireworks AI
1 year
🚀 Launching the GenAI Platform: bringing fast, affordable, and customizable Large Language Models (LLMs) to developers. Use open-source foundation models and deploy your own LoRA adapters with up to 20–120x cost reduction. 1/9
5
30
156
@FireworksAI_HQ
Fireworks AI
10 months
Mixtral: one more expert to break the tie Mixtral has 8 experts, but only 2 are active for each token. Do more than 2 help? Surprisingly, it helps in fp8, but in original 16 bit precision. With this trick, fp8 can almost match fp16 on MMLU! Why is that? 1/5
Tweet media one
4
22
147
@FireworksAI_HQ
Fireworks AI
6 months
Fireworks now offers the first (to our knowledge) hosted, instruct variant of Mixtral 8x22B! Try it at … or download it at … Thanks @Teknium1 and @NousResearch for the great dataset!
Tweet media one
Tweet media two
@ArtificialAnlys
Artificial Analysis
6 months
Mixtral 8x22B is an exciting launch but is not yet ready for production use for most use-cases The version of the model released by Mistral is the base model and is not instruct/chat fine-tuned. This means that it isn’t designed for the prompt & answer style that most
Tweet media one
2
2
21
12
19
128
@FireworksAI_HQ
Fireworks AI
1 year
ChatFireworks is live on LangChain! You can now use open-source chat models like Mistral 7B and Llama 2 Chat in your @LangChainAI applications. LLM developers can combine Fireworks chat models with system prompts and memory to build fast and high-performing conversational AI
Tweet media one
1
29
128
@FireworksAI_HQ
Fireworks AI
8 months
Fireworks is now the fastest provider for Mixtral 8x7b Instruct at 200 tokens/s! Graph courtesy of @ArtificialAnlys
Tweet media one
8
5
120
@FireworksAI_HQ
Fireworks AI
2 months
🚀 Exciting news! Fireworks AI is one of the first platforms to offer Llama 3.1 for production use from day one in partnership with @AIatMeta . With expanded context length, multilingual support, and the powerful Llama 3.1 405B model, developers can now leverage unmatched AI
Tweet media one
8
18
110
@FireworksAI_HQ
Fireworks AI
10 months
We released prompt caching for LLM inference with 5-10x faster time to the first token on long repeated prompts. LLMs are often invoked with repeated prompts: system prompt, few-shot examples, entire previous conversation for multi-turn chat, Q&A for a single document, etc. 1/4
1
12
103
@FireworksAI_HQ
Fireworks AI
11 months
For performant LLM inference, no deployment is one-size-fits-all. A low-latency bot has different requirements from a high-throughput summarization model. Learn about Fireworks’ many deployment configurations in our new blog 1/8
3
19
97
@FireworksAI_HQ
Fireworks AI
1 year
Finetuned Mistral is now supported on the Fireworks inference platform! You can now upload finetuned Mistral PEFT add-ons to the Firework platform for fast inference. Check out our cookbook for how to finetune and upload a custom Mistral model to Fireworks: Finetune the model:
Tweet media one
1
14
87
@FireworksAI_HQ
Fireworks AI
1 year
We are excited to announce new updates available on the Fireworks Mistral-7B Poe Bot: 1/ Sliding Window Attention - process sequences up to 32K efficiently thanks to sliding window optimization. 2/ Proper BOS Handling - Mistral-instruct now follows the conversation template it
Tweet media one
4
25
81
@FireworksAI_HQ
Fireworks AI
1 year
We are excited to team up with @LangChainAI to bring access to open-source LLMs like Mistral 7B via the LangChain Prompt Playground. This enables developers to efficiently explore and optimize their prompts on open-source LLMs. Here is how it works:
Tweet media one
2
14
79
@FireworksAI_HQ
Fireworks AI
6 months
We are pleased to announce the availability of the open-source Llama 3 8B and 70B models with 8k context, served from our blazing-fast inference stack.
Tweet media one
4
14
77
@FireworksAI_HQ
Fireworks AI
1 year
You can get 11x higher throughput and 30% lower latency with #Falcon #LLM , thanks to Multi-Query Attention (MQA). Learn how co-designing model and system architecture boosts efficiency. More in the Fireworks Gen AI Platform blog: #AI 1/8
1
15
75
@FireworksAI_HQ
Fireworks AI
11 months
⚡️StableDiffusion XL on Fireworks is faster than ever! Our SDXL APIs support generating images of 1024x1024 in 30 steps in 2 seconds. Try it out in the console:
Tweet media one
2
14
71
@FireworksAI_HQ
Fireworks AI
1 year
Code infilling enables the use of LLMs for code completion or docstring generation. But using SoTA LLMs (e.g., Code Llama) for infilling is tricky - you need proper whitespace formatting, and not all model variants can do accurate infilling. 1/4
3
21
71
@FireworksAI_HQ
Fireworks AI
7 months
We’re removing our waitlist and providing general access to dedicated deployments! As part of the launch, we’re adding support for 42 new models in dedicated deployments, including Nous Hermes models and Deepseek Coder! Dedicated deployments use the ultra-efficient Fireworks
Tweet media one
Tweet media two
2
7
69
@FireworksAI_HQ
Fireworks AI
13 days
🔥Meet Multi-LoRA, a FireOptimizer capability that lets you personalize AI at scale and deliver the best experience for each customer and use case, with 100x cost-efficiency. Serve hundreds of fine-tuned models on a single base model at the same cost as a single base model.
Tweet media one
3
11
65
@FireworksAI_HQ
Fireworks AI
1 year
Code Llama 🦙is out! Try it on the FREE and FAST Fireworks AI #LLM Inference platform at
3
13
62
@FireworksAI_HQ
Fireworks AI
3 months
Announcing Yi-Large on @fireworksai_hq ! We're excited to be among the first providers of Yi-Large, joining Nvidia in offering this state-of-the-art model. Yi-Large ranks among the top LLMs, closely trailing GPT-4, Gemini 1.5 Pro, and Claude 3 Opus on the LMSYS benchmark
Tweet media one
1
12
60
@FireworksAI_HQ
Fireworks AI
4 months
We are excited to partner with @HamelHusain and @dan_s_becker on the conference "Mastering LLMs: A Conference For Developers & Data Scientists" and offer ALL students $250 in credit on the @FireworksAI_HQ platform. Registrations for the course are closing soon!
0
10
61
@FireworksAI_HQ
Fireworks AI
7 months
Happy March! Fireworks is announcing our Spring 2024 platform updates - designed for improved production usage at scale. 🧵 💨Faster, more prod-ready serverless models - Mixtral Instruct and Llama 70B have become even faster, with speeds up to 300
Tweet media one
2
8
51
@FireworksAI_HQ
Fireworks AI
1 year
We are excited to bring fast inference open-source LLMs to Vercel AI SDK. We can't wait to see how developers use this. You can try the Llama 2 models here:
@jaredpalmer
Jared Palmer
1 year
Just switched 's Llama 2 provider to @thefireworksai . Huge perf improvement.
1
2
56
1
7
47
@FireworksAI_HQ
Fireworks AI
1 year
Fireworks achieve 3.5x better latency for blazing-fast code completion! Faster and more accurate code completion is essential for building production-grade AI-powered coding assistants. We are excited to publish our new use case showing how our fast LLM inference platform
Tweet media one
0
7
47
@FireworksAI_HQ
Fireworks AI
1 year
The Fireworks Inference Platform is fast, but how? An important technique is CUDA graphs, which can achieve a 2.3x speedup for LLaMA-7B inference. Learn about CUDA graphs, complexity applying them to #LLM inference, and more in our new deep dive. 1/6
1
10
47
@FireworksAI_HQ
Fireworks AI
2 months
Llama 3.1 8B, 70B Instruct are now available for fine-tuning. Fine-tuning guide -
Tweet media one
7
5
46
@FireworksAI_HQ
Fireworks AI
3 months
Gemma 2 9B is now on Fireworks! We're the first hosted provider (to our knowledge) to offer Google's latest open-source LLM. Try Gemma 2 in our UI playground or via our OpenAI-compatible API!
Tweet media one
Tweet media two
4
5
46
@FireworksAI_HQ
Fireworks AI
10 months
Fireworks is committed to bringing the best model quality and performance to the community. We’ve launched the Qwen-72B model on the Fireworks platform and @poe_platform , to provide outstanding performance on English and Chinese tasks! Qwen-72B outperformed LLaMA2-70B on all
Tweet media one
Tweet media two
2
5
44
@FireworksAI_HQ
Fireworks AI
3 months
Fireworks is raising $52M, led by @sequoia ! "We’re using the funding to make a shift towards compound AI systems that can orchestrate across multiple models with different modalities and tools” Learn more from our live Bloomberg interview:
2
7
45
@FireworksAI_HQ
Fireworks AI
10 months
@MistralAI Try out the new Mixtral 8x7B mixture-of-experts model fine-tuned by @thefireworksai for chat on @poe_platform !
Tweet media one
1
4
43
@FireworksAI_HQ
Fireworks AI
4 months
Excited to announce custom model import, on-demand H100s and auto-scaling to and from 0 on Fireworks! Use thousands of HuggingFace models with 60% faster speeds and 53% lower costs on your own GPU 🧵
Tweet media one
3
11
38
@FireworksAI_HQ
Fireworks AI
6 months
We are super excited to partner with @StabilityAI in bringing the state-of-the-art image generation models Stable Diffusion 3 (SD3) and Stable Diffusion Turbo (SD3-turbo) to developers with @FireworksAI_HQ enterprise-grade distributed inference service. Read more about the
Tweet media one
2
6
38
@FireworksAI_HQ
Fireworks AI
2 months
Deepseek Coder v2 is now available on Fireworks! Try either the full model at 128K context length or the lite model at 168k context length! These models are served with Multi latent attention from Deepseek to drastically reduce the KV cache footprint.
Tweet media one
3
6
35
@FireworksAI_HQ
Fireworks AI
7 months
Fireworks features top serving speeds- with OpenRouter Mixtral Nitro tokens/sec even faster than Groq!
@OpenRouterAI
OpenRouter
7 months
ICYMI: you can now track model performance over time. Here's the early data for Mixtral Nitro 🚀
Tweet media one
0
0
5
2
4
33
@FireworksAI_HQ
Fireworks AI
2 months
Fireworks is serving Llama 3.1 405B (fp8 quantized) at $3 per 1M input/output token! We're offering the model  ~3x cheaper than competing platforms thanks to the unparalleled efficiency of the Fireworks inference stack. Try it now at
Tweet media one
2
4
35
@FireworksAI_HQ
Fireworks AI
3 months
. @cursor_ai trained has trained a specialized model on the "fast apply" task, which involves planning and applying code changes. The fast-apply model surpasses the performance of GPT-4 and GPT-4o, achieving speeds of ~1000 tokens/s (approximately 3500 char/s) on a 70b model.
Tweet media one
0
4
34
@FireworksAI_HQ
Fireworks AI
11 months
Segmind Stable Diffusion 1B (SSD-1B) is now available on Fireworks! It’s now possible to generate 1024x1024 images in 30 steps in just 1 second using SSD-1B on the Fireworks inference platform. SSD-1B is one of the most impressive and high-performing diffusion models available
1
9
32
@FireworksAI_HQ
Fireworks AI
7 months
We hear you and we're committed to creating the best experience for rapid, iterative fine-tuning. We've removed the minimum price for fine-tuning! Create smaller tuning jobs for < $1, effective immediately!
@yar_vol
floating point
7 months
@lqiao @FireworksAI_HQ This is awesome, but do you really need to charge $3 min for EVERY finetune? I though we'd be able to do some iterative fine tuning, but this obviously quickly adds up. OpenAI does not charge minimum fixed price. It is maybe irrational, but I hesitate to try:)
Tweet media one
1
0
0
1
2
30
@FireworksAI_HQ
Fireworks AI
2 months
Excited to announce our judges for this Sunday's agents and compound AI hackathon with @LangChainAI and @FactoryAI ! Join us to push the boundaries of agents and compound AI and to chat with folks like @matanSF , @srochiramani , @dzhulgakov , @swyx , @hwchase17 , @rolandgvc , @dvendrow
Tweet media one
1
2
29
@FireworksAI_HQ
Fireworks AI
3 months
Compound AI Systems combine specialized models, retrievers, and external tools for specific tasks, offering greater flexibility and performance compared to single, mega models that can be less efficient and harder to specialize. Join us at CampFire Connect on July 11th at 10 AM
Tweet media one
1
5
28
@FireworksAI_HQ
Fireworks AI
10 months
Introducing the CreativeQR bot on @poe_platform , powered by @thefireworksai image generation! Generate beautiful images with scannable QR codes seamlessly embedded. Try it now at
Tweet media one
1
3
28
@FireworksAI_HQ
Fireworks AI
6 months
We’re excited to work with @MongoDB to make it easier, faster, and safer for developers to build #GenAI applications. By leveraging our highly curated and optimized open-source models and MongoDB, developers can now build faster, lower TCO, and improve quality. Learn more about
5
4
28
@FireworksAI_HQ
Fireworks AI
6 months
Mixtral MoE 8x22b is now available on Fireworks! Try out the base model here and look out for an instruct model soon!
Tweet media one
Tweet media two
0
4
27
@FireworksAI_HQ
Fireworks AI
1 year
🎇 Fireworks Generative AI Platform 🔝 Top OSS models behind a stable API 🐎 Optimized performance with low cost 🥠 Fine-tuned models for specific use cases 🦜 Native LangChain integration 💸 Try it for free Incredibly excited to share this initial release. More details in 🧵
1
8
25
@FireworksAI_HQ
Fireworks AI
7 months
We’re now serving Gemma 7B Instruct on the Fireworks platform! Try out Google’s latest model on Fireworks to enjoy fast inference speeds, token-based pricing, and an OpenAI-compatible, user-friendly API. Get started on our playground at or through our API
Tweet media one
Tweet media two
0
3
24
@FireworksAI_HQ
Fireworks AI
10 months
@MistralAI You can try it live on Fireworks. We did our best to reverse-engineer the implementation and will update it once the official model code is out.
@FireworksAI_HQ
Fireworks AI
10 months
@MistralAI 8x7B is live on Fireworks. Try it now at Warning: this is not an official implementation, as the model code hasn’t been released yet. Results might vary. We will update it once Mistral does the release. More perf improvements are landing soon
Tweet media one
9
34
161
1
5
22
@FireworksAI_HQ
Fireworks AI
7 months
Reduced pricing We’ve switched from separate pricing for input and output tokens to one flat price. We estimate that pricing should be ~20% cheaper for the median user and all queries will be cheaper, except those with an input:output token ratio greater than ~10:1.
Tweet media one
3
1
24
@FireworksAI_HQ
Fireworks AI
5 months
Llama 3 8b and 70b now available for fine tuning. Fireworks lets you deploy 100 fine tuned models for fast, serverless inference at 0 extra cost! Get started here 👇
Tweet media one
0
3
24
@FireworksAI_HQ
Fireworks AI
1 month
👀Check out the latest RAG tutorial using Llama 3.1 using @astrodotbuild , @SurrealDB and @FastAPI , where you can update the embeddings live and retrieve and add context via the Llama 3.1 405B model
4
4
24
@FireworksAI_HQ
Fireworks AI
4 months
Excited to announce that our SDXL, Playground v2.5 and Segmind SSD image generation models are faster and less expensive! 💨 0.8 seconds to generate a 30 step-image in our tests. Independently benchmarked to be the fastest! 💰 Prices reduced ~35% from $0.006 to $0.0039 for a
@ArtificialAnlys
Artificial Analysis
4 months
Congratulations @FireworksAI_HQ on improving the speed of their Text to Image model APIs! Fireworks has reduced generation time by ~40%, from ~2.8s to ~1.7s for Playground v2.5 and ~1.9s to ~1.2s for SDXL. Fireworks has also reduced prices ~35%, positioning it amongst the
Tweet media one
1
1
12
1
7
23
@FireworksAI_HQ
Fireworks AI
2 months
Thrilled to see Fireworks on the Forbes Next Billion-Dollar Startups list! No surprise that AI companies are leading the charge—AI is reshaping industries, and we're at the forefront. Proud of our unstoppable team, but we're just getting started. Join us and help shape the future
@Forbes
Forbes
2 months
#BillionDollarStartups : Artificial intelligence dominates this year’s list of 25 venture-backed startups we think most likely to reach a $1 billion valuation.
1
8
19
4
4
23
@FireworksAI_HQ
Fireworks AI
2 months
Looking to run inference on state-of-the-art infrastructure? Fireworks AI is the first to offer Llama 3.1 inference using both Nvidia and AMD GPUs. We’re committed to providing the best hardware for unmatched performance and cost efficiency. With Nvidia H100 and AMD Instinct
@KarimBhalwani
Karim
2 months
Thrilled to share that @Meta 's Llama 3.1 family of models, including 8B, 70B and 405B, runs seamlessly on @AMD 's AI GPUs, empowering pioneers like @FireworksAI_HQ to offer one of the fastest and most efficient inference engines from the start. We are grateful for the opportunity
12
5
31
2
3
22
@FireworksAI_HQ
Fireworks AI
6 months
We’re getting a new look! Check out our new logo and color scheme. Keep an eye out for more changes at !
Tweet media one
3
3
22
@FireworksAI_HQ
Fireworks AI
5 months
🚨 New Blog Alert 🚨 Find out how a group of Gen AI enthusiasts used @FireworksAI_HQ to make a LLM play DOOM, a video game created in 1993 that has gained cult status among hackers at Mistral SF Hackathon! Special thanks to Bhav Ashok ( @SammieAtman ), @hopingtocode , and Paul
Tweet media one
0
6
22
@FireworksAI_HQ
Fireworks AI
3 months
Join us next week at CampFire Connect to discover the latest from @FireworksAI_HQ , see new demos, and explore compound AI systems with Fireworks AI CEO @lqiao and @LangChainAI CEO @hwchase17 . 📅 11th July, 10 AM PST 📌 Online RSVP now  👇
Tweet media one
2
5
20
@FireworksAI_HQ
Fireworks AI
7 months
Building function-calling apps with open-source models has never been easier! We’ve built an example function calling app with FireFunction-v1 for generating images and getting/plotting stock prices. Try it here or build off of our open-source code here
1
1
20
@FireworksAI_HQ
Fireworks AI
7 months
Excited to announce the Firefunction Playground in our redesigned UI! It's never been faster to get started with open-source function calling. Add one of our example functions or your own function to see how FireFunction-v1 will make decisions. Get code with your functions and
Tweet media one
Tweet media two
1
3
20
@FireworksAI_HQ
Fireworks AI
10 months
The official Mixtral 8x7b instruct model released earlier today is now live on Fireworks. Give it a try at !
Tweet media one
1
4
20
@FireworksAI_HQ
Fireworks AI
10 months
It was a wild day! @MistralAI dropped just the model weights as a torrent this morning. Together with the community, we reverse-engineered architecture from the parameter names! The model is live on our inference platform just a few hours later, before the official code release!
2
1
19
@FireworksAI_HQ
Fireworks AI
1 year
Exciting preview of what's to come! Stay tuned for more details about what we're building at @thefireworksai
@dzhulgakov
Dmytro Dzhulgakov
1 year
@marktenenholtz @huggingface And TGI doesn't run Multi Query Attention yet, just broadcasts. Falcon is unique among open models to have MQA today btw. With special optimizations for MQA the numbers look even better for the inference service we're building at
0
2
15
0
3
19
@FireworksAI_HQ
Fireworks AI
9 months
🔥 Fire up inference with Mixtral + LoRA on Fireworks platform!
Tweet media one
2
0
18
@FireworksAI_HQ
Fireworks AI
6 months
DBRX Instruct ( @DbrxMosaicAI ) is now available for serverless inference! DBRX is hosted as an experimental model and will be hosted serverless at least through April 2024.
0
2
18
@FireworksAI_HQ
Fireworks AI
4 months
Are you a developer that loves the Fireworks platform? We’re hiring a developer advocate! You’d be writing guides/content, shepherding our developer community, hosting live events and influencing the direction of our developer products. Apply below or email kanika @fireworks .ai
4
6
18
@FireworksAI_HQ
Fireworks AI
1 month
🔥 Join Fireworks AI and Shape the Future of AI! At Fireworks AI, we’re building the fastest, most scalable AI infrastructure, backed by top investors like Benchmark and Sequoia. Our team of industry veterans from Meta's PyTorch, Google, and more, is on a mission to empower AI
Tweet media one
0
3
15
@FireworksAI_HQ
Fireworks AI
5 months
> We've worked with @FireworksAI_HQ to deploy our fast-apply model with strong speculative edits support. They have a fantastic inference engine and built out api support for our custom speculation logic. Glad to play a part in bringing a better experience to @cursor_ai
@amanrsanger
Aman Sanger
5 months
And, speculative edits give us a 13x speedup for this task over vanilla generation with llama-3-70b. (5/7)
Tweet media one
1
2
45
0
1
16
@FireworksAI_HQ
Fireworks AI
10 months
@MistralAI And we just made Mixtral 8x7Bi faster. More speed-ups are on the way
1
2
15
@FireworksAI_HQ
Fireworks AI
10 months
Llama Guard 7B is also now available at Fireworks, hot off the press!
Tweet media one
0
3
15
@FireworksAI_HQ
Fireworks AI
2 months
We're collaborating with @helicone_ai to bring LLM observability features to Fireworks users! Now you can build on tracking costs, usage, time to first tokens, and metrics to optimize your AI apps. To get started:
Tweet media one
1
0
15
@FireworksAI_HQ
Fireworks AI
11 months
New in Fireworks Image Generation: SSD-1B, image2image, ControlNet, and more! Read our blog: Take your image-generation apps to the next level with the following new image generation features on our fast inference platform:
2
3
14
@FireworksAI_HQ
Fireworks AI
4 months
Incredibly excited to announce the SD3-Medium API powered by @FireworksAI_HQ . Access the newest state-of-the-art image model from @StabilityAI with unprecedented sub-second latency from our model optimizations
@StabilityAI
Stability AI
4 months
Today, we’re thrilled to announce the open weights for Stable Diffusion 3 Medium, the latest and most advanced text-to-image AI model in our Stable Diffusion 3 series! This new release represents a major milestone in the evolution of generative AI and continues our commitment to
149
399
2K
1
3
14
@FireworksAI_HQ
Fireworks AI
5 months
“Programming is the art of telling another human being what one wants the computer to do. We should continually strive to transform every art into a science: in the process, we advance the art” - Donald Knuth in "Art of Programming" Today, we are heading into a leap forward
Tweet media one
2
7
13
@FireworksAI_HQ
Fireworks AI
5 months
> It's cost, speed and quality! Thank you, @beyang and @sourcegraph , for a great customer testimonial. @FireworksAI_HQ is excited to be the backbone for delivering several Gen AI experiences at low latency and optimal cost without compromising on the quality of responses.
1
0
13
@FireworksAI_HQ
Fireworks AI
3 months
Excited to be co-hosting an Agents and Compound AI Hackathon! Join us in SF on August 11 to build AI systems that can utilize multiple models, tools and knowledge bases!
@LangChainAI
LangChain
3 months
💻 Join us for an Agents and Compound AI Hackathon in San Francisco on Sunday, August 11th, hosted by @FireworksAI_HQ , @FactoryAI , and @LangChainAI . Apply here ➡ What exactly defines an agent? An agent is a system that uses an LLM to determine the
Tweet media one
3
12
68
0
5
13
@FireworksAI_HQ
Fireworks AI
7 months
We’re now serving Playground v2.5 text-to-image model on Fireworks! To our knowledge, we’re the fastest available Playground v2.5 provider, with inference speeds of ~1.2 seconds for a 1024 x 1024 image. Playground v2.5 (from @playground_ai ) offers dramatically improved quality
2
1
11
@FireworksAI_HQ
Fireworks AI
9 months
We’re partnering with @awscloud and @MongoDB to sponsor a DevOps for GenAI Hackathon on Wednesday, January 24th in NYC! Join us to learn more about building innovative applications with techniques like RAG and function-calling! RSVP at for more details
1
1
12
@FireworksAI_HQ
Fireworks AI
6 months
Awesome usage of Fireworks' new fine-tuning service!
@DynamicWebPaige
👩‍💻 Paige Bailey
6 months
🔫 Badass! A team at the @MistralAI hackathon in SF trained the 7B open-source model to play DOOM, based on an ASCII representation of the current frame in the game. 🤯 @ID_AA_Carmack
96
328
3K
0
1
12
@FireworksAI_HQ
Fireworks AI
10 months
We are proud to sponsor #NeurIPS2023 ! Our team will be at booth 802 from Monday (12/11) to Thursday (12/14) in New Orleans. We hope to see you there!
Tweet media one
0
0
11
@FireworksAI_HQ
Fireworks AI
6 months
Awesome application of FireFunction-v1 and Firework's Mixtral model - generate automatic pull request descriptions!
@vladblagoje
Vladimir Blagojevic
6 months
When you’re not stuck writing glue code for every LLM tool integration, innovative ideas start to appear everywhere you look. 😉 Introducing two GitHub Actions based on Haystack 2.0: PR Auto and Reno Auto
9
18
51
3
0
10
@FireworksAI_HQ
Fireworks AI
3 months
Live now! Join us to learn about our latest features, watch new demos, and discuss Compound AI Systems.
Tweet media one
0
0
10
@FireworksAI_HQ
Fireworks AI
7 months
Faster Mixtral speeds from our spring update are starting to register in benchmarks! Check out Mixtral on Fireworks for the fastest widely available speed, the best consistency and newly reduced pricing!
@ArtificialAnlys
Artificial Analysis
7 months
Fireworks AI has supercharged their Mixtral 8x7B offering impacting 3 critical metrics ‣ Optimized throughput and are now achieving up to 200 tokens/second, second only to Groq ‣ Reduced output token pricing to 1/3 of previous pricing. Now charging $0.5/M input & output
Tweet media one
Tweet media two
Tweet media three
15
8
68
0
2
10
@FireworksAI_HQ
Fireworks AI
5 months
@mag_pl @GroqInc Of course, with total response times (time to first 100 tokens) of just 1 second and Latency (seconds to first chunk) of 0.26 seconds, our platform delivers unrivalled speed without sacrificing an ounce of quality! Excited to keep pushing the envelope on fast, cost-effective Gen
Tweet media one
Tweet media two
0
1
10
@FireworksAI_HQ
Fireworks AI
10 months
@rauchg @vercel It’s our pleasure serving this model for the Vercel playground.
1
0
9
@FireworksAI_HQ
Fireworks AI
11 months
Fireworks is SOC2 Type II and HIPAA Compliant! We are pleased to report that the Fireworks AI inference platform is both SOC 2 Type II and HIPAA compliant. Achieving both SOC 2 Type II and HIPAA compliance is a testament to our proactive approach to safety and data security.
Tweet media one
0
3
9
@FireworksAI_HQ
Fireworks AI
10 months
We hypothesize that for some tokens, the gating scores of the 2nd and 3rd experts are almost identical in the original model. Quantization adds noise that causes the gating to pick the wrong expert. 2/5
4
0
8
@FireworksAI_HQ
Fireworks AI
7 months
Awesome application of FireFunction-v1! It's especially cool to see Vexa call image generation, text embedding and text generation through Fireworks!
@n4ze3m
Nazeem
7 months
Introducing Vexasearch v2-beta, a small side project built around Function Call Vexa can generate images, search for information on the internet, and ask questions on specific URLs using the amazing . @FireworksAI_HQ function call. Powered by @LangChainAI
3
12
45
1
2
9
@FireworksAI_HQ
Fireworks AI
1 year
Want to have fun creating stunning images with a fast and easy-to-use API? We've now made it easy to generate images using StableDiffusion XL via the Fireworks generative AI platform. You can now integrate the best image-generation capabilities into your applications.
Tweet media one
1
5
9
@FireworksAI_HQ
Fireworks AI
10 months
Interestingly, running an extra expert is almost free because there’s room in per-expert batches. The speed and throughput of FP8 models (including Mixtral) are much higher than FP16. This makes FP8 a good trade-off for the performance vs accuracy. 4/5
@dzhulgakov
Dmytro Dzhulgakov
10 months
@MistralAI model is hot: with mixture-of-experts, like GPT-4! It promises faster speed and lower cost than model of the same size and quality Fascinatingly, the speed-up is unevenly distributed: running on a laptop or the biggest GPU server benefit the most. Here’s why 🧵 1/7
Tweet media one
7
87
465
1
1
8