Maziyar PANAHI Profile Banner
Maziyar PANAHI Profile
Maziyar PANAHI

@MaziyarPanahi

2,852
Followers
528
Following
980
Media
10,459
Statuses

Principal AI/ML/Data Engineer @CNRS @ISCPIF | Spark NLP Lead | ❤️ #opensource

Paris, France
Joined October 2010
Don't wanna be here? Send us removal request.
Pinned Tweet
@MaziyarPanahi
Maziyar PANAHI
15 days
I AM A LLAMA QWEN! 👑 Thank you & good night. 🤗
Tweet media one
15
15
133
@MaziyarPanahi
Maziyar PANAHI
14 days
@ylecun « unpredictable regulatory environment. » You broke the already existing rules of not using personal and private data of your users to train any model, now dangling those models and complaining about EU’s regulations? If you want to cross the line of going from research to
26
7
543
@MaziyarPanahi
Maziyar PANAHI
1 year
@TansuYegen Chasing balloons in the sky and balls at the beach, seriously don’t we have better things to do?
9
6
239
@MaziyarPanahi
Maziyar PANAHI
3 months
what in the good lord is this!!! Phi-3-mini-128k?!!
4
28
239
@MaziyarPanahi
Maziyar PANAHI
4 months
Feeling the gap between the Llama-3-8B and Llama-3-70B models by @AIatMeta ? Not sure how to use your extra vRAM? Look no further! I am excited to introduce three new Llama-3 models in 11B, 13B, and 16B sizes! Find all 3 models on @huggingface
Tweet media one
20
25
232
@MaziyarPanahi
Maziyar PANAHI
1 month
1/4 🚀 Exciting news for AI enthusiasts! Check out NuExtract, a cutting-edge LLM designed for structured extraction tasks. It transforms any text into a structured output with just a template! 🤗 Open-source and available on @huggingface ! 🌟 More info:
Tweet media one
6
43
218
@MaziyarPanahi
Maziyar PANAHI
3 months
Introducing an experimental Llama-3-8B-Instruct with 32k context-length in GGUF format: - A big thanks to @winglian for doing the through test - A big thanks to @nisten for running tests Available on @huggingface and @LMStudioAI
Tweet media one
@nisten
nisten
3 months
@MaziyarPanahi @Orwelian84 @winglian Just tested yours at q4km, and 19k input prompt, it works well, you're good
Tweet media one
1
0
7
11
36
215
@MaziyarPanahi
Maziyar PANAHI
3 months
The most downloaded model I've ever had on @huggingface is the GGUF models for Llama-3-70B!!! 🚀 It has been downloaded more than 36,000 times in just under 24 hours!!! You people need to get a life! ❤️🙏🏽
Tweet media one
12
14
174
@MaziyarPanahi
Maziyar PANAHI
3 months
Finally! A fix for Llama-3 tokenizer has been merged! We had to make workarounds, but if the original tokenizer has the "eos_token" correctly set, we won't be needing any extra steps anymore.
Tweet media one
5
16
168
@MaziyarPanahi
Maziyar PANAHI
3 months
Something terrible happened over the weekend! I waited until now, but it's time to bring it up. @MistralAI has put all their models behind gated access. You must now individually accept their use of your data for each model! I have uploaded them all back on @huggingface
Tweet media one
5
15
165
@MaziyarPanahi
Maziyar PANAHI
3 months
It finally happened! I made it to the Top 10 of the Open LLM Leaderboard by @huggingface , right on the edge! Thank you all! ❤️ This is the very first fine-tuned model I've created based on Llama-3-70B, released by @metaai . I will be releasing 16 more fine-tuned models!🚀
Tweet media one
13
9
146
@MaziyarPanahi
Maziyar PANAHI
2 months
Wait, we have new DeepSeek models?! For coding this time! 😳 It’s a 236B MoE that supports up to 338 programming languages! 😱 and it has 128k context length! 🚀 Available on @huggingface :
7
22
144
@MaziyarPanahi
Maziyar PANAHI
1 year
@TheChiefNerd I am in iOS 16.3.1 and I don’t have this! Maybe it’s only for US and not Europe? So I guess my battery is bad because it’s an iPhone!
Tweet media one
59
10
116
@MaziyarPanahi
Maziyar PANAHI
3 months
Google released PaliGemma models! They are open vision-language model inspired by PaLI-3 and built with open components such as the SigLIP vision model and the Gemma language model. This space includes models fine-tuned on a mix of downstream tasks, inferred via @huggingface 🤗
Tweet media one
3
22
115
@MaziyarPanahi
Maziyar PANAHI
2 months
You guys are just too much! 😂 More than 45K downloads in less than 24 hours? The local LLM community on @huggingface is on fire!🔥 Hugging Face recently introduced a "Use this model" to directly load your favorite model onto your desktop.
Tweet media one
7
5
114
@MaziyarPanahi
Maziyar PANAHI
1 month
Say hello to the new model, 🦎Meta Chameleon, just released yesterday by @AIatMeta ! 🔥 Available in 7B and 34B sizes that support mixed-modal input and text-only outputs. Thanks for the open-source multimodal models! 🚀 (Please release them on @huggingface )
Tweet media one
8
12
110
@MaziyarPanahi
Maziyar PANAHI
3 months
Great job @intern_lm on "LLaVA-Llama-3-8B" models! 🚀 - LLaVA-Llama-3-8B collection: - @huggingface Sapce: Looking forward to the GGUF models. ❤️
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
3 months
@mervenoyann I duplicated your LLaVA NeXT space and made this one for LLaVa-Llama-3-8B: Very interesting model! The space needs some improvements, especially in the chat interface to fill the height. Thanks for the jump start 😊
1
5
32
5
15
105
@MaziyarPanahi
Maziyar PANAHI
2 months
🎉 Excited to share some incredible milestones in my journey with fine-tuning Llama-3 and Phi-3 models on @huggingface ! 🚀 🔹 My fine-tuned Llama-3 70B model is ranked #1 among all other FT models 🔹 My Phi-3 Mini fine-tuned model holds the #1 spot among all Phi-3 Mini FT
Tweet media one
13
10
99
@MaziyarPanahi
Maziyar PANAHI
1 year
@amasad Adding a . to mark the end of the sentence. 😂
Tweet media one
1
5
95
@MaziyarPanahi
Maziyar PANAHI
4 months
Are you curious how good is the Llama-3-8B-Instruct model? Join our discussion here:
6
7
92
@MaziyarPanahi
Maziyar PANAHI
27 days
Weekend plan! 🎉 Attempting to learn TextGrad by @stanford , perhaps a mix of: - AutoGrad, - DSPy, - natural language gradients, - and some magic! Just when I thought I was getting DSPy, boom, TextGrad drops in! My brain: "You shall not pass!" 😂
5
6
89
@MaziyarPanahi
Maziyar PANAHI
2 months
Wow! This thing is flying! This is a 2-bit quantized model of Qwen-7B-Instruct! It can’t get any smaller! You can open this directly from @huggingface into @LMStudioAI . 🚀 Congrats to @Alibaba_Qwen and the whole team! 👏🏽💙
5
13
89
@MaziyarPanahi
Maziyar PANAHI
11 days
Thrilled to announce Mistral-Nemo-Instruct-2407 support in Llama.cpp! This wouldn't have been possible without the incredible contributions from the community. A huge thank you to everyone who participated and helped make this happen! 🩷
5
15
86
@MaziyarPanahi
Maziyar PANAHI
2 months
I love desktop apps so much that I made one for HuggingChat, developed by @huggingface ! 🤗
8
11
83
@MaziyarPanahi
Maziyar PANAHI
1 year
🔥Having fun with Dolly v2 by @databricks ! Here is a quick demo of Dolly v2 (12B). Thanks to TextStreamer by @huggingface I can now get that nice feeling of ChatGPT🤗
7
10
83
@MaziyarPanahi
Maziyar PANAHI
4 months
Just uploaded "WizardLM_evol_instruct_V2_196k" back on the @huggingface Datasets. It was removed and I used this dataset a lot in my fine-tuning. Hope it helps others.
3
14
82
@MaziyarPanahi
Maziyar PANAHI
1 month
Something big is about to happen to the Open LLM Leaderboard by @huggingface ! 🤗 Place your bets, people! The closest guess wins a 1-year free HF Pro subscription. 🥳 (just kidding) @OpenLLMLeaders
Tweet media one
7
7
80
@MaziyarPanahi
Maziyar PANAHI
1 year
@NoContextBrits Your followers!? Oh, are you Jesus now, are you? 😂
Tweet media one
0
6
80
@MaziyarPanahi
Maziyar PANAHI
3 months
I spent the whole weekend trying to implement parallel function calling with local LLMs! I had some success, and now I'm off to create some GGUFs for the newly released 'Yi-1.5' models, which are licensed under Apache 2.0 by @01AI_Yi .
Tweet media one
4
13
79
@MaziyarPanahi
Maziyar PANAHI
2 months
There is new Mistral-7B V3! It just dropped an hour ago by @MistralAI on @huggingface - Both Base and Instruct are available 👏🏽 - 32K context length - Extended vocabulary to 32768 - Function calling support!
3
8
74
@MaziyarPanahi
Maziyar PANAHI
3 months
Another day, another series of Llama-3 models released on @HuggingFace ! This time, it's the big brother, the 70B! 🚀 Since Llama-3's release last week, I've created 27 models with more than 230K downloads. The community's support has been priceless! Thanks to all and @metaai ❤️
Tweet media one
9
7
73
@MaziyarPanahi
Maziyar PANAHI
4 months
If you cannot wait to "microsoft/WizardLM-2-8x22B" on your desktop, you can start with GGUF in Q2_K that I just uploaded 🚀
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
4 months
And we have taken off!
Tweet media one
0
2
10
7
9
70
@MaziyarPanahi
Maziyar PANAHI
1 year
@DC_Draino @InvestigateJ6 Wait, is this real? What did I just watch? What was the point of launching that into the crowd for no reason! (Also, he seemed pretty proud!)
Tweet media one
15
24
63
@MaziyarPanahi
Maziyar PANAHI
4 months
And it's done! 🚀 Every single possible GGUF model for Mixtral-8x22B-Instruct-v0.1 by @MistralAI is available to use on @huggingface . From IQ1 all the way to the FP16! (you'll have imatrix data as well) Thanks 🙏🏽
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
4 months
GGUFs are coming! Never doubt! 🚀 PS: also I am worries they might delete them later! 😂
Tweet media one
4
0
30
2
8
68
@MaziyarPanahi
Maziyar PANAHI
3 months
Phi-3: Do you love me? ❤️ Coming up next to @huggingface , new series of Phi-3 fine-tunes. 🚀 - I love it to code! - I love it to say everything in JSON! - I love it to talk! Let's see if they will love me back!
Tweet media one
8
4
63
@MaziyarPanahi
Maziyar PANAHI
3 months
🚀 The first fine-tuned models to score higher than Llama-3-70B & achieve the best MMLU/GSM8K at the same time! - 3 out of the top 10 models on the Open LLM Leaderboard are now dominated by these fine-tuned models - Achieved the highest MMLU / GSM8K on @huggingface Leaderboard
Tweet media one
4
5
63
@MaziyarPanahi
Maziyar PANAHI
2 months
And it is out! SD3 Medium by @StabilityAI is now available on @huggingface ! 🚀
Tweet media one
5
11
63
@MaziyarPanahi
Maziyar PANAHI
3 months
Oh my god! It's raining Llama-3 today! Based on an amazing work of @winglian ❤️, I created "Llama-3-8B-Instruct-64k" model! Already being tested, quantized, and uploaded on @huggingface 🚀 Who wants them?!
Tweet media one
@winglian
Wing Lian (caseus)
3 months
I'm up to 96k context for Llama 3 8B. Using PoSE, we did continued pre-training of the base model w 300M tokens to extend the context length to 64k. From there we increased the RoPE theta to further attempt to extend the context length. 🧵
Tweet media one
26
61
444
3
6
60
@MaziyarPanahi
Maziyar PANAHI
1 month
Something big is coming… 🚀 👀 Stay tuned for an exciting reveal from ‘Calme’ series. Are you ready to meet ‘Calme-2’ on @huggingface ? 🤗
Tweet media one
4
4
61
@MaziyarPanahi
Maziyar PANAHI
1 month
🎉 Exciting news for all Qwen fans! 🚀 I've just released 8 new models based on the powerful Qwen2 base model! Developed by @Alibaba_Qwen , these models are leading the way at @OpenLLMLeaders 🤗 Find them all on @huggingface (7B models are under Apache 2.0 license 💙):
5
8
59
@MaziyarPanahi
Maziyar PANAHI
4 months
First time for everything! Trending on @huggingface 🤗🚀
Tweet media one
6
5
57
@MaziyarPanahi
Maziyar PANAHI
23 days
The one and only @ClementDelangue , CEO of @huggingface 🤗 P.S. Not sure why we both look so surprised! 😂
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
23 days
Correction, now we are ready to take off! 😂
Tweet media one
0
3
14
2
2
57
@MaziyarPanahi
Maziyar PANAHI
3 months
🚨New GGUF alert! - 5x new high-quality IQ based quantized models for "Meta-Llama-3-70B-Instruct" by @AIatMeta - 10x new GGUF models for "Llama-3-Smaug-8B" by @abacusai Models are available on @huggingface (links are in the comments) - as always, thanks for the support ❤️
6
12
57
@MaziyarPanahi
Maziyar PANAHI
12 days
To all the axolotl lovers and DPO fans, make sure you check out this beautiful PR by @fozziethebeat (it now matches the SFT).
Tweet media one
1
4
56
@MaziyarPanahi
Maziyar PANAHI
2 months
How is the fine-tuning of Qwen2-72B going so far? Getting there! 😅 Who's winning: - MaziyarPanahi/Llama-3-70B-Instruct-DPO-v0.2 - MaziyarPanahi/Qwen2-72B-Instruct-v0.1 - meta-llama/Meta-Llama-3-70B-Instruct - Qwen/Qwen2-72B Thanks, @winglian for Qwen2 + FSDP in @axolotl_ai 🚀
Tweet media one
2
7
54
@MaziyarPanahi
Maziyar PANAHI
9 days
And the GGUF models for Meta-Llama-3.1-70B-Instruct model are now ready! Find them on @huggingface :
Tweet media one
2
10
51
@MaziyarPanahi
Maziyar PANAHI
2 months
Meet the new monster in town: Nemotron-4 by @NVIDIAAI , featuring 340 billion parameters! Get them while their hot on @huggingface 🔥
Tweet media one
6
8
50
@MaziyarPanahi
Maziyar PANAHI
11 days
🎉 Another big news today! Llama.cpp now supports SmolLM models by @huggingface ! 🤖 All 3 models will be quantized and available soon 🔥 ✨ HuggingFaceTB/SmolLM-135M-Instruct 🌟 HuggingFaceTB/SmolLM-360M-Instruct 💪 HuggingFaceTB/SmolLM-1.7B-Instruct
@MaziyarPanahi
Maziyar PANAHI
11 days
Thrilled to announce Mistral-Nemo-Instruct-2407 support in Llama.cpp! This wouldn't have been possible without the incredible contributions from the community. A huge thank you to everyone who participated and helped make this happen! 🩷
5
15
86
3
3
54
@MaziyarPanahi
Maziyar PANAHI
11 days
I have generated over 6K rows for a synthetic French CoT Legal dataset, and I'm quite satisfied with the results achieved using only a local LLM. Special thanks to @Teknium1 for the "Nous-Hermes-2-Mixtral-8x7B-DPO" model! Excellent balance between quality and inference speed.
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
12 days
I love using local LLMs to generate synthetic datasets! I test various models, have Claude judge the outputs, and then choose the best local LLM for each dataset. @NousResearch , this model consistently produces high-quality results. Any insights into why?
2
0
23
3
4
48
@MaziyarPanahi
Maziyar PANAHI
17 days
Excuse me?! Did they seriously just drop a record-breaking model like it's nothing? Zero shame! 🥰 Hey @huggingface , how about a new feature? Trending Alert: ping us when a model gets 20+ likes in 24 hours? For science, obviously! 🤗😅
@opengvlab
OpenGVLab
17 days
Today, we released InternVL2-Llama3-76B on HuggingFace🤗Hope to get your "like ❤️" on the page🥰!
4
30
229
1
2
47
@MaziyarPanahi
Maziyar PANAHI
2 months
And on that note, heading to bed! 😂 Phi-3 fine-tune models:
Tweet media one
@OpenLLMLeaders
OpenLLMLeaders
2 months
New model added to the leaderboard! Model Name Overall rank: 1530 Rank in 3B category: 1 Benchmarks Average: 70.26 ARC: 63.48 HellaSwag: 80.86 MMLU: 69.24 TruthfulQA: 60.66 Winogrande: 72.77 GSM8K: 74.53
1
3
19
3
6
47
@MaziyarPanahi
Maziyar PANAHI
1 month
From my experience testing Gemma-2 (27B), recently released by Google in the Medical Advanced Rag, I've learned the following. Specs: - DSPy - Citations - Long context (up to 7K to work with Llama-3 & Gemma-2) - Handling patient reports (messy) - Must answer fully!
Tweet media one
6
3
47
@MaziyarPanahi
Maziyar PANAHI
3 months
Great job @Gradient_AI_ ! This one is very close to the Instruct and that's pretty impressive! ❤️🚀👏🏽
Tweet media one
@OpenLLMLeaders
OpenLLMLeaders
3 months
New model added to the leaderboard! Model Name Overall rank: 527 Rank in 70+B category: 42 Benchmarks Average: 75.04 ARC: 67.58 HellaSwag: 86.4 MMLU: 77.19 TruthfulQA: 54.68 Winogrande: 83.98 GSM8K: 80.44
0
1
5
3
6
45
@MaziyarPanahi
Maziyar PANAHI
4 months
@chefeitenko @rileybrown_ai @midjourney Why do you think CAPTCHA is asking human to identify them? 😆
1
0
45
@MaziyarPanahi
Maziyar PANAHI
1 month
So it is finally here! The new and much improved Open LLM Leaderboard 2.0! 👿 In honor of this new leaderboard, I am publishing new fine-tuned Qwen2 models! 🚀 Follow my 👑Queen collection on @huggingface for all the new Qwen2 models!
4
4
43
@MaziyarPanahi
Maziyar PANAHI
10 days
Now that's why I subscribe to @huggingface Pro plan! I don't even have access to the model, but I can hit those Inference Endpoints to use Llama-3.1 models!!! 🤗
Tweet media one
3
7
42
@MaziyarPanahi
Maziyar PANAHI
4 months
One of the community members tried the IQ3-XS quants of "WizardLM-2-8x22B" by @WizardLM_AI —this is such a complicated question! Such an advanced and coherent response! I am quite impressed!
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
4 months
People started to enjoy the new " WizardLM-2-8x22B" by @WizardLM_AI in GGUF format! If this is IQ3-XS, I don't even wanna know what the 8bit is capable of! 🚀
Tweet media one
1
0
9
1
7
43
@MaziyarPanahi
Maziyar PANAHI
3 months
I am happy to announce the release of the v0.3 fine-tuning of Llama-3-8B-Instruct using the DPO dataset. As always, the template is ChatML! 😊 You can download it on @huggingface
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
3 months
I am about to upload my second fine-tuned Llama-3-8B-Instruct (v0.2) on @huggingface In the meantime, could you please explain to me what is happening with my 3rd run? (v0.3) Loss starts at : 0.6931 Loss ends at : 0.0026
Tweet media one
4
1
12
6
2
41
@MaziyarPanahi
Maziyar PANAHI
1 year
How big is your LLM? 😁 Putting "databricks/dolly-v2-3b" to work! 😎 - I love taking AI solutions apart and trying to replace the closed parts with open-source solutions - Nothing wrong with OpenAI, in fact, it's a great service. But it locks you in & you need to share data
3
6
40
@MaziyarPanahi
Maziyar PANAHI
1 year
@PeakLifeDT 4L of water! You’ll spend half a day in toilette!
7
1
40
@MaziyarPanahi
Maziyar PANAHI
16 days
Seriously?! It hasn't even been a day, and you've all downloaded the GGUF models as if your lives depend on them! Animals! ❤️ Nearly 17K downloads on @huggingface for the newly released Mathstral-7B model by @MistralAI ! 🤗 It's like Black Friday for nerds!
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
17 days
"Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B." GGUF models are available on @huggingface
3
4
28
3
6
40
@MaziyarPanahi
Maziyar PANAHI
3 months
Earlier today I submitted this model to the Open LLM Leaderboard. There is room to improve in our quest to extend Llama-3 context length. 🚀
Tweet media one
Tweet media two
@Gradient_AI_
Gradient
3 months
We just released the first LLama-3 8B with a context length of over 160K onto Hugging Face! SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens, powered by @CrusoeEnergy 's compute) by appropriately adjusting RoPE theta. 🔗
Tweet media one
31
78
422
3
2
39
@MaziyarPanahi
Maziyar PANAHI
3 months
Up next on @huggingface ! Coming to you this week: - New fine-tuned Llama-3-70B models - New fine-tuned Llamixtral-3 models (Mixture of Llama-3 in 24B and 47B) - New fine-tuned Qwen1.5-32B models
Tweet media one
2
8
38
@MaziyarPanahi
Maziyar PANAHI
4 months
The new WizardLM-207B is so fast in @LMStudioAI ! 🚀
Tweet media one
4
4
36
@MaziyarPanahi
Maziyar PANAHI
8 days
Mistral Large 1-bit and 2-bit models are ready to use! 🚀 1-bit: IQ1_S & IQ1_M 2-bit: IQ2_XS & Q2_K Found them on @huggingface 🤗:
@MaziyarPanahi
Maziyar PANAHI
9 days
Never doubt! GGUFs for "Mistral-Large-Instruct-2407" are coming to @huggingface 🤗
Tweet media one
1
0
33
3
5
50
@MaziyarPanahi
Maziyar PANAHI
1 year
@levelsio Like with every new technology! 😂
Tweet media one
0
9
36
@MaziyarPanahi
Maziyar PANAHI
3 months
Oh no! I just kicked myself out of the top 10 on the Open LLM Leaderboard @huggingface ! 🤣 v0.4, welcome to the top 10! Goodbye, v0.1; we wish we could have kept you around!
Tweet media one
@MaziyarPanahi
Maziyar PANAHI
3 months
It finally happened! I made it to the Top 10 of the Open LLM Leaderboard by @huggingface , right on the edge! Thank you all! ❤️ This is the very first fine-tuned model I've created based on Llama-3-70B, released by @metaai . I will be releasing 16 more fine-tuned models!🚀
Tweet media one
13
9
146
3
4
36
@MaziyarPanahi
Maziyar PANAHI
15 days
Mistral NeMo Mistral NeMo: our new best small model. A state-of-the-art 12B model with 128k context length, built in collaboration with NVIDIA, and released under the Apache 2.0 license.
4
3
36
@MaziyarPanahi
Maziyar PANAHI
2 months
Thanks to this beautiful work, here are the GGUF models for the new “Codestral-22B-v0.1” model by @MistralAI 🩷 You can find them on @huggingface and use them directly in your local LLM apps! 🚀
Tweet media one
@rodrimora
Rodri Mora aka Bullerwins
2 months
@MaziyarPanahi I have converted, it seems to work now with Transformers:
2
1
5
1
5
35
@MaziyarPanahi
Maziyar PANAHI
1 year
@MattWallace888 @elonmusk That episode! I mean that episode!!!!!
7
0
32
@MaziyarPanahi
Maziyar PANAHI
2 months
Aww 🥰! You guys are so cute! 🤗 @huggingface at #vivatech
Tweet media one
7
3
34
@MaziyarPanahi
Maziyar PANAHI
3 months
Llama-3-8B quantized in 2bit! It's a sassy little one! 🚀Super fast! @LMStudioAI (macbook pro, m2 max) 🤗Download them models from @huggingface
1
0
32
@MaziyarPanahi
Maziyar PANAHI
3 months
I won't go into detail about why such behavior, especially over the weekend, is far from professional. Nor will I entertain the idea that you should be grateful for free stuff and prepared for them to pull the rug out from under you! Truly free access:
0
1
33
@MaziyarPanahi
Maziyar PANAHI
11 days
📢 New dataset alert! 🚀 Ready to use in @axolotl_ai ! Just drop these into your SFT fine-tune YAML and you're all set! 💻✨ (maybe warm up a bit for Llama-3.1 release tomorrow!) Can't wait to give this a spin! Big thanks to @arcee_ai for making it happen! 🙌
@LucasAtkins7
Lucas Atkins
11 days
Today Arcee is releasing two datasets: 1. The Tome - this is a 1.75 million sample dataset that has been filtered to train strong generalist models. This is the dataset that was used to train Spark and Nova 2. Agent-Data: This is Arcee-Agent's dataset, comprising different
Tweet media one
4
13
62
0
3
33
@MaziyarPanahi
Maziyar PANAHI
2 months
The Qwen2-72B base model, released yesterday by @Alibaba_Qwen , has achieved the highest MMLU on the @huggingface Open LLM Leaderboard! 😱
Tweet media one
5
3
30
@MaziyarPanahi
Maziyar PANAHI
9 days
Never doubt! GGUFs for "Mistral-Large-Instruct-2407" are coming to @huggingface 🤗
Tweet media one
@MistralAI
Mistral AI
9 days
126
361
2K
1
0
33
@MaziyarPanahi
Maziyar PANAHI
3 months
@mervenoyann I duplicated your LLaVA NeXT space and made this one for LLaVa-Llama-3-8B: Very interesting model! The space needs some improvements, especially in the chat interface to fill the height. Thanks for the jump start 😊
1
5
32
@MaziyarPanahi
Maziyar PANAHI
18 days
Thank you for releasing the code! This is big! 💥 Would love to see this in TRL @huggingface and @axolotl_ai ❤️ 🤗
@CanXu20
Can Xu
18 days
🔥 Excited to share the other key Technology of WizardLM-2! 📙AutoEvol: Automatic Instruction Evolving for Large Language Models 🚀We build a fully automated Evol-Instruct pipeline to create high-quality, highly complex instruction tuning data: -------- 🧵 --------
Tweet media one
Tweet media two
10
28
217
3
8
32
@MaziyarPanahi
Maziyar PANAHI
4 months
Can't wait for "Meta-Llama-3-70B-Instruct" to be downloaded? 😎 Start playing with the Q2 on your laptop! @huggingface 🚀
Tweet media one
3
4
31
@MaziyarPanahi
Maziyar PANAHI
6 years
Visualizing our French Twitter archive in 2017: More than 10 million geo-tagged tweets being processed and visualized by using #MapD on top of 4 Tesla P100 GPUs. #BigData #visualization #geospatial #Twitter
Tweet media one
Tweet media two
Tweet media three
0
20
30
@MaziyarPanahi
Maziyar PANAHI
10 days
Let's test the new Llama-3.1-8B model locally! 🔥
5
1
30
@MaziyarPanahi
Maziyar PANAHI
17 days
"Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B." GGUF models are available on @huggingface
@MistralAI
Mistral AI
17 days
Tweet media one
Tweet media two
86
307
2K
3
4
28
@MaziyarPanahi
Maziyar PANAHI
8 days
Remember when AI was just for tech giants? Now we're running models locally! It's amazing how far we've come. 🤗
Tweet media one
@maximelabonne
Maxime Labonne
8 days
Due to popular demand, I've updated this figure to include DeepSeek-V2 and Mistral Large 2. It's also more zoomed for readability.
Tweet media one
18
73
352
4
5
42
@MaziyarPanahi
Maziyar PANAHI
1 month
Here are the GGUF models for the great "Firefunction-V2" model by @FireworksAI_HQ 🚀 Available on @huggingface - Original model: - Quants:
Tweet media one
1
4
30
@MaziyarPanahi
Maziyar PANAHI
4 months
GGUFs are coming! Never doubt! 🚀 PS: also I am worries they might delete them later! 😂
Tweet media one
@reach_vb
Vaibhav (VB) Srivastav
4 months
Damn straight! Mistral just dropped the Mistral 8x22B Instruct weights 🔥 > 90.8% on GSM8K maj @8 > 44.6% on math maj @4 Also Mistral throwing shade on Cohere lol
Tweet media one
8
28
228
4
0
30
@MaziyarPanahi
Maziyar PANAHI
1 month
Me, immediately! 🤣
Tweet media one
6
1
30
@MaziyarPanahi
Maziyar PANAHI
10 days
The GGUF models for Llama-3.1-8B are going to come out pretty quick!
Tweet media one
2
1
29
@MaziyarPanahi
Maziyar PANAHI
4 months
12k downloads in less than 24hours!!! Please have some mercy on @huggingface 🤗 Make AI accessible, and lo and behold, it's like everyone suddenly has a need for it! ❤️🚀
Tweet media one
2
3
29
@MaziyarPanahi
Maziyar PANAHI
14 days
⚡️ Haha! I am planning to bring OpenAI in-house! 🖥️ Snagging a pre-loved Mac Studio to become my offline AI powerhouse! 💪 I am talking Qwen2, Llama-3, Yi-1.5, and Gemma – all the cool kids of the SLM world, ready to party on one machine! Lab squad, assemble! 🔥
@Prince_Canuma
Prince Canuma
16 days
FastMLX: Turn your powerful Mac into an AI home server 🚀 Using my M3 Max (96GB URAM) to run a VLM, streaming responses to my M1 MacBook Air over WiFi.🔥 > pip install -U fastmlx
12
38
258
4
3
29
@MaziyarPanahi
Maziyar PANAHI
2 months
I am tired of your mediocrity, the "Real Housewives of AI" show, all the "regarding" statements, and the deception that something incredible is coming! Maybe it's just me, but I get more from my local LLMs for €0! See ya! 👋🏼 @OpenAI
Tweet media one
5
0
29
@MaziyarPanahi
Maziyar PANAHI
1 month
Tell me you are GPU poor without telling me you are GPU poor!
Tweet media one
6
3
29
@MaziyarPanahi
Maziyar PANAHI
4 months
Mixtral-8x22B-Instruct-v0.1, going wild on TOOLS & FUNCTION CALLING: "<unk>" "<s>", "</s>", "[INST]", "[/INST]", "[TOOL_CALLS]", "[AVAILABLE_TOOLS]", "[/AVAILABLE_TOOLS]", "[TOOL_RESULT]", "[/TOOL_RESULTS]",
4
3
29
@MaziyarPanahi
Maziyar PANAHI
29 days
This proves that Q8 in Llama.cpp (GGUF) can go head to head with FP16 and FP32! This saves a lot of VRAM! Thank you @bartowski1182 🙏
@bartowski1182
bartowski
29 days
Finished up testing the experimental quants testing with some interesting results (tldr, no more fp16 embed/output, now q8 embed/output) Basically across 8 categories I found that quantizing the embeddings and outputs to Q8 was equal to or better than FP16 in 6 of them. I feel
Tweet media one
5
7
57
3
7
28
@MaziyarPanahi
Maziyar PANAHI
3 months
I just re-uploaded back "WizardLM-70B-V1.0" to @huggingface This has been one of my favorite LLMs for a long time, trained and released by @WizardLM_AI I am uploading it back since it doesn't exist anymore and I think it deserves to live on! ❤️🚀
Tweet media one
3
4
28
@MaziyarPanahi
Maziyar PANAHI
3 months
Spark NLP just hit 100 million downloads on PyPI! 🚀Huge THANK YOU to our incredible community for your support over the past 7 years. I'm profoundly grateful to be part of this remarkable journey with such an inspiring and dedicated team. Here's to many more milestones ahead! 🥳
Tweet media one
2
0
27
@MaziyarPanahi
Maziyar PANAHI
15 days
Considering deploying local AI for your business? Here's something to consider for your team: 1. Mac Studio: Approx. $2,500 - Full-featured workstation - Versatile for various tasks 2. NVIDIA H100 GPU: $33K-$40K (if available) - Requires server infrastructure - High
@ac_crypto
Alex Cheema - e/acc
15 days
How long does it take to get distributed inference running locally across 2 MacBook GPUs from a fresh install? About 60 seconds, running @exolabs_ Watch till the end, I chat to the cluster using @__tinygrad__ ChatGPT web interface Code is open source 👇
17
50
627
2
0
28
@MaziyarPanahi
Maziyar PANAHI
4 months
I am just gonna say it, being in Central European Time (CET) sucks! You wake up and models were announced, already converted, quantized, hell even fine-tuned on @huggingface There's nothing left to do but download and enjoy!
3
3
28
@MaziyarPanahi
Maziyar PANAHI
3 months
I’ve been testing the new GPT-4o (Omni) in ChatGPT. I am not impressed! Not even a little! Faster, cheaper, multimodal, these are not for me. Code interpreter, that’s all I care and it’s as lazy as it was before! We have @AnthropicAI in EU, I am considering the switch. Feedbacks?
9
4
27