Maziyar PANAHI @MaziyarPanahi Twitter profile

Pinned Tweet

Maziyar PANAHI

@MaziyarPanahi

15 days

I AM A LLAMA QWEN! 👑 Thank you & good night. 🤗

15

133

Last Seen Profiles

@RebeccaH33951

@puppetflogger

@neolacakk_

@ghuspaithhiya

@Fana29750059

@omadam90

@Lonskils

@yuki_to_nerie

@bnnyhunna_

@NicolKoeppe

@ichon_lindang

@France_Travail_

@martinjflett

@danna_the_h

@ceoofcomedy144

@jalentouketsu

@NPressimon70486

@NagaToast

@UJPV1

@salb1132

@HokaSpain

@futureof_japan

@bokeplokalmalam

@faraz7605

@MercyyAfc

@JanuaryHandl

@hamusaki_nigu

@stw_pdg

@nileEcoyote

@poke_12_suki

@Georgette_Sand

@SAMI_PA1

@BandarStw

@rad7729

@jargo_nodes

@SyuanFong0901

Maziyar PANAHI

@MaziyarPanahi

14 days

@ylecun « unpredictable regulatory environment. » You broke the already existing rules of not using personal and private data of your users to train any model, now dangling those models and complaining about EU’s regulations? If you want to cross the line of going from research to

26

7

543

Maziyar PANAHI

@MaziyarPanahi

1 year

@TansuYegen Chasing balloons in the sky and balls at the beach, seriously don’t we have better things to do?

9

6

239

Maziyar PANAHI

@MaziyarPanahi

3 months

what in the good lord is this!!! Phi-3-mini-128k?!!

microsoft/Phi-3-mini-128k-instruct · Hugging Face

huggingface.co

4

28

239

Maziyar PANAHI

@MaziyarPanahi

4 months

Feeling the gap between the Llama-3-8B and Llama-3-70B models by @AIatMeta ? Not sure how to use your extra vRAM? Look no further! I am excited to introduce three new Llama-3 models in 11B, 13B, and 16B sizes! Find all 3 models on @huggingface

20

25

232

Maziyar PANAHI

@MaziyarPanahi

1 year

@TansuYegen Over a ball?!!!

Large iron mooring buoys in scrap yard Stock Photo - Alamy

Download this stock image: Large iron mooring buoys in scrap yard. - BMDD0W from Alamy's library of millions of high resolution stock photos, illustrations and vectors.

www.alamy.com

7

222

Maziyar PANAHI

@MaziyarPanahi

1 month

1/4 🚀 Exciting news for AI enthusiasts! Check out NuExtract, a cutting-edge LLM designed for structured extraction tasks. It transforms any text into a structured output with just a template! 🤗 Open-source and available on @huggingface ! 🌟 More info:

6

43

218

Maziyar PANAHI

@MaziyarPanahi

3 months

Introducing an experimental Llama-3-8B-Instruct with 32k context-length in GGUF format: - A big thanks to @winglian for doing the through test - A big thanks to @nisten for running tests Available on @huggingface and @LMStudioAI

nisten

@nisten

3 months

@MaziyarPanahi @Orwelian84 @winglian Just tested yours at q4km, and 19k input prompt, it works well, you're good

1

0

7

11

36

215

Maziyar PANAHI

@MaziyarPanahi

3 months

The most downloaded model I've ever had on @huggingface is the GGUF models for Llama-3-70B!!! 🚀 It has been downloaded more than 36,000 times in just under 24 hours!!! You people need to get a life! ❤️🙏🏽

12

14

174

Maziyar PANAHI

@MaziyarPanahi

3 months

Finally! A fix for Llama-3 tokenizer has been merged! We had to make workarounds, but if the original tokenizer has the "eos_token" correctly set, we won't be needing any extra steps anymore.

5

16

168

Maziyar PANAHI

@MaziyarPanahi

3 months

Something terrible happened over the weekend! I waited until now, but it's time to bring it up. @MistralAI has put all their models behind gated access. You must now individually accept their use of your data for each model! I have uploaded them all back on @huggingface

5

15

165

Maziyar PANAHI

@MaziyarPanahi

3 months

It finally happened! I made it to the Top 10 of the Open LLM Leaderboard by @huggingface , right on the edge! Thank you all! ❤️ This is the very first fine-tuned model I've created based on Llama-3-70B, released by @metaai . I will be releasing 16 more fine-tuned models!🚀

13

9

146

Maziyar PANAHI

@MaziyarPanahi

2 months

Wait, we have new DeepSeek models?! For coding this time! 😳 It’s a 236B MoE that supports up to 338 programming languages! 😱 and it has 128k context length! 🚀 Available on @huggingface :

DeepSeekCoder-V2 - a deepseek-ai Collection

huggingface.co

7

22

144

Maziyar PANAHI

@MaziyarPanahi

9 days

"Agentic components of the Llama Stack APIs" Let's dive into this, learn how to best use Tools and Functions with new Llama-3.1 models!

GitHub - meta-llama/llama-agentic-system: Agentic components of the Llama Stack APIs

Agentic components of the Llama Stack APIs. Contribute to meta-llama/llama-agentic-system development by creating an account on GitHub.

github.com

5

21

125

Maziyar PANAHI

@MaziyarPanahi

1 year

@TheChiefNerd I am in iOS 16.3.1 and I don’t have this! Maybe it’s only for US and not Europe? So I guess my battery is bad because it’s an iPhone!

59

10

116

Maziyar PANAHI

@MaziyarPanahi

3 months

Google released PaliGemma models! They are open vision-language model inspired by PaLI-3 and built with open components such as the SigLIP vision model and the Gemma language model. This space includes models fine-tuned on a mix of downstream tasks, inferred via @huggingface 🤗

3

22

115

Maziyar PANAHI

@MaziyarPanahi

2 months

You guys are just too much! 😂 More than 45K downloads in less than 24 hours? The local LLM community on @huggingface is on fire!🔥 Hugging Face recently introduced a "Use this model" to directly load your favorite model onto your desktop.

7

5

114

Maziyar PANAHI

@MaziyarPanahi

1 month

Say hello to the new model, 🦎Meta Chameleon, just released yesterday by @AIatMeta ! 🔥 Available in 7B and 34B sizes that support mixed-modal input and text-only outputs. Thanks for the open-source multimodal models! 🚀 (Please release them on @huggingface )

8

12

110

Maziyar PANAHI

@MaziyarPanahi

3 months

Great job @intern_lm on "LLaVA-Llama-3-8B" models! 🚀 - LLaVA-Llama-3-8B collection: - @huggingface Sapce: Looking forward to the GGUF models. ❤️

Maziyar PANAHI

@MaziyarPanahi

3 months

@mervenoyann I duplicated your LLaVA NeXT space and made this one for LLaVa-Llama-3-8B: Very interesting model! The space needs some improvements, especially in the chat interface to fill the height. Thanks for the jump start 😊

1

5

32

5

15

105

Maziyar PANAHI

@MaziyarPanahi

2 months

🎉 Excited to share some incredible milestones in my journey with fine-tuning Llama-3 and Phi-3 models on @huggingface ! 🚀 🔹 My fine-tuned Llama-3 70B model is ranked #1 among all other FT models 🔹 My Phi-3 Mini fine-tuned model holds the #1 spot among all Phi-3 Mini FT

13

10

99

Maziyar PANAHI

@MaziyarPanahi

1 year

@amasad Adding a . to mark the end of the sentence. 😂

1

5

95

Maziyar PANAHI

@MaziyarPanahi

4 months

Are you curious how good is the Llama-3-8B-Instruct model? Join our discussion here:

MaziyarPanahi/Meta-Llama-3-8B-Instruct-GGUF · OK llama 3 8b model is INSANE. Is almost as good as...

huggingface.co

6

7

92

Maziyar PANAHI

@MaziyarPanahi

27 days

Weekend plan! 🎉 Attempting to learn TextGrad by @stanford , perhaps a mix of: - AutoGrad, - DSPy, - natural language gradients, - and some magic! Just when I thought I was getting DSPy, boom, TextGrad drops in! My brain: "You shall not pass!" 😂

5

6

89

Maziyar PANAHI

@MaziyarPanahi

2 months

Wow! This thing is flying! This is a 2-bit quantized model of Qwen-7B-Instruct! It can’t get any smaller! You can open this directly from @huggingface into @LMStudioAI . 🚀 Congrats to @Alibaba_Qwen and the whole team! 👏🏽💙

5

13

89

Maziyar PANAHI

@MaziyarPanahi

11 days

Thrilled to announce Mistral-Nemo-Instruct-2407 support in Llama.cpp! This wouldn't have been possible without the incredible contributions from the community. A huge thank you to everyone who participated and helped make this happen! 🩷

Support Mistral-Nemo-Instruct-2407 128K · Issue #8577 · ggerganov/llama.cpp

Prerequisites I am running the latest code. Mention the version if possible as well. I carefully followed the README.md. I searched using keywords relevant to my issue to make sure that I am creati...

github.com

5

15

86

Maziyar PANAHI

@MaziyarPanahi

2 months

I love desktop apps so much that I made one for HuggingChat, developed by @huggingface ! 🤗

8

11

83

Maziyar PANAHI

@MaziyarPanahi

1 year

🔥Having fun with Dolly v2 by @databricks ! Here is a quick demo of Dolly v2 (12B). Thanks to TextStreamer by @huggingface I can now get that nice feeling of ChatGPT🤗

7

10

83

Maziyar PANAHI

@MaziyarPanahi

4 months

Just uploaded "WizardLM_evol_instruct_V2_196k" back on the @huggingface Datasets. It was removed and I used this dataset a lot in my fine-tuning. Hope it helps others.

MaziyarPanahi/WizardLM_evol_instruct_V2_196k · Datasets at Hugging Face

huggingface.co

3

14

82

Maziyar PANAHI

@MaziyarPanahi

1 month

Something big is about to happen to the Open LLM Leaderboard by @huggingface ! 🤗 Place your bets, people! The closest guess wins a 1-year free HF Pro subscription. 🥳 (just kidding) @OpenLLMLeaders

7

80

Maziyar PANAHI

@MaziyarPanahi

1 year

@NoContextBrits Your followers!? Oh, are you Jesus now, are you? 😂

0

6

80

Maziyar PANAHI

@MaziyarPanahi

3 months

I spent the whole weekend trying to implement parallel function calling with local LLMs! I had some success, and now I'm off to create some GGUFs for the newly released 'Yi-1.5' models, which are licensed under Apache 2.0 by @01AI_Yi .

4

13

79

Maziyar PANAHI

@MaziyarPanahi

2 months

There is new Mistral-7B V3! It just dropped an hour ago by @MistralAI on @huggingface - Both Base and Instruct are available 👏🏽 - 32K context length - Extended vocabulary to 32768 - Function calling support!

3

8

74

Maziyar PANAHI

@MaziyarPanahi

3 months

Another day, another series of Llama-3 models released on @HuggingFace ! This time, it's the big brother, the 70B! 🚀 Since Llama-3's release last week, I've created 27 models with more than 230K downloads. The community's support has been priceless! Thanks to all and @metaai ❤️

9

7

73

Maziyar PANAHI

@MaziyarPanahi

4 months

If you cannot wait to "microsoft/WizardLM-2-8x22B" on your desktop, you can start with GGUF in Q2_K that I just uploaded 🚀

Maziyar PANAHI

@MaziyarPanahi

4 months

And we have taken off!

0

2

10

7

9

70

Maziyar PANAHI

@MaziyarPanahi

1 year

@DC_Draino @InvestigateJ6 Wait, is this real? What did I just watch? What was the point of launching that into the crowd for no reason! (Also, he seemed pretty proud!)

15

24

63

Maziyar PANAHI

@MaziyarPanahi

4 months

And it's done! 🚀 Every single possible GGUF model for Mixtral-8x22B-Instruct-v0.1 by @MistralAI is available to use on @huggingface . From IQ1 all the way to the FP16! (you'll have imatrix data as well) Thanks 🙏🏽

Maziyar PANAHI

@MaziyarPanahi

4 months

GGUFs are coming! Never doubt! 🚀 PS: also I am worries they might delete them later! 😂

4

0

30

2

8

68

Maziyar PANAHI

@MaziyarPanahi

3 months

Phi-3: Do you love me? ❤️ Coming up next to @huggingface , new series of Phi-3 fine-tunes. 🚀 - I love it to code! - I love it to say everything in JSON! - I love it to talk! Let's see if they will love me back!

8

4

63

Maziyar PANAHI

@MaziyarPanahi

3 months

🚀 The first fine-tuned models to score higher than Llama-3-70B & achieve the best MMLU/GSM8K at the same time! - 3 out of the top 10 models on the Open LLM Leaderboard are now dominated by these fine-tuned models - Achieved the highest MMLU / GSM8K on @huggingface Leaderboard

4

5

63

Maziyar PANAHI

@MaziyarPanahi

2 months

And it is out! SD3 Medium by @StabilityAI is now available on @huggingface ! 🚀

5

11

63

Maziyar PANAHI

@MaziyarPanahi

3 months

Oh my god! It's raining Llama-3 today! Based on an amazing work of @winglian ❤️, I created "Llama-3-8B-Instruct-64k" model! Already being tested, quantized, and uploaded on @huggingface 🚀 Who wants them?!

Wing Lian (caseus)

@winglian

3 months

I'm up to 96k context for Llama 3 8B. Using PoSE, we did continued pre-training of the base model w 300M tokens to extend the context length to 64k. From there we increased the RoPE theta to further attempt to extend the context length. 🧵

26

61

444

3

6

60

Maziyar PANAHI

@MaziyarPanahi

1 month

Something big is coming… 🚀 👀 Stay tuned for an exciting reveal from ‘Calme’ series. Are you ready to meet ‘Calme-2’ on @huggingface ? 🤗

4

61

Maziyar PANAHI

@MaziyarPanahi

1 month

🎉 Exciting news for all Qwen fans! 🚀 I've just released 8 new models based on the powerful Qwen2 base model! Developed by @Alibaba_Qwen , these models are leading the way at @OpenLLMLeaders 🤗 Find them all on @huggingface (7B models are under Apache 2.0 license 💙):

5

8

59

Maziyar PANAHI

@MaziyarPanahi

4 months

First time for everything! Trending on @huggingface 🤗🚀

6

5

57

Maziyar PANAHI

@MaziyarPanahi

23 days

The one and only @ClementDelangue , CEO of @huggingface 🤗 P.S. Not sure why we both look so surprised! 😂

Maziyar PANAHI

@MaziyarPanahi

23 days

Correction, now we are ready to take off! 😂

0

3

14

2

57

Maziyar PANAHI

@MaziyarPanahi

3 months

🚨New GGUF alert! - 5x new high-quality IQ based quantized models for "Meta-Llama-3-70B-Instruct" by @AIatMeta - 10x new GGUF models for "Llama-3-Smaug-8B" by @abacusai Models are available on @huggingface (links are in the comments) - as always, thanks for the support ❤️

6

12

57

Maziyar PANAHI

@MaziyarPanahi

12 days

To all the axolotl lovers and DPO fans, make sure you check out this beautiful PR by @fozziethebeat (it now matches the SFT).

1

4

56

Maziyar PANAHI

@MaziyarPanahi

2 months

How is the fine-tuning of Qwen2-72B going so far? Getting there! 😅 Who's winning: - MaziyarPanahi/Llama-3-70B-Instruct-DPO-v0.2 - MaziyarPanahi/Qwen2-72B-Instruct-v0.1 - meta-llama/Meta-Llama-3-70B-Instruct - Qwen/Qwen2-72B Thanks, @winglian for Qwen2 + FSDP in @axolotl_ai 🚀

2

7

54

Maziyar PANAHI

@MaziyarPanahi

9 days

And the GGUF models for Meta-Llama-3.1-70B-Instruct model are now ready! Find them on @huggingface :

2

10

51

Maziyar PANAHI

@MaziyarPanahi

2 months

Meet the new monster in town: Nemotron-4 by @NVIDIAAI , featuring 340 billion parameters! Get them while their hot on @huggingface 🔥

6

8

50

Maziyar PANAHI

@MaziyarPanahi

11 days

🎉 Another big news today! Llama.cpp now supports SmolLM models by @huggingface ! 🤖 All 3 models will be quantized and available soon 🔥 ✨ HuggingFaceTB/SmolLM-135M-Instruct 🌟 HuggingFaceTB/SmolLM-360M-Instruct 💪 HuggingFaceTB/SmolLM-1.7B-Instruct

Maziyar PANAHI

@MaziyarPanahi

11 days

Thrilled to announce Mistral-Nemo-Instruct-2407 support in Llama.cpp! This wouldn't have been possible without the incredible contributions from the community. A huge thank you to everyone who participated and helped make this happen! 🩷

5

15

86

3

54

Maziyar PANAHI

@MaziyarPanahi

11 days

I have generated over 6K rows for a synthetic French CoT Legal dataset, and I'm quite satisfied with the results achieved using only a local LLM. Special thanks to @Teknium1 for the "Nous-Hermes-2-Mixtral-8x7B-DPO" model! Excellent balance between quality and inference speed.

Maziyar PANAHI

@MaziyarPanahi

12 days

I love using local LLMs to generate synthetic datasets! I test various models, have Claude judge the outputs, and then choose the best local LLM for each dataset. @NousResearch , this model consistently produces high-quality results. Any insights into why?

2

0

23

3

4

48

Maziyar PANAHI

@MaziyarPanahi

17 days

Excuse me?! Did they seriously just drop a record-breaking model like it's nothing? Zero shame! 🥰 Hey @huggingface , how about a new feature? Trending Alert: ping us when a model gets 20+ likes in 24 hours? For science, obviously! 🤗😅

OpenGVLab

@opengvlab

17 days

Today, we released InternVL2-Llama3-76B on HuggingFace🤗Hope to get your "like ❤️" on the page🥰!

4

30

229

1

2

47

Maziyar PANAHI

@MaziyarPanahi

2 months

And on that note, heading to bed! 😂 Phi-3 fine-tune models:

OpenLLMLeaders

@OpenLLMLeaders

2 months

New model added to the leaderboard! Model Name Overall rank: 1530 Rank in 3B category: 1 Benchmarks Average: 70.26 ARC: 63.48 HellaSwag: 80.86 MMLU: 69.24 TruthfulQA: 60.66 Winogrande: 72.77 GSM8K: 74.53

1

3

19

3

6

47

Maziyar PANAHI

@MaziyarPanahi

1 month

From my experience testing Gemma-2 (27B), recently released by Google in the Medical Advanced Rag, I've learned the following. Specs: - DSPy - Citations - Long context (up to 7K to work with Llama-3 & Gemma-2) - Handling patient reports (messy) - Must answer fully!

6

3

47

Maziyar PANAHI

@MaziyarPanahi

3 months

Great job @Gradient_AI_ ! This one is very close to the Instruct and that's pretty impressive! ❤️🚀👏🏽

OpenLLMLeaders

@OpenLLMLeaders

3 months

New model added to the leaderboard! Model Name Overall rank: 527 Rank in 70+B category: 42 Benchmarks Average: 75.04 ARC: 67.58 HellaSwag: 86.4 MMLU: 77.19 TruthfulQA: 54.68 Winogrande: 83.98 GSM8K: 80.44

0

1

5

3

6

45

Maziyar PANAHI

@MaziyarPanahi

4 months

@chefeitenko @rileybrown_ai @midjourney Why do you think CAPTCHA is asking human to identify them? 😆

1

0

45

Maziyar PANAHI

@MaziyarPanahi

1 month

So it is finally here! The new and much improved Open LLM Leaderboard 2.0! 👿 In honor of this new leaderboard, I am publishing new fine-tuned Qwen2 models! 🚀 Follow my 👑Queen collection on @huggingface for all the new Qwen2 models!

Open-LLM performances are plateauing, let’s make the leaderboard steep again - a Hugging Face Space...

huggingface.co

4

43

Maziyar PANAHI

@MaziyarPanahi

10 days

Now that's why I subscribe to @huggingface Pro plan! I don't even have access to the model, but I can hit those Inference Endpoints to use Llama-3.1 models!!! 🤗

3

7

42

Maziyar PANAHI

@MaziyarPanahi

4 months

One of the community members tried the IQ3-XS quants of "WizardLM-2-8x22B" by @WizardLM_AI —this is such a complicated question! Such an advanced and coherent response! I am quite impressed!

Maziyar PANAHI

@MaziyarPanahi

4 months

People started to enjoy the new " WizardLM-2-8x22B" by @WizardLM_AI in GGUF format! If this is IQ3-XS, I don't even wanna know what the 8bit is capable of! 🚀

1

0

9

1

7

43

Maziyar PANAHI

@MaziyarPanahi

3 months

I am happy to announce the release of the v0.3 fine-tuning of Llama-3-8B-Instruct using the DPO dataset. As always, the template is ChatML! 😊 You can download it on @huggingface

Maziyar PANAHI

@MaziyarPanahi

3 months

I am about to upload my second fine-tuned Llama-3-8B-Instruct (v0.2) on @huggingface In the meantime, could you please explain to me what is happening with my 3rd run? (v0.3) Loss starts at : 0.6931 Loss ends at : 0.0026

4

1

12

6

2

41

Maziyar PANAHI

@MaziyarPanahi

1 year

How big is your LLM? 😁 Putting "databricks/dolly-v2-3b" to work! 😎 - I love taking AI solutions apart and trying to replace the closed parts with open-source solutions - Nothing wrong with OpenAI, in fact, it's a great service. But it locks you in & you need to share data

3

6

40

Maziyar PANAHI

@MaziyarPanahi

1 year

@PeakLifeDT 4L of water! You’ll spend half a day in toilette!

7

1

40

Maziyar PANAHI

@MaziyarPanahi

16 days

Seriously?! It hasn't even been a day, and you've all downloaded the GGUF models as if your lives depend on them! Animals! ❤️ Nearly 17K downloads on @huggingface for the newly released Mathstral-7B model by @MistralAI ! 🤗 It's like Black Friday for nerds!

Maziyar PANAHI

@MaziyarPanahi

17 days

"Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B." GGUF models are available on @huggingface

3

4

28

3

6

40

Maziyar PANAHI

@MaziyarPanahi

3 months

Earlier today I submitted this model to the Open LLM Leaderboard. There is room to improve in our quest to extend Llama-3 context length. 🚀

Gradient

@Gradient_AI_

3 months

We just released the first LLama-3 8B with a context length of over 160K onto Hugging Face! SOTA LLMs can learn to operate on long context with minimal training (< 200M tokens, powered by @CrusoeEnergy 's compute) by appropriately adjusting RoPE theta. 🔗

31

78

422

3

2

39

Maziyar PANAHI

@MaziyarPanahi

3 months

Up next on @huggingface ! Coming to you this week: - New fine-tuned Llama-3-70B models - New fine-tuned Llamixtral-3 models (Mixture of Llama-3 in 24B and 47B) - New fine-tuned Qwen1.5-32B models

2

8

38

Maziyar PANAHI

@MaziyarPanahi

4 months

The new WizardLM-207B is so fast in @LMStudioAI ! 🚀

4

36

Maziyar PANAHI

@MaziyarPanahi

8 days

Mistral Large 1-bit and 2-bit models are ready to use! 🚀 1-bit: IQ1_S & IQ1_M 2-bit: IQ2_XS & Q2_K Found them on @huggingface 🤗:

MaziyarPanahi/Mistral-Large-Instruct-2407-GGUF · Hugging Face

huggingface.co

Maziyar PANAHI

@MaziyarPanahi

9 days

Never doubt! GGUFs for "Mistral-Large-Instruct-2407" are coming to @huggingface 🤗

1

0

33

3

5

50

Maziyar PANAHI

@MaziyarPanahi

1 year

@levelsio Like with every new technology! 😂

0

9

36

Maziyar PANAHI

@MaziyarPanahi

3 months

Oh no! I just kicked myself out of the top 10 on the Open LLM Leaderboard @huggingface ! 🤣 v0.4, welcome to the top 10! Goodbye, v0.1; we wish we could have kept you around!

Maziyar PANAHI

@MaziyarPanahi

3 months

It finally happened! I made it to the Top 10 of the Open LLM Leaderboard by @huggingface , right on the edge! Thank you all! ❤️ This is the very first fine-tuned model I've created based on Llama-3-70B, released by @metaai . I will be releasing 16 more fine-tuned models!🚀

13

9

146

3

4

36

Maziyar PANAHI

@MaziyarPanahi

15 days

Mistral NeMo Mistral NeMo: our new best small model. A state-of-the-art 12B model with 128k context length, built in collaboration with NVIDIA, and released under the Apache 2.0 license.

Mistral NeMo

Mistral NeMo: our new best small model. A state-of-the-art 12B model with 128k context length, built in collaboration with NVIDIA, and released under the Apache 2.0 license.

mistral.ai

4

3

36

Maziyar PANAHI

@MaziyarPanahi

2 months

Thanks to this beautiful work, here are the GGUF models for the new “Codestral-22B-v0.1” model by @MistralAI 🩷 You can find them on @huggingface and use them directly in your local LLM apps! 🚀

Rodri Mora aka Bullerwins

@rodrimora

2 months

@MaziyarPanahi I have converted, it seems to work now with Transformers:

2

1

5

1

5

35

Maziyar PANAHI

@MaziyarPanahi

1 year

@MattWallace888 @elonmusk That episode! I mean that episode!!!!!

7

0

32

Maziyar PANAHI

@MaziyarPanahi

2 months

Aww 🥰! You guys are so cute! 🤗 @huggingface at #vivatech

7

3

34

Maziyar PANAHI

@MaziyarPanahi

3 months

Llama-3-8B quantized in 2bit! It's a sassy little one! 🚀Super fast! @LMStudioAI (macbook pro, m2 max) 🤗Download them models from @huggingface

1

0

32

Maziyar PANAHI

@MaziyarPanahi

3 months

I won't go into detail about why such behavior, especially over the weekend, is far from professional. Nor will I entertain the idea that you should be grateful for free stuff and prepared for them to pull the rug out from under you! Truly free access:

〽️MistralAI - a MaziyarPanahi Collection

huggingface.co

0

1

33

Maziyar PANAHI

@MaziyarPanahi

11 days

📢 New dataset alert! 🚀 Ready to use in @axolotl_ai ! Just drop these into your SFT fine-tune YAML and you're all set! 💻✨ (maybe warm up a bit for Llama-3.1 release tomorrow!) Can't wait to give this a spin! Big thanks to @arcee_ai for making it happen! 🙌

Lucas Atkins

@LucasAtkins7

11 days

Today Arcee is releasing two datasets: 1. The Tome - this is a 1.75 million sample dataset that has been filtered to train strong generalist models. This is the dataset that was used to train Spark and Nova 2. Agent-Data: This is Arcee-Agent's dataset, comprising different

4

13

62

0

3

33

Maziyar PANAHI

@MaziyarPanahi

2 months

The Qwen2-72B base model, released yesterday by @Alibaba_Qwen , has achieved the highest MMLU on the @huggingface Open LLM Leaderboard! 😱

5

3

30

Maziyar PANAHI

@MaziyarPanahi

9 days

Never doubt! GGUFs for "Mistral-Large-Instruct-2407" are coming to @huggingface 🤗

Mistral AI

@MistralAI

9 days

126

361

2K

1

0

33

Maziyar PANAHI

@MaziyarPanahi

3 months

@mervenoyann I duplicated your LLaVA NeXT space and made this one for LLaVa-Llama-3-8B: Very interesting model! The space needs some improvements, especially in the chat interface to fill the height. Thanks for the jump start 😊

Llava Llama-3 8B - a Hugging Face Space by MaziyarPanahi

huggingface.co

1

5

32

Maziyar PANAHI

@MaziyarPanahi

18 days

Thank you for releasing the code! This is big! 💥 Would love to see this in TRL @huggingface and @axolotl_ai ❤️ 🤗

Can Xu

@CanXu20

18 days

🔥 Excited to share the other key Technology of WizardLM-2! 📙AutoEvol: Automatic Instruction Evolving for Large Language Models 🚀We build a fully automated Evol-Instruct pipeline to create high-quality, highly complex instruction tuning data: -------- 🧵 --------

10

28

217

3

8

32

Maziyar PANAHI

@MaziyarPanahi

4 months

Can't wait for "Meta-Llama-3-70B-Instruct" to be downloaded? 😎 Start playing with the Q2 on your laptop! @huggingface 🚀

3

4

31

Maziyar PANAHI

@MaziyarPanahi

6 years

Visualizing our French Twitter archive in 2017: More than 10 million geo-tagged tweets being processed and visualized by using #MapD on top of 4 Tesla P100 GPUs. #BigData #visualization #geospatial #Twitter

0

20

30

Maziyar PANAHI

@MaziyarPanahi

10 days

Let's test the new Llama-3.1-8B model locally! 🔥

MaziyarPanahi/Meta-Llama-3.1-8B-Instruct-GGUF · Hugging Face

huggingface.co

5

1

30

Maziyar PANAHI

@MaziyarPanahi

17 days

"Mathstral 7B is a model specializing in mathematical and scientific tasks, based on Mistral 7B." GGUF models are available on @huggingface

Mistral AI

@MistralAI

17 days

86

307

2K

3

4

28

Maziyar PANAHI

@MaziyarPanahi

8 days

Remember when AI was just for tech giants? Now we're running models locally! It's amazing how far we've come. 🤗

Maxime Labonne

@maximelabonne

8 days

Due to popular demand, I've updated this figure to include DeepSeek-V2 and Mistral Large 2. It's also more zoomed for readability.

18

73

352

4

5

42

Maziyar PANAHI

@MaziyarPanahi

1 month

Here are the GGUF models for the great "Firefunction-V2" model by @FireworksAI_HQ 🚀 Available on @huggingface - Original model: - Quants:

1

4

30

Maziyar PANAHI

@MaziyarPanahi

4 months

GGUFs are coming! Never doubt! 🚀 PS: also I am worries they might delete them later! 😂

Vaibhav (VB) Srivastav

@reach_vb

4 months

Damn straight! Mistral just dropped the Mistral 8x22B Instruct weights 🔥 > 90.8% on GSM8K maj @8 > 44.6% on math maj @4 Also Mistral throwing shade on Cohere lol

8

28

228

4

0

30

Maziyar PANAHI

@MaziyarPanahi

1 month

Me, immediately! 🤣

6

1

30

Maziyar PANAHI

@MaziyarPanahi

10 days

The GGUF models for Llama-3.1-8B are going to come out pretty quick!

2

1

29

Maziyar PANAHI

@MaziyarPanahi

4 months

12k downloads in less than 24hours!!! Please have some mercy on @huggingface 🤗 Make AI accessible, and lo and behold, it's like everyone suddenly has a need for it! ❤️🚀

2

3

29

Maziyar PANAHI

@MaziyarPanahi

14 days

⚡️ Haha! I am planning to bring OpenAI in-house! 🖥️ Snagging a pre-loved Mac Studio to become my offline AI powerhouse! 💪 I am talking Qwen2, Llama-3, Yi-1.5, and Gemma – all the cool kids of the SLM world, ready to party on one machine! Lab squad, assemble! 🔥

Prince Canuma

@Prince_Canuma

16 days

FastMLX: Turn your powerful Mac into an AI home server 🚀 Using my M3 Max (96GB URAM) to run a VLM, streaming responses to my M1 MacBook Air over WiFi.🔥 > pip install -U fastmlx

12

38

258

4

3

29

Maziyar PANAHI

@MaziyarPanahi

2 months

I am tired of your mediocrity, the "Real Housewives of AI" show, all the "regarding" statements, and the deception that something incredible is coming! Maybe it's just me, but I get more from my local LLMs for €0! See ya! 👋🏼 @OpenAI

5

0

29

Maziyar PANAHI

@MaziyarPanahi

1 month

Tell me you are GPU poor without telling me you are GPU poor!

6

3

29

Maziyar PANAHI

@MaziyarPanahi

4 months

Mixtral-8x22B-Instruct-v0.1, going wild on TOOLS & FUNCTION CALLING: "<unk>" "<s>", "</s>", "[INST]", "[/INST]", "[TOOL_CALLS]", "[AVAILABLE_TOOLS]", "[/AVAILABLE_TOOLS]", "[TOOL_RESULT]", "[/TOOL_RESULTS]",

4

3

29

Maziyar PANAHI

@MaziyarPanahi

29 days

This proves that Q8 in Llama.cpp (GGUF) can go head to head with FP16 and FP32! This saves a lot of VRAM! Thank you @bartowski1182 🙏

bartowski

@bartowski1182

29 days

Finished up testing the experimental quants testing with some interesting results (tldr, no more fp16 embed/output, now q8 embed/output) Basically across 8 categories I found that quantizing the embeddings and outputs to Q8 was equal to or better than FP16 in 6 of them. I feel

5

7

57

3

7

28

Maziyar PANAHI

@MaziyarPanahi

3 months

I just re-uploaded back "WizardLM-70B-V1.0" to @huggingface This has been one of my favorite LLMs for a long time, trained and released by @WizardLM_AI I am uploading it back since it doesn't exist anymore and I think it deserves to live on! ❤️🚀

3

4

28

Maziyar PANAHI

@MaziyarPanahi

3 months

Spark NLP just hit 100 million downloads on PyPI! 🚀Huge THANK YOU to our incredible community for your support over the past 7 years. I'm profoundly grateful to be part of this remarkable journey with such an inspiring and dedicated team. Here's to many more milestones ahead! 🥳

2

0

27

Maziyar PANAHI

@MaziyarPanahi

15 days

Considering deploying local AI for your business? Here's something to consider for your team: 1. Mac Studio: Approx. $2,500 - Full-featured workstation - Versatile for various tasks 2. NVIDIA H100 GPU: $33K-$40K (if available) - Requires server infrastructure - High

Alex Cheema - e/acc

@ac_crypto

15 days

How long does it take to get distributed inference running locally across 2 MacBook GPUs from a fresh install? About 60 seconds, running @exolabs_ Watch till the end, I chat to the cluster using @__tinygrad__ ChatGPT web interface Code is open source 👇

17

50

627

2

0

28

Maziyar PANAHI

@MaziyarPanahi

4 months

I am just gonna say it, being in Central European Time (CET) sucks! You wake up and models were announced, already converted, quantized, hell even fine-tuned on @huggingface There's nothing left to do but download and enjoy!

3

28

Maziyar PANAHI

@MaziyarPanahi

3 months

I’ve been testing the new GPT-4o (Omni) in ChatGPT. I am not impressed! Not even a little! Faster, cheaper, multimodal, these are not for me. Code interpreter, that’s all I care and it’s as lazy as it was before! We have @AnthropicAI in EU, I am considering the switch. Feedbacks?

9

4

27