Alpay Ariyak @AlpayAriyak Twitter profile

Pinned Tweet

Alpay Ariyak

@AlpayAriyak

9 months

Excited to present our newest update! Mixtral still in progress ⚙️

OpenChat

@OpenChatDev

9 months

🚀Announcing OpenChat-3.5 Update 0106: 𝗪𝗼𝗿𝗹𝗱’𝘀 𝗕𝗲𝘀𝘁 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝟳𝗕 𝗟𝗟𝗠! Experience ChatGPT & Grok-level AI locally 💿! Surpassing Grok-0 (33B) across all 4 benchmarks and Grok-1 (???B) on average and 3/4 benchmarks 🔥. 🎯 This update mainly enhanced

32

132

639

0

2

40

Last Seen Profiles

@ArchitectOffic2

@ggrazynka

@shvetsnice

@TempoAction

@blackcatlunaof

@NKARSENTI

@Covid_1984_

@uzair_amsal

@DHSIndiansHoops

@pecchini_pietro

@lupinek_marcel

@sadthmbad

@111taekim111

@Sorthouu_jKsr

@MikkiBrock

@IshwariaMD

@stw_pdg

@ANHar_74

@JCalidus

@dr_kouk

@BinorRaja

@st_vincent

@Philip_Elliott

@Gaziantepevli0

@dramebaz_woman

@yuto03145746

@MieraBobbe

@cdnpress

@Tormentazk

@FranklinCh27

@bardinextdoor

@bokeplokalmalam

@Leandromv1999

@AlosioCardoso2

@sasayoukuma

@bokeplokalmalam

Alpay Ariyak

@AlpayAriyak

6 months

I ran humaneval (base and plus) on the new GPT-4-Turbo-2024-04-09, and it ranks #1 on both

21

48

416

Alpay Ariyak

@AlpayAriyak

2 months

I have an exciting update - I’ve recently joined Together AI @togethercompute as a Research Scientist to lead Post-Training! Having the time of my life :)

36

5

282

Alpay Ariyak

@AlpayAriyak

10 months

Finished cooking up the best Open Source 7B LLM to date, release coming

24

13

264

Alpay Ariyak

@AlpayAriyak

6 months

HumanEval benchmark results for the new Mixtral 8x22B, DBRX-base, Qwen1.5-72B, and Mixtral 8x7B:

4

24

190

Alpay Ariyak

@AlpayAriyak

5 months

Just another day at the office 💜

5

2

157

Alpay Ariyak

@AlpayAriyak

7 months

🤭

7

20

142

Alpay Ariyak

@AlpayAriyak

10 months

Excited to introduce the World's Best Open Source 7B LLM! This is only the beginning, another release coming in the next 1-2 weeks :) Any guesses?

OpenChat

@OpenChatDev

10 months

Introducing the 𝗪𝗼𝗿𝗹𝗱’𝘀 𝗕𝗲𝘀𝘁 𝗢𝗽𝗲𝗻 𝗦𝗼𝘂𝗿𝗰𝗲 𝟳𝗕 𝗟𝗟𝗠 - OpenChat-3.5-1210, further surpassing ChatGPT and Grok models. This upgrade to the widely adopted OpenChat-3.5 is focused on increasing the performance in one of the most important areas for LLMs -

25

146

804

11

5

106

Alpay Ariyak

@AlpayAriyak

6 months

- If you fine-tune Llama 3, you HAVE to include "Llama 3" at the beginning of the name of your release - You are not allowed to train on outputs generated by Llama 3(and its fine-tunes) unless you are training Llama 3

6

15

99

Alpay Ariyak

@AlpayAriyak

3 months

Planning to build a centralized library for synthetic data generation for LLM training(sft first, then pt, rlhf, etc). Will allow you to build pipelines that combine and chain multiple data generation methods together, and modify them individually. Will be housing both

8

100

Alpay Ariyak

@AlpayAriyak

6 months

By request: @CohereForAI 's Command-R+ results on HumanEval (base & evalplus) - ranks #32 (64%) and #33 (56.7%) respectively

Alpay Ariyak

@AlpayAriyak

6 months

I ran humaneval (base and plus) on the new GPT-4-Turbo-2024-04-09, and it ranks #1 on both

21

48

416

12

9

82

Alpay Ariyak

@AlpayAriyak

5 months

@carrigmat Try @WizardLM_AI ’s WizardLM-2 8x22B, beats every open source LLM I’ve tried, including Llama-3 70B Instruct, by far

5

3

65

Alpay Ariyak

@AlpayAriyak

5 months

Happy to see that our new Llama 3 8B-based OpenChat-3.6 (still training) gets it right No CoT prompt needed either

Private LLM

@private_llm

5 months

We’ve never seen a 7B model get the famous Sally test right, let alone a 3.8B model. 💪Even the Llama 3 8B Instruct model fails this test. Coming 🔜 to your 📱

5

17

90

4

7

66

Alpay Ariyak

@AlpayAriyak

5 months

graduated, can finally build with no distractions

11

0

64

Alpay Ariyak

@AlpayAriyak

7 months

OpenChat-3.6 coming 💙

Google DeepMind

@GoogleDeepMind

7 months

We’re releasing Gemma 2B and 7B, which achieve best-in-class performance for their sizes compared to other models, and can run on a developer laptop or computer. They also surpass much larger models on key benchmarks while meeting our standards for safe and responsible outputs.

92

75

386

5

3

62

Alpay Ariyak

@AlpayAriyak

6 months

I’m so happy to share the first paper I’m listed as an author on ❤️

AK

@_akhaliq

6 months

Aurora-M The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order Pretrained language models underpin several AI applications, but their high computational cost for training limits accessibility. Initiatives such as BLOOM and

21

65

334

8

5

57

Alpay Ariyak

@AlpayAriyak

5 months

nvidia: only adding fp8 support to H100s deepspeed dropping FP6 on A100s:

Rohan Paul

@rohanpaul_ai

5 months

LLaMA-70b inferencing using only a single GPU and achieving 1.69x-2.65x higher normalized inference throughput than the FP16 baseline. with Six-bit quantization (FP6) 🔥 Deepspeed has just recently released this Paper and also integrated the FP6 quantization - "FP6-LLM:

10

107

607

1

43

Alpay Ariyak

@AlpayAriyak

6 months

@OfficialLoganK @Google Reading this coming from you just feels like I’m in a parallel universe

4

1

42

Alpay Ariyak

@AlpayAriyak

3 months

If you’ve spoken to me since the official announcement of WizardLM-2 in April, there’s a 99% chance I was in your ear rambling about how much I was looking forward to the paper detailing their new training data synthesis pipeline - today is my Christmas Thank you @WizardLM_AI

WizardLM

@WizardLM_AI

3 months

🎉Today we are announcing Evol-Instruct V2 !!! 🔥 Auto Evol-Instruct is one of the most important technologies for WizardLM-2. Paper link: We build a fully automated Evol-Instruct pipeline, allowing WizardLM-2 to be extended from three evolved domains

2

45

212

3

35

Alpay Ariyak

@AlpayAriyak

3 months

Pleasantly surprised to see that our old @MistralAI 7B-based OpenChat model from January is the most popular generalist LLM fine-tune released by the open source on @OpenRouterAI 🫶

OpenRouter

@OpenRouterAI

3 months

Claude 3.5 Sonnet is now more popular than Llama, Mistral, and MythoMax combined 👀

6

16

175

4

3

34

Alpay Ariyak

@AlpayAriyak

5 months

We’re just getting started 💜🚀

RunPod

@runpod_io

5 months

Excited to announce our $20 million Seed led by @intelcapital and @DellTechCapital to make AI training and inference as seamless as possible at scale. A huge thank you too our investors, early supporters, and all the devs and companies that use and love our platform ❤️ We're

18

20

161

3

1

29

Alpay Ariyak

@AlpayAriyak

6 months

If you're surprised that WizardLM-2 was removed, you haven't been following WizardLM well enough Some events I can remember: - Evol-Instruct code deleted right after announcement about it being open-sourced - WizardLM code deleted - WizardLM repo renamed to abcd for a while lol?

4

0

28

Alpay Ariyak

@AlpayAriyak

3 months

I’m at the @aiDotEngineer fair - hit me up if you’re around, let’s meet :)

0

1

26

Alpay Ariyak

@AlpayAriyak

6 months

prediction for 2024: someone either has or will cheat the LMSys Arena Elo Leaderboard

4

1

25

Alpay Ariyak

@AlpayAriyak

7 months

This also goes for @MistralAI and @GoogleAI : When you drop models with quirky architectures, how hard is it to release some damn full fine-tuning code? So much compute was wasted in the open source community trying to get Mixtral and Gemma fine-tuning working

Felix

@felix_red_panda

7 months

the grok model was published under apache2 license and it's 296.4GB torrent

17

86

696

2

0

25

Alpay Ariyak

@AlpayAriyak

4 months

Excited to present our latest release!

OpenChat

@OpenChatDev

4 months

🚀Introducing OpenChat 3.6 🌟Surpassed official Llama3-Instruct—with 1-2M synthetic data compared to ~10M human labels 🤫GPTs are close to limits—excel at generation but fall short at complex tasks 🎯We are training next gen—capable of deterministic reasoning and planning 🔗

9

68

300

1

24

Alpay Ariyak

@AlpayAriyak

6 months

Hello there new neighbor @DbrxMosaicAI ( @lmsysorg Chatbot Arena)

0

1

23

Alpay Ariyak

@AlpayAriyak

7 months

Grok openchat fine-tune speedrun begins

1

2

21

Alpay Ariyak

@AlpayAriyak

3 months

New Orca paper, data generation pipeline looks quite promising for performance and quality in real world usage Microsoft is greedy with datasets, so it likely won’t be released, but please share the generation prompts at least, so the pipeline can be accurately reproduced 🙏🏼

arindam mitra

@Arindam1408

3 months

In just three weeks, our small data focused team of 4 members were able to do necessary coding and generate over 25 million instructions, covering more than 17 skills.

1

2

15

1

2

19

Alpay Ariyak

@AlpayAriyak

7 months

Excited to share our latest model :)

OpenChat

@OpenChatDev

7 months

🚀 The World's First Gemma fine-tune based on openchat-3.5-0106 data and method (C-RLFT). Almost the same performance as the Mistral-based version. 6T tokens = secret recipe? HuggingFace:

11

31

185

0

1

14

Alpay Ariyak

@AlpayAriyak

6 months

Code: UI: Thank you for your amazing work @JiaweiLiu_

GitHub - evalplus/evalplus.github.io

Contribute to evalplus/evalplus.github.io development by creating an account on GitHub.

github.com

0

15

Alpay Ariyak

@AlpayAriyak

5 months

free 4090s, food, RunPod credits and immaculate vibes, pull up 💜

RunPod

@runpod_io

5 months

We're hosting our first hackathon in SF! We'll be giving away 4090s to everyone who wins and over $100k in prizes 125 hackers, all at RunPod HQ, building cool stuff, May 17-18. We'll also have free Chipotle 🌶️ Speakers dropping soon, apply here:

7

5

61

0

1

14

Alpay Ariyak

@AlpayAriyak

10 months

OpenChat 3.5 is now 1 of the 5 available models on @huggingface 's HuggingChat! @imonenext

2

1

14

Alpay Ariyak

@AlpayAriyak

9 months

Zhaojian Yu

@yfngnin4

9 months

@TeamCodeLLM_AI We are confirming the Microsoft's open source policy. If approved, we will release all code, data and models.

3

0

13

0

1

14

Alpay Ariyak

@AlpayAriyak

6 months

They're all base models, so had to play around with the prompt a bit to minimize completions, such as "TODO: implement", "pass", etc. - there were a lot The final and most successful template that was used on all: f"""The following function correctly solves the problem

2

1

12

Alpay Ariyak

@AlpayAriyak

7 months

@erhartford I agree that it matters, but: 1. We don’t know if the benchmarks they provided in the November blog are of a fine-tuned version of Grok-1 or the released base 2. They focused their comparison on instruction/chat models, so it’s fair for us to include openchat in said comparison

1

10

Alpay Ariyak

@AlpayAriyak

3 months

This will be a fully open source effort, so reach out if you’re interested in taking part! The end goal is for it to be THE open source library for synthetic data. What I’m most excited about is decomposing data generation pipelines from research into building blocks that you can

Alpay Ariyak

@AlpayAriyak

3 months

Planning to build a centralized library for synthetic data generation for LLM training(sft first, then pt, rlhf, etc). Will allow you to build pipelines that combine and chain multiple data generation methods together, and modify them individually. Will be housing both

8

100

2

0

10

Alpay Ariyak

@AlpayAriyak

1 year

Nearly all Instruction-tuned Multi-Modal Large Language Models were created with synthetic data generated by 𝘁𝗲𝘅𝘁-𝗼𝗻𝗹𝘆 GPT models, some incredible examples are LLaVA, Otter, LLaVAR. Now imagine how the Open-Source MLLM landscape will change with GPT-Vision-generated data

0

1

9

Alpay Ariyak

@AlpayAriyak

11 months

The GSM8K results on the new Open LLM Leaderboard seem a bit off. I don't expect the exact same results as reported in the papers, but the difference between MetaMath and WizardMath is too high. WizardMath-70b in general should not only have 4.09. I'm guessing prompt issue?

4

2

8

Alpay Ariyak

@AlpayAriyak

3 months

Also sorry to the lovely team for asking for the paper every couple of weeks, I appreciate the updates and keeping my hopes up when I thought it would never come out ❤️❤️❤️

1

0

8

Alpay Ariyak

@AlpayAriyak

3 months

@CanXu20 Thank you for releasing this @CanXu20 !!!

0

8

Alpay Ariyak

@AlpayAriyak

6 months

@Geronimo_AI Now all models from Microsoft must undergo an evaluation of toxicity to be released, and WizardLM were unaware of this (doesn't mean they will align the model though, only disclose the results)

WizardLM

@WizardLM_AI

6 months

🫡 We are sorry for that. It’s been a while since we’ve released a model months ago😅, so we’re unfamiliar with the new release process now: We accidentally missed an item required in the model release process - toxicity testing. We are currently completing this test quickly

59

68

682

0

8

Alpay Ariyak

@AlpayAriyak

7 months

@Teknium1 They only used Open Source implementations of evol code instruct data

1

0

7

Alpay Ariyak

@AlpayAriyak

3 months

@Teknium1 @chargoddard Congratulations on the release, amazing job ❤️ How many nodes did you use for the fft?

1

0

6

Alpay Ariyak

@AlpayAriyak

3 months

‼️ Both technical and non-technical background welcome ‼️

2

0

7

Alpay Ariyak

@AlpayAriyak

6 months

The only guardrail in place at the moment is checking whether the model reveals its identity in the response. But since the users control what the input is, it's very easy to bake in subtle (and undetectable by current guardrail) ways to identify the model

3

0

7

Alpay Ariyak

@AlpayAriyak

6 months

@altryne @CohereForAI Yeah, I'm surprised a model scoring this low on coding is so high on the arena leaderboard Maybe this is why I can't seem to find any official humaneval scores reported by Cohere anywhere

1

0

7

Alpay Ariyak

@AlpayAriyak

6 months

@CohereForAI Ran MBPP as a sanity check - better, but still not as close to GPT-4 and Claude as it is on the lmsys arena. Very interesting

0

6

Alpay Ariyak

@AlpayAriyak

10 months

@francoisfleuret Thank you for trying our model! I loved your book. Does this apply the openchat template? Trying these questions with the correct template and 2 different sets of sampling parameters(t=0.5, top_p=0.95 & t=0, top_p=1), got much better answers. Maybe a quant and/or param issue?

1

0

6

Alpay Ariyak

@AlpayAriyak

3 months

@AarushSah_ Now delete prod db

1

0

6

Alpay Ariyak

@AlpayAriyak

2 months

Can we please get more details @AIatMeta

3

0

6

Alpay Ariyak

@AlpayAriyak

8 months

This is a great initiative! Very happy to see our latest model OpenChat-3.5-0106 there, which scored 3rd overall and 1st at Customer Support Dialogue

PatronusAI

@PatronusAI

8 months

Today, we’re excited to announce the Enterprise Scenarios Leaderboard on Hugging Face, the first LLM leaderboard for real world use cases! 🏆

4

19

75

2

1

6

Alpay Ariyak

@AlpayAriyak

9 months

❤️

Nous Research

@NousResearch

9 months

Nous Research is excited to announce the closing of our $5.2 million seed financing round. We're proud to work with passionate, high-integrity partners that made this round possible, including co-leads @DistributedG and @OSSCapital , with participation from @vipulved , founder

74

75

855

0

5

Alpay Ariyak

@AlpayAriyak

7 months

I thought they were going to release the instruct/chat version, huge respect for this, truly a delight for anyone looking to fine-tune it Now release the fine-tuning code @grok @elonmusk so the open-source community doesn’t have to waste a ton of money trying to get it working

Teknium (e/λ)

@Teknium1

7 months

Yes, it is a base model apparently:

20

24

373

1

0

6

Alpay Ariyak

@AlpayAriyak

11 months

3 benchmarks added to @huggingface Open LLM Leaderboard - Winograde, GSM8K & DROP. As a result, ’s Yi-34B is now the top model. This is a step in the right direction, but there is still no coding benchmark, despite being one of the main LLM use cases

1

0

6

Alpay Ariyak

@AlpayAriyak

2 months

@aidan_mclau @hyperbolic_labs For 3192 Max Tokens, 0 temp, the results were: @hyperbolic_labs 405B Instruct (bf16): 190 @togethercompute 405B Instruct (fp8): 187 The code doesn't seem to communicate in the final score if any requests were skipped entirely due to error. I initially ran on hyperbolic with an

0

5

Alpay Ariyak

@AlpayAriyak

11 months

@lmsysorg I was playing around with evolving benchmark questions with @WizardLM_AI Evol-Instruct method a while back, and got some interesting results, especially with Breadth evolution. Less cheating than rephrasing, but not fair either - how far can we push this line 😏

0

5

Alpay Ariyak

@AlpayAriyak

6 months

@Teknium1 That makes me want to test Grok-1

1

0

5

Alpay Ariyak

@AlpayAriyak

3 months

@BanghuaZ @lmsysorg Looking forward to this 👀

0

5

Alpay Ariyak

@AlpayAriyak

2 months

@KemingLu612 Thank you! I’m literally generating data with Qwen this very moment haha

1

0

5

Alpay Ariyak

@AlpayAriyak

1 year

@ldjconfirmed Congratulations @ldjconfirmed you’re killing it🎉

0

4

Alpay Ariyak

@AlpayAriyak

3 months

@AarushSah_ @GroqInc @GavinSherry @kraken_9076 @JonathanRoss321 @geeksplainer @RickLamers @KapadiaSoami Congratulations, well deserved!!

1

0

4

Alpay Ariyak

@AlpayAriyak

5 months

@abacaj @carrigmat @WizardLM_AI No actual applications, just based on vibes from personal usage testing on what I normally use GPT-4 and Opus for. The difference vs 3 70B were really apparent when I took the lateral thinking prompts people use (eg the Sally one), rewrote and made new ones based on them:

1

0

4

Alpay Ariyak

@AlpayAriyak

1 year

@osanseviero Thank you for mentioning us, but a small correction - it’s OS Skunkworks :))

1

0

4

Alpay Ariyak

@AlpayAriyak

3 months

This

Teknium (e/λ)

@Teknium1

3 months

If you believe you can't exceed a teacher model at a task with synthetic data alone, then how is this SOTA? Synthetic data is real and is not something that has to cause a mode collapse or top out at the previous SOTA

14

10

186

0

4

Alpay Ariyak

@AlpayAriyak

11 months

This stems entirely from WizardMath requiring a specific prompt template, and serves as a clear example why we need to evaluate with custom prompts. MetaMath vs WizardMath gain for 70b on GSM8k - with prompt templates: +0.8% - without : +990%

0

1

4

Alpay Ariyak

@AlpayAriyak

3 months

@Arindam1408 Plans to release the dataset?

0

3

Alpay Ariyak

@AlpayAriyak

3 months

@Teknium1 @chargoddard Of H100 SXMs or something else?

1

0

3

Alpay Ariyak

@AlpayAriyak

1 year

@Sentdex @LambdaAPI We’re working on something similar, but with each expert being a LoRA adapter, rather than a full fine-tuned version of the base model

0

3

Alpay Ariyak

@AlpayAriyak

1 month

@abacaj Welcome back

0

3

Alpay Ariyak

@AlpayAriyak

1 year

@SanhEstPasMoi @felix_red_panda @thukeg @vilm_hq @skunkworks_ai To add to this, the incredible Otter based on OpenFlamingo:

0

3

Alpay Ariyak

@AlpayAriyak

1 year

@Yampeleg @far__el @Sentdex @Yampeleg :

0

3

Alpay Ariyak

@AlpayAriyak

7 days

@victorsungo Congratulations @victorsungo and all of @WizardLM_AI team! Well deserved :) People don't talk about Arena Learning and Auto-Evol-Instruct enough

0

3

Alpay Ariyak

@AlpayAriyak

10 months

Don’t forget @imonenext and the OpenChat team for the incredible OpenChat 3.5 used as base model for this, which was already the highest scoring OS model on MT-Bench :)

Banghua Zhu

@BanghuaZ

10 months

Forgot to add this, but huge kudos to the whole team: Evan Frick, @WthThao (two co-first authors), @zhuhl98 and Jiantao Jiao. Also huge thanks to the open source communities for their great work: @lmsysorg , @huggingface , @AIatMeta , @MistralAI , @alignment_lab , @AnthropicAI ,

0

14

0

2

Alpay Ariyak

@AlpayAriyak

6 months

@vitaliychiley Thank you for your work and congrats on the release! We'd like to fine-tune it at @OpenChatDev and open source that. Could you please provide the training code/script and hyperparameters you used for DBRX instruct?

0

2

Alpay Ariyak

@AlpayAriyak

7 months

@SciumoInc @erhartford These are the exact values and benchmarks, Grok’s scores were taken directly from xai’s announcement in November . Or did you mean something else?

1

0

2

Alpay Ariyak

@AlpayAriyak

10 months

@francoisfleuret @gblazex I set up the same quantized version in Text-Generation-WebUI with llama.cpp as backend and was able to get impressive results. Not sure what the exact equivalent args in llama.cpp would be, but I detailed everything and put more examples in this gist:

2

0

2

Alpay Ariyak

@AlpayAriyak

1 year

@altryne @Yampeleg @far__el Soon 🚀

0

2

Alpay Ariyak

@AlpayAriyak

6 months

@abacaj This

0

2

Alpay Ariyak

@AlpayAriyak

4 months

@DavidM4302 @giffmana Thank you for putting this together, sent you a little redeemable token of appreciation from RunPod :)

0

2

Alpay Ariyak

@AlpayAriyak

2 months

@yacineMTB Are you training a model to classify whether an image is the loss meme or not?

6

0

2

Alpay Ariyak

@AlpayAriyak

3 months

Similar performance to DoReMi while using 10x less compute 👀

Qian Liu ✈️ COLM 2024

@sivil_taram

3 months

Still following your human intuition to mix corpora from different sources for language model pre-training 🧠? Everyone says that data mixture has a big impact on model performance, but how - and why🕵️? Did you know that web corpora are actually highly impactful for downstream

5

51

178

0

2

Alpay Ariyak

@AlpayAriyak

1 year

@Teknium1 I was so excited until I realized it was only ~10% of the full dataset 🥲

1

0

2

Alpay Ariyak

@AlpayAriyak

10 months

@ArielNLee Thank you Ariel! Platypus was part of the dataset, so your work is greatly appreciated as well 🙌🏼

0

2

Alpay Ariyak

@AlpayAriyak

2 months

@aidan_mclau @hyperbolic_labs Are you sure the 2 providers you tried don’t just have different default hyperparameters?

1

0

2

Alpay Ariyak

@AlpayAriyak

3 months

@altryne С днём рождения брат!)

0

2

Alpay Ariyak

@AlpayAriyak

6 months

@agihippo before you rek(y)a(self) reference to chorus of 1:24-1:30 :)

Ice Cube - Check Yo Self (Remix) (Official Music Video)

REMASTERED IN HD! Official Music Video for Check Yo Self performed by Ice Cube.Follow Ice Cube:Instagram: https://www.instagram.comTwitter: https://twitter.c...

www.youtube.com

1

0

2

Alpay Ariyak

@AlpayAriyak

11 months

@far__el @Teknium1 @erhartford GSM8k is also not MCQ

1

0

2

Alpay Ariyak

@AlpayAriyak

6 months

@b_arbaretier Yes, but you can have a lot of deployments up simultaneously. Our deployments add up to a 950k tokens per minute limit, which is incredible for generating data in combination with @LiteLLM ’s router

2

0

2

Alpay Ariyak

@AlpayAriyak

1 month

@winglian The 🐐

0

2

Alpay Ariyak

@AlpayAriyak

2 months

@sam_paech Also this

Alpay Ariyak

@AlpayAriyak

2 months

@aidan_mclau @hyperbolic_labs For 3192 Max Tokens, 0 temp, the results were: @hyperbolic_labs 405B Instruct (bf16): 190 @togethercompute 405B Instruct (fp8): 187 The code doesn't seem to communicate in the final score if any requests were skipped entirely due to error. I initially ran on hyperbolic with an

0

5

0

2

Alpay Ariyak

@AlpayAriyak

7 months

is it even worth it… will hold off to see if there will be findings and developments that help justify the cost of making the fine-tuning work correctly, hyperparameter exploration and the fine-tuning itself. Like how Guan put it, “this might be another case of Falcon 180B”

0

2

Alpay Ariyak

@AlpayAriyak

5 months

@Teknium1 Thank you :)

0

2

Alpay Ariyak

@AlpayAriyak

7 months

@nisten @abacaj 💙

0

2