🔥Introducing Llama3-8B-Chinese-Chat, the 1st llama3 model finetuned on English-Chinese datasets via ORPO.
🚀Our model consistently produces better responses for Chinese prompts than the Llama-3-8B-Insturct, and excels in logic, coding, math, and writing.
🔥Introducing Gemma-2-9B-Chinese-Chat: the 1st Gemma-2 model tailored for Chinese&English users, fine-tuned on >100K preference pairs!
🚀Our model excels in Chinese prompts, and shows improved logic, coding, math, and writing skills.
More info: 🧵⬇️
🔥We introduce Llama3-8B-Chinese-Chat v2.1!
🚀Compared to v1, the training dataset of v2.1 is 5x larger, and it exhibits significant enhancements, especially in roleplay, function calling, and math capabilities!
😆Don't miss out on our v2.1 model ⬇️
🔥Introducing Llama3-8B-Chinese-Chat, the 1st llama3 model finetuned on English-Chinese datasets via ORPO.
🚀Our model consistently produces better responses for Chinese prompts than the Llama-3-8B-Insturct, and excels in logic, coding, math, and writing.
🔥 After the 9B model, we present Gemma-2-27B-Chinese-Chat: the 1st Gemma2 27B model optimized for Chinese&English, finetuned on >100K preference pairs!
🚀 Our model excels in Chinese, with improved logic, coding, math, and writing skills.
More info:🧵⬇️
🚀 Presenting Llama3-70B-Chinese-Chat, one of the 1st Llama3 70B models fine-tuned specifically for Chinese!
🏆 Llama3-70B-Chinese-Chat excels on C-Eval and CMMLU, surpassing ChatGPT and matching GPT-4.
🔥 Don't miss out on our model! Learn more: 🧵⬇️
🔥Introducing Mistral-7B-v0.3-Chinese-Chat, the 1st Mistral-7B-v0.3 model finetuned for Chinese&English.
🚀Our model consistently produces much better responses for Chinese prompts than the Mistral-7B-v0.3, and excels in math, roleplay, tool use, etc.
🎉 Our Llama3-8B-Chinese-Chat ranks 14th in overall trending on Huggingface, 2nd among Llama3 derivative models, and 1st among Llama3 Chinese derivative models! Thanks for your support!
🌟 I summarized our updates today in Fig. 3. See the details in the comments below:
🔥🔥🔥Update:
We provide the official 8bit-quantized and fp16 versions of Llama3-8B-Chinese-Chat in the following links, respectively. You are welcome to have a try!
8bit-quantized:
fp16:
🔥Introducing Llama3-8B-Chinese-Chat, the 1st llama3 model finetuned on English-Chinese datasets via ORPO.
🚀Our model consistently produces better responses for Chinese prompts than the Llama-3-8B-Insturct, and excels in logic, coding, math, and writing.
🔥🔥Checkout DiveR-CT, Diversity-enhanced Red Teaming with Relaxing Constraints
Highlights:
🚩New optimization by constrained RL
🚩Marked superiority in diversity and ASR
🚩Allowing dynamic control of objective weights
🚩Enhancing resiliency in blue team
🚩DiveR-CT, our latest automated red teaming method, forgoes reward maximization bias by constrained RL, also enhance semantic rewards to dynamically adapt. It delivers superior diversity, mitigates overoptimization and generates better data for safetuning
Update:
We provide the official 8bit-quantized and fp16 versions of Llama3-8B-Chinese-Chat in the following links, respectively. You are welcome to have a try!
8bit-quantized:
fp16:
🚀 Presenting Llama3-70B-Chinese-Chat, one of the 1st Llama3 70B models fine-tuned specifically for Chinese!
🏆 Llama3-70B-Chinese-Chat excels on C-Eval and CMMLU, surpassing ChatGPT and matching GPT-4.
🔥 Don't miss out on our model! Learn more: 🧵⬇️
We have included various examples generated by shenzhi-wang/Gemma-2-9B-Chinese-Chat, including examples of role playing, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc.
Have a look on our huggingface repo😆
🚀Jailbroken responses could potentially be mitigated through recursive thinking, which encourages LLMs to reconsider their initial outputs.
😄This idea is implemented in our proposed Recursive Contemplation (ReCon) framework.
🔥Check details below!
New Anthropic research paper: Many-shot jailbreaking.
We study a long-context jailbreaking technique that is effective on most large language models, including those developed by Anthropic and many of our peers.
Read our blog post and the paper here:
🔥Introducing Gemma-2-9B-Chinese-Chat: the 1st Gemma-2 model tailored for Chinese&English users, fine-tuned on >100K preference pairs!
🚀Our model excels in Chinese prompts, and shows improved logic, coding, math, and writing skills.
More info: 🧵⬇️
🔥Introducing Llama3-8B-Chinese-Chat, the 1st llama3 model finetuned on English-Chinese datasets via ORPO.
🚀Our model consistently produces better responses for Chinese prompts than the Llama-3-8B-Insturct, and excels in logic, coding, math, and writing.
@AnthropicAI
🚀Jailbroken responses could potentially be mitigated through recursive thinking, which encourages LLMs to reconsider their initial outputs. This idea is implemented in our proposed Recursive Contemplation (ReCon) framework.
🔥Check the details below!
@ycjcl
We haven't thoroughly tested our model on a wide range of benchmarks yet, but we plan to do so. However, as far as I know, no current Chinese benchmarks effectively reflect both the model's instruction-following ability in Chinese and its code-switching issues.
We have included various examples generated by shenzhi-wang/Gemma-2-27B-Chinese-Chat, including examples of role playing, function calling, math, RuoZhiBa (弱智吧), safety, writing, and coding, etc. Have a look on our huggingface repo😆
@GoogleDeepMind
🔥After Gemma-2-9B-Chinese-Chat, we further introduce the first fine-tuned Gemma-2-27B model for Chinese&English users! 📷Welcome to have a try!
🔥 After the 9B model, we present Gemma-2-27B-Chinese-Chat: the 1st Gemma2 27B model optimized for Chinese&English, finetuned on >100K preference pairs!
🚀 Our model excels in Chinese, with improved logic, coding, math, and writing skills.
More info:🧵⬇️
🚀Our model ranked 7th overall on the Huggingface Trending Model leaderboard, 1st on the Huggingface Trending Chinese Model leaderboard, and 1st on the Huggignface Trending ORPO Model leaderboard! Thank you for your support!
😆Check out our v2.1 if you enjoy v1!
@meili145
Thank you for your interest in our model!
We provide the official 8bit-quantized and fp16 versions of Llama3-8B-Chinese-Chat in the following links, respectively. You are welcome to have a try!
8bit-quantized:
fp16:
🔥Introducing Gemma-2-9B-Chinese-Chat: the 1st Gemma-2 model tailored for Chinese&English users, fine-tuned on >100K preference pairs!
🚀Our model excels in Chinese prompts, and shows improved logic, coding, math, and writing skills.
More info: 🧵⬇️
🔥 After the 9B model, we present Gemma-2-27B-Chinese-Chat: the 1st Gemma2 27B model optimized for Chinese&English, finetuned on >100K preference pairs!
🚀 Our model excels in Chinese, with improved logic, coding, math, and writing skills.
More info:🧵⬇️
EfficientTrain++ is accepted by TPAMI2024🤩
🔥An off-the-shelf, easy-to-implement algorithm for training foundation visual backbones efficiently!
🔥1.5−3.0× lossless training/pre-training speedup on ImageNet-1K/22K!
Paper&Code: