Yang You Profile
Yang You

@YangYou1991

7,729
Followers
412
Following
79
Media
373
Statuses

Presidential Young Professor at @NUSingapore . @Forbes 30 under 30. Ph.D. from @UCBerkeley . Founder, President and Chairman of @HPCAITech and Colossal-AI.

Joined January 2014
Don't wanna be here? Send us removal request.
Pinned Tweet
@YangYou1991
Yang You
1 year
I am happy to share that our paper won the Distinguished Paper Award of @RealAAAI Congratulations on my Ph.D. student @zhengzangw , who is the first author of this paper. #AAAI23 #AAAI #AI #ArtificialIntelligence
Tweet media one
15
36
358
@YangYou1991
Yang You
2 years
It is my first time teaching face-to-face courses after I became a professor :-)
Tweet media one
Tweet media two
Tweet media three
303
1K
16K
@YangYou1991
Yang You
3 months
Say hello to Grok-1's new PyTorch+HuggingFace edition! πŸš€ 314 billion parameters, 3.8x faster inference. Easy to use, open-source, and optimized by Colossal-AI. πŸ€– Dive in: #Grok1 #ColossalAI 🌟 Download Now:
34
119
736
@YangYou1991
Yang You
4 months
Exciting News from Open-Sora! πŸš€ They've just made the ENTIRE suite of their video-generation model open source! Dive into the world of cutting-edge AI with access to model weights, comprehensive training source code, and detailed architecture insights. Start building your dream
15
158
624
@YangYou1991
Yang You
4 months
πŸš€ 🌐 Build your own video generation model like #Sora ! Experience the power of replication without the price tag! Open-Sora delivers a low-cost implementation of Sora, cutting costs by a staggering 46%. Expand your sequences to nearly a million with this innovative open-source
18
116
461
@YangYou1991
Yang You
8 months
Time flies! I got my PhD from Berkeley 1218 days ago. My first PhD student is graduating. That is my first achievement :-)
Tweet media one
7
6
380
@YangYou1991
Yang You
1 year
AAAI distinguished paper award!
Tweet media one
Tweet media two
12
14
374
@YangYou1991
Yang You
4 months
Want to train a model like #Sora ? Check out our new project #OpenDiT ! OpenDiT is an easy-to-use, fast, and memory-efficient system for training and deploying DiT models, which are the foundation of models like Sora. With OpenDiT, you can achieve: * Up to 80% faster in training *
Tweet media one
Tweet media two
5
68
327
@YangYou1991
Yang You
4 years
I officially graduated from UC Berkeley. It really was a great journey!
Tweet media one
10
0
285
@YangYou1991
Yang You
4 years
First day as a faculty member at Computer Science Department of @NUSComputing Excited!
Tweet media one
8
0
253
@YangYou1991
Yang You
3 months
Speedup Open-Sora's training by 3x and inference by 2x with our novel DSP (Dynamic Sequence Parallelism)! For 10s 512x512 videos, Open-Sora's inference time: 1xH800: 106s 8xH800: 45s 8xH800+DSP: 22s DSP can be seamlessly adapted to all multi-dimensional transformers, unlocking
Tweet media one
2
59
250
@YangYou1991
Yang You
3 years
I am excited to be appointed as "Presidential Young Professor" at @NUSComputing
Tweet media one
20
5
209
@YangYou1991
Yang You
6 months
I am happy to share that our paper has been accepted by ICLR as an ORAL paper (1.2% acceptance rate). InfoBatch: Lossless Training Speed Up by Unbiased Dynamic Data Pruning InfoBatch randomly prunes a portion of less informative samples based on the
Tweet media one
1
40
216
@YangYou1991
Yang You
2 years
Had a great dinner with Dr. Kai-Fu Lee (the man I admire enormously) @kaifulee
Tweet media one
Tweet media two
4
6
180
@YangYou1991
Yang You
4 years
I'm grateful to graduate from @Berkeley_EECS with the Lotfi A. Zadeh Prize. I'm excited to announce that I will join the National University of Singapore as a tenure-track assistant professor at the Department of Computer Science in @NUSComputing
18
4
178
@YangYou1991
Yang You
1 year
I am happy to share that our paper won the Outstanding Paper Award of ACL. We propose CAME to simultaneously achieve two goals: fast convergence as in traditional adaptive methods, and low memory usage as in LLM training.
4
12
159
@YangYou1991
Yang You
2 months
🌟 Get ready for cinematic magic with Open-Sora! πŸŽ‰ It generates 16s & 720p video. Say hello to seamless storytelling, where your vivid imagination comes to life in high-definition with just a prompt! πŸ“Ήβœ¨ Open-Sora's bucket strategy redefines efficiency, with only 64 GPUs,
7
32
146
@YangYou1991
Yang You
4 years
Goodbye United States! Hello Singapore! I will quarantine for 14 days.
Tweet media one
Tweet media two
7
0
124
@YangYou1991
Yang You
1 year
The new semester has started! Just finished my first class this semester :-)
Tweet media one
Tweet media two
2
0
121
@YangYou1991
Yang You
4 years
Students of @UCBerkeley usually got the Ph.D. lollipop when they submitted dissertations. I could not do that because of COVID-19. However, @GradDivision mailed it to me from 13590.66 km away! What a great tradition! What a big surprise! Thanks a lot! @GradDivision @Berkeley_EECS
Tweet media one
2
3
118
@YangYou1991
Yang You
8 months
Our paper was published on May 26th of 2021 and it was also accepted by ACL. We clearly named the method ring self-attention (i.e. ring attention). I did not find any substantial difference between ring self-attention and ring attention. To my knowledge, our work is the first
@haoliuhl
Hao Liu
9 months
New paper w/ @matei_zaharia @pabbeel on transformers with large context size. We propose RingAttention, which allows training sequences that are device count times longer than those of prior state-of-the-arts, without attention approximations or incurring additional overhead.
Tweet media one
Tweet media two
10
180
851
6
18
106
@YangYou1991
Yang You
3 years
I am excited to be mentioned twice at #SC21 Awards Ceremony :-)
Tweet media one
Tweet media two
1
0
90
@YangYou1991
Yang You
6 months
Colossal-AI team just released SwiftInfer - a TensorRT-based implementation of StreamingLLM, boosting inference performance by a whopping 46%! In the scenario of long-text multi-round conversations, StearmingLLM can improve the ability of LLM to understand and remember context,
0
12
78
@YangYou1991
Yang You
2 years
I'll attend NeurIPS: please let me know if you want to chat or grab a coffee (or watch the FIFA World Cup)! DMs are open. Excited to be finally back at an in-person NeurIPS after 3 years 😁 #NeurIPS2022
Tweet media one
2
3
74
@YangYou1991
Yang You
4 years
I am happy to be rated as a good reviewer by a top AI conference.
Tweet media one
2
0
72
@YangYou1991
Yang You
2 years
Would you like to accelerate AI model training by 10x? Do you want an easy-to-use system that abstracts away all the repetitive nonsense from under the hood? Fret not, Colossal-AI is now open-source!
0
16
67
@YangYou1991
Yang You
3 years
It's my honor to receive IEEE @ComputerSociety TCHPC early career award! Hopefully see you at @Supercomputing πŸ˜„
Tweet media one
9
0
61
@YangYou1991
Yang You
1 year
Aloha! I am in Hawaii for attending #ICML2023 My research group has 2 papers this year. Also, our startup's Colossal-AI is a platinum sponsor!
Tweet media one
2
2
61
@YangYou1991
Yang You
4 years
I finally finished my 14-day quarantine with a negative COVID-19 testing result😁 It's like 14 weeks πŸ˜‚
Tweet media one
1
0
59
@YangYou1991
Yang You
2 years
Because of COVID-19, I left Berkeley without attending the graduation ceremony. It is amazing to meet my advisor again after 2 years!
Tweet media one
0
2
57
@YangYou1991
Yang You
2 years
All of our 5 paper submissions have been accepted by #CVPR2022 Congrats to my students! Hopefully, see you in New Orleans! More details can be found here:
Tweet media one
2
1
56
@YangYou1991
Yang You
4 years
My Ph.D. dissertation: fast and accurate machine learning on distributed systems and supercomputers.
Tweet media one
2
3
52
@YangYou1991
Yang You
1 year
It's great to have dinner with Prof. Jennifer Widom, the dean of @StanfordEng
Tweet media one
0
1
50
@YangYou1991
Yang You
4 years
Goodbye 2020. Hello 2021!
Tweet media one
Tweet media two
1
0
46
@YangYou1991
Yang You
8 months
The former premier of China passed away. He is a visionary leader who dedicates himself to the progress and well-being of his nation. Emerging from humble origins, he, through his exceptional talent and wisdom, ascended to the nation's highest echelons of leadership. Tasked with
Tweet media one
0
2
42
@YangYou1991
Yang You
2 months
πŸ”₯ Exciting news in AI! 20% enhancement in training efficiency for LLaMA3 8B and 70B! Colossal-AI offers tailored solutions for LLaMA3 models, significantly boosting training efficiency and setting new standards with exceptional performance. Check out the open-source project on
0
8
42
@YangYou1991
Yang You
3 years
NUS computer science Ph.D. program (full scholarship) has a Spring intake. The deadline is June 15th. Here is the application information:Β  My research group's information can be found at
0
9
40
@YangYou1991
Yang You
4 months
Prompt Learning: forcing human beings to fit machines Instruct Learning: forcing machines to fit human beings
4
3
39
@YangYou1991
Yang You
2 years
To my idol: you are my hero forever and I hope you all the best in your future endeavors.
Tweet media one
4
1
39
@YangYou1991
Yang You
1 year
Congrats to my coauthors!
Tweet media one
@YangYou1991
Yang You
1 year
I am happy to share that our paper won the Outstanding Paper Award of ACL. We propose CAME to simultaneously achieve two goals: fast convergence as in traditional adaptive methods, and low memory usage as in LLM training.
4
12
159
0
0
34
@YangYou1991
Yang You
2 years
Thanks to Dr. Kai-Fu Lee @kaifulee , we had a great meeting with OpenAI President @gdb and Chief Scientist @ilyasut
Tweet media one
1
0
36
@YangYou1991
Yang You
4 years
I am happy to see our LAMB optimizer was included in MLPerf's BERT implementation. Google finished BERT training in 24 seconds based on MLPerf. However, MLPerf used its own convergence metric, which is different from Mr. Jacob Devlin's baseline.
1
2
33
@YangYou1991
Yang You
1 year
It is my pleasure to be the session chair for ML: Optimization at #AAAI23 Our session will cover the latest techniques in machine learning optimization. If you are interested in improving the efficiencyΒ of chatGPT, stable diffusion, DALLΒ·E 2, and AlphaFold 2, come to talk to us!
Tweet media one
1
3
35
@YangYou1991
Yang You
27 days
πŸ”₯ Getting Chatbot Arena model rankings with 2000Γ— less time (5 minutes) and 5000Γ— less cost ($0.6), simply by mixing the off-the-shelf benchmarks! πŸš€ Introducing our MixEval, a revolutionary #LLMs evaluation paradigm that's fast, cheap, and precise! By blending real-world
@NiJinjie
Jinjie Ni
27 days
How to get βš”οΈChatbot Arenaβš”οΈ model rankings with 2000Γ— less time (5 minutes) and 5000Γ— less cost ($0.6)? Maybe simply mix the classic benchmarks. πŸš€ Introducing MixEval, a new πŸ₯‡gold-standardπŸ₯‡ LLM evaluation paradigm standing on the shoulder of giants (classic benchmarks).
Tweet media one
10
64
233
2
7
35
@YangYou1991
Yang You
11 months
Congrats to our team for giving a greatΒ talk at #ICML2023
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
33
@YangYou1991
Yang You
1 year
I will be speaking at the 37th AAAI Conference on Artificial Intelligence on Feb 7th and 8th! I’ll be discussing efficiently training large AI models like GPT-3 and Stable Diffusion. See you there. #AAAI23 #AAAI #AI #ArtificialIntelligence @RealAAAI
Tweet media one
2
1
32
@YangYou1991
Yang You
1 year
We are actively seeking talented postdoctoral researchers specializing in LLM and MLSys. If you have a passion for these fields, please click on the links below for more information and to apply.
Tweet media one
Tweet media two
0
7
31
@YangYou1991
Yang You
2 years
NBA Finals!!!
1
0
29
@YangYou1991
Yang You
3 years
Berkeley was ranked as No.1 by Forbes on the top US colleges list! Berkeley is the first public university to win Forbes’ top ranking. That's amazing! I miss CAL :-)
Tweet media one
0
3
30
@YangYou1991
Yang You
2 years
Congratulations to Prof.Β Jack Dongarra for winning the Turing award! Well deserved! BTW, I want to mention that my advisor Prof. James Demmel @Berkeley_EECS also made a significant contribution to HPC and NumericalΒ Libraries. This picture can tell us something :-)
Tweet media one
0
1
30
@YangYou1991
Yang You
4 years
Congratulations toΒ Bill Gropp, who was recently elected asΒ IEEE Computer Society president! Bill was the host of my faculty job interview at the UIUC CS department. He gave me a good piece of career advice. He was a very nice person. I wish him the best of luck for the new job!
Tweet media one
0
0
27
@YangYou1991
Yang You
8 months
I am excited to get on the list of "100 Most Influential Chinese" by Forbes China.
Tweet media one
2
0
29
@YangYou1991
Yang You
2 years
To have a happy life, we should find more people loving us, instead of minimizing the number of people hating us. The number of people hating us really does not matter, but the number of people loving us means how far we can go :-)
Tweet media one
2
0
28
@YangYou1991
Yang You
2 years
Our two tutorials have been accepted by @RealAAAI . It is my privilege to teach AI to top AI experts in the world. See you in Washington DC! #AAAI23 Tutorial 1: Colossal-AI: Scaling AI Models in Big Model Era Tutorial 2: Large-scale Deep Learning Optimization Techniques
Tweet media one
0
2
29
@YangYou1991
Yang You
3 years
Our new work!
@_akhaliq
AK
3 years
Sparse-MLP: A Fully-MLP Architecture with Conditional Computation pdf: abs:
Tweet media one
0
31
114
0
2
26
@YangYou1991
Yang You
2 years
Our new work :-)
@_akhaliq
AK
2 years
FastFold: Reducing AlphaFold Training Time from 11 Days to 67 Hours abs: github:
Tweet media one
4
70
304
1
1
28
@YangYou1991
Yang You
9 months
Based on current techniques, LLM query will be more expensive than search engine query. LLM inference is mainly using matrix-matrix multiply. Search engine (e.g. PageRank algorithm) is based on matrix-vector multiply. Each database query is just a matching. Matrix-matrix
3
1
28
@YangYou1991
Yang You
8 months
ChatGPT (OpenAI) = The iOS of AI. LLaMA (Meta) = The Android of AI. Who will win? Both
6
6
27
@YangYou1991
Yang You
2 years
πŸ˜‚πŸ˜‚
Tweet media one
1
2
23
@YangYou1991
Yang You
1 year
Congrats to the ChatGPT leadership team!
@YangYou1991
Yang You
2 years
Thanks to Dr. Kai-Fu Lee @kaifulee , we had a great meeting with OpenAI President @gdb and Chief Scientist @ilyasut
Tweet media one
1
0
36
0
2
22
@YangYou1991
Yang You
3 years
Transformers are transforming everything in deep learning. All the attention of deep learning is now on "Attention"!
0
1
21
@YangYou1991
Yang You
4 years
I'd like to share a paper recently published by Google: "Exploring the limits of Concurrency in ML Training on Google TPUs". It shows how Google finish the training of large deep learning models within one minute.
Tweet media one
0
2
21
@YangYou1991
Yang You
2 years
I hope this figure helps me to say "Happy New Year to Everyone!" :-)
Tweet media one
4
2
20
@YangYou1991
Yang You
4 years
I am looking for a postdoc or research fellow. Please feel free to contact me if you are interested.
0
5
21
@YangYou1991
Yang You
4 years
Thanks so much!
0
0
20
@YangYou1991
Yang You
4 years
A peaceful protest at @UCBerkeley . I am happy to see they draw lines on the ground to implement social distancing. I find many of them are actually elderly people, who are vulnerable to COVID-19 and violence. I want to thank them for what they did for the community.
Tweet media one
1
1
19
@YangYou1991
Yang You
3 years
Today, Google highlights Dr. Lotfi A. Zadeh on its search engine! It was my honor to receive the first Lotfi A. Zadeh Prize by @Berkeley_EECS
Tweet media one
0
0
19
@YangYou1991
Yang You
6 months
Excited to kick off the new semester! There's nothing quite like teaching in a bustling classroom packed with so much talent. Here's to a great term ahead! πŸ˜ŠπŸ“š
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
0
19
@YangYou1991
Yang You
4 years
Our new work withΒ  @quocleix andΒ  @tanmingxing . People now can finish the ImageNet training in 1 minute. However, a 75.9% convergence accuracy is probably too low to be practical. We achieve 83% ImageNet top-1 accuracy in 1 hour, which is a speed world record.
@_akhaliq
AK
4 years
83% ImageNet Accuracy in One Hour pdf: abs:
Tweet media one
1
7
43
1
3
18
@YangYou1991
Yang You
2 years
Our paper has been accepted by #AAAI23 CowClip: Reducing CTR Prediction Model Training Time from 12 hours to 10 minutes on 1 GPU Paper: Code:
0
2
19
@YangYou1991
Yang You
9 months
Excited to share our #ICCV2023 paper: Fine-tuning Vision-Language Models without Zero-Shot Transfer Degradation (ZSCL). ZSCL outperforms the pre-trained model on downstream tasks and maintains its zero-shot transferability to other tasks. paper: blog:
Tweet media one
0
4
18
@YangYou1991
Yang You
2 years
Introducing #ICLR2022 Concurrent Adversarial Learning for Large-Batch Training Motivation: Large-batch training has become a widely used technique when training neural networks with a large number of GPU/TPU processors.
Tweet media one
6
1
19
@YangYou1991
Yang You
3 years
Our Colossal-AI project is currently ranked #1 by Github Trending (python category)
Tweet media one
0
0
19
@YangYou1991
Yang You
2 years
Check our new paper at #NeurIPS2022 Random Sharpness-Aware Minimization We propose a novel random smoothing-based SAM (R-SAM) algorithm. R-SAM essentially smooths the loss landscape and improves the approximation of the inner maximization.
0
4
17
@YangYou1991
Yang You
9 months
Excited to introduce our #ICCV2023 paper Dataset Quantization (DQ). DQ achieves lossless training performances with 2% data keep ratio on language tasks and 60% data keep ratio on vision tasks. Just check out our paper and project:
Tweet media one
1
2
16
@YangYou1991
Yang You
2 years
The major source of the energy cost for training AI models comes from moving the data? Communication costs 10x more energy than computation. Please correct me if I'm wrong :-) For GPT-3: The communication energy cost is 4.7e+26 PJ. The computation energy cost is 3.6e+25 PJ.
Tweet media one
3
2
16
@YangYou1991
Yang You
3 years
PyTorch implementation of LARS for ImageNet: PyTorch implementation of LAMB for ImageNet: Both of them can achieve at least 76.7% accuracy in 90 epochs for both large batch sizes and small batch sizes.
0
1
14
@YangYou1991
Yang You
3 years
Our new paper: ONES automatically manages the elasticity of each AI job based on the workload to maximize GPU utilization and improve scheduling efficiency. Experiments on 64 GPUs show great results. This paper will appear on @Supercomputing #SC21 ()
Tweet media one
1
1
15
@YangYou1991
Yang You
2 years
I am happy to share that my Ph.D. advisor Prof. James Demmel and I will give an invited talk at @odsc See you in San Francisco!
Tweet media one
1
0
14
@YangYou1991
Yang You
9 months
Dubai = Do yoU Buy AI?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
15
@YangYou1991
Yang You
2 years
Once again, Colossal-AI is ranked as No.1 on the GitHub Python trending list!
Tweet media one
0
2
14
@YangYou1991
Yang You
2 years
I just published Embedding Training With 1% GPU Memory and 10 Times Less Budget, an Open Source Solution for Super-Large Recommendation Model Training on a Single GPU
0
1
12
@YangYou1991
Yang You
2 years
Our new work!
@_akhaliq
AK
2 years
FaceMAE: Privacy-Preserving Face Recognition via Masked Autoencoders abs: Compared to previous sota, FaceMAE consistently reduces at least 50% error rate on LFW, CFP-FP and AgeDB
Tweet media one
1
10
48
0
2
12
@YangYou1991
Yang You
4 years
Chinese Academy of Sciences released a benchmark for fast AI training. They are not the first team to do this. MLPerf is already a huge success. But they have a good summary of how researchers reduced the ImageNet training time from 29 hours to 1 minute.
Tweet media one
1
1
11
@YangYou1991
Yang You
4 months
Thanks for sharing our work. Congrats to the team led by @VictorKaiWang1 and @liuzhuang1234
@_akhaliq
AK
4 months
Neural Network Diffusion Diffusion models have achieved remarkable success in image and video generation. In this work, we demonstrate that diffusion models can also generate high-performing neural network parameters. Our approach is simple, utilizing an autoencoder and a
23
249
1K
0
2
12
@YangYou1991
Yang You
4 years
Congratulations to a former NUS Ph.D. student
@shw3ta_shinde
Shweta Shinde
4 years
Thrilled to share that I will be joining ETH Zurich ( @ETH_en ) as an assistant professor in the CS department ( @CSatETH ). Super excited to move to Switzerland this autumn and work with the amazing students and faculty.
22
3
341
0
0
12
@YangYou1991
Yang You
5 months
This is a photo of me generated by AI :-)
Tweet media one
4
0
12