Most people don't know that Lightning Studios offer:
- free persistent storage
- free persistent environments
- unlimited background execution
- VSCode, PyCharm, (any IDE) integration
Set up your Studio environment once and reuse it again any time 🤯🤯
In this new 90 minute lecture, I show how to pretrain a 3B LLM from scratch. No edits. No detail skipped.
Companies want you to believe pretraining models is super hard and costly. With the right tools, it's not.
- We start by tuning the model on a cheap A10G.
- Then we scale
Excited to announce the release of bolts!
- linear + Logistic regression on TPUs/GPUs
- self-supervised learning
- RL
- GANs
- GPT
- Callbacks library
- Datamodules library
All powered by lightning and rigorously tested.
Don't waste more time implementing your own baselines...
Bolts is a new Deep Learning research and production toolbox from PyTorch Lightning. Iterate faster with pre-trained models, components, callbacks, and data sets, all modular, tested, and optimized for GPUs/TPUs.
Simply subclass, override, and train.
Excited to announce our new compiler - Thunder!
(built in collaboration with NVIDIA). 🤯 🤯
Thunder is a source to source compiler for PyTorch. It speeds up PyTorch models.
As an example, it speeds up llama 2 7B by 40%.
Excited to announce the launch of GPT-42.
- Half the size of GPT-3 (100 billion parameters)
- runs on 775 watts a day (2000 calories)
- can do one-shot learning
- multi-modal
- does NOT require 9,000 GPUs
- 300,000 years worth of evolution research!
Inference API coming soon!
LLM? old news. New hotness will be LVMs (Large Vision Models).
This time, looks like
@GoogleAI
is finally staying ahead of
@OpenAI
.
How well does it work?
a 🧵-> 1/n
Excited to release our latest paper (with
@kchonyc
) which establishes a conceptual framework for characterizing contrastive learning methods (SimCLR, BYOL, CPC, AMDIM, Swav). (work done at
@facebookai
)
Btw.. this was the motivation for
@PyTorchLightnin
We sped up the best opensource LLM (Falcon 40B) by 20% simply by porting it over to use Lightning.
We'll keep making it faster over the next few weeks. Stay tuned!
Available now in Lit-parrot:
It's PhD application season 😑.
PhDs will tell you not to get one.
non-phds will tell you to do it.
Job postings tell you to get one.
What to do?
a 🧵 on:
- industry vs academia
- schools vs advisor
- minority applicants and how not to get intimidated
- age for phd
1/9
AI should be fully open source and part of the collective knowledge!
Excited to announce a fully open-source (Apache 2.0), high-performance implementation of llama
Join our discord (
@LightningAI
) to build AI
Code for
@OpenAI
very deep VAE was released.
Cool trick mentioned: "Gradient Skipping - Skip an update when a grad norm is above a threshold (if the KL blows up)"
VAE twitter, what other tricks do you use normally?
Code:
Paper:
Round 2... I use CodeLlama 70B and Mixtral MoE to write code to finetune a model on 16 GPUs (multi-node) 🤯🤯
Video has zero edits. This is a realistic iterative development workflow.
TL;DR: Both are good. Mixtral MoE is super fast and writes clean code.
More below 👇🏻
Neural networks get a bad reputation for being black boxes.
In this tutorial, I’ll show you how to use backpropagation to change the input as to classify it as whatever you would like.
Had fun making this with
@alfcnz
Best investment zuck made was starting Facebook AI (
@AIatMeta
).
even after this move, I'm pretty sure wall street still doesn't know what llama is...
@PyTorchLightnin
has become a favorite for AI researchers and industry experts... but we thought there was a high-level framework (lightning is mid-level), missing for common use cases.
Excited to introduce Flash!
Blog:
Repository:
Super cool! Excited to announce our partnership with
@facebookai
to help standardize research and production code with
@PyTorchLightnin
.
If you're a company not using lightning, look into it! It's great for research AND production :)
I feel like
@PyTorch
dataloaders are underrated... i remember early days messing with hdf5, csv, numpy arrays, etc... data processing used to take forever.
Dataloaders are brilliant.
2024 AI predictions ⚡️
1. 1B models will outperform 70B models.
2. Models will be deployed on CPUs for almost free. Not API services.
3. Data quality will yield the next 10x boost in performance.
4. A combination of open source models will beat the best private models.
Here's a deep dive into VAEs for color images (not MNIST for a change!), and matching implementation in
@PyTorchLightnin
.
- ELBO, KL and reconstruction intuition
- Implementation
Colab:
Github:
Blog:
Yesterday, AI became about corporate self interests. A divorce from the broad AI research field that made these companies even possible.
PyTorch Lightning and
@LightningAI
will not sell out, we commit to continuing to give back to the AI community and opensource.
Since OpenAI’s surprise release of its GPT-4 model yesterday, there has been a raft of online criticism about the accompanying 'technical report.' Thanks to
@_willfalcon
for sharing his thoughts in this new Q&A:
Excited to release
@PyTorchLightnin
1.1.0. Built in collaboration with
@facebookai
you can now train any NLP Transformer modek, Swav (self-supervised), Speech or Image GPU with 60% less memory!
With ZERO code changes!
Read more here:
Don’t work on LLMs without learning a few key principles!
Today we release our next set of lectures on LLMs and NLP!
- working with embeddings
- limitations of RNNs
- self-attention principles
- how LLMs work
@karpathy
, love your minGPT demo!
I removed about 100+ lines of boilerplate by adapting it to use
@PyTorchLightnin
.
also gave you half precision, multi gpu and tpu training for free ;)
Here's the demo:
In 2015 I attended
@NeurIPSConf
for the first time. That was the year i built my first neural network (in Theano). After that conference i was so convinced about the future of deep learning that I quit my job at
@GoldmanSachs
and went all in on DL!
How did you get started?
Excited to launch "Open in Studio" 🤯🤯 - the Open in Colab alternative 🙂
✅ Open Github repos
✅ Open to specific notebooks/files
✅ Persistent storage
✅ Persistent environment
✅ Live code together
✅ T4, L4, A100, H100 and more instances
✅ 22 free GPU hours per month
If you're working on a new ML product/startup and need for funding, please reach out!
I'm especially interested in supporting fellow latinxs, (and any underrepresented group).
I can't wait to see what you build next and to help get you there!
@black_in_ai
@_LXAI
@wimlds
Glad we’re inspiring more companies to give free GPUs 🔥 - always lead from the front
All users get 22 free GPU hours on Lightning Studios
our free tier has even more!
Excited to announce or Series A funding of $18.6!
If you thought Lightning gave you superpowers, wait till you try our new platform
@_gridai
!
With no code changes, spin up hundreds of models across thousands of GPUs on the cloud of your choice!
Finetuning does not have to be expensive! We put together a guide that discusses trade-offs between compute and time requirements.
Would love to hear other's experience in the responses 👇🏼
Excited to launch torch-metrics -
@PyTorch
metrics optimized for distributed training at scale.
We thought the lightning metrics could be useful to all
@PyTorch
users beyond just
@PyTorchLightnin
:)
Highly recommend this video on writing optimized cuda kernels
by
@marksaroufim
from the
@PyTorch
team.
Perf checklist:
- coalesced global memory access
- maximize occupancy
- memory or compute bound
- minimize control divergence
... + 4 other items
In this video, I convert a VAE from the
@PyTorch
repo into Lightning in under 45 minutes.
As it's obvious from the video it's a faithful attempt to replicate the experience of doing this for an unseen project.
It's official,
@PyTorchLightnin
has hit 100k downloads in just 5 months! 45k just in the last 2 months.
If you are still coding
@PyTorch
for-loops by hand or doing your own distributed training, you should REALLY check out lightning.
🛠️Tooling Tuesday🛠️
Today, we share a
@GoogleColab
notebook implementing a Transformer with
@PyTorch
, trained using
@PyTorchLightnin
.
We show both encoder and decoder, train with teacher forcing, and implement greedy decoding for inference.
👇1/N
Anyone want to run their PyTorch code on TPUs?? Now you can with Lightning...
Best part is - no need to change your code!!! it’s 100% hardware agnostic.
Who are the best ML engineers you know?
we are hiring in NYC and SF. 🚀🚀
Lightnings is only 45 people 🤯🤯
(vs the 200+ at most AI startups).
- We are flat, focused and lean.
- No meetings or noise.
- Ship or bust
Every leader ships - My GH below
CPC, AMDIM, SimCLR: BatchNorm in contrastive learning is needed to keep networks from "cheating" and thus collapsing.
The ablations below: Turns out Batchnorm IS what keeps BYOL from collapsing.
sigh... smh
Excited about our transfer learning tutorial with
@PyTorchLightnin
.
Our resnet50 pretrained using unsupervised learning achieved double the performance of a torchvision pretrained resnet50.
Our model was trained using swav (from
@facebookai
, implemented in Lightning). 1/n
Fall semester week ⑩ practicum with
@_willfalcon
! 🎉
Learn about:
• Supervised and self-supervised transfer learning and fine-tuning;
• Formatting your code with
@PyTorch
@PyTorchLightnin
;
• Access latest paper re-implementation.
Lit-llama can now be finetuned with LLaMA-Adapter.
LLaMA-adapter is a technique to finetune LLMs that only tweaks 1.2 million parameters. Within 1 hour, "LLaMA becomes an instruction-following model!"
LLaMA-Adapter paper - (nice
Lightning is focused, small and lean 😊😊
I grew up in special operations (SEAL training) and operate our company with the same mindset.
Incredibly proud of our small, FOCUSED, team for building a product that's already redefining AI development.
(Also, shout out to our fellow
I had a lot of fun chatting with Mathilde from
@facebookai
about her latest work SwAV which is the state of the art for self-supervised learning!
Paper:
Colab:
we finally launched Lightning Studio - Google Colab and Sagemaker done “the Lightning way” ⚡⚡
Took us ~3 years to develop the “iPhone” experience for AI tooling.
This video shows a very simple way to save ~60% on cloud cost by first debugging on CPU,
Amazing to cross 10k
@github
stars (559 days) for
@PyTorchLightnin
.
Whoever is out there coding the next cool thing, keep at it!
But i'm now a small part of this equation. Lightning is now a true community-driven project, so congrats to the contributors and community!
Thunder is the number 5 trending repo on Github! 🤯🤯
Thunder speeds up PyTorch models up to 40% (it's early still 😅😅). Contribute and support it so we can make PyTorch models even faster!
My approach to AI research has been to work on fundamental science with less focus on chasing for new SOTAs. But seems like the community over indexes on work with a 1% improvement to SOTA instead?
For senior researchers, how do you balance this? What about early career?
We're kicking of our Large scale Infinite Training initiative (LIT-training) - LLMs for all ⚡⚡⚡
Donate your spare compute capacity to train Lit-LLaMa fully open-source and transparently.
AI is the new electricity, it cannot be owned by gatekeepers!
Excited to finally release
@gridai_
! Run any code on the cloud at scale from your laptop with ZERO code changes.
Over the years we've been passionate about helping the AI community scale up with
@PyTorchLightnin
. It's time to bring that to your ML infra!
Still haven’t checked out
@PyTorchLightnin
? If you’re still using keras, tensorflow, or simply haven’t refactored your pytorch code, doesn’t hurt to see what it’s like in the lightning world!
🚀 New feature alert 🔥⚡️
Today we take a big step to increase the flexibility of
@PyTorchLightning
with LightningLite, get the benefits of the scaling of Lightning without necessarily needing the full framework.
Convert your
@PyTorch
code in seconds.
Lightning 1.0.0 is here!
We also launched our new homepage - the home for everything Lightning, from blogs to tutorials to docs!
We're also on producthunt today!
The best use of AI (imho) is to augment humans, not replace them.
The focus of Muse () is to help artists get inspiration...
We also made it free and open source so it can run on your laptop without needing the cloud
Weekend treat - L4 GPUs are now available on Studios 🔥🔥
- $0.88 per hour
- 17 free hours per month 🤯
- choose 1 or 4 L4 GPUs ⚡⚡⚡⚡
- 1/2 the price of A10G 🤑
- 24 GB RAM
- free persistent storage included
Had you told me in 2012 when I was in the middle of going through
@us_navyseals
training that I would have written a deep learning framework used by thousands just a few years later, I would have said “what is coding???”
1/n
Humbled to hear so many people interested in joining the first
@PyTorchLightnin
maintainers' group. I'm looking to build a diverse team of 5 peeps to scale the explosive growth! 2.5k github stars, 46k downloads in a few months. For those interested:
100,000 users in < 5 months 🤯🤯🤯
next stop: 1 million users!
Lightning AI Studio adoption by top AI labs and enterprises has been truly humbling... The 100,000th user gets free Pro tier for 3 months 👇🏼👇🏼
Congrats to Suhas Nandiraju at Crosby Health, the 100,000th user! who
Congrats to the
@AIatMeta
team on Llama 3 🤯🤯
It is by FAR the best open source model i've played with!
Run your personal llama 3 in a Lightning Studio now... let me know what you think about the model!
⚡ PyTorch
@LightningAI
2.0 is out! 🤯
Fireside chat at (12 ET):
- Why AI has to stay OpenSource ⚡⚡
- History of PyTorch
@LightningAI
- New features in 2.0 to help you with foundation models
- How we introduced our final, stable API
👉
1/3 As a researcher from a non-traditional background, I take attribution very seriously and embrace constructive scientific discussion that accelerates learning and advances progress. Flash was developed to serve the evolving needs of PyTorch Lightning users.
A new open source model drops every day…
but it feels like people just tweet what they hear instead of what they experience?? 🤔
My gut says no one is using them in reality… just talking about potentially using them…
what am I missing?
... Pretraining on 32 H100s on Lightning Studios 🤯🤯!
A single Studio can scale from interactive development to multi-node training.
Full video walkthrough to pretrain a 3B param LLM on Studios.
This is a MAJOR milestone for the AI community!! to say i’m super excited is an understatement.
thanks for your leadership here
@soumithchintala
and support from
@MetaAI
@ylecun
@schrep
Purest commitment to opensource i’ve seen to date 😍
Live now, casually connect your Cursor IDE to
@LightningAI
Studio for remote development on GPUs 🤯🤯
✅ 22 GPU hours free per month
✅ Unlimited storage
✅ Auto sleep to save costs
✅ Persistent environment
✅ Persistent filesystem
ODE twitter, would love to know some of your use cases for ODEs!
For those not familiar with neural ODEs, here’s a walk-through (by:
@MichaelPoli6
) for implementing neural ODE models with
@Diffeq_ml
powered by
@PyTorchLightnin
.
I just open-sourced a research library I built for myself as I work on my PhD. It's like keras for with more control for research purpose. Would love contributors if anyone's interested in polishing it up!
@NYUDataScience
@PyTorch
#AI
#DataScience
Training with multiple GPUs has many approaches. I cover Data Parallel, Distributed Data Parallel, and the new more efficient sharded which works with any distributed data parallel mode.
So far we are seeing 60% savings in memory by enabling sharded.
if you are in a PhD, you WILL get whatever job you are interested in at the end of it. But spend that time learning new things, growing intellectually and engaging in scientific discourse. Don't rush it! It's about the journey, not the destination
3/9