I just posted my 10,000th reply in the
@PyTorch
discuss forum! Thanks everyone for creating such a great community, for the guidance and mentorship I received, and
@soumithchintala
for starting this journey.
Thanks a ton for awarding me the
@PyTorch
superhero award at the conference! It really made my day, and it was great seeing so many community members in real life. You all are amazing!
Wow, this arrived in the mail today from
@NVIDIAAI
. I’m touched beyond words for the kudos of Jensen and
@soumithchintala
. I am truly grateful to be able to work in such a unique team with unparalleled colleagues and the
@PyTorch
community. Thanks to all who made it possible.
This line of
@PyTorch
code fascinates me every time I come across it:
y = x_backward + (x_forward - x_backward).detach()
As
@ThomasViehmann
explained: "It get’s you x_forward in the forward, but the derivative will act as if you had x_backward"
Thank you for the kind words and for being part of this enjoyable community,
@ThomasViehmann
! A huge thank you to all of you who make the community and discussion forum so great! 🥳
Posting to when you are stuck with
@PyTorch
code not quite doing what you want, will more often than not see
@ptrblck_de
helping you out with patience, kindness, and unparalleled expertise. Yesterday, he reached 30 thousand replies! 🥳
Thank you so much!❤️
Our
@PyTorch
team at
@nvidia
is recruiting! If you love PyTorch and are interested in working in deep learning compilers, automatic code generation or the PyTorch core, please apply:
Also, send me a DM, if you want to talk about the positions! :)
Native Automatic Mixed Precision Training is available in the latest
@PyTorch
nightly binaries and master! No need to build apex anymore. Check out the examples:
Today marks 5 years since the public release of PyTorch! We didn't expect to come this far, but here we're🙂- 2K Contributors, 90K Projects, 3.9M lines of "import torch" on GitHub. More importantly, we're still receiving lots of love and having a great ride. Here's to the future!
Today, 5 years ago,
@ThomasViehmann
posted for the first time in the
@PyTorch
forum. 🥳 I am very grateful for his continuous activity and I am sure that it had a huge impact on many users (including me). A huge thank you and to the next 5 years! 🎉
PyTorch 1.10 is here!
Highlights include updates for:
- CUDA Graphs APIs updates
- Several frontend APIs moved to Stable
- Automatic fusion in JIT Compiler support for CPU/GPUs
- Android NNAPI now in beta
Blog:
Release:
⚡ PyTorch
@LightningAI
2.0 is out! 🤯
Fireside chat at (12 ET):
- Why AI has to stay OpenSource ⚡⚡
- History of PyTorch
@LightningAI
- New features in 2.0 to help you with foundation models
- How we introduced our final, stable API
👉
I'm honored to have the opportunity to be the keynote speaker at the Ecosystem Day and am looking forward to meeting as many community members as possible throughout the day! :)
We are excited to announce that
@ptrblck_de
will be the opening keynote for the morning session of PyTorch Ecosystem Day! Register now for
#PTED21
here:
@PyTorch
implementation of the StyleGAN Generator (Karras et al.
@NvidiaAI
) with pretrained weights by
@ThomasViehmann
and me. It was a pleasure to work with him on this project. Feedback always welcome!
Amazing work being done in the
@PyTorch
Team
@NVIDIAAI
on the new code generation stack enabling automated fusion for dynamic shapes.
Check out session S31952 at
#GTC21
! Christian Sarofeen will walk you through the design, benefits, and future directions!
Check out this blog post for the latest on nvFuser, our newly default Deep Learning Compiler for NVIDIA GPUs. nvFuser has unique capabilities built just for PyTorch & can achieve great speedups on NLP & Vision Networks with its runtime generated kernels.
Are you missing some
@PyTorch
layers in the C++ API and are interested in contributing? No in-depth knowledge necessary! Have a look at this thread for more information:
Check out blendtorch by
@cheind
: integration of Blender renderings into
@PyTorch
datasets! "We utilize Eevee, a new physically based real-time renderer, to synthesize images and annotations at 60FPS and thus avoid stalling model training in many cases."
Thank you,
@ThomasViehmann
! I'm humbled by your words and hope my posts can live up to these kudos. Thank you all for being part of this community and for making it so enjoyable. To cite Tom: "I came for PyTorch, but I stayed for the company." :)
The wise and inspiring
@ptrblck_de
posted his 20.000th post on today, helping literally thousands of people to get ahead with their
@PyTorch
projects. Thank you!
Also, this means 10.000 posts in 2020 alone:
@ajayj_
def nan_hook(name):
def hook(m, input, output):
if not torch.isfinite(output).all():
print("Invalid output in {}".format(name))
return hook
for name, module in model.named_modules():
module.register_forward_hook(nan_hook(name))
@StasBekman
You are most likely using CUDA 11.7+, which ships with lazy kernel/module loading and is enabled by default in PyTorch 1.13.1+. (Run `export CUDA_MODULE_LOADING=EAGER` to disable it as a test)
v1.6: native mixed-precision support from NVIDIA (~2x perf improvement), distributed perf improvements, new profiling tool for memory consumption, Microsoft commits to developing and maintaining Windows PyTorch.
Release Notes:
Blog:
@rasbt
@lantiga
@PyTorch
It was great meeting you finally! Your books and lectures were my reference while digging into ML and now I even got a signed copy of your new book! Time to build an LLM from scratch!
Are you interested in a fully automated GPU code generation system designed and implemented in
@PyTorch
? Join this GTC session to learn more about the latest updates from our nvfuser team presented by Christian Sarofeen!
Exciting update for “tracing with primitives” in
@PyTorch
! If you are interested in contributing and helping us implement PyTorch’s logic in a more readable, hackable, and composable way, please reach out!
We recently posted the third update in PyTorch's "tracing with primitives" series. See . Bringing PyTorch's logic into Python is exciting, and we invite the community to participate!
@aamaljoseph
@PyTorch
Haha, happy to help. For the shared screenshot: using `strict=False` will ignore the key errors and will thus skip those parameters, so make sure this is really what you want.
@vmirly
@PyTorch
@ThomasViehmann
This line was suggested for a workflow, where a ReLU is used in the forward but a LeakyReLU should be used in the backward pass. Thread:
@wightmanr
@soumithchintala
@rom1504
For convs enable the cuDNN v8 API via `TORCH_CUDNN_V8_API_ENABLED=1` and `bfloat16` should be used. It's experimental at the moment, but should be enabled soon by default.
@ajayj_
If you are using modules, you could register a forward hook and add debug print statements for a quick check which layer might be causing it. This code is quite naive, but might give you enough information to start digging into the model.
@deliprao
The PyTorch binaries ship with their own CUDA dependencies, so you would also need to install a proper NVIDIA driver (not the full CUDA toolkit). `torch.compile` uses `ptxas` for its code-gen, and this binary should also ship in the current nightlies now.
To help developers get started with PyTorch, we’re making the 'Deep Learning with PyTorch' book, written by Luca Antiga and Eli Stevens, available for free to the community:
Join us at
@nvidia
GTC next week to hear from PyTorch researchers and contributors on the latest for 2.0, performance and other talks!
🔎 Read more about the PyTorch talks:
🖥️ Register for free:
What a great blog post describing how TorchDynamo and nvFuser can speed up models easily. It even provides a notebook to reproduce the results. "we have not seen any drawback implied by the use of this library, the acceleration just comes for free" 🎉🥳
We have recently tested the excellent TorchDynamo prototype from
@PyTorch
team and benchmarked it vs
@onnxruntime
and TensorRT.
TL;DR: big boost in inference perf + ease of use without major drawback. 👏
@jansel0
& team!
Particle filters are general algorithms for inferring the state of a system with noisy dynamics and noisy measurements. Here's an example with a robot in a circular room. Red=true robot, blue=guesses, occasional red line=noisy range sensor measurement. Details in thread 1/
We’re excited to announce the first-ever PyTorch Ecosystem Day
#PTED21
, a virtual event designed for our ecosystem and industry communities to showcase their work and discover new opportunities to collaborate and network!
Apply now👇
What a great interview! "I came for PyTorch, but I stayed for the company." I can only confirm
@ThomasViehmann
's statement, who became a very good friend through PyTorch.
Giveaway + Release:
Here's my interview w 3 great contributors to
@PyTorch
:
All about their book: Deep Learning w PyTorch by
@ManningBooks
, Open Source and PyTorch.
Eli Stevens,
@lantiga
and
@ThomasViehmann
Audio:
Video:
@francoisfleuret
@PyTorch
What would be your use case? Do you want to check your script for functional issues, such as indexing errors, or would you like to emulate a specific GPU architecture and check device kernels for potential issues?
@Sathishtheta
I don't create specific time slots, but there is often some time to answer a few questions while waiting for a source build to finish, between meetings, or even when waiting for a docker pull. I just enjoy it and can still learn new stuff from users. :)
@NVIDIAAI
just released the MONAI framework in
@PyTorch
! 🎉 Coming from a Biomedical Engineering background, it's great to see these toolkits. Checkout the examples and contributions are always welcome! :)
PS: It also comes with
@pytorch_ignite
examples ;)
If you are interested to hear more about general nvFuser features, its progress, as well as the new Python interface check the [A41255] GTC session, Tuesday, Sep 20 from 12:00 PM - 12:50 PM PDT:
@NVIDIAAI
just released the MONAI framework in
@PyTorch
! 🎉 Coming from a Biomedical Engineering background, it's great to see these toolkits. Checkout the examples and contributions are always welcome! :)
PS: It also comes with
@pytorch_ignite
examples ;)
Exciting update for “tracing with primitives” in
@PyTorch
! If you are interested in contributing and helping us implement PyTorch’s logic in a more readable, hackable, and composable way, please reach out!
@jsotterbach
@PyTorch
@NVIDIAAI
I'm a bit confused how data parallelism and ensembles fit together. The former is sending a chunk of the dataset to all model clones (using multiple GPUs) and syncs their updates while the latter trains different models potentially using the same data. How are you combining them?
@ThomasViehmann
@PyTorch
Congrats! Your posts are always insightful and we are all lucky to have you in the forums. I also learn a ton from reading your posts as seen here :P
We're excited to host the second annual PyTorch Developer Conference, featuring talks, discussions and posters from the core-devs, ecosystem, and industry.
Date: Oct 10th, 2019 in San Francisco. Space is limited, apply for an invite at
Something with
@PyTorch
, TorchDrift, our book Deep Learning with PyTorch and yours truly. Wednesday May 19 at 1pm Pacific time (10pm in Bergamo).
#PTCV
It’s been 5 years since we launched
@pytorch
. It’s much bigger than we expected -- usage, contributors, funding. We’re blessed with success, but not perfect. A thread (mirrored at ) about some of the interesting decisions and pivots we’ve had to make 👇
@radekosmulski
> I guess maybe momentum in Adam is affected, but for anything else sparse should not make any difference?
Yes, this would be the difference and would thus yield to different results. Check this gist:
Just created a repo with a few scripts I've written as code samples on the
@PyTorch
discussion board and which I find quite handy. Maybe you'll find something useful there! ;)
Thanks
@PyTorch
for the awesome Developer Conference! A lot of interesting talks in an awesome location. It was great to finally meet so many folks from the community in real life. Huge shout-out to the organizers!
@Ritika_Borkar
@jeremyphoward
@BruceHolmer
@PyTorch
This exactly! I add a warm-up loop (e.g. for 10 iters) using dummy tensors for the forward and backward pass to let cudnn find the fastest kernels.
Also, this code snippet might be useful for profiling multiple processes:
@rasbt
`pytorch_model.to(memory_format=torch.channels_last)` if `amp` is enabled and using `torch.backends.cudnn.benchmark = True` might give you an additional speedup. Note that cudnn.benchmark will profile the kernels for each new input shape, so be careful if dynamic shapes are used
@PyTorch
To learn more about what “primitive operations” are and how they can be used to break complicated operators into simpler blocks, take a look at the first update: