Srishti Gureja @sGx_tweets Twitter profile

Pinned Tweet

Srishti Gureja

2 years

First paper accepted at a NeurIPS workshop! TL4NLP it is :) All because of folks around me that support & believe me. Grateful!

17

0

93

Last Seen Profiles

@StakeEase

@kota_ta_

@CassiaTequ91568

@benjit14

@noimageofficial

@monck_1987

@Arfarij222161

@hwansohee

@Walmart

@AkronWkro

@saraymnn

@bokeplokalmalam

@MollyMoreyra

@TeresaHunt29899

@molly_ms59110

@JoellynMat19317

@mugiwaraboy92

@borjapuigaguirr

@d_midnights_13

@cukienaknikmati

@HANK63555740

@msdebbieallen

@abstainfromlust

@DAsupremacyy

@m151057

@Alindapramesti

@jjkrrxic

@aewinterxoxo

@slmmdlsy158592

@Dear_Devil28

@stw_pdg

@HouSFSaturday

@NickyQuane27661

@Vwrl_r6

@AshlDight

@kythiraiya

Srishti Gureja

@sGx_tweets

2 years

Learning @PyTorch ? I found this nice resource:

01. PyTorch Workflow Fundamentals - Zero to Mastery Learn PyTorch for Deep Learning

Learn important machine learning concepts hands-on by writing PyTorch code.

www.learnpytorch.io

25

338

1K

Srishti Gureja

@sGx_tweets

2 years

"XGBoost is all you need?" Ok, but let me caution you for cases when your data has categorical variables & you are using any tree based or boosted tree method like RandomForest, XGB etc One hot encoding could ruin things when the categorical variable has many levels 1/n

22

89

560

Srishti Gureja

@sGx_tweets

3 years

Know your ML evaluation metrics Got highly imbalanced data? Probably you'd want to 're'consider ROC. While AUC ROC is a very popular metric because of its characteristics like being insensitive to class distribution, it isn't a good choice 1/

5

61

310

Srishti Gureja

@sGx_tweets

2 years

How cool is it to try your ML models live rather than just training them in Jupyter notebooks? Very cool! And even cooler when it's done as easy as it gets with @Gradio and @huggingface 🤗space! I'm ready to try my bean plant classifier! 1/

12

38

303

Srishti Gureja

@sGx_tweets

3 years

Got reached out for data scientist role by a company based in London using AI in the field of medicine. They liked my LinkedIn activity & see me as a good fit according to what my LI makes me seem interested in - I do not even post much as I've always been interested more on the+

16

24

280

Srishti Gureja

@sGx_tweets

3 years

As a Data Scientist, I'll be smart if I quickly understand the business problem, frame it, understand how and what data is needed (or already used). In some cases knowing how to write integration pipelines also helps. Ofc, modelling is equally important. But obviously no one +

11

26

271

Srishti Gureja

@sGx_tweets

1 year

Thanks a lot @PyTorch @linuxfoundation ! 🔥❤️ Great to have received this. Looking forward to contributing much more substantially down the line.

19

4

252

Srishti Gureja

@sGx_tweets

2 years

Data loading shouldn't be a bottleneck in the model training pipeline. With my new blog, learn how @PyTorch ensures this. I also explore & implement the latest DataPipes from TorchData.🌟Pretty cool. Wrote this one on @weights_biases posts section.

How To Eliminate the Data Processing Bottleneck With PyTorch

This article explains how PyTorch seamlessly handles data for us to be able to train our large Deep Learning models efficiently without wasting the GPUs. .

wandb.ai

8

27

185

Srishti Gureja

@sGx_tweets

2 years

Are you setting off learning @PyTorch ? Follow along with my blog for a line-by-line explanation as you create your first neural net classifier. I wrote it on @weights_biases posts section. Here's the link -

Create your First Neural Net in PyTorch - Line by Line Explanation

Learn how to create your first NN in PyTorch. Made by Srishti Gureja using Weights & Biases

wandb.ai

4

24

141

Srishti Gureja

@sGx_tweets

2 years

Ever used regularisation (L1, L2) and wondered why it's advised to standardise the features (x1 x2...xn) before doing so? In L1 and L2 regularisation, we aim to shrink the magnitudes of coefficient estimates. 1/n

4

19

95

Srishti Gureja

@sGx_tweets

2 years

Looking to connect with people in the ML space that actively contribute to open source. I know how welcoming & helpful open source folks are. help me get started? :) I'd just want to chat a bit, won't take much of your time.

9

8

89

Srishti Gureja

@sGx_tweets

2 years

Outliers in your data? Among the many ways to deal w it, if one is going for regularisation, then - L1 (Lasso) is more robust to outliers while L2 (Ridge) isn't really. 1/n

2

10

82

Srishti Gureja

@sGx_tweets

3 years

Why do you choose '5' fold cross validation? Or why not any other k than the one you go for in k fold cross validation? Well, I shouldn't be asking here. Should ask the good old Bias Variance tradeoff instead. Here's why - 1/n

3

9

74

Srishti Gureja

@sGx_tweets

2 years

But Hash encoding & Dracula are two encoding schemes recommend for categorical variables with many levels. Ofcourse, these do not come without their cons. 6/6

7

8

66

Srishti Gureja

@sGx_tweets

2 years

Quote of the day - "Logistic Regression IS a regression algorithm." Anyone who tells you otherwise clearly doesn't know what they are talking.

16

5

62

Srishti Gureja

@sGx_tweets

1 year

working through RoPE's math & realising it's nothing but revising my undegrad lin algebra (intuition & concept) was fun. btw, I'll be presenting today on extending the context of models using RoPE - 2230 IST @forai_ml . Paper:

Extending Context Window of Large Language Models via Positional...

We present Position Interpolation (PI) that extends the context window sizes of RoPE-based pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within 1000 steps), while...

arxiv.org

2

8

61

Srishti Gureja

@sGx_tweets

3 years

theoretical (mathy, nitty gritty) side of things & that acc to me wouldn't really help me scale as a creator or maybe that isn't what people would be interested in reading. Still, learning in public is Mighty! think I should follow strategic posting now :)

3

1

45

Srishti Gureja

@sGx_tweets

2 years

Two types of classification tasks and how to implement each in @PyTorch : I've come across this confusion a bunch of times now about choosing the right loss function for a Classification Task using Deep Learning in @PyTorch . Here's a simple explanation: 1/3

5

4

44

Srishti Gureja

@sGx_tweets

3 years

They, too were data scientists!

4

2

36

Srishti Gureja

@sGx_tweets

3 years

So, I saw a post laying a 'set of golden rules' for dealing with missing data. Wrong on so many levels. No set of rules exists unless one considers the following-- Why is the data missing? Is it even worth imputing? A thread👇

1

12

37

Srishti Gureja

@sGx_tweets

2 years

Learning exporting @PyTorch models to ONNX today. Interesting! Thread soon 👋

6

5

39

Srishti Gureja

@sGx_tweets

2 years

PyTorch 2.0!!! Fun weekend ahead

PyTorch 2.0

Overview

pytorch.org

2

0

39

Srishti Gureja

@sGx_tweets

2 years

Learn to create lists in @PyTorch the correct way! The wrong way To create a @PyTorch NN with a variable no. of layers, a plain python list might be a common choice to store the network layers (nn.Module) by appending. This becomes a source of error. See in code? 👇 1/3

1

4

40

Srishti Gureja

@sGx_tweets

2 years

In the data space, I've learnt more by writing than by reading. Few weeks back, when I started learning pytorch, I wrote a blog to explain every detail of a code of mine that constructed the most basic NN using pytorch. To cater my reader well, I made sure every little detail +

2

1

38

Srishti Gureja

@sGx_tweets

3 years

Basics go a loooong way! Nothing new but I've come to realise this again, as I've been interviewing for Data Scientist roles lately.

3

0

33

Srishti Gureja

@sGx_tweets

2 years

Creating Custom Models in @PyTorch ? Make sure you aren't making this 👇 error. Let's learn in 5 steps. Step 1: Firstly, to "correctly" create optimizable parameters in PyTorch without running into gradient errors, we need to ensure parameters are leaf tensors. 1/5

1

3

33

Srishti Gureja

@sGx_tweets

2 years

Ever wanted to set different learning rates for different layers/parameters while training your neural networks in @PyTorch ? Let's learn how to do that with @PyTorch in 2 steps 👇 1. We will create the simplest neural network with 2 layers: 1/2

2

4

33

Srishti Gureja

@sGx_tweets

3 years

@Abh1navv What are my k nearest neighbours having?

2

0

28

Srishti Gureja

@sGx_tweets

3 years

at work will give you a nicely prepared data ready to be applied fancy models to.

3

0

31

Srishti Gureja

@sGx_tweets

2 years

Do not one hot encode your categorical variable while using XGB if it has got many levels. One hot encoding works ok & might even give performance boost if no. of levels are lesser. Curious why? 2/n

4

2

28

Srishti Gureja

@sGx_tweets

2 years

My @weights_biases blogathon submission explains CPCA - a simple & useful dimensionality reduction algorithm where you work with not 1, but 2 datasets to explore patterns in the target data. Also, +

Contrastive PCA - Let's Visualize Clusters in our Data !

Do you love looking at rich and clear patterns in your data? How nice is it when your dimensionality reduction algorithm projects the data in a 2D space with finely separated clusters in pretty...

wandb.ai

1

2

27

Srishti Gureja

@sGx_tweets

2 years

this badge on @PyTorch forums is tempting to me, hoping to do more in this welcoming community! PS - almost accidentally misspelled badge as batch :)

3

1

27

Srishti Gureja

@sGx_tweets

3 years

sometimes I really feel I should do an MS in 'Applied' math and stats. crazy how class 10th's moving average can be used for analysing time series data. it helps smoothing the series, assess trends by removing seasonality and even forecast future. no rocket science, just +

3

1

26

Srishti Gureja

@sGx_tweets

2 years

Didn't know PyTorch forums could be installed as a mobile application. More scrolls now :D

1

2

26

Srishti Gureja

@sGx_tweets

2 years

Working with Neural Networks? Using a CV architecture to predict whether a brain's MR scan classifies as cancerous, non-cancerous or any other such class? Rather than a single prediction from the neural net, wouldn't it be better if we could generate confidence intervals? 1/n

1

3

23

Srishti Gureja

@sGx_tweets

3 years

Data Science != Machine learning/Modelling

3

2

21

Srishti Gureja

@sGx_tweets

2 years

There it is! Few seconds and here's the prediction. Was nice to build my own web app as part of Building end to end Vision Applications taught by Dr. Abubakar @abidlabs at CoRise @corise_ !!

2

22

Srishti Gureja

@sGx_tweets

2 years

Pytorch docs are the best resource!

1

0

22

Srishti Gureja

@sGx_tweets

2 years

With a lot of levels comes a lot of sparsity. So while one hot encoding many levels (equivalent to creating the same no. of variables) only a small fraction of data points shall have the value 1 for a single level (read: variable) Why's this a problem? 3/n

2

1

21

Srishti Gureja

@sGx_tweets

3 years

z-test vs t-test z-test: underlying statistic can be approximated to follow std. normal distribution. t-test: ,,,,,to follow t distribution instead. Catch is: t distribution is more accurate in case of small samples. 👇

3

20

Srishti Gureja

@sGx_tweets

2 years

Checkpointing models in @PyTorch mid-training? Do not forget to save the optimizer's current state along with the model's current state, like so👇

2

0

21

Srishti Gureja

@sGx_tweets

3 years

What to do? PR Curve is a better choice. Both Precision & Recall deal with the postive (minority) class of interest. So in this example while Recall values are equal, precision informs how many positives as predicted by the model are true positivies. 4/

2

1

19

Srishti Gureja

@sGx_tweets

2 years

A serious error. Feature selection/engineering on tabular data is a crucial step in any machine learning problem. BUT! Hold on and double check you are doing it right. If you are using Cross-validation or validation set holdout approaches for estimating test error.. 1/n

1

20

Srishti Gureja

@sGx_tweets

2 years

People talk about fancy Machine Learning models. Today, I'll talk about anything fancy when it comes to solving problems/answering questions using data. An acquaintance was very keen to find the best outlier detection technique.

2

3

20

Srishti Gureja

@sGx_tweets

2 years

Clarity of concept goes a long way! Recently had an interview where I was asked something about pre-trained BERT models that I'd never read or thought of before. But, since I had the gist of what actually goes on inside the BERT architecture, I was able to answer on point +

4

2

20

Srishti Gureja

@sGx_tweets

3 years

exercising keeps the body fit and, learning computer algos keeps my mind fit. a fun mind exercise to do between my NLP lessons.

2

3

16

Srishti Gureja

@sGx_tweets

3 years

@TivadarDanka Coolest thing on twitter in a while!

0

16

Srishti Gureja

@sGx_tweets

2 years

New blog on @PyTorch soon! I'll be talking about how Pytorch handles data effectively and efficiently. Along, I'll also demonstrate the new DataPipe functionality from the TorchData library. Stay tuned 👋🔥

1

0

19

Srishti Gureja

@sGx_tweets

3 years

@marktenenholtz it would be what I failed to do myself: don't learn X first completely &then Y & then Z & so on. Take up a problem and learn on the go. so for eg. one really doesn't need to be vv good at python to learn ML. ofc it's an advantage to be that but def not required(atleast to start)

0

19

Srishti Gureja

@sGx_tweets

2 years

@marktenenholtz found it! guess I got a good memory so I just remembered this is your post. don't know what people get out of plagiarism but it's so irritating it would've really driven me up the wall had I been in your place.

4

0

19

Srishti Gureja

@sGx_tweets

2 years

Do you use drop_last = True in your PyTorch DataLoader? I do. Here's what it is- Setting it to True shall drop the last batch in each epoch in case the dataset at hand cannot be evenly divided into batches of equal sizes. 1/n

2

0

18

Srishti Gureja

@sGx_tweets

2 years

I don't know whether one needs to know math for an industry ML role or not But what I know is that engineering skills are sooo needed. maybe more :D What's interesting to me is that I feel this latter skillset is no less required in research as well :) ps: No rigid agenda x

3

0

19

Srishti Gureja

@sGx_tweets

2 years

Trees split on those variables that yield the "purest" nodes. Easy to see why a One hot encoded variable typically shall not lead to very pure nodes & hence the tree shall not split on it No matter how important the original categorical variable could be as a feature 4/n

1

19

Srishti Gureja

@sGx_tweets

3 years

I need some help with tagging diseases (NER) in biomedical text data. Any help from anyone who's done that is appreciated.

5

1

17

Srishti Gureja

@sGx_tweets

2 years

Naturally, this would also interfere with feature importance generated by RandomForest or any other method as even if the splits happen on these hot encoded levels, they'll most likely not happen near the root What to do then? Well, here's needed knowledge from experience 5/n

1

19

Srishti Gureja

@sGx_tweets

3 years

it's a real deal, atleast for me, to implement research papers. it's a basic one (!= not useful rather very useful), still taking a lot of effort. Anyone who's in a regular practice of doing this? (will also do a thread once I finish. hoping I finish.)

4

1

17

Srishti Gureja

@sGx_tweets

1 year

@osanseviero for a quick comprehensive overview of pos embeddings: follows RoPE: Detailed blogs by the authors of RoPE are the best resources. For GQA, its paper is short and v easily understandable once you know MQA which again is v simple.

What is RoPE and why does it matter? | Srishti Gureja posted on the topic | LinkedIn

RoPE (Rotary Position Embedding) empowers many recent LLM developments including LlaMA, PaLM etc. as far as injecting positional information in these…

www.linkedin.com

0

2

17

Srishti Gureja

@sGx_tweets

2 years

Full fine-tuning LLMs on downstream tasks comes with a lot of GPU memory usage + storage costs. Let us look into a PEFT technique called Adaptors for efficient transfer learning in LLMs with an example application in Transformers!👇 1/7

2

3

15

Srishti Gureja

@sGx_tweets

2 years

Communication-lack of it could ruin any data project, be it industry or research & the worse, it could cost you loads of time before things get ruined Communication-more than half work is done if this is done properly This isn't preach, it's what I'm experiencing these days :)

1

2

17

Srishti Gureja

@sGx_tweets

2 years

Transposing data in PyTorch? x.T is deprecated in PyTorch's latest release when used with tensors of dimensionality other than 0 or 2. Worth noting why - probably because it doesn't work like how we would want when dealing with batches of data (matrices). 1/2

1

17

Srishti Gureja

@sGx_tweets

3 years

What type of questions can I expect in a coding round for Data Scientist role? Pandas assignment already done! 🤔 I'm wondering what this round holds. Anyone that's gone through a similar round? Fingers crossed 🤞

5

0

15

Srishti Gureja

@sGx_tweets

2 years

Lately, I've been realising how important a good understanding of pytorch's autograd is for any practitioner. To this end, I'm planning to write a series of blog posts explaining how the autograd engine works with computational graphs, and related concepts. 1/2

3

1

16

Srishti Gureja

@sGx_tweets

3 years

Information lies in variability - Central idea on which dimensionality reduction by PCA is based. But, what if we want to capture variability only due to a specified cause/reason & not care about other sources of variability. For eg. one might want to capture +

1

16

Srishti Gureja

@sGx_tweets

2 years

Was reading about Markov Processes - guess they apply to us perfectly. The future state given the present & past depends only on the present no matter what the past was. Beaut!

2

0

16

Srishti Gureja

@sGx_tweets

3 years

my first encounter with word vectors was like -we use some algorithms to convert words into vectors in a way s.t. synonyms have similar vectors this isn't even the most appropriate definition & ofcourse left me sort of uninterested if not clueless read on to know the most basic

1

0

17

Srishti Gureja

@sGx_tweets

3 years

The more I study time series, the more interesting it gets. What comes in the way sometimes however, is those math heavy proofs/conditions. (Not the usual ones though, they are smooth to go :)) Skipping them for now, let's see them if I go for a PhD lol XD

1

0

16

Srishti Gureja

@sGx_tweets

3 years

this could be very (very, very) misleading (or misinforming). the decision boundary of logistic is linear in *its most basic form*. that s shaped sigmoid is **NOT** the decision boundary. this image makes it look like that sigmoid graph is separating the blue class +

Saurav Jain (Open Source + Communities)

@Sauain

3 years

1. Logistic Regression It's a classification model used when the target is categorical. It is a statistical model that in its basic form uses a logistic function to model a binary dependent variable, although many more complex extensions exist.

1

6

62

3

1

14

Srishti Gureja

@sGx_tweets

3 years

Fine tuning since yesterday. Optuna, you nice! How's hyperopt? Has anybody used it?

5

0

16

Srishti Gureja

@sGx_tweets

2 years

Crucial to constantly revisit the business problem while solving & evaluating the Machine learning problem. Even if the business problem isn't dynamic- REVISE IT! We get tempted to try fancy data science techniques and tools & forget what we are here for - THE BUSINESS PROBLEM!

2

0

15

Srishti Gureja

@sGx_tweets

2 years

"Non linear decision boundaries cannot be solved by logistic regression" - One who understands *just the basics* of what logistic regression is, would know this is TOTALLY wrong. I've now lost count of blogs, threads, posts etc. that state this. 1/2

2

16

Srishti Gureja

@sGx_tweets

3 years

train test split in time series cannot be done like how is done with other types of data, that is, dividing the whole data at random. here the chronology that's inherent in the data needs to be followed in the train &test sets as well. this is needed as most modelling techniques+

1

0

15

Srishti Gureja

@sGx_tweets

3 years

cPCA - Contrastive Principal Components Analysis did impressively... +

1

0

16

Srishti Gureja

@sGx_tweets

3 years

starting w non stationary time series today. I'm loving studying time series so far. Does it come under machine learning? :) After all, it's modelling here as well. Infact, I feel it's challenging cause we are mainly concerned w extrapolation here, among other things.

1

0

14

Srishti Gureja

@sGx_tweets

2 years

Scratched the surface of some techniques for efficient ML inference (Quantization, Pruning etc.) - Interesting topics! Nice to experiment w these techniques in @PyTorch .

0

15

Srishti Gureja

@sGx_tweets

2 years

Tensors' gradients unexpectedly None in @PyTorch ? Let's debug 👇🙌 Follow 4 simple checks and you'll have your answer. 1. tensor.requires_grad == True 2. == True, or tensor.grad_fn is None; if it is not None, use retain_grad() on it. 1/2

3

1

16

Srishti Gureja

@sGx_tweets

2 years

Came across hierarchical softmax while revisiting the negative sampling algo. NS is essentially a simpler alternative to HS. HS was a technique introduced to mitigate the heavy computational complexity involved in learning word embeddings using algos like word2vec. 1/2

1

2

15

Srishti Gureja

@sGx_tweets

3 years

subprocess in python :'( stuck badly. has anyone worked w it? help me please.

3

4

12

Srishti Gureja

@sGx_tweets

3 years

Seasonality, which is one of the major components of time series data, can occur in two types - Single & Multiple. Single - when there is one dominant seasonal pattern in the data; more likely to be seen in low frequency data like monthly or yearly. for eg. in a monthly data+

2

0

14

Srishti Gureja

@sGx_tweets

2 years

Fine-tuning an LLM taking up too much GPU memory? Heavily Parameterized Large Language Models + Basic Linear Algebra Theorem = Save GPU memory! 💯 Let’s talk about LoRA, a PEFT technique that relies on a simple concept - decomposition of non-full rank matrices. 1/7

2

0

13

Srishti Gureja

@sGx_tweets

3 years

Curious to know if ML practitioners use LMMs. Linear mixed effect models are an important and very interesting class of models. They let you model correlated data. I used LMMs to model pollutant levels in Beijing's air over years.

1

14

Srishti Gureja

@sGx_tweets

2 years

Model inference in @PyTorch TIL that computational graphs can be used not just for backprop, but for inference as well. We could create and export our model's graph and use it for inference later without the model checkpoint file. 1/2

1

0

13

Srishti Gureja

@sGx_tweets

2 years

The correct way - Use ModuleList ModuleList functions similar to a python list & is meant to store nn.Module objects similar to how a python list is used to store objs like ints, strings etc. The parameters of different layers are registered & accessible using .parameters(). 3/3

3

0

13

Srishti Gureja

@sGx_tweets

3 years

Tbh, 'classic' machine learning is the easiest thing I've studied in the field of Statistics so far. Now some may say ML isn't included in Statistics. For me, it is and I'll call it that way only. Why it seemed easiest to me could be a culmination of my interest, hold of basics👇

2

0

11

Srishti Gureja

@sGx_tweets

3 years

Electricity consumption high during the day, less during the night - I observed this in a time series data. (so it's like up and down with day and night) What is this? seasonal not cyclic Another series, I observe is going up then down up down.. so on. This? cyclic not seasonal+

5

0

12

Srishti Gureja

@sGx_tweets

2 years

@svpino Reminds me of cPCA - one of my favourite dimensionality reduction algorithms.

Srishti Gureja

@sGx_tweets

3 years

What PCA - Principal Components Analysis couldn't accomplish...

2

1

9

0

1

12

Srishti Gureja

@sGx_tweets

2 years

Confusion Matrix over any metric! Any day!

0

13

Srishti Gureja

@sGx_tweets

3 years

to compare two (or more) classification models when the data is highly imbalanced It can be overly optimistic in case of highly imbalanced data So say, with 100k negative examples & 10 positives - Model A & B both correctly identify 9 out of 10 positives. (true positives) 2/

1

0

13

Srishti Gureja

@sGx_tweets

3 years

As a person who writes code, no matter what role, industry, purpose etc., learning time and space complexity is inevitable. And there's no argument to this.

1

0

11

Srishti Gureja

@sGx_tweets

3 years

Another way to detect outliers, this for the multidimensional ones. Last few Principal Components. Generally last few PCs capture very less variance present in the original data. So, a plot of last PC against all data points can be used to find the points against which this PC 👇

3

0

10

Srishti Gureja

@sGx_tweets

3 years

Language Modelling -Recurrent Networks vs Feed Forward NNs. A thread, no math -a gentle explanation First up, what's Language modelling? It's when you start to write a reply to this thread &your google keyboard recommends you next words Now how to train a model to do just this?

2

0

13

Srishti Gureja

@sGx_tweets

3 years

getting the machine learn from data's past - ML getting my brain's machine learn from my life's past - *?* while the former is cool and I'm okay at it ig, I hope I get good at the latter :)

1

0

11

Srishti Gureja

@sGx_tweets

3 years

large, very large data could do no good, if sampling scheme is flawed. (this is me regretting not taking my survey sampling classes seriously :)

1

0

10

Srishti Gureja

@sGx_tweets

3 years

word2vec- slight technicality: do we consider two vectors per word? (initially, during the learning phase) One vector when the word is a central word. Other, when the word acts as context word.

1

0

11

Srishti Gureja

@sGx_tweets

3 years

@pradologue Hometown

2

0

9

Srishti Gureja

@sGx_tweets

2 years

@PyTorch 's Sequential vs ModuleList; & also their combination! 3 simple steps! nn.Module's stored in Sequential are connected in a "cascaded" way - the output of the 1st Module in Sequential becomes the input to the 2nd Module -- need to take care of dimensions. In code 👇 1/3

1

12

Srishti Gureja

@sGx_tweets

3 years

one question for data science people: how would you answer if an application asks you about your 'programming experience'? projects that demonstrate the same? Doesn't this sound more on the engineering side? And if you were to put ML(modelling)projects here, what would those be?

5

0

11

Srishti Gureja

@sGx_tweets

2 years

@PyTorch Tip! If the dataset isn't too big and you decide to save it in GPU's memory, and use the DataLoader to load mini batches.. Do not forget to specify the `generator` as a parameter to the DataLoader as shown. 👇 1/2