Big Data Trainer โข Founder & CEO TrendyTech โข Tweets about
#BigData
&
#DataEngineering
. Helping you get hike ๐ฐ & find your Dream Job ๐ Join 20000+ students๐
I asked my team member to give me one decent resume.
I got one resume,
I checked the ATS score, it was 38
I did a lot of research over the past few days,
I made multiple changes, without using any AI software (to strike the right balance between the ATS score & human
I asked my team member to give me one decent resume.
I got one resume,
I checked the ATS score, it was 38
I did a lot of research over the past few days,
I made multiple changes, without using any AI software (to strike the right balance between the ATS score & human
From a CTC of 12 LPA to 30+ LPA in 7 months
Multiple offers & an extremely good hike,
that's what a well planned learning can help you achieve.
It's such a joy to see your students getting tremendous success.
My advice to everyone out there, a few failures in life should not
SQL Full course in 9 hours - (totally free & rock solid content)
This includes all major topics like
1. SQL Fundamentals, CRUD Operations
2. Primary Key vs Unique Key, Auto Increment Values -
3. DDL vs DML, Truncate vs Delete
4. Foreign Key Constraint
5. Distinct, Order By,
Here Comes the Gold mine for Data Engineers
Python Complete Playlist (4 videos already released)
lecture 1 -
lecture 2 -
lecture 3 -
lecture 4 -
The 5th lecture is coming on Tuesday
23 SQL videos that will make you fall in love with SQL
SQL Basics (14 videos)
๐ SQL Fundamentals, CRUD Operations & Setting Environment -
๐ Primary Key vs Unique Key, Auto Increment Values -
๐ DDL vs DML, Truncate vs Delete
Get 10X more Interview calls - ATS Score 38 to 100
I'm giving you free access to my session where I will show how to optimize your resume & take the ATS score from 38 to 100.
yes you heard it right, a fully ATS compliant resume with 100 score.
This will help you get 10X more
From a CTC of 12 LPA to 30+ LPA in 7 months
Multiple offers & an extremely good hike,
that's what a well planned learning can help you achieve.
It's such a joy to see your students getting tremendous success.
My advice to everyone out there, a few failures in life should not
A Data Engineering Roadmap beyond just cracking Interviews!
Here is a 32 weeks Step by Step Plan
1. Introduction to Big Data/DataLake Storage (3 weeks)
Big Data - The Big Picture, Linux Commands, Introducing the Multi Node Practice Environment, Distributed Storage
2.
I'm giving you free access to my session where I will show how to optimize your resume & take the ATS score from 38 to 100.
yes you heard it right, a fully ATS compliant resume with 100 score.
This will help you get 10X more Interview calls.
In this session I have talked
All Data Engineers should definitely read these 10 posts..
1. From 0 to Hero in SQL - Follow this Plan
2. Crunching Big Data in absolute layman terms ๐ฅ
3. Normalization vs Denormalization
4. Super
Secret Revealed - From 0 interview calls to 100+ calls
I asked one of my team member to give me one resume (struggling to fetch interview calls)
I got one resume,
I checked the ATS score, it was 38
I did a lot of research over the past few days,
I made multiple changes,
Big Data End to End Pipeline on major Cloud Platforms!
Ingest -> Store -> Process -> Serve
Ingest - Get the data from multiple sources using some ingestion framework.
Example - AWS Glue, Azure Data Factory, NiFi, Sqoop
Store - Since we are going to store huge amount of data
All Data Engineers should definitely read these 10 posts..
1. From 0 to Hero in SQL - Follow this Plan
2. Crunching Big Data in absolute layman terms ๐ฅ
3. Normalization vs Denormalization
4. Super
I asked my team member to give me one decent resume.
I got one resume,
I checked the ATS score, it was 38
I did a lot of research over the past few days,
I made multiple changes, without using any AI software (to strike the right balance between the ATS score & human
Here Comes the Gold mine for Data Engineers - A complete Pack (Free)
Complete SQL Free Course -
Complete Python Free Course (6 videos already released till now)
lecture 1 -
lecture 2 -
lecture 3 -
Order of execution in a SQL query
We all know SQL, but most of us do not understand the internals of it.
Let me take an example to explain this better.
Select p.plan_name, count(plan_id) as total_count
From plans p
Join subscriptions s on s.plan_id=p.plan_id
Where p.plan_name
Big Data Interviews in 2020 vs Big Data Interviews in 2024.
It's amazing to see how the technology landscape has shifted in last 4 years.
I conducted Big Data Mock Interviews back in 2020,
Here is a playlist (2020)
We again Started conducting these
From 0 to Hero in SQL - Follow this Plan
SQL Basics (14 videos)
๐ SQL Fundamentals, CRUD Operations & Setting Environment -
๐ Primary Key vs Unique Key, Auto Increment Values -
๐ DDL vs DML, Truncate vs Delete -
A Data Engineering Roadmap beyond just cracking Interviews!
Here is a 32 weeks Step by Step Plan
1. Introduction to Big Data/DataLake Storage (3 weeks)
Big Data - The Big Picture, Linux Commands, Introducing the Multi Node Practice Environment, Distributed Storage
2.
A Data Engineering Roadmap beyond just cracking Interviews!
Here is a 32 weeks Step by Step Plan
1. Introduction to Big Data/DataLake Storage (3 weeks)
Big Data - The Big Picture, Linux Commands, Introducing the Multi Node Practice Environment, Distributed Storage
2.
All Data Engineers should definitely read these 10 posts..
1. From 0 to Hero in SQL - Follow this Plan
2. Crunching Big Data in absolute layman terms ๐ฅ
3. Normalization vs Denormalization
4. Super
From a CTC of 5.37 LPA to 30 LPA in a span of just 3 years!
This story is of my amazing and hardworking student (not revealing name as company names & CTC is disclosed)
Here is he journey -
year 2020 - 5.37 LPA CTC with overall experience of 6 years
Got to know about my "Big
CICD for Data Engineers in a Super Easy way!
Lets say you are working on a RetailAnalysis Project & have a Jira ticket assigned to you RA-17843
If you are a developer you would create a feature branch,
feature-RA-17843ย & work on it.
As soon as you make a git push, and github
Data Engineering - 10 Managerial round Interview Questions
1. what is the size of your cluster
2. How much data you deal with on a daily basis
3. what is your role in your big data project
4. Are you using onpremise setup or you are working on cloud
5. which big data
Step by Step Plan to learn Big Data (All Free resources Included)
1. Learn SQL Basics -
SQL will be used at a lot of places - Hive/Spark SQL/RDBMS queries
Joins & windowing functions are very important
2. Learn Programming/Python for Data Engineering -
If you are learning SQL, learn all the below things..
1. SQL Fundamentals, CRUD Operations
2. Primary Key vs Unique Key, Auto Increment Values -
3. DDL vs DML, Truncate vs Delete
4. Foreign Key Constraint
5. Distinct, Order By, Limit, Like Keyword
6. Order of execution in SQL
7.
CICD for Data Engineers in a Super Easy way!
Lets say you are working on a RetailAnalysis Project & have a Jira ticket assigned to you RA-17843
If you are a developer you would create a feature branch,
feature-RA-17843ย & work on it.
As soon as you make a git push, and github
30 Data Engineering videos on trending topics - Interview preparation
1. Explaining Data Lake Versus Data Warehouse -
2. Learn Columnar Storage -ย
3. Analysing the failed Spark Jobs using Log Files -
4.
Get 10X more Interview calls (Free access to the session)
Take your ATS score close to 100 now!
Here is what I did..
I asked my team member to give me one decent resume.
I got one resume,
I checked the ATS score, it was 38
I did a lot of research over the past few days,
I
All Data Engineers should definitely read these 10 posts..
1. From 0 to Hero in SQL - Follow this Plan
2. Crunching Big Data in absolute layman terms ๐ฅ
3. Normalization vs Denormalization
4. Super
One Person whom I truly admire in the field of Data Engineering isย
@EcZachly
(Zach Wilson)
Here are 9 excellent technical posts byย him.
I urge all the Big Data Enthusiasts to check these.
1. order of execution in SQL
2. Important Tips - Data
Apache Spark - Lets cover multiple scenarios in this post
consider you have a 20 node spark cluster
Each node is of size - 16 cpu cores / 64 gb RAM
Let's say each node has 3 executors,
with each executor of size - 5 cpu cores / 21 GB RAM
=> 1. What's the total capacity of
Recently Zach Wilson
@EcZachly
has created a public Github repo with all the resources, books, companies, and social media accounts you should be following to stay current on data engineering topics.
This should act like a gold mine for all the Data Engineering enthusiasts.
If
I am starting with a Free Python Course for Data folks on my youtube channel
Here is my promise - this will be even better than the paid courses.
The first video will be released on 26th February, Monday @ 5 pm
After receiving wonderful feedback on my free SQL series, it was
Working 8 PM to 5 AM at a call center to help my family with Financial crisis
Walking 3 km everyday to attend my MCA coaching classes in order to save a few bucks.
Working on a Diwali night for extra 300 rupees.
Working in extreme dental pain, even when I was supposed to get a
Step by Step Plan to learn Big Data (All Free resources Included)
1. Learn SQL Basics -
SQL will be used at a lot of places - Hive/Spark SQL/RDBMS queries
Joins & windowing functions are very important
2. Learn Programming/Python for Data Engineering -
Order of execution in a SQL query
We all know SQL, but most of us do not understand the internals of it.
Let me take an example to explain this better.
Select p.plan_name, count(plan_id) as total_count
From plans p
Join subscriptions s on s.plan_id=p.plan_id
Where p.plan_name
2024 - Rockstar Data Engineer Roadmap
Prerequisites
---------------------
1. Linux commands
2. Programming fundamentals (preferably python)
3. SQL is very important
You should learn the below things
--------------------------------------
1. Distributed Computing Fundamentals
I am giving Free access to my session where I have covered 10 recently asked Pyspark Interview questions.
If you are going for a pyspark Interview most likely you will face these questions.
I have given the answers to all of them, and you need to portray it the same way in your
30 Data Engineering videos on trending topics - Interview preparation
1. Explaining Data Lake Versus Data Warehouse -
2. Learn Columnar Storage -
3. Analysing the failed Spark Jobs using Log Files -
4.
Python for Data Engineers / Data Analysts & Data Scientists
I have released 3 videos till now :
video 1 -
- Introduction
- Installing python 3
- Our first program
- Variable
- Datatypes
- Type errors are caught at runtime
- Typecasting
- String
These 30 videos will help you for your next Data Engineering Interview!
1. Explaining Data Lake Versus Data Warehouse -
2. Learn Columnar Storage -
3. Analysing the failed Spark Jobs using Log Files -
CICD for Data Engineers in a Super Easy way!
Lets say you are working on a RetailAnalysis Project & have a Jira ticket assigned to you RA-17843
If you are a developer you would create a feature branch,
feature-RA-17843 & work on it.
As soon as you make a git push, and github
I'm giving you free access to my session where I will show how to optimize your resume & take the ATS score from 38 to 100.
yes you heard it right, a fully ATS compliant resume with 100 score.
This will help you get 10X more Interview calls.
In this session I have talked
Even if you have done a few python paid courses, check these 6 videos that I have uploaded on youtube.
You will truly realize that you didn't knew things this way & in this depth.
1.
2.
3.
4.
I asked my team member to give me one decent resume.
I got one resume,
I checked the ATS score, it was 26
I did a lot of research over the past few days,
I made multiple changes, without using any AI software (to strike the right balance between the ATS score & human
If you are learning SQL, learn all the below things..
1. SQL Fundamentals, CRUD Operations
2. Primary Key vs Unique Key, Auto Increment Values -
3. DDL vs DML, Truncate vs Delete
4. Foreign Key Constraint
5. Distinct, Order By, Limit, Like Keyword
6. Order of execution in SQL
7.
From 0 to Hero in SQL - Follow this Plan
SQL Basics (14 videos)
๐ SQL Fundamentals, CRUD Operations & Setting Environment -
๐ Primary Key vs Unique Key, Auto Increment Values -
๐ DDL vs DML, Truncate vs Delete -
Big Data End to End Pipeline
Ingest -> Store -> Process -> Serve
Ingest - Get the data from multiple sources using some ingestion framework.
Example - AWS Glue, Azure Data Factory, NiFi
Store - Since we are going to store huge amount of data we need a Distributed/Object
Normalization vs Denormalization
Normalization is a process of dividing the data into multiple smaller tables with an intent to reduce data redundancy & inconsistency.
However, Denormalization is totally opposite of above idea.
Denormalization is the technique of combining
Apache Spark Partition skew explained in a super simple way
Let's say we have 1 lakh coins of different denominations and we want to find the total sum.
If one person has to do that, then its a monolythic style and this will take time.
so the best way is to distribute it to
A Data Engineering Roadmap beyond just cracking Interviews!
Here is a 32 weeks Step by Step Plan
1. Introduction to Big Data/DataLake Storage (3 weeks)
Big Data - The Big Picture, Linux Commands, Introducing the Multi Node Practice Environment, Distributed Storage
2.
Internal working of Apache Spark - One of the most liked writeup
Lets say you have a 20 node spark cluster
Each node is of size - 16 cpu cores / 64 gb RAM
Let's say each node has 3 executors,
with each executor of size - 5 cpu cores / 21 GB RAM
=> 1. What's the total capacity
I am developing an end to end project on Azure Data Engineering and will release it on my youtube channel.
This project will be on healthcare domain.
For now I will design and develop the version 1 of it.
There will be 3 datasets
- Patients data
- Clinical Trial data (results
am looking to hire 2 people for developing innovative solutions around Big Data technologies.
Initially I will provide a paid internship and then if things go well,
I will convert it to a full time role.
My preferred choice for this hiring will be
- Candidates with a career
From yesterday, I kept exploring different names for the Biggest Data Engineering Community that I am building.
So finally, here is the name.
"The Data Engineers Club"
Deciding on a name is just the very first step, from the upcoming week we will be in action with a lot of
I have a vision to create a really Strong community in Data Engineering (The Biggest Community)
To achieve this mission, I am starting with multiple Non Profit initiatives.
I don't know till what level this will go, but I am determined to give my best!
Few of the things that I
From a CTC of 12 LPA to 30+ LPA in just 7 months
Multiple offers & an extremely good hike,
that's what a well planned learning can help you achieve.
It's such a joy to see your students getting tremendous success.
My advice to everyone out there, a few failures in life
1,00,000 Subscribers on YouTube ๐ฅ
No clickbaits, no controversies, Just Educational Content.
On May 4, 2019 I posted my first video on my YouTube channel.
So far we have uploaded complete SQL course & there is an ongoing Python series for students.
In the long run, we donโt
7 offers in Big Data with a whopping 230% hike
Ideally I do not recommend my students taking more than 3 offers,
but I cant stop anyone!
Got this message from one of my student who cracked multiple companies.
As a mentor It motivates me when I receive such messages from my
Till 2004 - used to be below average student throughout
2004 - failed in 12th Pre-boards
2006 - Got 1.5 lakh+ rank in AIEEE even after 1 year coaching
2007 - Could only manage to get admission in BCA Distant
Na hi skills the, and naa ni paise :)
Looks like it's the end of my
10 offers in Big Data with a whopping 90% hike (~2X CTC)
Got this message from one of my student who cracked multiple companies.
As a mentor It motivates me when I receive such messages from my students.
I offer a super premium big data program that gives top results.
if you
I have talked to a lot of students who recently attended the interview on Azure cloud.
The way interviews are conducted is mostly the interviewer ask one question and have a lot of related follow up questions.
For Azure Storage account I have framed 3 set of questions which
12 ways to improve the performance of your Spark jobs
1. Caching Data In Memory
2. Join Strategy Hints for SQL Queries
3. Coalesce Hints for SQL Queries
4. Adaptive Query Execution
I have seen a lot of candidates who keep postponing the interviews.
They keep saying I am still preparing, or I am not prepared.
They feel they need more time, and want perfection.
Remember on thing, you will never feel that you are 100% prepared.
My suggestion, decide on a
Today's Data Engineering Mock interview will be at next level.
It will simulate the SQL & DSA round for Data Engineers in a product based Company.
The one who is taking the interview is working in Google & the one who is attending the interviews works in Walmart.
Both of these
As part of the Data Engineers Club, we got a few mock interviews done this weekend.
I sincerely thank the volunteers, who are making this a huge success!
we will soon be uploading those to youtube so that it can help a bigger crowd.
I will talk about what went well & what could
Learn Apache Spark Step by Step (Follow the Sequence)
1. A quick introduction to the Spark API
2. Overview of Spark - RDD, accumulators, broadcast variable
3. Spark SQL, Datasets, and DataFrames:
4.
"Sir I travelled all the way from Singapore, Just to meet you! "
My students made me realise, โeverything is possibleโ - Student Success Celebration at The Taj Bangalore
Key highlights of the event
>> 130 Selected Students were Invited, 126 Attended showing their love for me.
Learn Apache Spark Step by Step (Follow the Sequence)
1. A quick introduction to the Spark API
2. Overview of Spark - RDD, accumulators, broadcast variable
3. Spark SQL, Datasets, and DataFrames:
4.
I am super excited to announce the launch of my new program
"The Elite Data Engineering Program"
It's a 20 weeks program
High level topics covered in the Elite DE program are:
=>Distributed Storage & Processing Fundamentals
=>Pyspark in Depth
=>A lot of performance tuning
What is Databricks?
Databricks is a company formed by creators of apache spark.
Databricks provides an apache spark based unified analytics platform optimized for cloud.
when we talk about open source version of spark we have to deal with all the below challenges:
=>
20 Recently asked Pyspark Interview questions
1. Difference between client and cluster mode
2. what is partition skew, reasons for it. How to solve partition skew issues?
3. what is a broadcast join in apache spark
4. what is the difference between partition and bucketing
2024 - Rockstar Data Engineer Roadmap
Prerequisites
---------------------
1. Linux commands
2. Programming fundamentals (preferably python)
3. SQL is very important
You should learn the below things
--------------------------------------
1. Distributed Computing Fundamentals
What is Databricks?
Databricks is a company formed by creators of apache spark.
Databricks provides an apache spark based unified analytics platform optimized for cloud.
when we talk about open source version of spark we have to deal with all the below challenges:
=>
Most important question asked in Apache Spark Interviews.
How do you optimize your spark jobs?
Here are 10 ways you can optimize your spark job!
1. Filter irrelevant data as early as possible
2. make sure broadcast hash join strategy comes in play when joining one large and
My new Data Engineering batch is starting tomorrow. It's a 32 weeks extensive program covering the fundamentals, Pyspark, Azure Cloud, AWS Cloud, Streaming, Performance Tuning and Projects.
Who should opt for this?
People in IT who want to move into Data Engineering, or people
These 30 videos will help you for your next Data Engineering Interview!
1. Explaining Data Lake Versus Data Warehouse -
2. Learn Columnar Storage -
3. Analysing the failed Spark Jobs using Log Files -
10 trending questions asked in Apache Spark interviews
1. how are initial number of partitions calculated in a dataframe
2. what happens internally when you execute spark-submit
3. what is a partition skew and how to tackle it
4. what are the spark optimization techniques you
Data Engineering - Walmart Complete hiring Process
I have created a youtube video for all the aspiring Data Engineers, who are looking to get into Walmart.
In this video, I have covered the below questions
1. How to get Interview Call from Walmart ?
2. Positions that you can
I have seen a lot of candidates who keep postponing the interviews.
They keep saying I am still preparing, or I am not prepared.
They feel they need more time, and want perfection.
Remember one thing, you will never feel that you are 100% prepared.
My suggestion, decide on a
As part of the Data Engineers Club we have got an amazing start.
In the last 2 weeks we managed to conduct 12 Mock interviews.
I will be uploading all of these to my youtube channel.
For the next 30 days, each single day we will release one mock interview.
This is going to be
12 ways to improve the performance of your Spark jobs
1. Caching Data In Memory
2. Join Strategy Hints for SQL Queries
3. Coalesce Hints for SQL Queries
4. Adaptive Query Execution
I just uploaded a new video on my youtube channel covering 10 pyspark interview questions which were recently asked in the interviews.
Here are the questions which I have answered in the video.
1. Difference between client mode and cluster mode
2. what is partition skew,