Machine learning for finance is hard.
So let's take a step-by-step case study in this 🧵
This time - let's ask ML to find patterns without giving it any direction, and use those patterns to mitigate risk.
Oh, and we'll backtest the ML model too!
Enjoy!
Machine learning for finance is hard!
So let's take a simple case, and break it down step-by-step in this 🧵
The application - a simple trend prediction on the SPY using a random forest.
Enjoy!
Machine learning for finance can be tricky!
So let's explore with another step-by-step case study 🧵
This time, let's use ML to find time series patterns across multiple instruments.
The aim - to find like patterns in the market and build a scanner.
Enjoy!
Machine learning in finance is quite elusive
When I began this journey 2 years ago, there weren't many end-to-end examples to learn from
If your new to this area - then here are 5 🧵s I have put together so your journey can be smoother than mine!
More to come in the future 🙂
Machine learning for finance is not all about prediction.
So let's take a step-by-step case study in this 🧵
This time - let's cluster trades to try and find similar ones in the future
Enjoy!
The Stockbee Market Monitor by
@PradeepBonde
is probably one of the best free resources out there.
It's amazing for gauging situational awareness of the market.
Let's take a look from a data-science perspective in predicting trends in SPY 🧵
Machine learning for finance can be tricky
So let's break it down in a step-by-step case study 🧵
This time - let's create a model, and use explainable ML techniques to understand how it's making decisions.
And hopefully we'll gain some trading insights along the way!
Enjoy!
Finding optimal trading parameters is great - but they may not respond or adapt to market changes
This is where ML may be able to help us!
So let's take a step-by-step case study in this 🧵and use ML to try and predict the best trading parameters each day
Enjoy!
Good visualisations are key to quickly assessing strategy performance
So here's a 🧵to discuss two I use frequently:
1. Monte Carlo Plots
2. Win Rate Heatmap
And yes, all Python code is included, enjoy!
I spent 100's of hours making a ML breakout scanner.
The outcome - scanning the entire market ≤ 2 mins per day.
The thing is, I hardly use it now... but it taught me a lot.
Here's 10 things I learned from this experience 🧵
(My favourite is Tip 7!)
Financial ML is not all about price prediction
IMO, there is power in:
• Clustering to find like assets (using KMeans)
• Clustering to find similar trades
Also - optimisation methods for strategy tuning :
• Genetic algos
• Bayesian opt
• Particle swarm optimisation
Learning Python for finance?
A good place to start is with coding simple strategies.
Here's a simple 🧵to show an example (code inside).
Best part is, it's only 38 lines of code.
Oh, and thank you
@QuantifiedStrat
for original post!
Bull Market Signal Strategy – How To Predict A Bull Market (Backtest)
The bull signal is the following: 18 consecutive closes above the 200 SMA in a bear market. After the bull market signal, the forward returns tend to be pretty strong. The market is up 21.84% on average one
Getting long-term historical intraday data is not easy by free.
For example, yfinance limits to only 7 days for 1m bars.
However, you can get 2 years from
@polygon_io
!
Here's a simple step-by-step 🧵to doing your first intraday price API call with Polygon.
In the linked tweet I showed an application for ML in finance.
Predicting trends in the SPY.
This was with a random forest model, so, how does a deep learning LSTM compare? 🤔
Can it learn from the OLHCV time series alone?
Let's explore in this step-by-step 🧵
Machine learning for finance is hard!
So let's take a simple case, and break it down step-by-step in this 🧵
The application - a simple trend prediction on the SPY using a random forest.
Enjoy!
To be fair - I suck at finding patterns for trading
But machine's aren't too bad at pattern finding
This is why I use machine learning in my research!
One technique I like to use - feature importance
It helps me understand what may be useful to dig into further.
Simple ML idea for finance
Clustering for stop-loss setting:
• Use kmeans to group similar trades
• Per cluster, get the x% quantile for the move in the opposite direction for winners only
This may help keep you in similar looking winners for future events - worth a test!
An ML idea for trading:
• Cluster to find patterns
• Fit a classifier using your clusters as labels
• Use SHAP to explain what features are important per cluster
Explainable, unsupervised learning!
Perhaps I'll make a 🧵about this soon!
Machine learning projects in finance can be complicated
Part of developing successful models is having a well-defined approach
Here's my typical step-by-step approach for developing classification models 🧵
I hope it's useful for you!
Screenshotting charts, what a pain!
No easy way to get chart consistency...
And don't get me started on the manual effort!
If only we could automate this somehow for our deep dives? 🤔
Well, with < 100 lines of Python, you can charts like these on mass 🧵
I started to progress with machine learning in trading when I took the "Keep It Simple Stupid" approach
Attempts 2 years ago
• Deep learning
• Reinforcement Learning
Now
• Unsupervised ML for data mining
• Tree based classification
• Anything I can do to remove complexity!
Trend lines are some of the most discretionary processes in trading.
With some math magic - they can be algorithmic.
Here's a 🧵 describing my first attempt at this.
Spoiler alert - the Python code is only 35 lines long!
Calendar Effects in Long-Term Treasuries: TLT Seasonal Trading Strategies Revealed (End of Month Insights):
In this strategy, we look at some specific calendar effects in long-term Treasuries. We backtest some calendar effects by using the ETF with the ticker code TLT which
News events are well known to drive up stock prices
So I imagine many want to collect news for backtesting (like me!)
Here's a simple step-by-step guide using
@polygon_io
to collect data for free 🧵
The case study, the 2023 PLTR earnings surprise!
My friend and I are making a machine learning swing trading model.
We were interested in it's performance month by month to improve it.
For some reason, 118 trades in July 2022, all winners. 😂
Stuff like this you'd never see from a simple histogram of trade results.
One key thing I have recently learned about classification models:
It's not just about being right
It's about accurately gauging how likely the model is right!
This is where we can use one cool technique - probability calibration.
Let's explore in this short 🧵
And here we have it!
A link to some of the code I used to develop my threads on Python/ML.
I'll iteratively add to it over time as I produce new content.
Enjoy!
Testing in any ML challenge is super important.
This is especially true for finance!
So let's explore one idea in this bitesize 🧵
The topic - time series K-Fold validation.
Enjoy!
And it's up!
An interactive dash to show the results from a RF model, that predicts trends in the SPY.
For those waiting, thank you for your patience!
It's been a busy week/hard to find time to develop around my 9-5.
Enjoy & more info inside the 🧵
1 month ago I decided to take twitter posting seriously
Since then:
• 2000 followers (I started at 120)
• Missed only 2 days of posting
• Video called 4 people
• Made & launched the trend predictor dash
I'm excited to see what I'll say in another month!
Me at the start of my journey: "I'm going to be rich!"
Me now: "Sigh... what mistake did I make this time..."
One thing I have learned is that more experience doesn't necessarily mean less mistakes. I still do stupid stuff all the time!
These days I'm simply more cynical of my
Not all % changes are created equally
A 20% may be large, but it could be relative to how the asset typically moves
Here are 3 features I add to try and account for this in my models 🧵:
• % changes/average range
• Include both % change and range
• Include market cap
I've been a bit quieter on twitter lately
Partly due to the effort of live testing my model
Partly to learn about some new things:
• Probability calibration
• Model stacking
• Conformal prediction
The data & trading rabbit hole is deep (but fun)!
More on the above soon.
They say a good picture speaks 1000 words.
In this case, the picture shows 1000 Monte-Carlo simulations.
In one image:
• Median expectancy
• Upper/lower quantiles (to judge volatility)
• Drawdown histogram
All part of a Monte-Carlo 🧵 I'm working on.
Stay tuned!
The two data providers I am currently using:
•
• Financial Modelling Prep
Both on their cheapest plan (for now).
Pros and cons of both on the starter plan⬇️
Where do I come up with most of my features for ML in finance?
YouTube videos and 𝕏 posts
Yes, I know it's not a clever answer
But great ML features come straight from intuition about the problem
Traders with this are already 10 steps ahead in financial ML!
Here's a backtest visualisation idea
Plot your win rate as a heatmap!
This way it's easy to see:
• The number of trades per month
• If each month appears consistent
• Are there extremely bad/good months
A useful overview for further analysis!
Code inside the 🧵
Yesterday I did a thread about how the market monitor (MM) can help predict trends in the SPY.
Today, let's remove the MM data and focus on price action only.
The aim - to start analysing what the key price-action factors are for determining SPY trends. 🧵
The Stockbee Market Monitor by
@PradeepBonde
is probably one of the best free resources out there.
It's amazing for gauging situational awareness of the market.
Let's take a look from a data-science perspective in predicting trends in SPY 🧵
Strategy Idea: The Williams %R Strategy on QQQ 🧵
Inside:
• What the Williams %R indicator is
• The strategy rules
• Link to the python backtest
• Parameter optimisation
• Key takeaways
It's a great looking equity curve, but can we trust it completely? 🤔
A dual MACD momentum strategy applied to BTC-USD 🧵
Inside
• The simple strategy rules, created by
@FWisdomTV
• How to backtest in Python
• Statistics from the backtest
Enjoy!
I'm a moron trader.
I struggle with classifying an A+ setup using discretion.
Maybe saying moron is harsh... perhaps learner is more appropriate! I know this work takes time.
Nevertheless, this is why I love to use machine learning to help me.
Here's one idea I try -
Financial ML can be a lot of trial and error
This is especially true for feature engineering
Got an idea for a predictive feature?
Train a model -> feature importance -> find out if you're right
One good way to confirm your hypotheses!
Random forest models are amazing
However, I prefer gradient boosted trees
Here's one advantage - early stopping!
This helps prevent overfitting by cutting training early, just as the model stops generalising to new data
Random forests don't really have this concept.
I like data backed trading because it confirms my discretion
If I find a profitable pattern - I can test it on mass
Machine learning is my next layer
If it can learn something about this pattern, then I have more conviction that a general and repeatable edge is there
Mark Minvervini's trend template is very well known in the trading world.
Perhaps what isn't very well known is how easy the template is to code in Python.
Here's a short 🧵 describing how to do this.
Oh, did I mention... it's only 34 lines of code!
Happy holidays everyone!! 🌲
I started posting properly here just under two months ago - now I'm at nearly 4k followers and I'm hugely grateful for everyone's support!
Trading and research are very lonely games. By doing everything in isolation, I believe you cut down on your
@PradeepBonde
If you're a swing trader, then the Market Monitor is a seriously useful market breadth tool.
I'm using it to understand whether the market is favorable to long-based momentum trades.
Basically, I'm looking to trade long when the chart is highlighted green.
Deep learning is sexy, right?
To be honest, this is where I started looking to apply machine learning for finance.
Now I'm finding more edge in the "simpler" algorithms.
Here's a 🧵for an idea I'm playing with - clustering for small gap gappers.
Interested in using machine learning for finance?
My advice - start with tree based models.
Why?
They're simple to work with, and you can use them to gauge feature importance.
Even though deep learning seems sexy, it's much harder to use and far less interpretable.
To traders who are learning Python!
What is it that you would like to learn more about?
I'd love to make more posts about Python, so I want to understand the sort of things you'd like to see.
E.g. more simple backtests? Collecting data? Plots?
Let me know (even if it's a DM)!
Intuition can be your best friend for features in ML
E.g. For trend trading models, extensions from SMAs may be useful
Feature importance scores can help show if you are correct
Sometimes it confirms your genius
Or you may learn something new
One of the cool things about ML!
The SPY predicting trend dashboard has 259 unique viewers since I launched it 3 days ago.
Not massive numbers I know, but more than PhD thesis got in the first 3 years.
So that's a huge win to me!
Although, to be fair, my thesis wasn't exactly a "thrilling read" 🤷.
Deep learning is amazing stuff - but tree based models are winners for me!
Deep Learning:
• Need a lot of data
• Long training times
• Complex model architectures
Tree based models:
• Can work on smaller datasets
• Quicker training times
• Easier to interpret and explain
Deep learning is amazing - but tree based models are winners for me!
Deep Learning:
• Need a lot of data
• Long training times
• Not very interpretable
Tree based models:
• Can work on smaller datasets
• Quicker training times
• Interpretable
@PradeepBonde
The best thing about tree based models is that you can extract feature importance.
In other words, per column in Pradeep's data, what are the most/least important.
Interestingly, the # of stocks up 13% plus in 34 days is most important. Closely followed by the 10 day ratio.
One of my favourite testing techniques:
Time Series Cross Validation
• Helps in preventing data leakage
• Mimics periodic model training
This way, you can test if there is some model instability as you train on "new" data.
1. Data + Feature Engineering
To keep it simple, let's get daily SPY data from yfinance - with this we'll get ~20 years of daily data.
The features we'll use are also simple:
• Price change from the SMA
• Price change from the n day max/min
• Price Changes
I don't want to waste anyone's time - so here's the code
What does this do?
1. It checks through a csv file with two columns, ticker and Date.
2. It downloads the daily data from yfinance (if it's not delisted)
2. It produces a chart (and optionally saves it as a png)
Wow, 500 followers! 🚀
Yesterday I had 120. +300% growth.
Thank you all! More content to come.
Also, I'm now secretly hoping my trading account grows the same way 😂
Distributions in trading are strange
An example = high-of-the-day time for red day small cap gappers
The mean + median look... well.. off.
Here's a tool I like - Kernel Density Estimation
Inside:
• A quick KDE intro
• How to use it in Python
• Some analysis ideas
Enjoy!
Here are the three best places to learn the basics behind how common ML models work:
1. StatQuest by
@joshuastarmer
2. StatQuest by
@joshuastarmer
3. StatQuest by
@joshuastarmer
If you're interested in ML - Josh's YouTube channel is highly recommended!
Wow! I woke up today to 1000 followers 🚀
A week ago I had 500
Just before that 120.
I'm humbled that 1000 people are interested in what I have to say.
Thank you all!
More value to come!
Wow, 500 followers! 🚀
Yesterday I had 120. +300% growth.
Thank you all! More content to come.
Also, I'm now secretly hoping my trading account grows the same way 😂
Thank you for the reception to my tweet about the Market Monitor yesterday!
It looks like there is an appetite for making this kind of chart public.
I'm working on knocking up a dashboard, which I can host with
@streamlit
.
Sneak peak on the next tweet.
The Stockbee Market Monitor by
@PradeepBonde
is probably one of the best free resources out there.
It's amazing for gauging situational awareness of the market.
Let's take a look from a data-science perspective in predicting trends in SPY 🧵
1. Getting the data
This is the easy part!
Since we are looking at the daily time frame, we can use yfinance to get the data in just two lines of code.
The output will be a pandas dataframe containing ~ 20 yrs of data.
Learning machine learning for finance?
My advice - don't start with finance!
You'll ask two questions when it fails
1. Do I understand ML?
2. Can ML solve my problem?
Start with basic/solvable problems to learn
Apply to finance later
Then you can eliminate one question
Everyone wants a superhuman ML prediction model
Yet there is great power in unsupervised methods too
The difference:
1. Prediction - finding a specific pattern
2. Unsupervised - it finds a pattern for you (thanks ML!)
Sometimes good insights can be gained from unsupervised!
A few weeks to build a model
100's of lines of code
Too many debugging sessions
Countless cups of coffee
The unexpected and annoying problems when live testing on a demo account...
Priceless 😂
Sometimes, a good visualisation can speak 1000 words
However, this one literally does that!
The data = news headlines for gap-up days on mid to large cap stops
My aim = to upskill in NLP and apply to trading.
Quite a neat view, right? 🙂
2a. Deriving the features
Yes, they're all %
The simple reason is that we want the model to learn general rules.
If we feed it price, it may learn something like "oh this happened when SPY was $200".
% changes keep features generic over time scales
Explainable ML is an awesome area of study.
This is a snapshot from when my simple ML trend model predicted the end of the COVID crash.
Using SHAP, you can break down each feature contribution to this decision.
All part of a 🧵I'm working on for tomorrow - stay tuned!
@PradeepBonde
Anyway, that's all for this one!
If you are interested in applying Python/Data-Science to finance then:
1. Follow me
@DrDanobi
for more of these
2. Check out some other of my threads.
The Stockbee Market Monitor by
@PradeepBonde
is probably one of the best free resources out there.
It's amazing for gauging situational awareness of the market.
Let's take a look from a data-science perspective in predicting trends in SPY 🧵
My aim is to quantify as much as possible.
That even goes for my trend lines!
The ones in the video are drawn 100% by my algo.
How does it work, I hear you ask?
Answer: y=mx+c with a sprinkle of mathematical optimisation.
Something we shall explore in a future thread.
It's pretty basic for now - but can always be updated & improved based on feedback.
If this would be useful for you, then let me know by liking/commenting 🙂
Anyway, that's enough of my blabbering for today!
I hope you found it useful 🙂
If you are interested in applying Python/Data-Science to finance then follow me
@DrDanobi
for more threads!
Machine learning for finance is hard!
So let's take a simple case, and break it down step-by-step in this 🧵
The application - a simple trend prediction on the SPY using a random forest.
Enjoy!
I can't take credit for the idea - that goes to
@chanep
and his team at
@PredictNowAI
They call it "Conditional Parameter Optmisation" (CPO) - the key idea is to use ML to find the optimal parameters based on the current market conditions, i.e. it's adaptive.
Plenty of distributions in trading are not "normal".
Here's one example - monthly percentage changes on the SPY.
What does this mean for you?
Don't use a mean as your average - use a median.
Or better yet, plot a histogram, to see how the distribution of your data looks!
To start, let's briefly introduce two fields of ML:
1. Unsupervised - grouping like cases, where the ML is given no direction.
2. Supervised - learning from labelled data (e.g. this is a cat, this is a dog)
We're going to use 1 to find a pattern, and 2 to de-risk it.
The simple moving average crossover strategy.
It's terrible - but a GREAT starting point for algo trading in Python.
In fact, it's only 10 lines of code.
Here's an simple 🧵 you can follow along with.
3. Target variable
This is THE hardest bit IMO - if you give it crappy target, the model may not be able to learn anything!
For the demo, let's ask the model if it can predict if SPY will be above it's 20SMA in 5 days time.
This could be one way of detecting trend reversals.
@PradeepBonde
With the new features & model, we end up with a cleaner chart again.
What I find quite interesting is seeing how the model is quite unsure in 2023.
Lots of chop for sure - I'm sure many traders can vouch for that personally!
Obviously, more work to be done on the model!
But at least the heatmap gives some signal for bad months where we could look at adding secondary filters.
For those interested in such a plot - I have attached the code I used.
Here's a data analysis tip for trading - plot everything!
It may seem excessive, but sometimes raw numbers don't speak the full story.
Here's an example, the time for the high of the day for red day small gap gappers.
You'd never know there are two peaks without plotting.
2. Deriving the features
The model needs something to learn from, so as a start, let's give it:
• % change between the price and SMA
• % change from the price and rolling max/min
• % change in the price from n days ago
Notice a pattern?
1a. Data + Feature Engineering
The reason we use % changes for each of these is for 2 reasons.
The first is that it stops a model learning something like "When SPY was $200, it did this"
The second I'll explain when we get to the unsupervised ML.
@PradeepBonde
TL;DR
The Market Monitor is an exceptional tool for understanding trading environments.
Machine learning can be leveraged with this data to provide an indicator of when to swing trade long.
And be sure to thank Pradeep for the data/free content!
Anyway, that's probably enough for this monster thread.
Let's not lengthen the winter.
I hope you found it useful 🙂
If you are interested in applying Python/Data-Science to finance then follow me
@DrDanobi
for more threads!
Machine learning for finance is hard.
So let's take a step-by-step case study in this 🧵
This time - let's ask ML to find patterns without giving it any direction, and use those patterns to mitigate risk.
Oh, and we'll backtest the ML model too!
Enjoy!
@PradeepBonde
Why?
We want the machine learning model to learn general rules.
If we feed in raw figures, it may not understand that the # will change because the universe size changes.
The 5 & 10 day ratio are already normalised, so they are fine too (same for the t2108).
The TL;DR:
• Using ML to predict trends can be done, with a decent degree of accuracy
• We can use ML to gain new trading insights
• Lots of pretty pictures
Anyway, that's all for today - I hope you enjoyed!
2. Train/Test Split
Let's do a split on 2020
Why?
I want to capture a bull/bear market in my test set - it's a fairly rough measure on seeing if the model can withstanding extremes.
Obviously, there are more rigorous testing someone could do.
But better to start simple!
1. Trend Prediction
In this 🧵:
• A simple prediction of trend in 5 days time
• Random Forest model
• Basic features & model evaluation
My experience has been that predicting trends is easier than prices - the trick is in how you define "trend" 😉
Hopefully this is a start!
Machine learning for finance is hard!
So let's take a simple case, and break it down step-by-step in this 🧵
The application - a simple trend prediction on the SPY using a random forest.
Enjoy!
And there we have it! An end-to-end example of a simple machine learning workflow.
Honestly, this is super simple and there is tons more to cover, e.g:
• Hyperparameter optimisation
• AUC/ROC
• Feature Engineering
• Target Improvements
• Different models
And so much more!
9. Visualise on a chart
This is essential for time series classifiers & gives you another way of evaluating model success.
E.g. there are some hiccups in sideways areas - this says maybe better features or a more defined target is require to avoid the chop!
5. Model Training
Honestly, this is the easy part. Some tips!
Tip 1: Keep max depth small
Reason 1: This prevents overfitting, and forces the model to learn "general rules" from the data, rather than exact patterns
3. Clustering
KMeans clustering is an unsupervised ML approach to find "groups" in data, without any guidance except the number of groups to search for.
E.g., K = 2 -> group the data into two bins.
So the appropriate question to ask here is "well... how many groups?"
Anyway, that's enough rambling for today.
The code has been committed to my git repo, the link is in my bio.
And, If you are interested in applying ML/data-science to finance then follow me
@DrDanobi
for more threads like these!
Machine learning for finance can be tricky!
So let's explore with another step-by-step case study 🧵
This time, let's use ML to find time series patterns across multiple instruments.
The aim - to find like patterns in the market and build a scanner.
Enjoy!