SkalskiP Profile Banner
SkalskiP Profile
SkalskiP

@skalskip92

20,558
Followers
973
Following
1,138
Media
6,186
Statuses

Computer Vision @roboflow . Open-source. GPU poor. Dog person. Coffee addict. Dyslexic. | GH: | HF:

Kraków, Polska
Joined February 2014
Don't wanna be here? Send us removal request.
Pinned Tweet
@skalskip92
SkalskiP
17 days
supervision - a computer vision library I created - just crossed 15,000 stars on GitHub! BBBRRRRRR! link:
50
598
6K
@skalskip92
SkalskiP
13 days
polish TV is using computer vision to enhance the viewer experience for sports broadcasts: - FIFA-like radar overlays - player recognition - pass distance measurement - ball speed and trajectory tracking during shots
208
1K
14K
@skalskip92
SkalskiP
8 months
RIP image annotation companies Fully automated image labeling with GroundingDINO + SAM + OpenAI Vision API code:
Tweet media one
65
395
3K
@skalskip92
SkalskiP
11 months
supervision-0.13.0 is out! Now you can effortlessly build advanced video analytics. Trackers, Zones, Annotators, and much more. GitHub repository:
36
546
3K
@skalskip92
SkalskiP
6 months
here is the final version of my vehicle speed estimation demo read the thread below to learn how I built it. I will cover: - detection - tracking - perspective transformation - speed calculation - some bonus ideas ↓
75
285
3K
@skalskip92
SkalskiP
5 months
REAL-TIME object detection WITHOUT TRAINING YOLO-World is a new SOTA open-vocabulary object detector that outperforms previous models in terms of both accuracy and speed. 35.4 AP with 52.0 FPS on V100. ↓ read more
35
383
3K
@skalskip92
SkalskiP
3 months
supervision, the open-source library I created a year ago, has crossed 10,000 stars on GitHub this weekend! thank you to everyone who helped me build this project! it took us 2,000+ commits, 500+ PRs and 50+ contributors to do it. repository:
21
313
2K
@skalskip92
SkalskiP
2 months
almost fully functional version of my football AI project today, I added player tracking using ByteTrack and projection of players onto the map code coming soon:
53
210
2K
@skalskip92
SkalskiP
11 months
supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:
46
311
2K
@skalskip92
SkalskiP
5 months
I'm starting to get more and more serious with YOLO-World; trying to solve real-life problems. I wanted to see if YOLO-World could recognize that the holes had been filled out. It was pretty tricky, but I learned a little about prompting. ↓ read more
16
168
1K
@skalskip92
SkalskiP
10 months
The traffic analysis project is growing! The YouTube tutorial will be out this week. Progress: I can now identify that the car is in a specified zone. Next: Match entrance and exit zones for every tracker ID to analyze the traffic flow. GitHub repo:
20
292
1K
@skalskip92
SkalskiP
8 months
Chat with the webcam using @OpenAI vision API
45
179
1K
@skalskip92
SkalskiP
2 months
I'm taking my football/soccer project to the next level today, I worked on detecting players, referees, and the ball and mapping their positions from video frames to positions on the field. ↓ read more
60
120
1K
@skalskip92
SkalskiP
1 month
I fine-tuned my first vision-language model PaliGemma is an open-source VLM released by @GoogleAI last week. I fine-tuned it to detect bone fractures in X-ray images. thanks to @mervenoyann and @__kolesnikov__ for all the help! ↓ read more
Tweet media one
30
196
1K
@skalskip92
SkalskiP
10 months
ball and player 3d pose estimation - easily one of the coolest computer vision projects I have ever made repository:
24
184
1K
@skalskip92
SkalskiP
3 months
detecting AI-generated text researchers studied the impact of ChatGPT on AI conference peer reviews, confirming what we all knew paper: ↓ read more
Tweet media one
32
118
1K
@skalskip92
SkalskiP
7 months
Nov 6th, 2023: We love you guys! Nov 17th, 2023: Sam is fired!
40
158
1K
@skalskip92
SkalskiP
4 months
YOLOv9 is out looks like a new SOTA real-time object detector I'm already working on a custom training tutorial
@_akhaliq
AK
4 months
YOLOv9 Learning What You Want to Learn Using Programmable Gradient Information Today's deep learning methods focus on how to design the most appropriate objective functions so that the prediction results of the model can be closest to the ground truth. Meanwhile, an appropriate
Tweet media one
9
163
794
24
164
1K
@skalskip92
SkalskiP
3 months
manual data labeling is (almost) dead 1,500,000 images auto-annotated within 2 weeks of release. now, we also support automatic segmentation labeling. ↓ read more about open-source models that power this feature
51
136
1K
@skalskip92
SkalskiP
2 months
I need to take a break from football AI for a while. I plan to experiment with PaliGamma, Google's new open-source VLM, over the next few days. but don't worry, I'll be back. In the meantime, the football AI code is slowly making its way to this repo.
38
124
1K
@skalskip92
SkalskiP
4 months
train YOLOv9 on your dataset tutorial - run inference with a pre-trained COCO model - fine-tune model on custom dataset - evaluate the trained model - run inference with a fine-tuned model blogpost: ↓ read more
13
157
1K
@skalskip92
SkalskiP
11 months
supervision-0.13.0 is out! Now you can effortlessly count crops in the fields with a single drone flyby. GitHub repository:
11
153
1K
@skalskip92
SkalskiP
2 months
taking my football/soccer AI to the next level - image embeddings - dimension reduction - player clustering - awesome visualizations code: (code migration in progress...) ↓ read more
33
92
978
@skalskip92
SkalskiP
6 months
what stops you from using supervision today? link:
24
107
935
@skalskip92
SkalskiP
7 months
looking for OpenAI-4V alternatives? - LLaVA - BakLLaVA - CogVLM - Fuyu-8B - Qwen-VL I am working on a short blog post discussing some GPT-4V alternatives. It will probably come out today. links all resources:
Tweet media one
@skalskip92
SkalskiP
7 months
What OpenAI-4V alternatives would you recommend? - LLaVA - BakLLaVA
45
44
488
43
154
927
@skalskip92
SkalskiP
8 months
Automated @NBA match commentary using @OpenAI vision and TTS (with code!) Everyone is bragging about projects that generate automatic video commentary, but no one is showing the code. I did it while waiting for the plane. code:
43
144
912
@skalskip92
SkalskiP
4 months
manual data labeling is almost dead define prompts, tweak the confidence threshold, and make manual adjustments if necessary. this feature is now available to all users, even on free accounts. read more:
12
125
917
@skalskip92
SkalskiP
9 days
Florence-2 is finally out! 1 model; 10+ computer vision tasks! ↓ key takeaways are listed below. see my blog post for details. link:
23
123
889
@skalskip92
SkalskiP
18 days
I spent most of today preparing for CVPR 2024 "Matching Anything by Segmenting Anything" particularly caught my attention. Here are the fast open-vocabulary tracking examples (MASA + YOLO-World). link: ↓ read more
7
122
877
@skalskip92
SkalskiP
5 months
how to calculate the TIME objects spend IN THE ZONE? - that's the topic of my next tutorial. here's a short (and a bit creepy) demo I built a few months ago. do you have ideas for a less creepy use case for this tech? github repository:
56
127
859
@skalskip92
SkalskiP
24 days
supervision 0.21.0 is launching tomorrow this update includes VertexLabelAnnotator, allowing you to annotate skeleton vertices with custom text and color link:
15
131
858
@skalskip92
SkalskiP
4 months
taking traffic analysis to the next level with supervision-0.19.0 speed estimation + 3d roead visualization link: ↓ read more
12
109
860
@skalskip92
SkalskiP
7 months
analyzing store traffic to find the most frequently visited areas super demo created by @Hine__Po - member of Supervision community link to repo if you want to build something over the weekend:
13
146
817
@skalskip92
SkalskiP
4 months
The YOLO-World YouTube tutorial is out! please, let us know what you think! - model architecture - processing images and video in Colab - prompt engineering and detection refinement - pros and cons of the model watch here: ↓ more resources
12
136
803
@skalskip92
SkalskiP
3 months
now you can run real-time object detection on multiple streams with 10 lines of code link: ↓ code snippet
13
141
789
@skalskip92
SkalskiP
4 months
YOLOv9 tutorial: train model on custom dataset - running inference with pre-trained COCO weights - fine-tuning the model on a custom dataset - model evaluation - model deployment sorry it took me so long; hope you like it
15
100
752
@skalskip92
SkalskiP
2 months
it took us a while, but the supervision-0.20.0 release will finally add support for key points. what are your thoughts on annotators? so far, we only have EdgeAnnotator and VertexAnnotator. supervision repo:
21
96
748
@skalskip92
SkalskiP
9 months
supervision-0.15.0 is out! This time, we bring highly customizable annotators. We added eight annotators - box, mask, ellipse, label, circle, corner, trace, and blur. But the best part is... you can freely mix them! GitHub repository:
9
126
737
@skalskip92
SkalskiP
1 month
YOLO is the craziest model family. Each version is created by a different organization. "Compared with YOLOv9-C, YOLOv10-B has 46% less latency and 25% fewer parameters for the same performance." I'll try to test it today. ↓ links
Tweet media one
14
82
746
@skalskip92
SkalskiP
5 months
improving object counting logic today I solved an interesting bug that has existed in my library for a loooooong time repository: ↓ WARNING: lots of math in the thread below
7
80
739
@skalskip92
SkalskiP
9 months
Easily one of the most exciting projects built with Supervision! Our community member Vriza Wahyu Saputra built this fantastic ball juggling counting demo using the moving LineZone available in our API.
12
95
717
@skalskip92
SkalskiP
8 months
Am I the last person who didn't know about OpenAI Cookbook? link:
Tweet media one
23
89
705
@skalskip92
SkalskiP
3 months
support for pose estimation and key point detection soon in the supervision you can expect connectors for the most popular models and the first annotators in the next supervision release can't wait to build demos like this with supervision
14
83
705
@skalskip92
SkalskiP
5 months
parking occupancy analysis calculation of percentage occupancy in individual parking zones all this was done with supervision: btw, @UenoLeo is cooking a blog post covering this project, so stay tuned! ↓ read more
13
97
703
@skalskip92
SkalskiP
3 months
I love watching other people build cool demos with the supervision library; traffic analysis examples built by Anant Jaiswal - object tracking - zone counting - heat-map analysis link:
4
94
704
@skalskip92
SkalskiP
4 months
smart self-service checkout powered by YOLOv9 the value of the basket is updated live based on its changing content; what else should I add? demo build with supervision:
14
90
704
@skalskip92
SkalskiP
8 months
What papers should I read to expand my knowledge of Transformers? Please send links in the comments and write why this paper is worth reading. Thanks for your help!
Tweet media one
32
102
685
@skalskip92
SkalskiP
3 months
new YouTube tutorial: compute dwell time using computer vision in live streams (seems easy, yet tricky) - static file vs stream processing - preventing growing latency and frame buffer overflow - efficient stream processing full tutorial: ↓ read more
6
78
681
@skalskip92
SkalskiP
6 months
speed estimation tutorial is finally out! - object detection -multi-object tracking - filtering detections with polygon zone - perspective transformation and speed estimation link: below are some interesting visualizations I created for this video ↓
13
112
673
@skalskip92
SkalskiP
5 months
Qwen-VL-Plus is SACARY good! (better than GPT-4V) here it is casually solving Recaptcha! - You don't have to give any additional instructions other than 'Solve it.' - It can even mark the exact position of the objects it is looking for. ↓ it can do so much more
Tweet media one
24
103
678
@skalskip92
SkalskiP
7 months
Sports Analytics with GPT-4 Vision I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
@skalskip92
SkalskiP
11 months
supervision-0.13.0 is out! We added ByteTrack support! Now you can easily plug in any object detector and use it for tracking. GitHub repository:
46
311
2K
20
89
669
@skalskip92
SkalskiP
11 months
- Object detection over HTTP? - Easy! We just open-sourced our inference server under Apache 2.0 Left terminal: @roboflow inference Right terminal: video client
6
80
662
@skalskip92
SkalskiP
10 months
The traffic analysis project is done! The YouTube tutorial will be out tomorrow. Stay tuned! Wait till flow counters appear around 0:06. Github repo:
17
100
651
@skalskip92
SkalskiP
8 months
SAM + MetaCLIP + ProPainter produce masks: remove object: I'm working on combo space!
7
102
616
@skalskip92
SkalskiP
1 month
I'm experimenting with PaliGemma tonight a single open-source model allowing you to: - detect car (detection) - answer questions about its color and brand (VQA) - read license plate number (OCR) all that on a single consumer-grade GPU is there any other model that can do it?
Tweet media one
25
81
631
@skalskip92
SkalskiP
5 months
it blows my mind to see things that are created using my code
Tweet media one
14
26
619
@skalskip92
SkalskiP
5 months
It took me ONE HOUR to craft this demo using supervision-0.18.0 - Three new annotators: PercentageBar, RoundedBox, and OrientedBox - Enhanced LineZone feature for improved counting - OBB (oriented bounding boxes) integration ↓ read more repo:
13
94
609
@skalskip92
SkalskiP
4 months
YOLO-World + EfficientSAM + StableDiffusion for language-guided inpainting I was inspired yesterday by the work of @MrDravcan (see attached), and I decided to try to replicate it. SPOILER ALERT: it didn't quite work out for me. ↓ read more
17
96
611
@skalskip92
SkalskiP
3 months
awesome example of using Supervision for the detection, annotation, and counting of coffee seedlings kudos to community member Eric Kimwatan supervision repo: ↓ youtube tutorial and colab
8
87
598
@skalskip92
SkalskiP
3 months
time-in-zone (dwell time) tutorial is coming this is the third time I'm trying to make this video; hopefully, the last one I finally have a good use case - waiting time for service. here is the first iteration. what do you think? link:
11
61
600
@skalskip92
SkalskiP
2 months
always triple-check the correctness of your datasets and data augmentations. today, I found two separate errors that ruined my model training. but finally, we are on the right track ↓ here's where I messed up
12
38
599
@skalskip92
SkalskiP
3 months
detecting small objects is hard I spent some time today writing a short how-to guide on using supervision (in combination with the most popular CV libraries) to detect small objects. btw is that a good idea for a video tutorial? link: ↓ read more
18
59
586
@skalskip92
SkalskiP
5 months
supervision-0.18.0 is almost here! we had planned to release it tomorrow, but we're still putting the finishing touches on the OBB (oriented bounding box) support repository:
Tweet media one
6
67
583
@skalskip92
SkalskiP
7 months
Manually annotate ONE image and let GPT-4V annotate ALL of them. 1. generate boxes for all images with GroundingDINO 2. provide categories for the reference image 3. prompt GPT-4V to map generated boxes to reference categories
Tweet media one
8
86
576
@skalskip92
SkalskiP
10 months
I'm working on a new YouTube tutorial... It's going to be sick! Here is v1 of my custom vehicle detector.
19
61
569
@skalskip92
SkalskiP
4 months
I'm experimenting with a new annotator that zooms in on small detections do you think it is something useful? or am I just wasting my time here? more cool annotators:
40
56
559
@skalskip92
SkalskiP
7 months
me: Find dog. gpt-4 vision: for the past few days I have been working on a library for advanced prompting of LMMs here it is:
Tweet media one
16
77
553
@skalskip92
SkalskiP
4 months
processing documents with Claude 3 - Good OCR capabilities - Process up to 20 images with a single API call - API seems slow and a bit unstable; expect a lot of variance in call execution time - ~2x cheaper than GPT4-V (please check my math) ↓ read more
Tweet media one
Tweet media two
10
47
543
@skalskip92
SkalskiP
9 months
Working on a new tutorial - time in the zone. Detection, tracking, and zones are ready. Time to add timers. GitHub repository:
9
96
536
@skalskip92
SkalskiP
3 months
time analysis with computer vision - blurring faces - detection and tracking - smoothing detections - filtering detections by zone - calculating time let me know if you want me to explain anything else. ;) code: ↓ read more
8
58
523
@skalskip92
SkalskiP
6 months
finally had a little bit of time to work on my upcoming vehicle speed estimation tutorial any improvement ideas? the demo was built with the supervision code will soon land on GitHub:
26
77
496
@skalskip92
SkalskiP
9 months
Is that demo too creepy? Ignore that one lady sitting in the zona since the beginning is undetected. I am still trying to figure out why... But zone timers work! GitHub repository:
39
80
513
@skalskip92
SkalskiP
4 months
🔴 stream: YOLO-World Q&A + coding in less than 15 minutes, I start my first YT stream; I'll be talking about YOLO-World and answering your questions that you left under my last YT video stop by to say hello link: ↓ some of the topics we will cover
7
73
512
@skalskip92
SkalskiP
1 year
Two months ago, I created a @github repository where I gathered links to the best free AI courses. 🔥 I started with five links, and now there are almost 20. 🚀 The entire repository already has 1200+ ⭐ ⮑ 🔗 GitHub repository: ↓🧵some of the courses
Tweet media one
8
152
495
@skalskip92
SkalskiP
1 month
YOLOv10 is fast and light but is NOT the best choice for detecting small objects in the distance. - YOLOv8 - top-right - YOLOv9 - bottom-left - YOLOv10 - bottom-right YOLOv10 performs worse.
Tweet media one
8
57
511
@skalskip92
SkalskiP
17 days
the moment @karpathy knows your open-source project exist; now I can die in peace 🤯
Tweet media one
@skalskip92
SkalskiP
17 days
supervision - a computer vision library I created - just crossed 15,000 stars on GitHub! BBBRRRRRR! link:
50
598
6K
16
14
508
@skalskip92
SkalskiP
3 months
OpenAI, xAI, and... Supervision top 5 on GitHub trending!
Tweet media one
@skalskip92
SkalskiP
3 months
supervision, the open-source library I created a year ago, has crossed 10,000 stars on GitHub this weekend! thank you to everyone who helped me build this project! it took us 2,000+ commits, 500+ PRs and 50+ contributors to do it. repository:
21
313
2K
19
42
484
@skalskip92
SkalskiP
9 days
Thursday afternoon posters session #CVPR2024 YOLO-World: Real-Time Open-Vocabulary Object Detection [223] TL;DR: YOLO-World enhances the YOLO detector series with open-vocabulary detection capabilities, overcoming limitations of predefined object categories. ↓
7
95
567
@skalskip92
SkalskiP
7 months
What OpenAI-4V alternatives would you recommend? - LLaVA - BakLLaVA
45
44
488
@skalskip92
SkalskiP
7 months
supervision-0.17.0 release is just around the corner - plug in your favorite detection/segmentation model - compose the perfect visualization github:
9
92
487
@skalskip92
SkalskiP
10 months
Traffic Analysis Tutorial is out! I'm sorry for the delay. I didn't expect it to be 23 minutes long. Please let me know what you think. YouTube video:
13
101
488
@skalskip92
SkalskiP
7 months
using GPT-4V to split players into teams blending detections with the same tracker ID allows you to significantly reduce the number of GPT-4V API calls when you process video 1 call / 25 frames kudos to @ikuma_uchida18 for coming up with this strategy read more, it's cool ↓
@skalskip92
SkalskiP
7 months
Sports Analytics with GPT-4 Vision I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
20
89
669
9
79
470
@skalskip92
SkalskiP
8 months
The second day of work on my SAM + MetaCLIP + ProPainter HF Space - Automated object masking [done] - Automated inpainting using ProPainter [in progress]
13
67
464
@skalskip92
SkalskiP
13 days
I know my football AI isn't quite there yet, but this motivates me to add some advanced features after I get back from CVPR 2024. I'll keep you guys posted! link:
8
26
473
@skalskip92
SkalskiP
8 months
I just added the polygon annotator to the supervision package you can now use masks or polygons to visualize the result of the instance segmentation model polygon annotator will be available in supervision-0.17.0 code:
5
53
455
@skalskip92
SkalskiP
7 months
processing this one-second video exhausted my entire daily quota of 500 GPT-4V requests but if you were wondering, @OpenAI GPT-4V can automatically divide players into teams based on the color of their uniforms
@skalskip92
SkalskiP
7 months
Sports Analytics with GPT-4 Vision I wondered whether GPT-4V had the capability to automatically separate players into teams based on the color of their uniforms. It took me a ridiculously long time to create this image, but in the meantime, I learned a lot about GPT-4V.
Tweet media one
20
89
669
26
62
448
@skalskip92
SkalskiP
5 months
defect detection with computer vision training and deploying manufacturing defect detector step-by-step guide blog post: ↓ read more
Tweet media one
9
71
456
@skalskip92
SkalskiP
1 month
detecting small and distant objects is a major weakness of YOLOv10. here is the comparison of YOLOv8l at 640x640 and YOLOv10l at 640x640: - green: detected by YOLOv8 and YOLOv10 - red: detected only by YOLOv8 - blue: detected only by YOLOv10
@skalskip92
SkalskiP
1 month
YOLOv10 is fast and light but is NOT the best choice for detecting small objects in the distance. - YOLOv8 - top-right - YOLOv9 - bottom-left - YOLOv10 - bottom-right YOLOv10 performs worse.
Tweet media one
8
57
511
9
55
448
@skalskip92
SkalskiP
2 months
estimating traffic density based on the live feed from NYC street cameras. you can find out in real-time which streets are congested. shoutout to @UenoLeo for creating this cool project!
7
50
446
@skalskip92
SkalskiP
8 months
Segment Anything (SAM) + MetaCLIP - unleashing the full power of @Meta open source! I'm having fun with @Gradio today!
@NielsRogge
Niels Rogge
8 months
CLIP by @OpenAI was revolutionary, but its data curation pipeline was never detailed nor open-sourced. @Meta has now released MetaCLIP, a fully open-source replication. Models are on the hub:
17
172
1K
10
73
437
@skalskip92
SkalskiP
11 days
live GPT-4o demo by @rown from OpenAI at #CVPR2024
21
49
435
@skalskip92
SkalskiP
4 months
YOLO (unofficial and incomplete) history who made what? while I wait for my first YOLOv9 model custom dataset fine-tuning to finish, I decided to share with you an incomplete YOLO history with links to papers and code YOLO (2016) Joseph Redmon et al. - paper:
Tweet media one
4
60
424
@skalskip92
SkalskiP
15 days
obviously torch vision is a lot more important than supervision :) still it’s awesome feeling to see my tiny library overtaking this freaking giant on GitHub link:
Tweet media one
8
27
427
@skalskip92
SkalskiP
4 months
zero-shot video object detection with YOLO-World as promised, I just updated my @huggingface ; have fun! space:
8
71
413
@skalskip92
SkalskiP
9 months
supervision-0.15.0 will be out tomorrow! This time we bring highly customizable annotators. Just plug in your model and we'll take care of the rest. GitHub repository:
4
62
412
@skalskip92
SkalskiP
8 months
Adding meaningful regions and labels significantly improves GPT-4V's reasoning capabilities.
Tweet media one
18
68
403
@skalskip92
SkalskiP
8 months
the must-have resource for anyone who wants to experiment with and build on the @OpenAI Vision API code:
Tweet media one
7
57
402
@skalskip92
SkalskiP
3 months
zone analysis is awesome; you can use it to calculate an object's precise position in space, determine its movement path, or measure its distance traveled. air traffic monitoring demo by @carlos_melo_py supervision repo: ↓ youtube tutorial and code
5
56
403
@skalskip92
SkalskiP
3 months
whenever I show zone analysis in my tutorials, people ask me how I designed the polygons I decided to spend a few hours and create for you a small tool you can fire up locally to draw zones code:
8
36
398