Ostris Profile Banner
Ostris Profile
Ostris

@ostrisai

5,501
Followers
219
Following
210
Media
878
Statuses

AI / ML researcher and developer. Forcing rocks to think since 1998. ML at - @heyglif

Denver, Co
Joined August 2023
Don't wanna be here? Send us removal request.
@ostrisai
Ostris
3 months
Did a lot of testing on my LoRA training script for @bfl_ml FLUX.1 dev model. Amazing model! I think it is finally ready. Running smooth on a single 4090. Posting a guide tomorrow. Special thanks to @araminta_k for helping me test on her amazing original character and artwork.
35
98
688
@ostrisai
Ostris
9 months
I teared up a bit. I am extremely excited, but also feel completly inadequate in literally everything I have ever worked on. Ever. It is absolutely stunning and humbling to watch. I need a drink.
@OpenAI
OpenAI
9 months
Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.”
487
1K
12K
24
30
675
@ostrisai
Ostris
2 months
Testing Flux.1 schnell embedding interpolation. I told claude some details about my life and had it generate 40 prompts from being born -> being dead using the data I gave it. Then interpolated between the embeddings for this video. Uses @araminta_k softpasty LoRA
31
86
649
@ostrisai
Ostris
3 months
IT WORKS!! I trained an IKEA instruction LoRA on a decompressed version of FLUX.1 schnell , and it works on schnell at 4 step! The choppiness should go away when I get the guidance embedding trained. Still a WIP, but it works! There is hope!
Tweet media one
Tweet media two
22
58
538
@ostrisai
Ostris
1 year
New Stable Diffusion XL LoRA, Ikea Instructions. SDXL does an amazingly hilarious job at coming up with how to make things. Special thanks to @multimodalart and @huggingface for the GPU grant!! HF -> Civitai ->
Tweet media one
Tweet media two
Tweet media three
Tweet media four
13
79
485
@ostrisai
Ostris
3 months
AI-Toolkit now officially supports training LoRAs directly on FLUX.1-schnell. 🥳 How to do it here -> The adapter that makes it possible here ->
14
85
484
@ostrisai
Ostris
2 months
FLUX.1 schnell text embedding interpolation test. Silver balloons floating in the street.
13
41
463
@ostrisai
Ostris
8 days
How did I miss this? Too many releases. @freepik Awesome work! It works out of the box with ai-toolkit. Training first test LoRA now.
11
63
437
@ostrisai
Ostris
11 months
Early alpha demo of my "virtual try on" I have been working on. Load a few photos of a person, a photo of a top, enter a prompt, and instantly render them wearing it in any scene you want. Special thanks to my beautiful wife for letting me use her likeness.
16
49
386
@ostrisai
Ostris
3 months
Working to get out a full tutorial for FLUX.1-dev LoRA training on a 24GB card. For now, I updated the read me and added an example config file. Should be enough to get many of you going. Will be updating as I go.
11
32
314
@ostrisai
Ostris
2 months
Soon.. It is still cooking, but I am starting to get extremely excited about it.
Tweet media one
19
16
288
@ostrisai
Ostris
3 months
Testing training a LoRA for FLUX.1 ... 38.5gb VRAM!! Gradient checkpointing Rank 256 (Big, I know) T5 in 8bit Full BF16 It is a big boy.
Tweet media one
13
35
268
@ostrisai
Ostris
4 months
Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon)
9
45
264
@ostrisai
Ostris
3 months
I take this back. I managed to squeeze LoRA training for FLUX.1-schnell-train on a single 4090 with 8bit mixed precision. We will see how well it works. 3s/iter.
Tweet media one
@ostrisai
Ostris
3 months
This can be optimized further. I think you could maybe do mixed precision 8bit quantization on the transformer (maybe). But, no matter how optimized it gets, I don't think it will ever be possible to train on current consumer hardware (<=24gb). Someone please prove me wrong.
3
1
25
13
26
256
@ostrisai
Ostris
2 months
We're cooking up a FLUX.1-dev IP Adapter at @heyglif using SigLip 512 base for the vision encoder. It is still cooking, but it is getting there. Current samples (left input - right output).
Tweet media one
14
28
253
@ostrisai
Ostris
27 days
I haven't spoken much about my ongoing project to de-distill schnell to make a permissive licensed version of flux, but I have been updating it periodically as it trains. I just noticed it is the #2 trending text-to-image model on Hugging Face. Working on aesthetic tuning now.
Tweet media one
13
28
249
@ostrisai
Ostris
3 months
Doing a full transformer finetune of FLUX.1 schnell at bf16 with T5 at 8bit is using ~ 75GB VRAM.
Tweet media one
@ostrisai
Ostris
3 months
Testing training a LoRA for FLUX.1 ... 38.5gb VRAM!! Gradient checkpointing Rank 256 (Big, I know) T5 in 8bit Full BF16 It is a big boy.
Tweet media one
13
35
268
11
23
243
@ostrisai
Ostris
5 months
I trained a new VAE with 16x depth and 42 channels (kl-f16-d42). I am now training SD1.5 to work with it, which will double the output size of SD1.5 without much additional compute overhead. Every time I train a new latent space, it always starts out inverted. It's so odd.
14
34
234
@ostrisai
Ostris
8 months
I just released a new IP adapter for SD 1.5 I'm calling a Composition Adapter. It transfers the general composition of an image into a model while ignoring the style / content. A special thanks to @peteromallet , it was their idea. Samples in🧵
12
49
221
@ostrisai
Ostris
3 months
Honestly, the best way to use FLUX.1 is on @heyglif . Claude Sonnet 3.5 + FLUX.1 working together is absolutely insane! Made a quick glif that has Claude prompt a website design for FLUX.1-dev to generate, and it is mind blowing! Link to this glif in thread
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
17
214
@ostrisai
Ostris
8 months
Just added a SDXL version of the IP Composition Adapter, which injects the general composition of an image into the model, while mostly ignoring content and style. It now supports SDXL and SD 1.5. Some samples in 🧵
7
34
204
@ostrisai
Ostris
3 months
Training sample from a training run using a schnell training adapter I have been working on. 1200 steps in. Still hasn't broken down. Trained on and sampled with FLUX.1 schnell.
Tweet media one
10
12
200
@ostrisai
Ostris
2 months
I have been testing this out, and it works amazingly well. Samples from a 2k step run on The Ghoul from the Fallout series.
Tweet media one
Tweet media two
@__TheBen
TheLastBen
2 months
Two layers is all you need. Two layers trained, 128 dim, 9mb flux LoRA
Tweet media one
Tweet media two
Tweet media three
Tweet media four
39
49
614
10
13
180
@ostrisai
Ostris
2 months
It won't be long now. Note: This is not a traditional IP adapter. We are going directly from the SigLip 512 base last hidden state into the attn layers. We are fine tuning the entire vision encoder as well, it is a small encoder but has more than 2x the resolution of CLIP.
Tweet media one
@ostrisai
Ostris
2 months
We're cooking up a FLUX.1-dev IP Adapter at @heyglif using SigLip 512 base for the vision encoder. It is still cooking, but it is getting there. Current samples (left input - right output).
Tweet media one
14
28
253
8
17
182
@ostrisai
Ostris
3 months
Just kicked out a significant bugfix for ai-toolkit that should have a dramatic increase in quality when training Flux, especially on fine details. The artifacts should be gone. You are probably going to want to update to the latest if you are using it.
9
18
174
@ostrisai
Ostris
23 days
I am honestly pretty happy with how OpenFLUX is turning out. I never expected to actually get it to where it is currently, and it still has a long way to go before it is where I want it to be.
Tweet media one
13
7
173
@ostrisai
Ostris
9 days
I added support for training LoRAs on SD3.5 Large at 8bit on 24GB GPU to ai-toolkit. Still doing some testing and will likely make some tweaks to it, but it is there if you early birds want to test it out.
5
30
161
@ostrisai
Ostris
2 months
Experimenting skipping Flux blocks. First image is all blocks. 2nd image is skipping MMDiT blocks 3, 6, 7, 8, 9, 10, 13. With a little tuning, it would improve farther. Prompt: a woman with pink hair standing in a forest, holding a sign that says 'Skipping flux blocks'
Tweet media one
Tweet media two
16
17
153
@ostrisai
Ostris
6 months
SD1.5 with a Flan T5 XXL text encoder is cooking 🔥with parent teacher training. >400k steps in. Most generic concepts are transferred. I am really loving how it is turning out so far.
Tweet media one
15
15
137
@ostrisai
Ostris
2 months
I have had twitter for a year, as of today. Somehow, I have acquired close to 4k followers from almost exclusively posting about ML experiments I am working on. I never expected anyone to be interested in my work. Thank you all for nerding out with me this year #MyXAnniversary
Tweet media one
4
0
113
@ostrisai
Ostris
25 days
Toying with an idea of a living community model for the next step for open flux. Allow the community to fine-tune the model on their own datasets and target identified weak spots using set training configs. Then, once a week/month, all of these are merged into the base model.
22
5
108
@ostrisai
Ostris
1 month
Flux face adapter.... is.. working? ish..
Tweet media one
Tweet media two
16
6
105
@ostrisai
Ostris
5 months
PixArt Sigma is now ranked higher than SD3 on imgsys. We all need to start giving PixArt more love. Plus, it is openrail++.
Tweet media one
6
13
98
@ostrisai
Ostris
4 months
This simple change allows you to use 4090s in a datacenter. Follow me for more life hacks.
Tweet media one
@vikhyatk
vik
4 months
@giffmana I just checked the GeForce license and it looks like they carved out an exception for crypto. So if I find a way to put this on the blockchain…
Tweet media one
3
2
33
1
10
98
@ostrisai
Ostris
7 months
Stable Diffusion 1.5 but with the CLIP Big G text encoder. This was an experiment that I probably dedicated too much compute to. To realize its full potential, it needs some proper fine tuning. Regardless, here it is and it works with 🤗inference api.
4
17
97
@ostrisai
Ostris
9 days
Big day today, but I'm finally getting around to testing out mochi-1-preview from @genmoai . It is pretty amazing for an Apache 2.0 model. Now I just need to find a way to run it locally seeing that I don't exactly have 4 H100 GPUs laying around.
7
5
93
@ostrisai
Ostris
3 months
I have an experimental de-compressed version of Flux-schnell trained. It won't generate well on its own without the guidance embeddings. So I am training those from scratch now. I am also training my first real LoRA test on it (IKEA instructions). 🤞
6
4
87
@ostrisai
Ostris
16 days
I ran a test last night to completely remove the double transformer layers on flux and only training the first 9 single layers to see if it the model could learn to function without the first half of the network. It seems to be working. 🧵
3
12
87
@ostrisai
Ostris
2 months
I added a post with the results of skipping each block in FLUX.1 dev here ->
@ostrisai
Ostris
2 months
Experimenting skipping Flux blocks. First image is all blocks. 2nd image is skipping MMDiT blocks 3, 6, 7, 8, 9, 10, 13. With a little tuning, it would improve farther. Prompt: a woman with pink hair standing in a forest, holding a sign that says 'Skipping flux blocks'
Tweet media one
Tweet media two
16
17
153
9
11
86
@ostrisai
Ostris
1 month
New training best practice, random case dropout.
@multimodalart
apolinario 🌐
1 month
reminder for flux: prompting is case-sensitive 𝙰𝚊 left: Mark Zuckerberg eating pasta right: mark zuckerberg eating pasta same seed
Tweet media one
Tweet media two
27
44
657
1
3
83
@ostrisai
Ostris
10 months
TinyLlama is amazing! I have been waiting on a <3B permissive model to come out. Fine tuning small LLMs to do very specific tasks has so much potential. I loaded it up in my prompt upsampler and it works shockingly well. 🧵
4
8
77
@ostrisai
Ostris
1 month
Working on a clothing IP adapter for Flux... Shirt is getting close. Face is.... not....
Tweet media one
Tweet media two
4
5
70
@ostrisai
Ostris
4 months
The SD1.5 version is probably done. Currently at 240k steps . I am trying to cleanup fine detail, but it may have reached the limit of what synthetic data from the parent model can achieve. Will run it through the night on some high res fix images which will hopefully help.
@ostrisai
Ostris
4 months
75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.
1
5
59
5
3
64
@ostrisai
Ostris
3 months
Be super careful with your FLUX.1 training captions people.
@araminta_k
Mint
3 months
Everyone who said captions don't do anything for Flux is wrong because I captioned a dog as a "cat" and it was ONE picture, and now the model is beautiful except whenever I prompt cat or kitten it gives me dogs. Also shoutout to @comfydeploy for the awesome service.
Tweet media one
8
8
96
2
3
60
@ostrisai
Ostris
2 months
Excellent work @multimodalart !
@multimodalart
apolinario 🌐
2 months
FLUX.1 ai-toolkit now has an official UI 🖼️ with @Gradio With this open source UI you can 💻, locally or any cloud: - Drag and drop images 🖱️ - Caption them ✏️ (or use AI to caption 🤖) - Start training 🏃 No code/yaml needed 😌 Thanks for merging my PR @ostrisai 🔥
Tweet media one
9
96
515
0
2
61
@ostrisai
Ostris
4 months
75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.
@ostrisai
Ostris
4 months
Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon)
9
45
264
1
5
59
@ostrisai
Ostris
5 months
Training my first SD3 LoRA. It is hacky and probably won't be able to run on anything other than my trainer for now, but it is cooking. I am sure I am missing some stuff, but we will see.
Tweet media one
5
0
56
@ostrisai
Ostris
4 months
The prompt comprehension is incredible! #auraflow "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "
Tweet media one
3
5
54
@ostrisai
Ostris
18 days
I'm still off and on cooking a version of SD1.5 that has a 16ch VAE and T5XL text encoder. It takes forever to learn the fine detail since it is 4x the amount of data in a 16ch VAE. It is frustrating because it is so close, yet still so far.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
9
1
55
@ostrisai
Ostris
5 months
Released a LoRA for SDXL that converts the latent space to the SD1/2 latent space.
@ostrisai
Ostris
6 months
Training samples from a little pet project. SDXL LoRA that converts the SDXL Latent space to the SD1/2 latent space. I have been training it off and on for a while and think it is probably closed to done.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
3
38
3
4
53
@ostrisai
Ostris
11 days
I trained Marty McFly on 6 images for 1500 steps and 6 LoRAs on each image for 250 steps each, and merged them together on inference. In short, no, it does not work as well as training on all images at the same time.
Tweet media one
Tweet media two
@ostrisai
Ostris
11 days
I am curious if you would get similar results from training a LoRA on 10 images as you would training 10 LoRAs on single images with 1/10th the steps, and then merging them together. Has anyone tried anything like this?
12
0
33
9
1
53
@ostrisai
Ostris
2 months
A year ago, when I was building the sampling mechanism for ai-toolkit, I accidentally left my name (Jaret) hard coded as the trigger word, buried deep in the code. It really freaked me out and made me question reality when my name started showing up in all the training samples.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
6
1
52
@ostrisai
Ostris
7 months
Cooking a style only IP adapter for SDXL. It still has a ways to go, but it is looking promising.
Tweet media one
3
3
48
@ostrisai
Ostris
4 months
To me, the most exciting thing about Auraflow is that it is actually an open source license, Apache 2.0. CreativeRail++, while being permissive, is not actually an OSI compliant license. I am super excited to sink my GPUs into it this weekend!
2
4
48
@ostrisai
Ostris
26 days
Almost every VLM captions like : "The image appears to possibly feature a man who could be in a mood some would describe as happy" vs how people prompt "A happy man". And they all seem to ignore my instructions do it otherwise.
20
0
48
@ostrisai
Ostris
4 months
Changing name to LittleDiT since I decided to increase the size a bit. Moved from T5 base to T5 large and going with 20 blocks in DiT vs 10. Still a lot smaller than SD1.5 with everything baked in. Now we cook, for a long time. Current samples attached.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
0
48
@ostrisai
Ostris
11 months
Training a Stable Diffusion LoRA that can do 1 step is HARD. You have to get pretty creative with the timestep to prevent from going pure adversarial loss, which I don't want to do. I think I have it now. Just needs to cook for a while. Current train samples of 1 vs 2 step. SD1.5
Tweet media one
Tweet media two
5
4
46
@ostrisai
Ostris
8 months
The IP composition adapter is sitting on #5 on 🤗trending text to image models. Thank you all for the support.
Tweet media one
2
2
46
@ostrisai
Ostris
7 months
Style IP Adapter for SDXL is coming along. I love the impasto style people. Some content is still coming through. Working on that. I also figured out a novel way to compensate for inference CFG during training to prevent over saturation. Hopefully done tomorrow.
Tweet media one
Tweet media two
2
4
44
@ostrisai
Ostris
3 months
@bfl_ml @araminta_k For personalization I also did a celeb test with Christina Hendricks since the model didn't seem to know her well. It works well on realism personalization. Attached is before and after training a LoRA on FLUX.1-dev on Christina Hendricks.
Tweet media one
Tweet media two
4
2
45
@ostrisai
Ostris
4 months
Tiny DiT: Images are coming through. It is basically going to need a full fine tune though. Debating on committing to that because I REALLY want this. Entire model w/ TE is 646MB. Full fine tune takes 2.4GB VRAM and I can train at a BS of > 240 on a 4090.
Tweet media one
Tweet media two
@ostrisai
Ostris
4 months
Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?
2
5
38
4
6
44
@ostrisai
Ostris
7 days
I was curious how difficult it would be to train a guidance LoRA so you can infer without CFG and burn in the negative prompt. Turns out, it is pretty easy to do. My test was with OpenFLUX.1, but should work with any model requiring CFG and will double inference speed.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
7
3
62
@ostrisai
Ostris
2 months
Testing some flux vision adapter training today. My favorite generations have always been the oddities generated while messing with the cross attn layers.
Tweet media one
Tweet media two
3
0
44
@ostrisai
Ostris
5 months
Training montage for those who enjoy watching models train as much as I do.
@ostrisai
Ostris
5 months
Doing a test of training SD1.5 to use a 16 channel/16 depth VAE so it will generate natively at 1024 with same compute requirements of 512. ~ 300k steps in so far. It is working but taking FOREVER.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
1
33
1
0
42
@ostrisai
Ostris
1 year
New Stable Diffusion XL LoRA - "Super Cereal". Turn anything into a cereal box. Special thanks to @multimodalart and @huggingface for the compute! HF -> Civitai ->
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
6
43
@ostrisai
Ostris
3 months
Support for these LoRAs will work out of the box with Diffusers, I also finished the weight mapping for ComfyUI and have a PR for that here That will enable full support for Comfy UI. You can checkout my fork in the iterum.
1
3
43
@ostrisai
Ostris
9 months
Everyone with a AI video startup right now
1
4
43
@ostrisai
Ostris
8 days
First LoRA test I tried was with live action Cruella because in my pruning attempts, anatomy broke down. Plus that hair and her clothing is hard to learn. It worked out well. Final step training samples attached.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@ostrisai
Ostris
8 days
How did I miss this? Too many releases. @freepik Awesome work! It works out of the box with ai-toolkit. Training first test LoRA now.
11
63
437
5
0
47
@ostrisai
Ostris
7 months
I logged into Civitai for the first time in a long time. There is just blatant unchecked child porn everywhere. I took all of my models off of there. I do not support nor want to be associated with that in any way shape or form.
9
1
42
@ostrisai
Ostris
4 months
SDXL 16ch VAE adapter is still coming along, but it still has a long way to go.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
1
1
42
@ostrisai
Ostris
3 months
I generated a few hundred regularization images for training with FLUX.1-dev, and it kept generating Donald Trump completly unprompted. These prompts were: "a man in a suit and tie talking to reporters" "a man with a blue tie and a black jacket" "a man in a suit and green tie"
Tweet media one
Tweet media two
Tweet media three
10
1
40
@ostrisai
Ostris
1 year
It has taken dozens of iterations and keyboard smashing to figure out all of the math. Hours to curiate and modify the 3k guidance image pairs. But it is working!! I will fix stable diffusion hands! Training sample from my private SDXL realism model. Step 0 and 600
Tweet media one
6
1
39
@ostrisai
Ostris
6 months
Training samples from a little pet project. SDXL LoRA that converts the SDXL Latent space to the SD1/2 latent space. I have been training it off and on for a while and think it is probably closed to done.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
3
3
38
@ostrisai
Ostris
4 months
16ch SDXL VAE adapter sample image. Prompt: "woman playing the guitar, on stage, singing a song, laser lights, punk rocker". In many ways, the 4ch VAE made training easier because the VAE made up most of the fine details. Now, the unet has to learn it. Needs to cook more.
Tweet media one
4
3
39
@ostrisai
Ostris
4 months
Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?
2
5
38
@ostrisai
Ostris
3 months
@McintireTristan Dataset is thousands of images generated by Flux Schnell. Fine tuning is not possible because it is distilled. What is I doing is intentionally breaking down the step compression to hopefully end up with a training base that we can train LoRAs on that will work with schnell.
8
1
36
@ostrisai
Ostris
2 months
Hm... Learning rate may have been a bit too high.
Tweet media one
Tweet media two
10
1
36
@ostrisai
Ostris
29 days
@DMBTrivia @kohya_tech It was trained on thousands of schnell generated images with a low LR. The goal was to not teach it new data, and only to unlearn the distillation. I tried various tricks at different stages to speed up breaking down the compression, but the one that worked best was training with
1
7
36
@ostrisai
Ostris
6 months
Saturday experiment: Single value adapter. I trained feeding a single -1 to 1 float directly into key/val linear layers that corresponds to eye size in images, and apply that to the cross attn layers. It works. Sample images are -1.0 and 1.0 pairs. Time to add more features.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
4
35
@ostrisai
Ostris
2 months
Having some issues keeping up with direct messages and tags on here and discord. If I haven’t gotten back to you yet, I am sorry. A little overwhelmed with messages at the moment.
1
0
34
@ostrisai
Ostris
3 months
OMG! LFG!
@bfl_ml
Black Forest Labs
3 months
Today we release the FLUX.1 suite of models that push the frontiers of text-to-image synthesis.  read more at
Tweet media one
42
147
820
2
0
34
@ostrisai
Ostris
4 months
I cannot get my 16ch adapter for SD1.5 where I want it without doing a full fine tune. I also cannot get my flan T5xl adapter there either. So I merged them into a single model together and I am doing a full tune of a T5xl- 16ch-SD1.5 model. We will see.
3
0
34
@ostrisai
Ostris
4 months
"You will not use the Stability AI Materials or Derivative Works, or any output .... to create or improve any foundational generative AI model" Is still in there. My understanding was that this is why @HelloCivitai refused to host it in the first place.
@StabilityAI
Stability AI
4 months
At Stability AI, we’re committed to releasing high-quality Generative AI models and sharing them generously with our community of innovators and media creators.  We acknowledge that our latest release, Stable Diffusion 3 Medium, didn’t meet our community’s high expectations, and
Tweet media one
70
167
643
7
3
34
@ostrisai
Ostris
5 months
Doing a test of training SD1.5 to use a 16 channel/16 depth VAE so it will generate natively at 1024 with same compute requirements of 512. ~ 300k steps in so far. It is working but taking FOREVER.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
4
1
33
@ostrisai
Ostris
3 months
Here is an example for the OpenFLUX.1 project that has been training over a week, so the artifacts are much more pronounced. The left image is where it was, and it was driving me crazy. The right image is 250 steps after implementing the fix.
Tweet media one
Tweet media two
5
2
33
@ostrisai
Ostris
4 months
I trained another VAE (kl-f8-d16 -16 ch), and I test trained SD 1.5 to use it. It picked it up very quick, but the fine details need work. Overall, the test worked. Trying to decide if I want to switch to SDXL or PixArt Sigma. Thoughts?
Tweet media one
Tweet media two
Tweet media three
Tweet media four
8
4
33
@ostrisai
Ostris
11 days
I am curious if you would get similar results from training a LoRA on 10 images as you would training 10 LoRAs on single images with 1/10th the steps, and then merging them together. Has anyone tried anything like this?
12
0
33
@ostrisai
Ostris
3 months
Tweet media one
1
0
32
@ostrisai
Ostris
30 days
I absolutely love working at glif and you would too.
@fabianstelzer
fabian
30 days
friends, we're hiring staff level product engineers at Glif. RTs / forwarding much appreciated 💜 Glif is a powerful & fun AI sandbox, where talented creators build, remix and share AI microapps (aka glifs) with hundreds of thousands of players glifs are a new media format for
Tweet media one
80
37
226
2
1
31
@ostrisai
Ostris
3 months
It is interesting and raises questions about synthetic captioning as similar images were likely synthetically captioned as "a man in a suit". It also makes me curious how many of the "not real people" it generates look almost exactly like some real people in the dataset.
8
1
31
@ostrisai
Ostris
4 months
6 years ago I spent days creating a dataset and fine tuning a classifier on Star Wars images so I could use it in deep dream. Instead of dog faces, there are droids, Chewbacca fur, and Boba Fett helmets. This blew people’s minds back then. We have come a long way.
Tweet media one
4
2
30
@ostrisai
Ostris
3 months
Ran a quick lora test doing this on SD 1.5. It added a lot of fine detail. Not sure what a longer train run would do as it also increases the contrast. Before and after samples attached
Tweet media one
Tweet media two
Tweet media three
Tweet media four
@ostrisai
Ostris
3 months
Instead of adjusting the scaling factor, one could probably just multiply the noise by 0.934 during detail to increase the fine detail. I will test this theory.
1
0
10
3
4
29
@ostrisai
Ostris
2 months
The vision encoder adapter I am testing for Flux dev may not be working right in the traditional sense, but it makes some interesting things.
Tweet media one
Tweet media two
Tweet media three
Tweet media four
2
1
29
@ostrisai
Ostris
1 month
2
2
28
@ostrisai
Ostris
1 year
This is absolutely awesome. Results from my initial tests look good and it generates similar structured images to SDXL with same prompt and noise. Doing a fine tuning run now.
6
1
28
@ostrisai
Ostris
2 months
I want to also note that it will run fine on 24gb with the entire flux model and ip adapter loaded into vram. In fact, I did most of the pretraining on a 4090, so you can fine tune it for specific tasks (style, composition, identity) on one as well.
4
2
26
@ostrisai
Ostris
3 months
I'm training SDXL to use my 16ch VAE, and adjusting the VAE scale factor. The first image is training with a calculated norm scaler. The second is when I adjust it to a lower number, which is clearly closer to the ideal value. Any idea how to calculate the ideal value?
Tweet media one
Tweet media two
5
1
25
@ostrisai
Ostris
7 months
@victormustar Add a label if the license is open source compliant or non open source compliant. Hugging Face is full of Fauxpen source models claiming to be "open" and "open source" when they are not.
Tweet media one
2
0
25