Ostris @ostrisai Twitter profile

Last Seen Profiles

@7by7tamaki

@tkminnm

@pho_bitl

@BinorRaja

@yumemiru_u27

@MatsuoKazuhi

@piddleandpurr

@N2Nf9

@SayangHijabers

@weyolll5

@maaakungames

@KlingelLai

@Lars1205

@opposingforce98

@jamesrollings1

@XiaoyiC

@Msre6t0BLKFp8uC

@timajun_BNB

@speedwaydigest

@Youssou59197583

@soo1311_

@MoellerLei77219

@borsanin_gulu

@SoapsSpoilers

@DNZAbass

@stw_pdg

@kinkysirnick

@CFC_Alpha01

@ThePeachyCo_

@SweetNicoli

@rendesu_421

@SeekerOfNights

@tobrut_only

@sotthes

@Schack_REISA

@rachel_peet_

Ostris

@ostrisai

3 months

Did a lot of testing on my LoRA training script for @bfl_ml FLUX.1 dev model. Amazing model! I think it is finally ready. Running smooth on a single 4090. Posting a guide tomorrow. Special thanks to @araminta_k for helping me test on her amazing original character and artwork.

35

98

688

Ostris

@ostrisai

9 months

I teared up a bit. I am extremely excited, but also feel completly inadequate in literally everything I have ever worked on. Ever. It is absolutely stunning and humbling to watch. I need a drink.

OpenAI

@OpenAI

9 months

Prompt: “A movie trailer featuring the adventures of the 30 year old space man wearing a red wool knitted motorcycle helmet, blue sky, salt desert, cinematic style, shot on 35mm film, vivid colors.”

487

1K

12K

24

30

675

Ostris

@ostrisai

2 months

Testing Flux.1 schnell embedding interpolation. I told claude some details about my life and had it generate 40 prompts from being born -> being dead using the data I gave it. Then interpolated between the embeddings for this video. Uses @araminta_k softpasty LoRA

31

86

649

Ostris

@ostrisai

3 months

IT WORKS!! I trained an IKEA instruction LoRA on a decompressed version of FLUX.1 schnell , and it works on schnell at 4 step! The choppiness should go away when I get the guidance embedding trained. Still a WIP, but it works! There is hope!

22

58

538

Ostris

@ostrisai

1 year

New Stable Diffusion XL LoRA, Ikea Instructions. SDXL does an amazingly hilarious job at coming up with how to make things. Special thanks to @multimodalart and @huggingface for the GPU grant!! HF -> Civitai ->

13

79

485

Ostris

@ostrisai

3 months

AI-Toolkit now officially supports training LoRAs directly on FLUX.1-schnell. 🥳 How to do it here -> The adapter that makes it possible here ->

ostris/FLUX.1-schnell-training-adapter · Hugging Face

huggingface.co

14

85

484

Ostris

@ostrisai

2 months

FLUX.1 schnell text embedding interpolation test. Silver balloons floating in the street.

13

41

463

Ostris

@ostrisai

8 days

How did I miss this? Too many releases. @freepik Awesome work! It works out of the box with ai-toolkit. Training first test LoRA now.

Freepik/flux.1-lite-8B-alpha · Hugging Face

huggingface.co

11

63

437

Ostris

@ostrisai

11 months

Early alpha demo of my "virtual try on" I have been working on. Load a few photos of a person, a photo of a top, enter a prompt, and instantly render them wearing it in any scene you want. Special thanks to my beautiful wife for letting me use her likeness.

16

49

386

Ostris

@ostrisai

3 months

Working to get out a full tutorial for FLUX.1-dev LoRA training on a 24GB card. For now, I updated the read me and added an example config file. Should be enough to get many of you going. Will be updating as I go.

11

32

314

Ostris

@ostrisai

2 months

Soon.. It is still cooking, but I am starting to get extremely excited about it.

19

16

288

Ostris

@ostrisai

3 months

Testing training a LoRA for FLUX.1 ... 38.5gb VRAM!! Gradient checkpointing Rank 256 (Big, I know) T5 in 8bit Full BF16 It is a big boy.

13

35

268

Ostris

@ostrisai

4 months

Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon)

ostris/vae-kl-f8-d16 · Hugging Face

huggingface.co

9

45

264

Ostris

@ostrisai

3 months

I take this back. I managed to squeeze LoRA training for FLUX.1-schnell-train on a single 4090 with 8bit mixed precision. We will see how well it works. 3s/iter.

Ostris

@ostrisai

3 months

This can be optimized further. I think you could maybe do mixed precision 8bit quantization on the transformer (maybe). But, no matter how optimized it gets, I don't think it will ever be possible to train on current consumer hardware (<=24gb). Someone please prove me wrong.

3

1

25

13

26

256

Ostris

@ostrisai

2 months

We're cooking up a FLUX.1-dev IP Adapter at @heyglif using SigLip 512 base for the vision encoder. It is still cooking, but it is getting there. Current samples (left input - right output).

14

28

253

Ostris

@ostrisai

27 days

I haven't spoken much about my ongoing project to de-distill schnell to make a permissive licensed version of flux, but I have been updating it periodically as it trains. I just noticed it is the #2 trending text-to-image model on Hugging Face. Working on aesthetic tuning now.

13

28

249

Ostris

@ostrisai

3 months

Doing a full transformer finetune of FLUX.1 schnell at bf16 with T5 at 8bit is using ~ 75GB VRAM.

Ostris

@ostrisai

3 months

Testing training a LoRA for FLUX.1 ... 38.5gb VRAM!! Gradient checkpointing Rank 256 (Big, I know) T5 in 8bit Full BF16 It is a big boy.

13

35

268

11

23

243

Ostris

@ostrisai

5 months

I trained a new VAE with 16x depth and 42 channels (kl-f16-d42). I am now training SD1.5 to work with it, which will double the output size of SD1.5 without much additional compute overhead. Every time I train a new latent space, it always starts out inverted. It's so odd.

14

34

234

Ostris

@ostrisai

8 months

I just released a new IP adapter for SD 1.5 I'm calling a Composition Adapter. It transfers the general composition of an image into a model while ignoring the style / content. A special thanks to @peteromallet , it was their idea. Samples in🧵

ostris/ip-composition-adapter · Hugging Face

huggingface.co

12

49

221

Ostris

@ostrisai

3 months

Honestly, the best way to use FLUX.1 is on @heyglif . Claude Sonnet 3.5 + FLUX.1 working together is absolutely insane! Made a quick glif that has Claude prompt a website design for FLUX.1-dev to generate, and it is mind blowing! Link to this glif in thread

4

17

214

Ostris

@ostrisai

8 months

Just added a SDXL version of the IP Composition Adapter, which injects the general composition of an image into the model, while mostly ignoring content and style. It now supports SDXL and SD 1.5. Some samples in 🧵

ostris/ip-composition-adapter · Hugging Face

huggingface.co

7

34

204

Ostris

@ostrisai

3 months

Training sample from a training run using a schnell training adapter I have been working on. 1200 steps in. Still hasn't broken down. Trained on and sampled with FLUX.1 schnell.

10

12

200

Ostris

@ostrisai

2 months

I have been testing this out, and it works amazingly well. Samples from a 2k step run on The Ghoul from the Fallout series.

TheLastBen

@__TheBen

2 months

Two layers is all you need. Two layers trained, 128 dim, 9mb flux LoRA

39

49

614

10

13

180

Ostris

@ostrisai

2 months

It won't be long now. Note: This is not a traditional IP adapter. We are going directly from the SigLip 512 base last hidden state into the attn layers. We are fine tuning the entire vision encoder as well, it is a small encoder but has more than 2x the resolution of CLIP.

Ostris

@ostrisai

2 months

We're cooking up a FLUX.1-dev IP Adapter at @heyglif using SigLip 512 base for the vision encoder. It is still cooking, but it is getting there. Current samples (left input - right output).

14

28

253

8

17

182

Ostris

@ostrisai

3 months

Just kicked out a significant bugfix for ai-toolkit that should have a dramatic increase in quality when training Flux, especially on fine details. The artifacts should be gone. You are probably going to want to update to the latest if you are using it.

9

18

174

Ostris

@ostrisai

23 days

I am honestly pretty happy with how OpenFLUX is turning out. I never expected to actually get it to where it is currently, and it still has a long way to go before it is where I want it to be.

13

7

173

Ostris

@ostrisai

9 days

I added support for training LoRAs on SD3.5 Large at 8bit on 24GB GPU to ai-toolkit. Still doing some testing and will likely make some tweaks to it, but it is there if you early birds want to test it out.

Added preliminary support for SD3.5-large lora training · ostris/ai-toolkit@3400882

github.com

5

30

161

Ostris

@ostrisai

2 months

Experimenting skipping Flux blocks. First image is all blocks. 2nd image is skipping MMDiT blocks 3, 6, 7, 8, 9, 10, 13. With a little tuning, it would improve farther. Prompt: a woman with pink hair standing in a forest, holding a sign that says 'Skipping flux blocks'

16

17

153

Ostris

@ostrisai

6 months

SD1.5 with a Flan T5 XXL text encoder is cooking 🔥with parent teacher training. >400k steps in. Most generic concepts are transferred. I am really loving how it is turning out so far.

15

137

Ostris

@ostrisai

2 months

I have had twitter for a year, as of today. Somehow, I have acquired close to 4k followers from almost exclusively posting about ML experiments I am working on. I never expected anyone to be interested in my work. Thank you all for nerding out with me this year #MyXAnniversary

4

0

113

Ostris

@ostrisai

25 days

Toying with an idea of a living community model for the next step for open flux. Allow the community to fine-tune the model on their own datasets and target identified weak spots using set training configs. Then, once a week/month, all of these are merged into the base model.

22

5

108

Ostris

@ostrisai

1 month

Flux face adapter.... is.. working? ish..

16

6

105

Ostris

@ostrisai

5 months

PixArt Sigma is now ranked higher than SD3 on imgsys. We all need to start giving PixArt more love. Plus, it is openrail++.

6

13

98

Ostris

@ostrisai

4 months

This simple change allows you to use 4090s in a datacenter. Follow me for more life hacks.

vik

@vikhyatk

4 months

@giffmana I just checked the GeForce license and it looks like they carved out an exception for crypto. So if I find a way to put this on the blockchain…

3

2

33

1

10

98

Ostris

@ostrisai

7 months

Stable Diffusion 1.5 but with the CLIP Big G text encoder. This was an experiment that I probably dedicated too much compute to. To realize its full potential, it needs some proper fine tuning. Regardless, here it is and it works with 🤗inference api.

ostris/sd15-big-g-alpha · Hugging Face

huggingface.co

4

17

97

Ostris

@ostrisai

9 days

Big day today, but I'm finally getting around to testing out mochi-1-preview from @genmoai . It is pretty amazing for an Apache 2.0 model. Now I just need to find a way to run it locally seeing that I don't exactly have 4 H100 GPUs laying around.

7

5

93

Ostris

@ostrisai

3 months

I have an experimental de-compressed version of Flux-schnell trained. It won't generate well on its own without the guidance embeddings. So I am training those from scratch now. I am also training my first real LoRA test on it (IKEA instructions). 🤞

6

4

87

Ostris

@ostrisai

16 days

I ran a test last night to completely remove the double transformer layers on flux and only training the first 9 single layers to see if it the model could learn to function without the first half of the network. It seems to be working. 🧵

3

12

87

Ostris

@ostrisai

2 months

I added a post with the results of skipping each block in FLUX.1 dev here ->

Skipping FLUX.1-dev Blocks - Ostris

Skipping blocks in FLUX.1-dev. Some blocks can be skipped without affecting the output much. Others will destroy the output if omitted.

ostris.com

Ostris

@ostrisai

2 months

Experimenting skipping Flux blocks. First image is all blocks. 2nd image is skipping MMDiT blocks 3, 6, 7, 8, 9, 10, 13. With a little tuning, it would improve farther. Prompt: a woman with pink hair standing in a forest, holding a sign that says 'Skipping flux blocks'

16

17

153

9

11

86

Ostris

@ostrisai

1 month

New training best practice, random case dropout.

apolinario 🌐

@multimodalart

1 month

reminder for flux: prompting is case-sensitive 𝙰𝚊 left: Mark Zuckerberg eating pasta right: mark zuckerberg eating pasta same seed

27

44

657

1

3

83

Ostris

@ostrisai

10 months

TinyLlama is amazing! I have been waiting on a <3B permissive model to come out. Fine tuning small LLMs to do very specific tasks has so much potential. I loaded it up in my prompt upsampler and it works shockingly well. 🧵

TinyLlama/TinyLlama-1.1B-Chat-v1.0 · Hugging Face

huggingface.co

4

8

77

Ostris

@ostrisai

9 days

Added a blog post with images from skipping each block in SD3.5-large. Hopefully this helps in choosing more efficient blocks to target during fine tuning.

Skipping SD3.5-large Blocks - Ostris

The following images were generated using the same seed and prompt while skipping specific blocks in SD3.5-large. The prompt for all of these was “a woman with pink hair standing in a forest, holding...

ostris.com

10

15

79

Ostris

@ostrisai

1 month

Working on a clothing IP adapter for Flux... Shirt is getting close. Face is.... not....

4

5

70

Ostris

@ostrisai

4 months

The SD1.5 version is probably done. Currently at 240k steps . I am trying to cleanup fine detail, but it may have reached the limit of what synthetic data from the parent model can achieve. Will run it through the night on some high res fix images which will hopefully help.

Ostris

@ostrisai

4 months

75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.

1

5

59

5

3

64

Ostris

@ostrisai

3 months

Be super careful with your FLUX.1 training captions people.

Mint

@araminta_k

3 months

Everyone who said captions don't do anything for Flux is wrong because I captioned a dog as a "cat" and it was ONE picture, and now the model is beautiful except whenever I prompt cat or kitten it gives me dogs. Also shoutout to @comfydeploy for the awesome service.

8

96

2

3

60

Ostris

@ostrisai

2 months

Excellent work @multimodalart !

apolinario 🌐

@multimodalart

2 months

FLUX.1 ai-toolkit now has an official UI 🖼️ with @Gradio With this open source UI you can 💻, locally or any cloud: - Drag and drop images 🖱️ - Caption them ✏️ (or use AI to caption 🤖) - Start training 🏃 No code/yaml needed 😌 Thanks for merging my PR @ostrisai 🔥

9

96

515

0

2

61

Ostris

@ostrisai

4 months

75k steps in on training the adapter for SDXL. First ~30k steps were just on the new conv_in/conv_out layers. Then I added the LoRA (lin 64, conv 32). It is going to be a while, but it is coming along.

Ostris

@ostrisai

4 months

Releasing my 16ch VAE (KL-f8-d16) today (also). MIT license, lighter weight than SD3 VAE (57,266,643 params vs 83,819,683), similar test scores, smaller, faster, opener. I'm currently training adapters for SD 1.5, SDXL, and PixArt to use it (coming soon)

9

45

264

1

5

59

Ostris

@ostrisai

5 months

Training my first SD3 LoRA. It is hacky and probably won't be able to run on anything other than my trainer for now, but it is cooking. I am sure I am missing some stuff, but we will see.

5

0

56

Ostris

@ostrisai

4 months

The prompt comprehension is incredible! #auraflow "a cat that is half orange tabby and half black, split down the middle. Holding a martini glass with a ball of yarn in it. He has a monocle on his left eye, and a blue top hat, art nouveau style "

3

5

54

Ostris

@ostrisai

18 days

I'm still off and on cooking a version of SD1.5 that has a 16ch VAE and T5XL text encoder. It takes forever to learn the fine detail since it is 4x the amount of data in a 16ch VAE. It is frustrating because it is so close, yet still so far.

9

1

55

Ostris

@ostrisai

5 months

Released a LoRA for SDXL that converts the latent space to the SD1/2 latent space.

ostris/sdxl-sd1-vae-lora · Hugging Face

huggingface.co

Ostris

@ostrisai

6 months

Training samples from a little pet project. SDXL LoRA that converts the SDXL Latent space to the SD1/2 latent space. I have been training it off and on for a while and think it is probably closed to done.

3

38

3

4

53

Ostris

@ostrisai

11 days

I trained Marty McFly on 6 images for 1500 steps and 6 LoRAs on each image for 250 steps each, and merged them together on inference. In short, no, it does not work as well as training on all images at the same time.

Ostris

@ostrisai

11 days

I am curious if you would get similar results from training a LoRA on 10 images as you would training 10 LoRAs on single images with 1/10th the steps, and then merging them together. Has anyone tried anything like this?

12

0

33

9

1

53

Ostris

@ostrisai

2 months

A year ago, when I was building the sampling mechanism for ai-toolkit, I accidentally left my name (Jaret) hard coded as the trigger word, buried deep in the code. It really freaked me out and made me question reality when my name started showing up in all the training samples.

6

1

52

Ostris

@ostrisai

7 months

Cooking a style only IP adapter for SDXL. It still has a ways to go, but it is looking promising.

3

48

Ostris

@ostrisai

4 months

To me, the most exciting thing about Auraflow is that it is actually an open source license, Apache 2.0. CreativeRail++, while being permissive, is not actually an OSI compliant license. I am super excited to sink my GPUs into it this weekend!

2

4

48

Ostris

@ostrisai

26 days

Almost every VLM captions like : "The image appears to possibly feature a man who could be in a mood some would describe as happy" vs how people prompt "A happy man". And they all seem to ignore my instructions do it otherwise.

20

0

48

Ostris

@ostrisai

4 months

Changing name to LittleDiT since I decided to increase the size a bit. Moved from T5 base to T5 large and going with 20 blocks in DiT vs 10. Still a lot smaller than SD1.5 with everything baked in. Now we cook, for a long time. Current samples attached.

3

0

48

Ostris

@ostrisai

11 months

Training a Stable Diffusion LoRA that can do 1 step is HARD. You have to get pretty creative with the timestep to prevent from going pure adversarial loss, which I don't want to do. I think I have it now. Just needs to cook for a while. Current train samples of 1 vs 2 step. SD1.5

5

4

46

Ostris

@ostrisai

8 months

The IP composition adapter is sitting on #5 on 🤗trending text to image models. Thank you all for the support.

2

46

Ostris

@ostrisai

7 months

Style IP Adapter for SDXL is coming along. I love the impasto style people. Some content is still coming through. Working on that. I also figured out a novel way to compensate for inference CFG during training to prevent over saturation. Hopefully done tomorrow.

2

4

44

Ostris

@ostrisai

3 months

@bfl_ml @araminta_k For personalization I also did a celeb test with Christina Hendricks since the model didn't seem to know her well. It works well on realism personalization. Attached is before and after training a LoRA on FLUX.1-dev on Christina Hendricks.

4

2

45

Ostris

@ostrisai

4 months

Tiny DiT: Images are coming through. It is basically going to need a full fine tune though. Debating on committing to that because I REALLY want this. Entire model w/ TE is 646MB. Full fine tune takes 2.4GB VRAM and I can train at a BS of > 240 on a 4090.

Ostris

@ostrisai

4 months

Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?

2

5

38

4

6

44

Ostris

@ostrisai

7 days

I was curious how difficult it would be to train a guidance LoRA so you can infer without CFG and burn in the negative prompt. Turns out, it is pretty easy to do. My test was with OpenFLUX.1, but should work with any model requiring CFG and will double inference speed.

7

3

62

Ostris

@ostrisai

2 months

Testing some flux vision adapter training today. My favorite generations have always been the oddities generated while messing with the cross attn layers.

3

0

44

Ostris

@ostrisai

5 months

Training montage for those who enjoy watching models train as much as I do.

Ostris

@ostrisai

5 months

Doing a test of training SD1.5 to use a 16 channel/16 depth VAE so it will generate natively at 1024 with same compute requirements of 512. ~ 300k steps in so far. It is working but taking FOREVER.

4

1

33

1

0

42

Ostris

@ostrisai

1 year

New Stable Diffusion XL LoRA - "Super Cereal". Turn anything into a cereal box. Special thanks to @multimodalart and @huggingface for the compute! HF -> Civitai ->

2

6

43

Ostris

@ostrisai

3 months

Support for these LoRAs will work out of the box with Diffusers, I also finished the weight mapping for ComfyUI and have a PR for that here That will enable full support for Comfy UI. You can checkout my fork in the iterum.

FLUX: Added full Diffusers mapping for FLUX.1 schnell and dev. by jaretburkett · Pull Request #4302...

This PR will add full LoRA support for Diffusers based LoRAs. The only weights that do not match up correctly are. ("guidance_in.out_layer.bias", "time_text_embed.guidanc...

github.com

1

3

43

Ostris

@ostrisai

9 months

Everyone with a AI video startup right now

1

4

43

Ostris

@ostrisai

8 days

First LoRA test I tried was with live action Cruella because in my pruning attempts, anatomy broke down. Plus that hair and her clothing is hard to learn. It worked out well. Final step training samples attached.

Ostris

@ostrisai

8 days

How did I miss this? Too many releases. @freepik Awesome work! It works out of the box with ai-toolkit. Training first test LoRA now.

11

63

437

5

0

47

Ostris

@ostrisai

7 months

I logged into Civitai for the first time in a long time. There is just blatant unchecked child porn everywhere. I took all of my models off of there. I do not support nor want to be associated with that in any way shape or form.

9

1

42

Ostris

@ostrisai

4 months

SDXL 16ch VAE adapter is still coming along, but it still has a long way to go.

1

42

Ostris

@ostrisai

3 months

I generated a few hundred regularization images for training with FLUX.1-dev, and it kept generating Donald Trump completly unprompted. These prompts were: "a man in a suit and tie talking to reporters" "a man with a blue tie and a black jacket" "a man in a suit and green tie"

10

1

40

Ostris

@ostrisai

1 year

It has taken dozens of iterations and keyboard smashing to figure out all of the math. Hours to curiate and modify the 3k guidance image pairs. But it is working!! I will fix stable diffusion hands! Training sample from my private SDXL realism model. Step 0 and 600

6

1

39

Ostris

@ostrisai

6 months

Training samples from a little pet project. SDXL LoRA that converts the SDXL Latent space to the SD1/2 latent space. I have been training it off and on for a while and think it is probably closed to done.

3

38

Ostris

@ostrisai

4 months

16ch SDXL VAE adapter sample image. Prompt: "woman playing the guitar, on stage, singing a song, laser lights, punk rocker". In many ways, the 4ch VAE made training easier because the VAE made up most of the fine details. Now, the unet has to learn it. Needs to cook more.

4

3

39

Ostris

@ostrisai

4 months

Saturday experiment: Retrained xattn layers on PixArt Sigma to take T5 base (much smaller). It works surprisingly well. Merge reduced the number of blocks in the transformer from 28 to 10. Just popped it in the oven (full tune). Now we wait. Who wants a tiny DiT to play with?

2

5

38

Ostris

@ostrisai

3 months

@McintireTristan Dataset is thousands of images generated by Flux Schnell. Fine tuning is not possible because it is distilled. What is I doing is intentionally breaking down the step compression to hopefully end up with a training base that we can train LoRAs on that will work with schnell.

8

1

36

Ostris

@ostrisai

2 months

Hm... Learning rate may have been a bit too high.

10

1

36

Ostris

@ostrisai

29 days

@DMBTrivia @kohya_tech It was trained on thousands of schnell generated images with a low LR. The goal was to not teach it new data, and only to unlearn the distillation. I tried various tricks at different stages to speed up breaking down the compression, but the one that worked best was training with

1

7

36

Ostris

@ostrisai

6 months

Saturday experiment: Single value adapter. I trained feeding a single -1 to 1 float directly into key/val linear layers that corresponds to eye size in images, and apply that to the cross attn layers. It works. Sample images are -1.0 and 1.0 pairs. Time to add more features.

8

4

35

Ostris

@ostrisai

2 months

Having some issues keeping up with direct messages and tags on here and discord. If I haven’t gotten back to you yet, I am sorry. A little overwhelmed with messages at the moment.

1

0

34

Ostris

@ostrisai

3 months

OMG! LFG!

Black Forest Labs

@bfl_ml

3 months

Today we release the FLUX.1 suite of models that push the frontiers of text-to-image synthesis. read more at

42

147

820

2

0

34

Ostris

@ostrisai

4 months

I cannot get my 16ch adapter for SD1.5 where I want it without doing a full fine tune. I also cannot get my flan T5xl adapter there either. So I merged them into a single model together and I am doing a full tune of a T5xl- 16ch-SD1.5 model. We will see.

3

0

34

Ostris

@ostrisai

4 months

"You will not use the Stability AI Materials or Derivative Works, or any output .... to create or improve any foundational generative AI model" Is still in there. My understanding was that this is why @HelloCivitai refused to host it in the first place.

Stability AI

@StabilityAI

4 months

At Stability AI, we’re committed to releasing high-quality Generative AI models and sharing them generously with our community of innovators and media creators. We acknowledge that our latest release, Stable Diffusion 3 Medium, didn’t meet our community’s high expectations, and

70

167

643

7

3

34

Ostris

@ostrisai

5 months

Doing a test of training SD1.5 to use a 16 channel/16 depth VAE so it will generate natively at 1024 with same compute requirements of 512. ~ 300k steps in so far. It is working but taking FOREVER.

4

1

33

Ostris

@ostrisai

3 months

Here is an example for the OpenFLUX.1 project that has been training over a week, so the artifacts are much more pronounced. The left image is where it was, and it was driving me crazy. The right image is 250 steps after implementing the fix.

5

2

33

Ostris

@ostrisai

4 months

I trained another VAE (kl-f8-d16 -16 ch), and I test trained SD 1.5 to use it. It picked it up very quick, but the fine details need work. Overall, the test worked. Trying to decide if I want to switch to SDXL or PixArt Sigma. Thoughts?

8

4

33

Ostris

@ostrisai

11 days

I am curious if you would get similar results from training a LoRA on 10 images as you would training 10 LoRAs on single images with 1/10th the steps, and then merging them together. Has anyone tried anything like this?

12

0

33

Ostris

@ostrisai

3 months

@multimodalart

1

0

32

Ostris

@ostrisai

30 days

I absolutely love working at glif and you would too.

fabian

@fabianstelzer

30 days

friends, we're hiring staff level product engineers at Glif. RTs / forwarding much appreciated 💜 Glif is a powerful & fun AI sandbox, where talented creators build, remix and share AI microapps (aka glifs) with hundreds of thousands of players glifs are a new media format for

80

37

226

2

1

31

Ostris

@ostrisai

3 months

It is interesting and raises questions about synthetic captioning as similar images were likely synthetically captioned as "a man in a suit". It also makes me curious how many of the "not real people" it generates look almost exactly like some real people in the dataset.

8

1

31

Ostris

@ostrisai

4 months

6 years ago I spent days creating a dataset and fine tuning a classifier on Star Wars images so I could use it in deep dream. Instead of dog faces, there are droids, Chewbacca fur, and Boba Fett helmets. This blew people’s minds back then. We have come a long way.

4

2

30

Ostris

@ostrisai

3 months

Ran a quick lora test doing this on SD 1.5. It added a lot of fine detail. Not sure what a longer train run would do as it also increases the contrast. Before and after samples attached

Ostris

@ostrisai

3 months

Instead of adjusting the scaling factor, one could probably just multiply the noise by 0.934 during detail to increase the fine detail. I will test this theory.

1

0

10

3

4

29

Ostris

@ostrisai

2 months

The vision encoder adapter I am testing for Flux dev may not be working right in the traditional sense, but it makes some interesting things.

2

1

29

Ostris

@ostrisai

1 month

2

28

Ostris

@ostrisai

1 year

This is absolutely awesome. Results from my initial tests look good and it generates similar structured images to SDXL with same prompt and noise. Doing a fine tuning run now.

segmind/SSD-1B · Hugging Face

huggingface.co

6

1

28

Ostris

@ostrisai

2 months

The LoRA used is here.

alvdansen/softpasty-flux-dev · Hugging Face

huggingface.co

0

4

27

Ostris

@ostrisai

2 months

I want to also note that it will run fine on 24gb with the entire flux model and ip adapter loaded into vram. In fact, I did most of the pretraining on a 4090, so you can fine tune it for specific tasks (style, composition, identity) on one as well.

4

2

26

Ostris

@ostrisai

3 months

I'm training SDXL to use my 16ch VAE, and adjusting the VAE scale factor. The first image is training with a calculated norm scaler. The second is when I adjust it to a lower number, which is clearly closer to the ideal value. Any idea how to calculate the ideal value?

5

1

25

Ostris

@ostrisai

7 months

@victormustar Add a label if the license is open source compliant or non open source compliant. Hugging Face is full of Fauxpen source models claiming to be "open" and "open source" when they are not.

2

0

25