At Stability AI, we’re committed to releasing high-quality Generative AI models and sharing them generously with our community of innovators and media creators.
We acknowledge that our latest release, Stable Diffusion 3 Medium, didn’t meet our community’s high expectations, and
#SD3
Pushing the limit of prompt understanding
#Prompt
: a stack of four books, the titles are "Harry Potter", "SD for Dummies", "Final Fantasy VII Rebirth", "How to Train Your Phoenix"
These have been generated with a
#SDXL
Turbo finetune in 8 steps. Perfect prompt following and perfect hands 90% of the time. It shows how a good architecture is more important than early results. Remember that for
#SD3
: take the base model and project how good finetunes will be
The “weight” is nearly over! Today, at
@ComputexTaipei
, our Co-CEO,
@chrlaf
, officially announced the open release date of Stable Diffusion 3 Medium for June 12th.
🔗Sign up to the waitlist to be the first to know when the model releases:
#SD3
#Prompt
: tokyo gal with glasses is looking up and to the window with reflections, holding a cup of coffee with white steam. rainy night, in a cafe shop with beautiful interior design beside a street with wet grounds. through the cafe's watery window
photo of three people, a man holding the sign with the text "S", a girl with the sign "D" and another man with the sign with the text "3". Indoor scene, colored signs
For reference, SD3 2B has roughly the same size but it's MMDiT (which is far superior to Unet) and used 3 text encoders, plus has a 16ch VAE. You can't get this level of detail in XL without clever upscaling and downscaling. This is generated with SD3 medium (2B) at native res
Just to set the record: I didn't work on any safety or structure on SD3 Medium, I merely did aesthetic tuning and generated pretty images that I shared out of my own volition.
photo of three people, a wizard holding the sign with the text "Magic", a witch with the sign "Hex" and another alchemist with the sign with the text "potions". Indoor scene, colored signs
To be clear
Things I did NOT work on:
- The architecture of SD3 (existed before I joined)
- The pretraining of SD3 (done before I knew it existed)
- Anything regarding safety
- The license
Things I DID work on:
- Fixing aesthetic
- Fixing style alignment
- Post pretty pics I like
The Stability AI Developer Platform now offers a comprehensive suite of API services, setting new standards in image generation, upscaling, and editing, with additional services forthcoming.
Learn more here: (1/4)
Playing with
#SD3
8B
`an old apothecary. On the counter there are three old potions: a blue potion with the handwritten label "Mana" a green potion with the label "Health", a red potion with the label "Poison"`
(yes, someone swapped the labels for sure)