Join my mailing list for the latest A.I. insights 🤖. No spam, just fascinating updates, and breakthroughs in artificial intelligence.
Subscribe here and let's explore the future together: 💡
2D to 3D is coming! 2024 will be the year of 3D with AI.
DreamCraft3D looks promising (code "soon"). Even if the render is not perfect yet, it's not an issue. It'll be better once fine-tuned and with new research on it!
And more important I already know how I'll use it🧶
Stability just announced the private beta for their 3D model (Stable 3D).
Many information about it in the thread!
1) A video demo. For a first beta, it looks very impressive!
Shocking Crossover: Harry Potter x Star Wars by Pixar, Holiday Special (Not Yet on Disney+)
✍️ Scenario: Human + GPT
🖼️ Image: SDXL / ComfyUI
🎥 Video:
@pika_labs
🎙️ Voice:
@coqui_ai
Consistent Characters: IPAdapter & Prompt
Interested in discovering the detailed process
Breaking News - Oops I did it again
I just released 4 new ControlNet models for Stablediffusion 2.X
- Normal Map (BAE) 🎨
- Segmentation (ADE20K) 🧩
- LineArt 🖼️
- Openpose with hands 🖐️ and face 🙂
Example and useful infos in thread ⏬
Important to read it until the end!
BREAKING NEWS
Perpetual ♾ Diffusion ☀🌑 models are available.
Have fun with them.
Based on the awesome SD2.1-768 by
@StabilityAI
~250 hours to create the dataset
~50 000 images
~300 hours of training
42 epochs for the final versions
1/3
For a dozen days, I've been working on a simple but efficient workflow for upscale.
I share many results and many ask to share.
So I'm happy to announce today: my tutorial and workflow are available.
Links in comments.
i don't use caps anymore but... WHAT THE F***
zero-shot vid2vid!
papers ✅
code ✅
it use sd1.5 and ebsynth under the hood + their algorithm of intra/inter-frame and it's compatible with controlnet and lora!
more example below
It's a major breakthrough!
Still very early but I can "image-prompt" clothes now.
Need a bigger dataset and much longer training.
For now, I have the global vibes of the clothes, the next step is to match it closer!
Comparison of 4 lipsync methods on 3 different videos (Zoom matters!)
Top left: wav2lip (mouth only)
Top right: wav2lip (full)
Bottom left: Video retalking
Bottom right: SadTalker Video
Google just released a wonderful paper AND code!
Consistent style generation withOUT fine-tuning.
It could be a challenger to IPAdapter.
Can't wait to try it in ComfyUI!
More info in thread!
Zero-shot styling! 🤯🤯🤯
Just stumbled upon a tech wizardry in a research paper!
Imagine turning the Eiffel Tower 🗼into a giant cactus 🌵 without training!
Can't wait to play with it. (code not released yet)
It will be so useful to ensure consistency for A.I. Movies!
Thanks
@replicate
and
@chigozienri
to adapt my AutoCaption tool and offer a free webservice to use it!
That's why opensourcing tool is important!
(link in first comment)
i have access to
#sd3
so let's go for a community generation thread
* comment with your prompt
* i ll generate image (maybe not all of them and not instant, i ll do my best)
* share the first message to spread the word
you can use short or very long prompts, and you can use
a new tool for talking character: EMO.
results seem very good even on non-realistic image.
code is not available yet.
i hope someone smart will modify the code to use video as input too!
Watch my 2min. tutorial in which I explain (and share!) my workflow using IP Adapter Plus with SDXL in Comfy UI:
(link to the tutorial in the first comment)
Progress of ControlNet - Depth V2
- 100% of the dataset downloaded
- 75% of the preprocessing version done
- Script for training done
- The training will start in 9 hours 🤞
Next: write the annotator for the Automatic1111 ControlNet extension and test it!
We can't know yet if
New paper "Deep Geometrized Cartoon Line Inbetweening".
Demos are pretty good. Code and weight are available.
The real question now is how can we leverage that to do a nice things for StableDiffusion (controlnet?/Animatediff?) poke
new model by stability: stable cascade
(cascade because it s 3 different models used one after one)
results are announced better and faster than sdxl
can’t wait to try it in comfyui (once ipadapter will be available)
SDXL Turbo is fast but not good for 1024x1024
SDXL DPO achieves better quality but is slow.
SDXL DPO TURBO is amazing!
Having a link to download this merge is 🥰(just look in the first comment)
<- DPO TURBO ........ TURBO ->
Ever heard ofscreenshot-to-code? It's a game-changer for web development!
This tool transforms any screenshot into HTML/Tailwind/JS code using GPT-4 Vision and DALL-E 3.
And with CogVLM and Stable Diffusion, we can do a full open-source version of it now!
Dear
@chrlaf
, CTO and interim co-CEO of Stability,
On behalf of the StableDiffusion user community, we are writing to express our concerns over the departure of Emad, who was a driving force behind Stability's open-source model training initiatives.
Emad's commitment to
Amazing Dress Workflow Update!
🌟 Good News: I'm sharing it soon.
❌Bad News: No tutorial about it this week.
🚨 Must-Read: Exclusive beta workflow drops Friday for my mailing list subscribers!
Don't miss out, sign up now!
Link below 👇
🌌🤖 Spectacular AI Adventure: "Milo & Ziggy: Embark on a Cosmic Journey" - A Heartwarming Tale of Friendship Beyond the Stars
✍️ Scenario: 100% Human Creativity
🖼️ Visuals: Crafted with 💖 / SDXL / ComfyUI
🎥 Production:
@pika_labs
🎙️ Voices:
@coqui_ai
Curious about how we
Today, I launch the crowdfunding campaign for my Pir-AI-tes. 🎉
The very first artbook about Pirates with 100% of the artworks made with A.I. tools only.
You can register for the launch on Kickstarter. Link in comment.
Quote RT / RT VERY appreciated!
StoryDiffusion seems very great and it's open-source (license unknown)
can't wait to see how we can finetune and use it with other tools like IPAdapter & ComfyUI
More example below👇
✅Dreambooth running on windows
✅Lora running on windows
✅with SD 2.1 512
✅and SD 5.1 768 v
a kind of install is available on my github - (link in comments)
comparison of different "face technologies".
You can download and play with the workflow (link in comment), many settings to test, combination of models, changing weights etc...
more tests will be done when instant id will have a better integration.
for now, my favorite is a
Pika 1.0 is incredible.
This is from prompt only, first generation, no cherry-picking.
But the poor dog gets a hat when sleeping, watch it in the thread.
And choose what'll happen to it next!
some people asked me if i ll train controlnets for SD3.
tldr ; i don't know.
i trained all the main controlnets for SD2.1.
for SDXL, i was in touch with stability (i give them my methodology, training scripts etc) ; in parallel, i trained and released OpenPose (i trained a few
i just updated the upscaler workflow.
the new version fix the color washing issue.
thanks for all people here and on discord who show me results with this node.
(link below)
Let me introduce you to the (un)dress workflow.
Now I have your attention. Thanks.
I've been asked frequently about sharing my (re)dress workflow. While I understand the interest, I've decided not to disclose it due to obvious reasons... but for some people, reasons were not
it seems to be one of the missing part to get real AI actors.
i think in 2024 we'll be able to:
- generate our 3D character
- having very good tools like this one for lipsync & emotions
- get better emotions for AI voice
then:
- nice 3D environment generated with AI and with
🚨 breaking hypothesis 🚨
facts:
1) sd3 use similar tech as sora
2) sora can make video and image
conclusion:
so if stability get more gpus, they may train stablevideo based on sd3 and achieved sora level. 🤯
Prediction: MidJourney will be integrated in X.
Why?
1. The biggest clue is this screenshot (and it s not the first time, Elon talk about MJ)
2. Meta, Google and MSFT is adding generative AI in their platforms
3. MJ seems to lose user since the release of Dalle3, so they
Discover ZipLoRA: a groundbreaking approach to generate any subject in any style, merging style, and subject LoRAs effectively.
It offers high fidelity in both aspects.
By Google Research one of the author is
@natanielruizg
, same as DreamBooth!⏬
ControlNet - ZoeDepth is available for SD2.X
Cooking is over, Muffins are ready!
Imgs: 1) Original 2) ZoeDepth 3) DepthV1
You can use it with depth/le_res as preprocessor but it works better with ZoeDepth (My PR is not accepted yet but you can use my fork).
links below
Quick movie (made in less than 15 minutes) to celebrate the access to
@pika_labs
new model
All clips are done directly by text2video.
Voice with
@coqui_ai
Ho ho ho! A free caption tool for Xmas!
The more videos I create, the more I find myself in need of adding attractive captions. Unfortunately, I couldn't find a repository that made this process easy.
Time constraints kept me from developing a tool myself. So, I decided to take
The key for SD3 seems to be : very long prompt
(see the short version below... or don't, no you don't after all)
An intimate, soulful portrait photograph of a majestic golden retriever, capturing the essence of this beloved canine breed known for their intelligence, loyalty, and
I just release the LoRA version of openpose controlnet XL.
Available on the same repo:
Results are a bit different than the full model but didn’t test enough to tell if it s better or worst.