Here's how to get the new model from Stability AI, Stable Cascade running locally in Automatic1111 or Forge :

 
I've been trying out Forge and it's pretty darn great - it has all of the features and functions of Auto1111 and more, plus it is better optimised and runs quite a bit faster. It can even do "high res fix" and upscale base images 3x in one go.

I'd recommend you do the manual install instead of the 1-click installer. Here's the guide to get it up and running :

 
Stable Diffusion 3 announced :
  • Stable Diffusion 3: Stability AI announces the preview release of Stable Diffusion 3, which shows significantly improved overall generation quality in early demos.
  • Improved performance: Specifically, Stability AI promises improved performance on multi-part, complex prompts, image quality, and text writing capabilities. The model is not yet generally available, but there is a waiting list that you can sign up for.
  • Safety precautions: Stability AI says it has taken numerous safety precautions to prevent the model from being misused by malicious actors, starting with training and continuing through testing, evaluation, and deployment.
  • Other models: Stability AI has recently released several new models, including Stable Cascade, a very fast text-to-image model, Stable Video Diffusion, a generative video model, and Stable Zero123, a model for text-to-3D applications.
 
Another super useful IP Adapter has been released, this one is used for referencing composition in images and then generating similar images based on that. As explained in the below video, this is different from the existing ControlNets such as Canny or Openpose which would be used to copy an exact pose - the composition IP adapter essentially makes an image that replicates the placement and lighting of your subjects in a new picture.

 
I really didn't want to dig too deep into ComfyUI as it looks daunting to learn, but it turns out that the Forge repo is essentially being discontinued, and it seems everyone and their dog is using ComfyUI for stable diffusion these days. I guess I'll share my experience with Comfy if all goes well.
 
I really didn't want to dig too deep into ComfyUI as it looks daunting to learn, but it turns out that the Forge repo is essentially being discontinued, and it seems everyone and their dog is using ComfyUI for stable diffusion these days. I guess I'll share my experience with Comfy if all goes well.
Forge has a second branch that apparently is still getting updates. I'm also put off by the comfy noodle monster so I'm still using A1111. I also have stableswarm installed but haven't used it yet. Apparently will be SD3 ready at launch in 2 days time.
 
Here's a screenshot.
f3d8574891b91d4f07d3c703eea8d1ff.jpg
 
After an hour or two, I think I've gotten to grips with the basics of Comfy. I will still keep my Forge setup as it is familiar and easier to use when trying to do controlnet and inpaint related stuff, but the potential for ComfyUI is huge, especially for video.
 
Running inside Krita (free drawing / image editing app)

Been using it for a day. He's added region support just yesterday. Pretty good. I can't see why professional artists wouldn't start using this going forward. It's a very good aid.
There's some UI issues that need to be worked out, but it's very good. And the LCM "live preview" is, at least on a GTX1060, pretty quick. I can imagine it being almost instant on newer cards.

This was started with simple line work for the road and buildings, the character had a little more in order to establish position, arm placement and so on.

1718174035216.jpeg


Then over time, around 3 hours last night, I slowly changed and added detail which it then tries to interpret. The hair for example, was drawn by me to that shape with the AI filling and adding. As you can see it's not perfect. Trying to add an Alice band required a redraw of the hair around that part which you can see has affected the background but that's due to unclean layering which can be quickly fixed. Colour changes were also manually added, and understands that well.

1718174054288.jpeg
 
Last edited:
I have Stable Diffusion 3 up and running in Comfy UI. First impressions are that it is somewhat better in adhering to the specifics of prompts (though it did ignore my negative prompt for watermarks), but image quality wise, there are countless finetuned 1.5 and SDXL models that look way better than what SD3 is outputting. Maybe it's a matter of updating my prompting style and getting to grips with Comfy UI a bit more, but so far, I'd give it a solid 5 out of 10. The model is pretty darn huge at 15.8GB for SD3 medium with the clip models included.

ComfyUI_temp_uyllo_00007_.jpg
 
I trained a new LoRA last night, mainly because it's been a while since I last made one and to see whether the workflow I used before still works. It seems like it does. You can grab it here if you want some Zombies in various styles for your generated images... here's one made with the "cartoon me" model.

00030-743221606.png
 
Back
Top