Stable Diffusion AI Image Generator

Sort of defeats the purpose of QR codes if they don't reliably work across devices. Style over substance I suppose :D
I'm sure either the fad will pass or the technique will improve to the point where it'll work consistently, in which case it would be nice to see more creative and artistic QR codes.
 
These extensions are absolutely fantastic for generating photorealistic images - portraits specifically


I used some of the techniques and extensions in this video and got this pretty cool photo realistic render of an alien from a custom LoRA I made a few months go.


Alien01.jpg
 
Last edited:
Stability AI has officially released SD XL 0.9 following a leaked version getting out. It should work on Vlad diffusion following the setup \ steps following video :

 
Proof of concept showing that SD doesn't need huge amounts of memory, in general, to generate images. Also shows extreme memory reductions as a performantly viable tradeoff.
Uses a Pi with 512MB of RAM.


Model / Library
1st run​
2nd run​
3th run​
FP16 UNET / OnnxStream
0.133 GB - 18.2 secs​
0.133 GB - 18.7 secs​
0.133 GB - 19.8 secs​
FP16 UNET / OnnxRuntime
5.085 GB - 12.8 secs​
7.353 GB - 7.28 secs​
7.353 GB - 7.96 secs​
FP32 Text Enc / OnnxStream
0.147 GB - 1.26 secs​
0.147 GB - 1.19 secs​
0.147 GB - 1.19 secs​
FP32 Text Enc / OnnxRuntime
0.641 GB - 1.02 secs​
0.641 GB - 0.06 secs​
0.641 GB - 0.07 secs​
FP32 VAE Dec / OnnxStream
1.004 GB - 20.9 secs​
1.004 GB - 20.6 secs​
1.004 GB - 21.2 secs​
FP32 VAE Dec / OnnxRuntime
1.330 GB - 11.2 secs​
2.026 GB - 10.1 secs​
2.026 GB - 11.1 secs​
 
I've found that the QR code model for ControlNet is very good at creating images where you can integrate logos or text into a scene in a subtle way (depending on your weights and settings).

Sometimes you can't quite make it out, but the "hidden" logo or text becomes clearer if you zoom out or move further away from the image.

Apple.png

The 100 Most Famous Logos Of All-Time 3.png




Windows.jpg

Microsoft.jpg
 
I built myself a pretty dang nifty prompt-maker using Python and a lot of help from Chat-GPT. It allows me a lot of granular control but randomly generates the adjectives, artists, "trending on" and fluff and technical terms that are tedious to come up with when trying to think of good prompts.

Screenshot 2023-07-25 202416.png
 
Gentlemen, there's hope.

View attachment 1562760
I'm not sure what "runs fine" means to this person, but I have XL1.0 up and running on Auto1111 and it is super slow with an RTX3060 12Gb. And I have to disable all my extensions before it actually renders something. Yeah, the model seems to produce some nice images but I think it'll only be worthwhile once it plays nicer with the various extensions. For now if I REAAALLY want an SD XL image it seems more productive to generate it on the Clipdrop site and then use it as a base for whatever I want to do with it with my other local tools.

I haven't gotten SD XL to actually work with Vlad diffusion (SD Next) or Invoke AI yet.
 
I'm not sure what "runs fine" means to this person, but I have XL1.0 up and running on Auto1111 and it is super slow with an RTX3060 12Gb. And I have to disable all my extensions before it actually renders something. Yeah, the model seems to produce some nice images but I think it'll only be worthwhile once it plays nicer with the various extensions. For now if I REAAALLY want an SD XL image it seems more productive to generate it on the Clipdrop site and then use it as a base for whatever I want to do with it with my other local tools.

I haven't gotten SD XL to actually work with Vlad diffusion (SD Next) or Invoke AI yet.
I wonder if he's running at 512x512
 
I wonder if he's running at 512x512
Well, I got SDXL1 to run in InvokeAI. It took about 3 and a half minutes to render this one 512x512 crappy picture without the refiner.
5b2c074c-f763-4c25-97b8-a8c5874c9f93-png.1563140

So then, I ran the same prompts, steps, seed again, this time with the refiner enabled and that took 05:07 to generate this one (crappy) 512x512 image.
2da3645e-4cc5-4bce-89e8-c9a42990e0bc.png

So yeah, it seems to work a bit better in Auto1111, I got much better looking results there though it is also dog slow and in my opinion it might be better just to use one of the finetuned 1.5 models and then upscale it... which will probably be faster than SDXL1 for its initial image output.
In Vlad Diffusion, I just get black square outputs, but it's quite similar to Auto1111, so I can't really be bothered with fiddling with it there.
Hopefully it'll get optimised and improved performance-wise, as is, I don't think I will use it much in a local install.



Edit - seems SDXL is heavily affected (slowed down) by the newer Nvidia Drivers. I downgraded to the 531.79 drivers, performance doubled and maybe even tripled. I knew there was some supposed slowdown in the new drivers but I didn't really notice it. Downgrading also doesn't seem to speed up performance for 1.5 models in Auto1111 much either. Oh well I guess I'll upgrade for gaming \ downgrade for AI'ing until Nvidia fixes it somehow.
 

Attachments

  • 5b2c074c-f763-4c25-97b8-a8c5874c9f93.png
    5b2c074c-f763-4c25-97b8-a8c5874c9f93.png
    399.2 KB · Views: 44
Last edited:
Well, I got SDXL1 to run in InvokeAI. It took about 3 and a half minutes to render this one 512x512 crappy picture without the refiner.

Edit - seems SDXL is heavily affected (slowed down) by the newer Nvidia Drivers. I downgraded to the 531.79 drivers, performance doubled and maybe even tripled. I knew there was some supposed slowdown in the new drivers but I didn't really notice it. Downgrading also doesn't seem to speed up performance for 1.5 models in Auto1111 much either. Oh well I guess I'll upgrade for gaming \ downgrade for AI'ing until Nvidia fixes it somehow.
I was wondering.
It was pretty quick on my 1060 6GB but I was using older drivers. 531.79.

Not THAT much slower than sd1.5 models.

However I used --medvram and some others and only did 1 image at a time. I assume you read the notes and you're using --no-half-vae which is required in auto1111.

Loading the model for the first time though takes an age. ~7 minutes just to load the model.

I'm currently using ComfyUI wit SDXLbase though.

1 image at 512x512 takes 33 seconds.

1 image at 1024x1024 takes 1m47s

1690461567497.png
 
I was wondering.
It was pretty quick on my 1060 6GB but I was using older drivers. 531.79.

Not THAT much slower than sd1.5 models.

However I used --medvram and some others and only did 1 image at a time. I assume you read the notes and you're using --no-half-vae which is required in auto1111.

Loading the model for the first time though takes an age. ~7 minutes just to load the model.

I'm currently using ComfyUI wit SDXLbase though.
Yeah, I used no-half-vae, otherwise it doesn't work at all. Takes maybe 1-3 minutes to load the model on my side. No Half-Vae was also the key to getting it to work in InvokeAI, there's a setting within the GUI where you switch between fp16\fp32 for vae precision (that and only being able to use SDXL in the TXT2IMG or node workspace and not the unified canvas). I haven't messed with Comfy yet, though I have downloaded it and will probably try it out sometime.
 
This SDXLbase doesn't seem to need much prompting to get good results.

Prompt: Pokémon, elemental shadow dog, 4k, realistic, bokeh
Negative: text, watermark, cartoon

Euler, 20 steps

1690464102485.png1690464111581.png
 
Top
Sign up to the MyBroadband newsletter
X