Well there are 2 types of models, CLIP and diffusion models. With VoC, Disco, et... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		glenneroo on Aug 15, 2022 \| parent \| context \| favorite \| on: Open-source rival for OpenAI’s DALL-E runs on your... Well there are 2 types of models, CLIP and diffusion models. With VoC, Disco, etc. latent diffusion, you pick multiple CLIP models and a single diffusion model. The CLIP models are the big gigabyte ones like ViT and RN, and you can use CLIP search engines that search on the LAION datasets to give you a rough idea what will happen when you use those words in your prompts: https://rom1504.github.io/clip-retrieval I will otherwise refer you to the "Bible" of latent diffusion: https://sweet-hall-e72.notion.site/A-Traveler-s-Guide-to-the... Whatever isn't covered in there is probably in the Disco Diffusion cheatsheet: https://botbox.dev/disco-diffusion-cheatsheet/ There are tons of resources out there, and it's a nonstop learning and experimenting process to try to achieve what you want.

Geee on Aug 16, 2022 [–]

Thanks again. Now I got my first image out and it ended up being a complete failure. :) I'll keep experimenting / learning.

glenneroo on Aug 16, 2022 | [–]

Welcome to the party! My first image was also a total failure, it can only get better from here ;) Prepare to spend a lot of time reading before you start to make sense of things.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact