How Stable Broadcast Art Generating AI Works

[Jay Alammar] has put together an illustrated guide to how stable streaming works, and the principles it contains are perfectly applicable to understanding how similar systems like OpenAI's Dall-E or Google's Imagen work also under the hood. These systems are probably best known for their amazing ability to turn textual prompts (e.g., "paradise cosmic beach") into a corresponding image. Sometimes. Well, usually, anyway.

"System" is an apt term, because Stable Diffusion (and similar systems) are actually made up of many separate components working together to make the magic happen. [Jay's] illustrated guide really shines here, as it starts at a very high level with just three components (each with its own neural network) and digs deeper as needed to explain what's going on at a deeper level and how it fits into the whole.

Spot similar shapes and contours between the image and the noise that preceded it? This is because the image is the result of removing noise from a random visual mess, not creating it from scratch like a human artist would.
This may surprise some to find out that the image creation part doesn't work like a human. That is, he doesn't start with a blank canvas and builds an image bit by bit from scratch. It starts with a seed: a bunch of random noise. Noise is subtracted in a series of steps that leave the result looking less like noise and more like an aesthetically pleasing and (ideally) consistent image. Combine that with the ability to guide noise suppression in a way that promotes conformance to a text prompt, and you have the bones of a text-to-image generator. There's a lot more than that, of course, and [Jay] goes into detail for those interested.

If you're unfamiliar with stablecasting or art-making AI in general, this is one of those areas that's evolving so quickly it sometimes seems impossible to keep up. Fortunately, our very own Matthew Carlson explains all about what it is and why it matters.

Stable Diffusion can be run locally. There's a fantastic open source web UI out there, so there's no better time to upgrade and start experimenting!

Technology Oct 24, 2022 0 67 Add to Reading List

How Stable Broadcast Art Generating AI Works

[Jay Alammar] has put together an illustrated guide to how stable streaming works, and the principles it contains are perfectly applicable to understanding how similar systems like OpenAI's Dall-E or Google's Imagen work also under the hood. These systems are probably best known for their amazing ability to turn textual prompts (e.g., "paradise cosmic beach") into a corresponding image. Sometimes. Well, usually, anyway.

"System" is an apt term, because Stable Diffusion (and similar systems) are actually made up of many separate components working together to make the magic happen. [Jay's] illustrated guide really shines here, as it starts at a very high level with just three components (each with its own neural network) and digs deeper as needed to explain what's going on at a deeper level and how it fits into the whole.

This may surprise some to find out that the image creation part doesn't work like a human. That is, he doesn't start with a blank canvas and builds an image bit by bit from scratch. It starts with a seed: a bunch of random noise. Noise is subtracted in a series of steps that leave the result looking less like noise and more like an aesthetically pleasing and (ideally) consistent image. Combine that with the ability to guide noise suppression in a way that promotes conformance to a text prompt, and you have the bones of a text-to-image generator. There's a lot more than that, of course, and [Jay] goes into detail for those interested.

If you're unfamiliar with stablecasting or art-making AI in general, this is one of those areas that's evolving so quickly it sometimes seems impossible to keep up. Fortunately, our very own Matthew Carlson explains all about what it is and why it matters.

Stable Diffusion can be run locally. There's a fantastic open source web UI out there, so there's no better time to upgrade and start experimenting!