![]() ![]() In other words, the resulting visual matches the keywords that inspired it. That’s why the models can create novel images that are “semantically consistent” with the prompt, Farid said. Instead, Farid explained, they serve as a kind of background instruction that allows the model to infer concepts like color, objects and artistic style. The point of this training isn’t to give the model countless images that it can directly use to create new ones when a user gives it a text prompt. “What the system is learning is how to start with a text prompt, a noise pattern and go back to a full image because it’s done that now a billion times,” he added. Before they generate a novel image based on a user’s command, they’re trained on hundreds of millions of different images, each paired with a caption that describes it in words.įarid explained that training involves starting with each image, breaking it down to visual noise - random pixels that don’t represent anything specific, kind of like static on an old television - and inverting the process so that the model can go from noise back to the original image. Newer models take a completely different approach to image creation called diffusion. They are also restricted to a single category, Farid explained, so you can’t place a human in the image beside a cat - the GANs can only generate one or the other based on the source images it has access to. The faces are always from the neck up, and users can only change a few details, such as hair and skin tone. These GANs can create a convincing fake human face, but with limitations. ![]() (You can check out the website for some examples of the end product.) If the image the generator made is merely an approximation of a face - as in, the discriminator can tell it’s not the real thing - the two continue in a loop until the discriminator doesn’t note a difference between the AI-generated face and the real ones it’s seen before, Farid said. Farid explained that it then hands that image over to the discriminator, which has access to millions of images of real human faces so it can act as a sort of fact-checker. The generator is tasked with creating an image that doesn’t actually exist - for example, a “photo” of a human face. On the left, there’s the generator, and on the right, the discriminator. You can picture GANs as having a double-faced head, like the ancient Roman god Janus. Though they’ve somewhat been eclipsed by newer technology, Farid said they were “all the rage five years ago,” which he noted highlights how quickly this technology is evolving. How do text-to-image AI models work?Īn earlier iteration of AI-generated images relied on generative adversarial networks (GANs). Here are the basics of how it works, and some of the ways it could be used in the not-so-distant future. ![]() Generative AI, meaning that which is capable of creating new content, is a rapidly advancing field. Amid the potential for wrongdoing - including copyright concerns - some experts note that there are practical ways image generators could be put to positive use, and others are pushing to develop a digital infrastructure that will help internet users verify what they see online. Soon enough, we’ll have similar tools able to create novel videos and audio. ![]() WATCH: Advances in artificial intelligence raise new ethics concernsīut there’s no doubt that the new models are a game changer. “You can see why this is in some ways a much more powerful technology from the point of view of creativity, but also manipulation,” compared to models that preceded these ones, said Hany Farid, a professor of computer science at the University of California, Berkeley.īad actors using technology to create false or misleading content is nothing new - the fact is “we’ve always been distorting reality,” said Farid, noting that Soviet dictator Joseph Stalin notoriously manipulated photographs to his benefit. The easier it is for a person to create false images designed to make people believe something that isn’t true, the greater the potential for real damage. Since this technology’s fairly recent public debut, users and onlookers have raised questions over its potential nefarious uses. Image generated using Craiyon, formerly known as DALL-E mini. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |