Google’s texts to image generators are very useful and help people change their text prompts into high quality images. Google already has a text to image generator called Imagen but now alongside Imagen, it has been revealed that Google actually has another text to Image generator called Parti or Pathways Autoregressive Text-to-Image. This generator makes every effort for Photorealism but achieves that by using a different family of generative models.
In order to convert text into an image Parti uses an autoregressive model while Imagen uses Diffusion where it converts a pattern of random dots into images. Parti first converts a collection of images into a code sequence that can be compared to puzzle pieces. Then the text prompt that is given by the user is translated using these code sequences and a new image is created. This system of generating photos uses the advantages given by the existing research and the build of large Language models like PaLM. This is critical for the approach as it helps in handling long and complex prompts. Furthermore it also helps in producing high quality images.
Google also discovered that Parti can manage long and complex prompts that reflect world knowledge, Have a specific format or style and have many components including fine-grained details plus some interactions. Because of safety Google has chosen not to release any of Parti’s models, data or codes until further protection measures have been put into place. Also to prevent credit theft all images produced by the generator are watermarked in the bottom right corner with the name Parti.
But the thing is that because models like Parti are trained on a specific type of datasets that mostly contain biases about people of different backgrounds, jobs etc. these types of models generally come out biased, so when a prompt is put on about something which is largely stereotyped the result will be stereotypical as well. Take a wedding for example, if a user puts in a prompt for a wedding the resulting photo will be one referring to western standards.
Read next: Inflation is at an all time high in the world, here’s how it affects consumer spending
In order to convert text into an image Parti uses an autoregressive model while Imagen uses Diffusion where it converts a pattern of random dots into images. Parti first converts a collection of images into a code sequence that can be compared to puzzle pieces. Then the text prompt that is given by the user is translated using these code sequences and a new image is created. This system of generating photos uses the advantages given by the existing research and the build of large Language models like PaLM. This is critical for the approach as it helps in handling long and complex prompts. Furthermore it also helps in producing high quality images.
Google also discovered that Parti can manage long and complex prompts that reflect world knowledge, Have a specific format or style and have many components including fine-grained details plus some interactions. Because of safety Google has chosen not to release any of Parti’s models, data or codes until further protection measures have been put into place. Also to prevent credit theft all images produced by the generator are watermarked in the bottom right corner with the name Parti.
But the thing is that because models like Parti are trained on a specific type of datasets that mostly contain biases about people of different backgrounds, jobs etc. these types of models generally come out biased, so when a prompt is put on about something which is largely stereotyped the result will be stereotypical as well. Take a wedding for example, if a user puts in a prompt for a wedding the resulting photo will be one referring to western standards.
Read next: Inflation is at an all time high in the world, here’s how it affects consumer spending