CM3leon by Meta: state-of-the-art generative model for text and images

AllaEddine Guettaf
Jul 17, 2023
2 min read

This July 14, Meta announced its new generative AI model: CM3leon, (pronounced like "chameleon") a single-base model that generates both text to image and image to text.

AI-generated images are certainly not a new concept, I'm guessing most of you have used or heard of DALL-E or Mdijourney. With CM3leon, meta decides to join the battle, but in a slightly different way.

CM3leon achieves state-of-the-art performance for text-to-image generation, despite being trained with five times less compute than previous transformer-based methods. CM3leon has the versatility and effectiveness of autoregressive models, while maintaining low training costs and inference efficiency.

This model can generate sequences of text and images conditioned on arbitrary sequences of other image and text content. This greatly expands the functionality of previous models that were either only text-to-image or only image-to-text.

How CM3leon performs across tasks

Text-to-image

Like the other tools, user have to input a prompt text in the ai. And The more compositionally structured the invitation text, the more likely it is to generate a coherent image that follows the invitation.

We can see it in this exemples, who were created for this prompts: (1) A small cactus wearing a straw hat and neon sunglasses in the Sahara desert. (2) A close-up photo of a human hand, hand model. High quality. (3) A raccoon main character in an Anime preparing for an epic battle with a samurai sword. Battle stance. Fantasy, Illustration. (4) A stop sign in a Fantasy style with the text “1991.”

Text-guided image editing

You can edit an image by giving an image and a text prompt, and the changes will be applied according to the text you've entered.

Text task

This model can also provides description, captions and answers question about an image.

Exemple:

Prompt Question: What is the cat carrying?

Model Generation: Cigarette

Prompt: Describe the given image in very fine detail.

Model Generation: In this image, there is a cat wearing a suit and holding a cigarette in its mouth. There is a dark background. A fantasy image.

Expectations

What I'm waiting to see, and that all Ai-generated images struggle to do, is how CM3leon will generate text on an image. If this AI can rise to this challenge and generate images with text written and positioned correctly, it will become an excellent tool for marketers and content creators.

Also how this AI will help designers to generate different type mock up , And how it will blend the artistic with the professional, so that we get something that can be used for advertising campaigns, social media or websites.

It's not yet clear when Meta will make this technology available to the public. But given the state of AI in the research phase, I have no doubt that it will live up to and perhaps even exceed our expectations.

CM3leon by Meta: state-of-the-art generative model for text and images

Recent Posts

Comments