What is Imagen 3 and Veo 2? And how to use it?

> Google has recently launched Imagen 3 and Veo 2, cutting-edge AI models designed for image and video generation, respectively. These models are available through Google's Vertex AI platform, offering businesses powerful tools for content creation.

Audio version coming soon

Verified by Essa Mamdani

Google has recently launched Imagen 3 and Veo 2, cutting-edge AI models designed for image and video generation, respectively. These models are available through Google's Vertex AI platform, offering businesses powerful tools for content creation.

Imagen 3: High-Quality Image Generation

Imagen 3 is Google's highest-quality text-to-image model. It generates incredibly detailed, photorealistic images with minimal artifacts, surpassing previous versions in detail, lighting, and artifact reduction. Key improvements include better color balance, diverse art style rendering, and high-fidelity detail.

How to Use Imagen 3:

While the specifics of accessing Imagen 3 may vary depending on your access level, the general process involves providing a text prompt describing the desired image. The model then generates the image based on your prompt. For example, a prompt like "A close-up, macro photography stock photo of a strawberry intricately sculpted into the shape of a hummingbird in mid-flight" will produce a detailed image matching that description. Further, Google offers tools to edit and customize images generated by Imagen 3, allowing you to refine and tailor them to your specific needs. This includes text-based editing, mask-based editing (for partial image changes), and upscaling. To access Imagen 3, visit the Vertex AI documentation. Access may require joining an allowlist; details are available on the Google Cloud blog.

Veo 2: Advanced Video Generation

Veo 2 is Google DeepMind's advanced AI video generation model. It builds upon the capabilities of the original Veo, offering significant improvements in resolution (up to 4K) and video quality. Veo 2 generates high-quality videos from text or image prompts, maintaining continuity and realistic movement of objects and characters. It also boasts improved camera control, allowing users to specify camera angles and shots.

How to Use Veo 2:

Veo 2 is currently powering Google Labs' VideoFX generation tool. To use it, you'll need to provide either an image and a text prompt specifying the desired video, or generate videos solely from text instructions. The process involves uploading an image (if using image-to-video) and providing a detailed text description of the desired video content. Veo 2 will then generate the video based on your input. Currently, VideoFX is available in the U.S. only.

Image of train station generated by Google DeepMind's Imagen 3 AI model.

Note: Both Imagen 3 and Veo 2 incorporate safety features like digital watermarking and safety filters to mitigate the creation of harmful content. They also adhere to Google's Responsible AI Principles.