Google text into video – A neural network AI generator

Google text into video – A neural network AI generator: Recently, services specializing in converting text descriptions into images have received great development. Google went even further and introduced the Imagen Video neural network, which generates video based on text. The video is obtained in a resolution of 1280 by 768 pixels and a duration of 5 seconds.


 True, while Google is embarrassed to launch Imagen Video to the masses – they are afraid of indecent videos.

According to Ars Technica, the neural network can work in several stylistic modes. For example, it can imitate the work of famous artists or create rotating 3D objects that keep their shape as they rotate.

Imagen Video uses a public database of tens of millions of photos, videos, and text descriptions when converting text to video. 

Based on the request (for example, “the bear is washing the dishes”), an initial video prototype is created from 16 frames at a resolution of 24×48 pixels at a rate of 3 frames per second. Subsequently, the algorithms convert the prototype into an HD movie 5 seconds long and 24 frames per second.

Examples of Imagen Video work can be viewed on the project website. There are video searches like “panda drives a car,” “sheep to the right of a glass of wine,” “astronaut on horseback,” and even “flying through a battle of pirate ships in a raging ocean.”

Google does not want to publish the source code of the neural network because it is afraid of the appearance of “unacceptable content.” The company has tried to filter out the original problematic videos but still believes that Imagen Video will be able to generate anything like sexually explicit, violent, or hate speech. So testing the neural network will not work.