Video | Mona Lisa raps...but how?

2024-04-25 2024-04-25T12:29:04Z
ندى ماهر عبدربه
ندى ماهر عبدربه
صانع مُحتوى

ArabiaWeather - A team of scientists at Microsoft Research Asia has developed a new artificial intelligence model called VASA-1, which turns images of people's faces and audio clips into synchronized videos with lip movements, facial expressions, and head movements in an accurate and realistic manner.

In a research paper, the team stated that they presented the VASA framework, which enables the creation of lifelike talking faces with attractive visual emotional skills from a single image and speech audio clip. The first model, VASA-1, is distinguished by its ability to generate exquisite lip movements in sync with sound, In addition to capturing a wide range of nuances in facial expressions and natural head movements that contribute to the authenticity and liveliness of the video.

The team claims that their method not only delivers high video quality with realistic face and head dynamics, but also supports online creation of 512 x 512 videos at up to 40 frames per second with almost negligible latency.

Video | A Saudi airline employee becomes a trend... What's the story?

Singing the Mona Lisa and fears of impersonation

VASA, or Visual Affective Skills Animator, is a name that stands for “Visual Affective Skills Animator,” and is capable of creating realistic videos that accurately and realistically mimic human conversational behaviors.

The VASA model can create videos that look completely real, with “realistic talking faces” mirroring conversational behaviors through natural facial gestures, eye and head movements, all starting from a single static head image.

The team used the VoxCeleb2 dataset, which includes videos of thousands of real-life celebrities, to train their model.

Their model was distinguished by its ability to deal with diverse inputs outside the training domain, such as artistic images and non-English speech.

While the model's capabilities raise impersonation concerns, the scientists stress that their goal with the tool is to develop virtual characters' visual emotional skills, not to impersonate anyone in the real world.

Microsoft confirms that there are currently no plans to release the code supporting the model, and aims to use the technology responsibly and in accordance with appropriate regulations in the future.

Read also:

China is drowning in dust... How so?

On World Earth Day, frequently asked questions about...


Sources:

Interesting Engineering

This article was written originally in Arabic and is translated using a 3rd party automated service. ArabiaWeather is not responsible for any grammatical errors whatsoever.
See More
Related News
Your Air Fryer Might Be Spying On You... What's The Story?

Your Air Fryer Might Be Spying On You... What's The Story?

How to Measure Weather Temperature on Samsung Smart Watch

How to Measure Weather Temperature on Samsung Smart Watch

Jordan | A rush of humid air currents starting tonight will bring rain to large parts of the north, center and east of the country

Jordan | A rush of humid air currents starting tonight will bring rain to large parts of the north, center and east of the country

Jordan - Alert | Low clouds touching the ground lead to almost zero horizontal visibility in the northern highlands

Jordan - Alert | Low clouds touching the ground lead to almost zero horizontal visibility in the northern highlands