A team of AI researchers at Alibaba Group, an Institute for Intelligent Computing, made an AI tool that can make a still image of a person speak something or sing a soundtrack to make an animated version of the image. Many other AI experiments with making a still picture of someone animated were made in the past but now Alibaba has also added audio to those animated versions. The most interesting thing about this experiment is that the researchers from Alibaba group haven’t used any 3D or facial landmarks for AI to make it work.
The thing that these researchers have used instead is diffusion modeling where they train AI with large data of audio and video files and slowly make it learn. The team worked for more than 250 hours on such type of data to give the app its complete form. The name of the app is Emote Portrait Alive or EMO in short.
The app gracefully converts an image into a video form with sounds. Another best thing about the app is that the team has worked hard to train AI into capturing the facial emotions and gestures that make the animated image look like a breathing human. The videos accurately make the mouth shapes that are required to pronounce the words and the facial gestures that are associated with it. The team of researchers have also shared many videos which show how smoothly AI has demonstrated turning a still image into a video of a person speaking. This app is a great app that captures better realism and expressions as compared to other AI apps. The team also warns that people can also use this app for unethical purposes so its use should be restricted or monitored.
Read next: Apple Faces Class Action Lawsuit Over Alleged iCloud Pricing Manipulation, Restricting Competition
The thing that these researchers have used instead is diffusion modeling where they train AI with large data of audio and video files and slowly make it learn. The team worked for more than 250 hours on such type of data to give the app its complete form. The name of the app is Emote Portrait Alive or EMO in short.
The app gracefully converts an image into a video form with sounds. Another best thing about the app is that the team has worked hard to train AI into capturing the facial emotions and gestures that make the animated image look like a breathing human. The videos accurately make the mouth shapes that are required to pronounce the words and the facial gestures that are associated with it. The team of researchers have also shared many videos which show how smoothly AI has demonstrated turning a still image into a video of a person speaking. This app is a great app that captures better realism and expressions as compared to other AI apps. The team also warns that people can also use this app for unethical purposes so its use should be restricted or monitored.
Read next: Apple Faces Class Action Lawsuit Over Alleged iCloud Pricing Manipulation, Restricting Competition