Thursday, April 25, 2024

Google used YouTube videos from the Mannequin Challenge to train its AI

Share

In the future, the results could be used in AR or for adding effects to videos.

google-pixel-3-xl-white-back-top-camera.

What you need to know

  • Google is training its AI to create depth maps isolating human subjects in the scene using only one camera.
  • As a starting point, Google used 2000 YouTube videos from the Mannequin Challenge to train the AI.
  • The results will lead to the ability to add effects to videos, such as portrait mode, and be used for Augmented Reality.

In a recent blog post, Google detailed how it has been working on depth perception in videos where both the camera and subject are moving. As a starting point, the study needed access to a vast amount of data to train the AI, and the first logical step was training it to detect people in a scene where the camera was moving but the people were static.

As it turns out, Google had the perfect resource for this data in the form of YouTube videos that were filmed for the Mannequin Challenge. In this challenge, a person or group of people would stand completely still as a camera panned around their position. Google used 2000 videos from the challenge to help train its AI to detect human figures in a variety of different scenes.

Something that makes this study even more interesting is the fact that Google is teaching its AI to create depth maps using footage that has been shot using only one camera. Most times, multiple cameras must be used to sense depth information in a scene.

[youtube https://www.youtube.com/watch?v=fj_fK74y5_0?modestbranding=0&html5=1&rel=0&autoplay=0&wmode=opaque&loop=0&controls=1&autohide=0&showinfo=0&theme=dark&color=red&enablejsapi=1]

Google already utilizes something similar for still images to create its portrait mode effect on the Pixel phones. However, this only pertains to still images. The new method Google has been developing is training its AI to create a depth map where both the camera and subject are moving within a scene.

By branching out into videos, it will open up features in the future for creating bokeh in video scenes similar to portrait mode on your phone. Another benefit to come from this study will be improved results for augmented reality, such as the Playmojis from Google’s Playground.

Another possibility will be the generation of 3D images from 2D scenes. While camera hardware has always been essential for photography and videography, what Google has done over the years with software shows that, in the future, algorithms will be just as important and help provide new experiences.

Get More Pixel 3a

Google Pixel 3a



pixel-3a-render-clearly-white-front-crop

Pixel 3a From $399 at B&H
Pixel 3a XL From $479 at B&H

  • Google Pixel 3a Review
  • Best Screen Protectors for Pixel 3a XL
  • Best Cases for Pixel 3a XL
  • Best Cases for the Pixel 3a
  • Best Pixel 3a Accessories

Read more

More News