Google is developing a new deep learning system that can pick out a single voice from a crowd of people — here is how the technology works.

TechTimes reports that Google is reportedly working on a new deep learning system that will be capable of singling out one person’s voice amongst a crowd of people. The system does this by analyzing users faces when they’re talking. Researchers first trained the system to recognize the voice of a single individual voice talking, this gave the system a base noise to focus on. They then added virtual noises mimicking a crowd, all playing at the same time, to teach the system to separate multiple audio tracks into different parts so it could learn to differentiate between each sound.

In a video posted to YouTube, the deep learning system can be seen analyzing the speech of two comedians and differentiating between the two, even when their voices overlap: