New York, April 20 (IANS) Does watching a dubbed foreign film often make you grimace because the facial expressions of the actors do not match the words put in their mouth?

This is now likely to change with a new method that analyses an actor’s speech motions and suggests new words to reduce the subtle discrepancies between the spoken words and facial expressions.
“This approach, based on ‘dynamic visemes,’ or facial movements associated with speech sounds, produces far more alternative word sequences than approaches that use conventional visemes, which are static lip shapes associated with sounds,” said Sarah Taylor from Disney Research at Pittsburgh.
“The method using dynamic visemes produces many more plausible alternative word sequences that are perceivably better than those produced using a static viseme approaches,” Taylor said.
With the new method, the researchers found that the facial movements an actor makes when saying “clean swatches,” for instance, are the same as those for such phrases as “likes swats,” “then swine,” or “need no pots.”
Speech redubbing, such as translating movies, television shows and video games into another language, or removing offensive language from a TV show, typically involves meticulous scripting to selecting words that match lip motions and re-recording by a skilled actor. Automatic speech redubbing, as explored in this study, is a largely unexplored area of research.
With conventional static visemes, a lip shape is assumed to represent a small number of different sounds and the mapping of those units to distinctive sounds is incomplete, which the researchers found limited the number of alternative word sequences that could be automatically generated.
Dynamic visemes represents the set of speech-related lip motions and their mapping to sequences of spoken sounds. The researchers exploited this more general mapping to identify a large number of different word sequences for a given facial movement.
The findings are scheduled to be presented at ICASSP 2015, the IEEE International Conference on Acoustics, Speech and Signal Processing in Brisbane, Australia, on April 23.

By