Voice conversion aims to transform speech into a target voice with just a few example recordings of the target speaker. Recent methods produce convincing conversions, but at the cost of increased complexity – making results difficult to reproduce and build on. Instead, in this talk I will go over some of our lab’s recent research to keep voice conversion simple by using just the plain k-nearest neighbours algorithm. Despite its simplicity, our recent research is competitive with the largest and complex existing models. I’ll also show how it can be used to perform diverse combinations of conversions, such as converting cross-lingually, or between human and animal sounds.
Matthew Baas is an Electronic Engineering PhD student in the Stellenbosch University MediaLab under Herman Kamper. His research focus is on speech synthesis and generative speech models.
30 August 2023