With the advancement of Artificial Intelligence (AI), Microsoft has introduced an AI model capable of emulating human voice. Dubbed VALL-E 2, this text-to-speech generator can replicate a voice using only a brief snippet of audio.
VALL-E 2 is anticipated to undergo training to grasp concepts without prior exposure to examples through a method known as zero-shot learning. Microsoft Research developers state that VALL-E 2 can produce authentic, lifelike speech resembling the original speaker’s voice, reaching human-like performance levels. The AI model is also capable of crafting intricate sentences alongside short phrases.
The tool leverages Repetition Aware Sampling and Grouped Code Modeling, two key features outlined by Microsoft researchers. Repetition Sampling tackles repetitive tokens, the smallest data units a language model processes, aiding the AI in comprehending human language word-by-word and avoiding recurring sounds or phrases during decoding.
Consequently, this enhances the system’s speech variability and naturalness. On the other hand, Grouped Code Modeling assists in streamlining the model’s token processing to yield faster outcomes. Microsoft asserts that VALL-E 2 marks a breakthrough in mimicking the human voice. Nonetheless, there has been a surge in “Vishing,” a scam where perpetrators impersonate trustworthy individuals over the phone, posing potential risks, including national security threats. In light of these concerns, Microsoft has clarified that VALL-E 2 remains a research endeavour and will not be publicly available in the foreseeable future.
The company aims to distance itself from incorporating VALL-E 2 into products or extending public access, citing potential risks associated with its deployment. Microsoft has faced heightened scrutiny regarding its AI practices, particularly concerning antitrust and data privacy issues.
Regulators have expressed unease over Microsoft’s significant investment in OpenAI, which granted the tech giant substantial control over the startup. To mitigate these concerns, Microsoft intends to maintain VALL-E 2 strictly as a research initiative.