A few weeks ago, I read about how easy it has become to clone the voices of ordinary people, pop stars, or politicians. From my point of view, one of the better uses was generating songs that were better than the originals by the pop stars themselves. Less favorable uses included scams targeting the elderly, like the grandchild trick that harms them.
This piqued my curiosity, and I wanted to find out how easy it has become to clone any voice. And yes, it has really become simple to clone your own voice on your own computer. In this article, I am again using one of the freely available Text-to-Speech (TTS) models from Huggingface. In this case, it is XTTS-v2 (https://huggingface.co/coqui/XTTS-v2).
In this article, I want to describe very simply how an interested person can run this model on their computer.
This article seeks to equip readers with the knowledge and tools necessary to embark on a journey of exploration and innovation with TTS models. By unraveling the technical complexities and presenting actionable steps for implementation, it aims to demystify the process of integrating TTS models into personal projects or professional workflows.
Whether you are a computer scientist, a developer, or simply someone fascinated by the potential of AI, this guide endeavors to provide valuable insights into harnessing the power of AI models on your personal computer.