Your voice is an integral part of your identity. Not being able to express yourself and your ideas would be incredibly frustrating and isolating. Luckily technology exists to help people who lose their voice due to cancer, ALS and other conditions. US AI MVP Charles Elwood, has helped multiple people regain their voice through an app he created that utilizes Microsoft’s Custom Neural Voice (CNV) and AI. The app converts text to speech in customized natural sounding synthetic voice.
Former Radio DJ & manager, Chris Martin, lost his ability to speak due to throat cancer and the surgical removal of his larynx. Since Chris had been a DJ, there were existing recordings of his voice that Charles was able to use with his app that allows Chris to speak again through AI.
Charles created a “bank” of Chris’s voice that allows him to type in sentences he wants his computer or phone to say for him. This text to speech service allows Chris to share his humor, life stories and to have deep conversations with his wife and family members. He can even still do play-by-play action for sports, an activity he loved to do when he was a DJ. Chris was able to interview Charles when he used his app on the radio airwaves. Chris’s wife told Charles “You took him home, Charlie.”
Q&A with Charles on merging his technical expertise with a passion for the greater good to help Chris, and others speak, in their own voice:
MVP Program: Share with us the background about your recent projects using AI & Microsoft Neural Voice:
Charles: I am helping people with cerebral palsy, amyotrophic lateral sclerosis (ALS), throat cancer, congenital diseases, and nonverbal on the spectrum. These are terrible diseases that cut off communication even though the brain is functioning and alive inside. I want to enable the people impacted to have a voice and be included in our society again. It’s possible with Custom Neural Voice, Azure AI Custom Vision and Azure technologies.
MVP Program: In what ways do you envision this technology evolving, and how might it further assist individuals with speech impairments in the future?
Charles: I really think for people that need mobility, smaller samples will be developed and operate on the edge to give people the freedom to move around, better privacy and security and less energy usage than the large language models. I am watching Phi-3 development and GPT4o mini with excitement! Fine tuning, RAG (Retrieval Augmented Generation), and SLMs (Small Language Models) are converging and easier to connect together. Pricing and models are improving to accommodate these cases. It is interesting to note, this is all very new technology and already has advanced so much in just 2 years.
MVP Program: How do you ensure the ethical use of someone's voice, and what measures are in place to protect users privacy?
Charles: I leave ownership and decisions of the voice to the person that originated the voice. I am having discussions with them about ownership in the future - do they pass the voice on in a will and give ownership to a spouse or children? I explain to them that I use Azure to host their voice, and I create logins in Azure with MFA (Multifactor Authentication), and they are the only people in the world that can access their voice. I also hear from many people whose parents have passed and they just want to hear their parents voice again. Sometimes to complete the grieving process, to help with closure, and sometimes to help their children hear their parents’ voice. All of these discoveries help me see where the ethical issues may occur and helps me educate people about what we need to be aware of and maybe this changes some of the rules and social constructs around ethics.
For more information on Charles Elwood's projects, please visit: