Forum Discussion
Andy_LoCascio
Feb 28, 2024Copper Contributor
Create Custom Video Avatar
I am attempting to build a real-time text to speech application using a custom video avatar. I am struggling to determine how to actually create the avatar. I have located the following two reso...
Ayan_Chawla
Apr 24, 2024Microsoft
Hi Andy,
In the preview stage of this service, the training will be done manually by Microsoft. You'll be notified after the model is successfully trained.
You can fill the form for that - https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xURFZNMk5NQzVHNFNQVzJIWDVWTDZVVVEzMSQlQCN0PWcu
Basic steps to do that -
1. Get the consent video with the given statements- https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/customavatar/verbal-statement-all-locales.txt
2. Prepare the training data - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples
3. Train the model - Done by Microsoft as of 24th April, 2024. Please fill the above form.
4. Deploy and use the Avatar - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-endpoint
Reference Link - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/what-is-custom-text-to-speech-avatar#how-does-it-work
For the real time, we need to install the speech SDK and use it - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar
** Please mark the answer as solved
In the preview stage of this service, the training will be done manually by Microsoft. You'll be notified after the model is successfully trained.
You can fill the form for that - https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xURFZNMk5NQzVHNFNQVzJIWDVWTDZVVVEzMSQlQCN0PWcu
Basic steps to do that -
1. Get the consent video with the given statements- https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/sampledata/customavatar/verbal-statement-all-locales.txt
2. Prepare the training data - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-record-video-samples
3. Train the model - Done by Microsoft as of 24th April, 2024. Please fill the above form.
4. Deploy and use the Avatar - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/custom-avatar-endpoint
Reference Link - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/what-is-custom-text-to-speech-avatar#how-does-it-work
For the real time, we need to install the speech SDK and use it - https://learn.microsoft.com/en-us/azure/ai-services/speech-service/text-to-speech-avatar/real-time-synthesis-avatar
** Please mark the answer as solved
- user089Jun 19, 2024Copper ContributorApart from the upfront custom avatar generation cost and the synthesis cost, are there any other costs involved? We are planning to occasionally use the custom avatar for batch synthesis.