Workflow: audio to caption, caption text editing, text to speech to generate a new audio track

paolob0208 · ‎Oct 10 2021

I have to produce video clips for IT training, starting from video recordings in which I want to edit words and dub with a synthetic voice.

The workflow should then be:

extracting caption files from the video,
editing the text in the caption file, (e.g. to correct inaccurate words)
applying a text to speech function on the caption file, (to have an homogeneous, standard voice on all clips)
generate a new audio track,
apply simple video editing functions (cut and paste clips, leveraging the standardized voice) to produce the final clip

Steps 1, 2, 5 are available into Stream, how could I implement and intgrate the remaining steps 3, 4 to have an integrated video cotnent management solution for training?

Marc Mroz · ‎Feb 19 2022

Microsoft acquired the company clipchamp who builds a web based video editor. They have a cool feature that does text to speech audio tracks.

https://clipchamp.com/en/features/ai-voice-over-generator/

clipchamp isn't part of M365 enterprise yet but we are working on rebuilding it into M365. For the time being you could use their current product. I think the text to speech is a paid feature.

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Workflow: audio to caption, caption text editing, text to speech to generate a new audio track

Workflow: audio to caption, caption text editing, text to speech to generate a new audio track

Re: Workflow: audio to caption, caption text editing, text to speech to generate a new audio track