Forum Discussion
How to extract audio from mp4
In a Teams chatbot, there's no direct option to accept audio input in personal chats. Users cannot record and send audio directly as a message.
However, Teams provides a "Record video clip" feature, which users can utilize to record audio, this can be sent as an MP4 file. To process this with a speech-to-text service, we need to extract the audio from the MP4 and convert it to an MP3 format. Once converted to MP3, we can then pass it to Azure Speech Service to extract the text from the audio.
Would that be a possible solution to use audio in personal chatbot
3 Replies
- Lakshmi_145Iron Contributor
This approach won't work for chatbot, as it's deployed in an Azure Web App. We can't rely on installing any ex locally, nor can we save files to a specific folder in the web app's environment.
What we need is a solution that doesn't depend on any local installation or file system storage.
Ideally, we should be able to extract the audio from the video (MP4) directly into a memory stream (e.g., as MP3 or WAV), and then use that stream to send the audio to Azure Speech Service for transcription completely in memory, without creating physical files.
- KARAN_SHEWALE0525
Microsoft
In Teams personal chats, there’s no direct option to send an audio message. But Teams does have a “Record video clip” feature, where users can just speak and send a short video (which is actually an MP4 file).
Now if we want to use that voice for speech-to-text (like with Azure Speech Service), we’ll need to extract the audio from the MP4 first.
Best way is to convert that MP4 into an MP3 or WAV file. For Azure, WAV is better — ideally mono channel, 16kHz sample rate.
You can use ffmpeg to do that:
bash
ffmpeg -i input.mp4 -ac 1 -ar 16000 output.wav
Once you’ve got the .wav file, just send it to Azure Speech API and get the transcription.
So yeah, even though Teams doesn’t support direct audio messages in personal chats, this is a pretty solid workaround. Users just record a quick video, we extract the audio, and we’re good to go.
- Sayali-MSFT
Microsoft
Hello Lakshmi_145, Could you please confirm if your issue has resolved with above suggestion or still looking for any help?