Forum Discussion
Audio problems for professional created videos
Thought I was going crazy. Spent way too much time testing this. Edited videos before in ClipChamp and DaVinci Resolve, same issues persisted, got tired of not knowing, did proper testing.
TLDR: It's not me, it's them: My best practice, export 48kHz AAC, convert to 44.1kHz for Stream upload, least amount of pops and clicks. Still broken. Use different video hosting service for Training Videos at work.
Audio Pops During Playback in Stream on SharePoint (HLS Segment Boundary Defect) – Downloaded File Clean
Problem Summary:
I am experiencing repeatable audio pops/clicks during streaming playback of MP4 videos uploaded to Stream on SharePoint. These pops:
- Are NOT present in the original exported file
- Are NOT present in the downloaded Stream copy
- ONLY occur during web playback in Edge, Chrome, and Firefox
- Occur at consistent timestamps aligned with HLS segment boundaries
I have conducted extensive controlled testing to isolate the issue. All tests confirm the cause is specific to the Stream on SharePoint HLS audio playback pipeline, not the source media.
Key Symptoms:
- Pops consistently appear at specific timestamps, especially around:
- 00:14
- 00:29.9
- 00:49
- 01:04
- 01:09
- 01:24
- 01:29
- Pops occur:
- Even when the source file is perfectly clean
- Regardless of browser (Chrome/Edge/Firefox)
- With hardware acceleration ON or OFF
- With different output devices
- Even using InPrivate/Incognito
- Pops do not exist when:
- The file is played locally
- The file is downloaded from Stream
- The same file is played in DaVinci Resolve or other media players
Tests I Conducted (all repeatable):
A. Audio Bitrate / Codec Tests
- AAC-LC CBR 192 kbps, 256 kbps, 320 kbps
- Stereo and mono versions
- FLAC → AAC transcode
- Result: pops remain at the same timestamps
B. Sample Rate Tests
- 48 kHz stereo → pops
- 48 kHz mono → same pops
- 44.1 kHz stereo → fewer pops
- 44.1 kHz mono → fewer pops
- Result: resampler path changes severity, but pops still occur
C. Video Encoding Tests
- H.264 @ 5–12 Mbps
- GOP = 60 (2-second)
- GOP = 30 (1-second)
- Native encoder vs hardware encoder
- Network Optimization ON/OFF
- H.265
- MKV → MP4 remux
- Result: no combination eliminates the pops
D. Silence Padding Tests
- Added 2 ms silence / fade between all clips (no change)
- Added 0.5 seconds digital silence at start & end - Pops shifted later (first pop moves to ~00:29.9), proving boundary sensitivity
- Earlier pops at 9s, 14s, 20s are removed
- Result: confirms segment-edge interaction
Diagnosis (technical):
Issue appears to be an HLS audio segment boundary defect in Stream on SharePoint:
- AAC priming, padding, and fractional-sample offsets are being trimmed or re‑aligned inconsistently
- Browsers apply different resampler behaviors, causing audible ticks
- Pops occur at exact points where the Stream HLS player transitions between segments
- Mono/stereo/bitrate/GOP/container changes do not eliminate the issue → not a source-encoding problem
- Only the Stream playback path introduces artifacts
Files used:
- Original master (clean)
- Stereo 48k AAC versions (192/256/320 kbps)
- Stereo 44.1k AAC versions
- Mono 48k and mono 44.1k versions
- Remuxed version
- Stream-on-SharePoint URLs for each uploaded test file (in separate folders to avoid stale HLS manifests)
Request:
Please escalate this issue to the Stream on SharePoint Media Transcoding and Playback team.
I am requesting:
- Verification of HLS segment construction for AAC content
- Verification of audio timestamp alignment and priming compensation
- Investigation into audio boundary resampling behavior across browsers
- Confirmation whether this is a known regression or newly introduced issue
- Any recommended encoding workaround until a fix is published
This issue affects training videos where audio quality is critical.
Thank you.
Almost forgot, other things I have tried:
- [ ] Exported 1080p at 5/8/12 Mbps; tested 720p control
- [ ] GOP/keyframe every 2 s (constant)
- [ ] AAC‑LC **CBR** at 192/256/320 kbps **vs** VBR
- [ ] Sample rate **48 kHz** **and** **44.1 kHz**
- [ ] True‑peak ceiling (limiter on Bus1) set to **−2 dBTP**
- [ ] 2–3 ms micro‑fades across all edits
- [ ] Added **0.5–1 s** silence padding head & tail
- [ ] **Hidden camera‑audio track removed/disabled**; bus routing verified - One Video and ONE audio track only, compounded later
- [ ] Export contains **one** audio stream - verified
- [ ] Tested Edge/Chrome/Firefox; hardware acceleration ON/OFF
- [ ] Noted segment‑interval pattern (2–6 s) vs random