Native staged/canary rollout for Office Add-in manifests (Centralized Deployment/Integrated Apps)
Summary
There is no supported way to roll out an Office Add-in manifest change to a small percentage or ring of users first, monitor health, and automatically roll back on failure. A manifest-level regression reaches 100% of a tenant at once, and rollback is a slow manual re-deploy. We request native staged/canary rollout for add-in manifests, matching capabilities Microsoft already ships for other products.
Category
Feature request / platform capability gap — Add-in deployment & management (Centralized Deployment / Integrated Apps in the M365 admin center; also relevant to Microsoft Marketplace). This is a deployment-platform gap, not an Office.js client-API defect.
Applies to
- Outlook add-in using event-based activation (OnMessageSend / Smart Alerts).
- Both the add-in-only (XML) manifest and the unified manifest for Microsoft 365.
- Distribution via Centralized Deployment / Integrated Apps and via Marketplace.
Motivation (real scenario)
We ship a security/compliance add-in (event-based, OnMessageSend Smart Alerts) to enterprises with thousands of mailboxes. Today, any manifest-level change — a new/changed event, requirement set, add-in command, endpoint, permission, or scope — is delivered to the entire tenant simultaneously. There is no supported mechanism to:
- expose the new manifest version to a small % or pilot ring first,
- watch activation/error telemetry, and
- automatically halt and roll back if health degrades.
Blast radius is the whole tenant; rollback means manually re-deploying the previous manifest, which propagates slowly (24–72h ribbon propagation; event configuration is cached locally and syncs asynchronously). Microsoft's own https://learn.microsoft.com/office/dev/add-ins/resources/resources-office-add-in-known-issues — where an event-based-activation change caused Smart Alerts add-ins to block users from sending mail, requiring multiple restarts to recover — illustrates exactly the blast radius and recovery latency that a staged rollout would contain.
What's missing (specifics)
- No percentage-based or ring-based rollout for a manifest version.
- Centralized Deployment assignment is manual and coarse — Everyone / specific users / groups only (https://learn.microsoft.com/microsoft-365/admin/manage/manage-deployment-of-add-ins).
- No telemetry-gated automatic promotion between stages.
- No one-click halt / automatic rollback to the previous manifest version.
- Changes to Events / permissions / scopes force admin re-consent and block users until granted (https://learn.microsoft.com/office/dev/add-ins/testing/testing-and-troubleshooting#add-in-wont-upgrade), so even manual group-phasing is disruptive for event-based add-ins.
- Marketplace updates are all-or-nothing and auto-propagate.
What we've tried, and why it's insufficient
- Manual group-based phased assignment (Microsoft's documented "recommended rollout strategy"): no percentages, no automated health gating, no automated rollback; still forces re-consent for event/permission changes.
- Two parallel add-in registrations (pilot vs. prod): produces duplicate ribbon UI and doubles admin overhead, and for event-based add-ins both registrations fire on the same event — unacceptable.
- Web-layer feature flags on our own CDN: works well for code-only changes, but cannot canary anything that lives in the manifest (events, requirement sets, commands, endpoints, permissions).
Requested capability
Native staged rollout for add-in manifest versions in Centralized Deployment / Integrated Apps:
- Define rings or percentages (or a pilot group) for a new manifest version.
- Promote / expand scope in steps; automatic health signals (activation failures, JS errors, event-handler failures) that gate promotion.
- One-click halt and automatic rollback to the previous manifest version.
- Programmatic support (PowerShell / Microsoft Graph) so this integrates with CI/CD.
Precedent — Microsoft already ships this pattern elsewhere
Office Add-ins are the conspicuous gap. Comparable staged/canary rollout already exists for:
- Microsoft 365 Copilot connectors — "staged rollout" to a limited audience (up to 100 users / 15 groups): https://learn.microsoft.com/microsoft-365/copilot/connectors/staged-rollout
- Microsoft Store apps — "gradual package rollout" (percentage-based, with halt/finalize): https://learn.microsoft.com/windows/apps/publish/gradual-package-rollout
- Microsoft Entra ID — cloud-auth "Staged Rollout" (group-scoped): https://learn.microsoft.com/entra/identity/hybrid/connect/how-to-connect-staged-rollout
- OneDrive sync app — multi-ring rollout with telemetry-gated suspend: https://learn.microsoft.com/sharepoint/sync-client-update-process
Impact / who benefits
Every ISV shipping security/compliance/productivity add-ins, and every enterprise admin using Centralized Deployment. Reduces tenant-wide outages from add-in updates and enables safe CI/CD for add-ins.
Environment
- Clients: Outlook on Windows (classic + new), Outlook on the web, Outlook on Mac.
- Requirement sets: 1.15.
Questions for Microsoft
- Is native staged/canary rollout for add-in manifests on the roadmap?
- Is there any supported way today to (a) roll out a manifest change to a subset of users with automated rollback, and (b) canary an event-based / OnMessageSend manifest change without an org-wide re-consent that blocks users?
- If not supported today, please treat this as a feature request and point us to the correct intake (aka.ms/m365dev-suggestions or other).