Microsoft delivers the latest Windows security and user experiences updates monthly. Updates are modular meaning that, regardless of which update you currently have installed, you only need the most recent quality update to get your machine up to date.
With the fast pace of Windows security and quality fixes, distributing this large amount of updated content takes up substantial bandwidth. Reducing this network transfer is critical for a great experience. Moreover, users on slower networks can struggle to keep their machines up to date with the latest security fixes if they cannot download the package.
“More than 20 million Americans lack access to high-speed broadband – a gap that is exacerbating the prosperity and opportunity gaps across our country, especially between rural and urban areas. As the economy, the workplace, and our daily lives become more digital, it’s critical we close this divide. We simply cannot afford to leave entire communities behind.” - Brad Smith, President, Microsoft
The problem with a large Windows update size
In addition to rural communities with limited access to high-speed broadband, hybrid and remote work have increased the complexity of update distribution for many corporations. Without high-speed internal networks to distribute updates to devices, corporations must rely on their virtual private networks and their remote workers' home internet connections to provide update distribution. Minimizing network traffic increases the velocity of security patches, keeping a remote workforce protected wherever they are.
In this post, we will describe how Microsoft mitigated some of these issues with new compression technology, reducing the size of quality updates in Windows 11 by 40%.
We set out to reduce the size of Windows 11 updates with the following goals:
Reduce the size of the network downloads.
Not regress install time.
Maintain compatibility with all distribution channels without any configuration changes needed by IT professionals.
Insight into Windows servicing
Since Windows 10, version 1809, Windows servicing has used paired forward and reverse differential compression. By utilizing the forward and reverse differentials, the OS can revert to its base version as an intermediate state in servicing. While the forward and reverse differentials are symmetric in their function, their content is largely disjoint, which means that a bidirectional delta containing the shared and disjoint content is not significantly smaller than a pair of forward and reverse differentials.
The delta pairs used in Windows Updates. The endpoints that have the base version of the file (V0) hydrate the target revision (VN) by applying a delta.
Why a bidirectional delta approach is not viable:
Binary deltas utilize transforms and patching instructions to transform a file from its base version to a target version. While some patching and transforms that add data to a file are non-destructive, transforms and patches may delete data needed for a reverse delta. For this reason, a bidirectional delta would need to store the content added in the forward delta and the content deleted during the forward apply. Because the data in forward and reverse deltas are largely disjoint, little efficiency is gained from a bidirectional delta over paired forward and reverse deltas.
We discovered that these transforms and patches can be “observed” by the delta apply step, and efficiently re-encoded into a reverse delta (n->0), obviating the need to distribute reverse deltas in a paired delta approach.
An example of a Reverse Update Data Generation delta being created during the application of an insert/delete instruction. This example captures how reverse update data generation would create a reverse delta during forward apply for a rudimentary insert/delete patching instruction.
Mapping of assembly virtual addresses
Architecturally enlightened delta algorithms such as Microsoft’s MSDelta remap virtual addresses when a function’s address changes. This is important because even basic patches in assembly code shift the function address of subsequent functions in the binary program. Without remapping the virtual addresses, a single line assembly code change could result in tens of thousands of function calls needing adjusted virtual addresses.
The mapping works by running a byte-by-byte disassembly of the program’s assembly code and identifying the virtual addresses. Virtual addresses correspond logically to entry points for assembly code functions and shift when the assembly code is updated with a fix. These shifts are observed by the delta engine and are captured by a mapping table. The mapping process on delta apply normalizes the addresses of these changes and is a large part of the reason of why modern architecturally enlightened delta algorithms are so efficient.
Shows example x86 assembly call instructions being shifted by the addition of a mov instruction at address 0x18000097D3 (line 17)
Reverse mapping of assembly virtual addresses
Much like the basic patching instructions, these transforms can be “observed” and reversed. There is a slight overhead as not all mappings are 1:1, and where forward mapping conflicts with its observed reverse mapping, additional patch instruction must be used to align the mapping. This can be done in-place, and the reverse mapping will provide nearly the same performance as a reverse delta with a direct mapping from a delta generation done on the server.
For versioned data systems requiring forward and reverse delta pairs, “reverse update data generation” provides a way of efficiently distributing the forward delta to the machine and having the machine maintain a path back to its original state. Microsoft has successfully employed this approach in Windows 11, providing a 40% reduction in update size. This benefits our customer base who will need to download less to remain up-to-date and secure.