We aim to make Windows 10 one of the most secure operating systems for our customers and to do that we are investing in a multitude of security features. Our philosophy is to build features that mitigate broad classes of vulnerabilities, ideally without having the app change its code. In other words, getting an updated version of Windows 10 should make the customer and the app safer to use. This comprehensive MSDN document shows all of the security focused technologies we have built into Windows over the years and how it keeps our customers safe. Here is another presentation by Matt Miller and David Weston that goes deeper into our security philosophy for further reading.
We are now exploring security features with deep hardware integration to further raise the bar against attacks. By integrating Windows and its kernel deeply with hardware, we make it difficult and expensive for attackers to mount large scale attacks.
ROP (Return Oriented Programming) based control flow attacks have become a common form of attack based on our own and the external research community’s investigations (Evolution of CFI attacks, Joe Bialek). Hence, they are the next logical point of focus for proactive, built-in Windows security mitigation technologies. In this post, we will describe our efforts to harden control flow integrity in Windows 10 through Hardware-enforced stack protection.
Memory safety vulnerabilities
The most common class of vulnerability found in systems software is memory safety vulnerabilities. This class includes buffer overruns, dangling pointers, uninitialized variables, and others.
A canonical example of a stack buffer overrun is copying data from one buffer to another without bound checking (i.e. strcpy). If an attacker replaces the data and size from the source buffer, the destination buffer and other important components of the stack can be corrupted (i.e. return addresses) to point to attacker desired code.
Dangling pointers occur when memory referenced by a pointer is de-allocated but a pointer to that memory still exists. In use-after-free exploits, the attacker can read/write through the dangling pointer that now points to memory the programmer did not intend to.
Uninitialized variables exist in some languages where variables can be declared without value, memory in this case is initialized with junk data. If an attacker can read or write to these contents, this will also lead to unintended program behavior.
These are popular techniques attackers can utilize to gain control and run arbitrary native code on target machines.
Arbitrary Code Execution
We frame our strategy for mitigating arbitrary code execution in the form of four pillars:
Code Integrity Guard (CIG) prevents arbitrary code generation by enforcing signature requirements for loading binaries.
Arbitrary Code Guard (ACG) ensures signed pages are immutable and dynamic code cannot be generated, thus guaranteeing the integrity of binaries loaded.
With the introduction of CIG/ACG, attackers increasingly resort to control flow hijacking via indirect calls and returns, known as call/jump oriented programming (COP/JOP) and return oriented programming (ROP).
We shipped Control Flow Guard (CFG) in Windows 10 to enforce integrity on indirect calls (forward-edge CFI). Hardware-enforced Stack Protection will enforce integrity on return addresses on the stack (backward-edge CFI), via Shadow Stacks.
The ROP problem
In systems software, if an attacker finds a memory safety vulnerability in code, the return address can be hijacked to target an attacker defined address. It is difficult from here to directly execute a malicious payload in Windows thanks to existing mitigations including Data Execution Prevention (DEP) and Address Space Layout Randomization (ASLR), but control can be transferred to snippets of code (gadgets) in executable memory. Attackers can find gadgets that end with the RET instruction (or other branches), and chain multiple gadgets to perform a malicious action (turn off a mitigation), with the end goal of running arbitrary native code.
Hardware-enforced stack protection in Windows 10
Keep in mind, Hardware-enforced stack protection will only work on chipsets with support for hardware shadow stacks, Intel’s Control-flow Enforcement Technology (CET) or AMD shadow stacks. Here is an Intel whitepaper with more information on CET.
In this post, we will describe only the relevant parts of the Windows 10 implementation. This technology provides parity with program call stacks, by keeping a record of all the return addresses via a Shadow Stack. On every CALL instruction, return addresses are pushed onto both the call stack and shadow stack, and on RET instructions, a comparison is made to ensure integrity is not compromised.
If the addresses do not match, the processor issues a control protection (#CP) exception. This traps into the kernel and we terminate the process to guarantee security.
Shadow stacks store only return addresses, which helps minimize the additional memory overhead.
Control-flow Enforcement Technology (CET) Shadow Stacks
Shadow stack compliant hardware provides extensions to the architecture by adding instructions to manage shadow stacks and hardware protection of shadow stack pages.
Hardware will have a new register SSP, which holds the Shadow Stack Pointer address. The hardware will also have page table extensions to identify shadow stack pages and protect those pages against attacks.
New instructions are added for management of shadow stack pages, including:
- INCSSP – increment SSP (i.e. to unwind shadow stack)
- RDSSP – read SSP into general purpose register
- SAVEPREVSSP/RSTORSSP – save/restore shadow stack (i.e. thread switching)
The full hardware implementation is documented in Intel’s CET manual.
Compiling for Hardware-enforced Stack Protection
In order to receive Hardware-enforced stack protection on your application, there is a new linker flag which sets a bit in the PE header to request protection from the kernel for the executable.
If the application sets this bit and is running on a supported Windows build and shadow stack-compliant hardware, the Kernel will maintain shadow stacks throughout the runtime of the program. If your Windows version or the hardware does not support shadow stacks, then the PE bit is ignored.
By making this an opt-in feature of Windows, we are allowing developers to first validate and test their app with hardware-enforced stack protection, before releasing their app.
Hardware-enforced Stack Protection feature is under development and an early preview is available in Windows 10 Insider previews builds (fast ring). If you have Intel CET capable hardware, you can enable the above linker flag on your application to test with the latest Windows 10 insider builds.
Hardware-enforced Stack Protection offers robust protection against ROP exploits since it maintains a record of the intended execution flow of a program. To ensure smooth ecosystem adoption and application compatibility, Windows will offer this protection as an opt-in model, so developers can receive this protection, at your own pace.
We will provide ongoing guidance on how to re-build your application to be shadow stacks compliant. In our next post, we will dig deeper into best practices, as well as provide technical documentation. This protection will be a major step forward in our continuous efforts to make Windows 10 one of the most secure operating system for our customers.
Kernel protection team - Jin Lin, Jason Lin, Niraj Majmudar and Greg Colombo