Understanding Hardware-enforced Stack Protection
Published Mar 24 2020 08:54 AM 192K Views
Microsoft

We aim to make Windows 10 one of the most secure operating systems for our customers and to do that we are investing in a multitude of security features. Our philosophy is to build features that mitigate broad classes of vulnerabilities, ideally without having the app change its code. In other words, getting an updated version of Windows 10 should make the customer and the app safer to use. This comprehensive MSDN document shows all of the security focused technologies we have built into Windows over the years and how it keeps our customers safe. Here is another presentation by Matt Miller and David Weston that goes deeper into our security philosophy for further reading.

 

We are now exploring security features with deep hardware integration to further raise the bar against attacks. By integrating Windows and its kernel deeply with hardware, we make it difficult and expensive for attackers to mount large scale attacks.

 

ROP (Return Oriented Programming) based control flow attacks have become a common form of attack based on our own and the external research community’s investigations (Evolution of CFI attacks, Joe Bialek). Hence, they are the next logical point of focus for proactive, built-in Windows security mitigation technologies. In this post, we will describe our efforts to harden control flow integrity in Windows 10 through Hardware-enforced stack protection.

 

Memory safety vulnerabilities

 

The most common class of vulnerability found in systems software is memory safety vulnerabilities. This class includes buffer overruns, dangling pointers, uninitialized variables, and others.

 

A canonical example of a stack buffer overrun is copying data from one buffer to another without bound checking (i.e. strcpy). If an attacker replaces the data and size from the source buffer, the destination buffer and other important components of the stack can be corrupted (i.e. return addresses) to point to attacker desired code.

 

Buffer Overrun.PNG

 

Dangling pointers occur when memory referenced by a pointer is de-allocated but a pointer to that memory still exists. In use-after-free exploits, the attacker can read/write through the dangling pointer that now points to memory the programmer did not intend to.

 

Uninitialized variables exist in some languages where variables can be declared without value, memory in this case is initialized with junk data. If an attacker can read or write to these contents, this will also lead to unintended program behavior.

 

These are popular techniques attackers can utilize to gain control and run arbitrary native code on target machines.

 

Arbitrary Code Execution

 

We frame our strategy for mitigating arbitrary code execution in the form of four pillars:

 

Arbitrary Code Execution Strategy.jpg

 

Code Integrity Guard (CIG) prevents arbitrary code generation by enforcing signature requirements for loading binaries.

 

Arbitrary Code Guard (ACG) ensures signed pages are immutable and dynamic code cannot be generated, thus guaranteeing the integrity of binaries loaded.

 

With the introduction of CIG/ACG, attackers increasingly resort to control flow hijacking via indirect calls and returns, known as call/jump oriented programming (COP/JOP) and return oriented programming (ROP).

 

We shipped Control Flow Guard (CFG) in Windows 10 to enforce integrity on indirect calls (forward-edge CFI). Hardware-enforced Stack Protection will enforce integrity on return addresses on the stack (backward-edge CFI), via Shadow Stacks.

 

The ROP problem

 

In systems software, if an attacker finds a memory safety vulnerability in code, the return address can be hijacked to target an attacker defined address. It is difficult from here to directly execute a malicious payload in Windows thanks to existing mitigations including Data Execution Prevention (DEP) and Address Space Layout Randomization (ASLR), but control can be transferred to snippets of code (gadgets) in executable memory. Attackers can find gadgets that end with the RET instruction (or other branches), and chain multiple gadgets to perform a malicious action (turn off a mitigation), with the end goal of running arbitrary native code.

 

Return Oriented Programming.PNG

 

Hardware-enforced stack protection in Windows 10 

 

Keep in mind, Hardware-enforced stack protection will only work on chipsets with support for hardware shadow stacks, Intel’s Control-flow Enforcement Technology (CET) or AMD shadow stacks. Here is an Intel whitepaper with more information on CET.

 

In this post, we will describe only the relevant parts of the Windows 10 implementation. This technology provides parity with program call stacks, by keeping a record of all the return addresses via a Shadow Stack. On every CALL instruction, return addresses are pushed onto both the call stack and shadow stack, and on RET instructions, a comparison is made to ensure integrity is not compromised.

 

If the addresses do not match, the processor issues a control protection (#CP) exception. This traps into the kernel and we terminate the process to guarantee security.

 

Shadow Stacks.PNG

 

Shadow stacks store only return addresses, which helps minimize the additional memory overhead.

 

Control-flow Enforcement Technology (CET) Shadow Stacks

 

Shadow stack compliant hardware provides extensions to the architecture by adding instructions to manage shadow stacks and hardware protection of shadow stack pages.

 

Hardware will have a new register SSP, which holds the Shadow Stack Pointer address. The hardware will also have page table extensions to identify shadow stack pages and protect those pages against attacks.

 

New instructions are added for management of shadow stack pages, including:

  • INCSSP – increment SSP (i.e. to unwind shadow stack)
  • RDSSP – read SSP into general purpose register
  • SAVEPREVSSP/RSTORSSP – save/restore shadow stack (i.e. thread switching)

The full hardware implementation is documented in Intel’s CET manual.

 

Compiling for Hardware-enforced Stack Protection

 

In order to receive Hardware-enforced stack protection on your application, there is a new linker flag which sets a bit in the PE header to request protection from the kernel for the executable.

 

If the application sets this bit and is running on a supported Windows build and shadow stack-compliant hardware, the Kernel will maintain shadow stacks throughout the runtime of the program. If your Windows version or the hardware does not support shadow stacks, then the PE bit is ignored.

 

By making this an opt-in feature of Windows, we are allowing developers to first validate and test their app with hardware-enforced stack protection, before releasing their app.  

 

Hardware-enforced Stack Protection feature is under development and an early preview is available in Windows 10 Insider previews builds (fast ring). If you have Intel CET capable hardware, you can enable the above linker flag on your application to test with the latest Windows 10 insider builds.

 

Conclusion

 

Hardware-enforced Stack Protection offers robust protection against ROP exploits since it maintains a record of the intended execution flow of a program. To ensure smooth ecosystem adoption and application compatibility, Windows will offer this protection as an opt-in model, so developers can receive this protection, at your own pace.

 

We will provide ongoing guidance on how to re-build your application to be shadow stacks compliant. In our next post, we will dig deeper into best practices, as well as provide technical documentation. This protection will be a major step forward in our continuous efforts to make Windows 10 one of the most secure operating system for our customers.

 

Kernel protection team - Jin Lin, Jason Lin, Niraj Majmudar and Greg Colombo

 

9 Comments
Brass Contributor

Thanks for the article. This type of feature is really needed for all consumers.

But I want to know which processors are supported to this feature. Any restrictions are there. Currently, I am using an Asus laptop with an intel i5 7200u Kabylake processor with microcode B4 with intel TPM 2.0.

Copper Contributor

@RAJUMATHEMATICSMSC- you can use the Intel ARK to search by feature set, and find out anything you need to know about Intel processors.

 

https://ark.intel.com/content/www/us/en/ark.html

Copper Contributor

It seems that the built in variable "cwd" (documented here: http://dtrace.org/guide/chp-variables.html#chp-variables) isn't available. Is that intentional or an oversight?

string cwd

The name of the current working directory of the process associated with the current thread.


That is a pretty minor nit.  Other than that, it seems great.

Copper Contributor

Does it makes sense to compile DLLs with this bit set? For example a COM component that is loaded into the process space of another application?

Does the main application controls the overall behaviour or is the shadow stack controlled per DLL?

Copper Contributor

Hello everyone.

I am curious about how to write code in the kernel driver to detect whether the Hardware-enforced Stack Protection function is enabled.

Can any handsome guy provide some clues or examples?:happyface:

Copper Contributor

Intel’s CET manual link is dead

Brass Contributor

Is this information compatible with what how Windows 11 does this?

Copper Contributor
Copper Contributor
Version history
Last update:
‎Dec 12 2022 11:08 AM
Updated by: