The release of 20.07 brought along a range of security enhancements and changes to Azure Sphere. As head of the Operating System Platform (OSP) Security team, I want to provide more insights into the efforts to keep Azure Sphere secure as a platform while being as transparent as possible about all the improvements made since our 20.04 release.
The OSP Security team I run for Azure Sphere worked with a diverse group of people and companies to have three separate red team events happen on the platform over the last few months; an internal Microsoft team, Trail of Bits, and the currently active Azure Sphere Security Research Challenge (ASSRC). These efforts are on top of the continued work done by the OSP team over the last three months to harden and further the security of the platform.
Trail of Bits performed a private red team exercise on the system and identified a number of risks that have been fixed for 20.07:
Although the ASSRC is still on-going it has provided a range of great findings by the participants, some of which overlapped ToB's findings like a writable /proc/self/mem. The Linux kernel related issues identified by ToB were not fixed in 20.05 or 20.06 due to the massive Linux kernel uprgade from 4.9 to 5.4, this oversight will be handled better in the future.
Cisco Talos reported the first 2 findings that are fixed in 20.07, ptrace used to bypass the unsigned code execution protections and the Linux kernel message ring buffer being user accessible allowing for information leakage. Along with reporting the first two findings, Cisco Talos also reported the /proc/self/mem finding and found a double free in the azspio Linux kernel driver that have been fixed. Cisco has a blog post up detailing their efforts so far for the ASSRC.
As an excellent example of findings from the ASSRC effort, I would like to describe a specific attack chain that McAfee Advanced Threat Research found for the device that has been fixed for 20.07. This attack chain did require physical access to a device and could not be done remotely due to the steps involved.
McAfee ATR did a fantastic job putting together this attack chain and finding a 0-day in the core Linux kernel itself to make it work. The attack chain exposed a weakness in the cloud and multiple weakenesses on the device including a previously unknown Linux kernel vulnerability.
While the above changes were done as a result of external red team findings, the Operating System Platform team continued improving the security of Azure Sphere.
One effort we've been working on is minimizing the ability to use ptrace unless in development mode. PTrace is needed by gdb to properly provide debug information however normal customer applications do not have a need for it. Having ptrace be available to customer applications allows an attacker to ptrace the process being attacked and inject unsigned code into memory for execution. 20.07 brings along a Linux kernel change where ptrace is no longer possible unless in development which also brings along a few extra enhancements as a side effect, the largest being that /proc no longer shows any other process pid and is further restricted of what a process can know about itself.
Another security enhancement is moving to wolfSSL 4.4.0 bringing along additional side channel attack hardening. Along with the wolfSSL upgrade is work to begin exposing access to supported wolfSSL functionality, the first set of functions allowing customers to directly call wolfSSL for establishing TLS client connections.
We have added more fuzzing across 5 different components and additional static code analysis tools including extra static analysis tools on every pull request into our repositories. If the static analysis fails then the PR can not be completed, this further strengthens the system by making it more difficult to check in easy to abuse coding flaws. As we expand to add features and functionality more fuzzers are built for parts of the system being updated. The addition of the new static analysis tool detected an off by one calculation in DHCP message handling that allowed reading an extra byte of data past the end of the buffer, this was corrected in 20.07.
You may have noticed that our last couple quality releases did not have a Linux kernel patch bump, this time was used to allow the Linux kernel team to upgrade the Linux kernel from 4.9 to 5.4.44. By doing so we capture Linux kernel security enhancements done between the versions along with keeping up-to-date on the latest changes.
String manipulation functions are a very common way for leaking the stack cookie along with being able to write it when string buffers are not properly null terminated. GLibC helps limit string buffer attacks by forcing the first byte of the stack cookie in memory to be 0 however we use musl on the device for libc. Musl initializes all bytes in the stack cookie instead of leaving the first byte in memory 0 allowing for the potential of stack cookie leaks and abuses. Our version of musl in 20.07 sets the first byte to 0 and the patch was provided to the maintainer incase they wish to add this security measure to musl.
On top of our own changes, MediaTek provided a new version of the firmware for their WiFi subsystem of their MT3620 that is now being used on the platform to deal with a range of issues.
As you can see, a wide range of security improvements have been made to the platform as we continue to strive to be the best in the field. We will continue to be transparent about our efforts and are devoted to being the most secure platform for IoT.
Jewell Seay
Azure Sphere OSP Security Team Lead
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.