New Levels of Security via Machine Learning & Combined Data Sets
First published on CloudBlogs on Oct, 01 2014
I have always understood the value of bringing different data sets together, but it’s been only recently that I’ve fully appreciated the power of it. Looking back to high school, when I was asked about a future profession, I would always say that I was going to be a CPA because I loved to work with numbers. The teenage me would be thrilled to see the things I’ve been able to do here at Microsoft over the last few years. The work I’ve been doing in our Cloud services for the past few years has really emphasized the value and power of data, and this has become an area where I am now constantly pushing my teams to do more, more, more.
I really began to understand the value of this when I was running the Windows Update team in 2007. Windows/Microsoft Update is one of the most impressive and powerful things I have ever worked on; it is a service that, every month, updates about 1 billion PC’s around the globe. Because we want to learn and improve the experience we are delivering with this update, we log the telemetry we get back from every PC that is updated – we look at factors like, “Were all the updates successful?”, “If not, why?” and “What elements are operating best/worst?”
Once all this data was gathered, I would review all of it with my team and compare it to the data from the previous months. Each month we would learn and get better. It was one of the meetings I looked forward to the most – I just loved looking at data and learning from it.
When the anti-malware team was moved to my organization in 2010, my love of data was introduced to a giant buffet of information to consume. The work we are doing in our anti-malware teams is beyond a doubt one of the most interesting things I have ever been a part of (
I wrote about this
in detail a couple weeks ago). Each month with the
updates we include the
Microsoft Malicious Software Removal Tool
(MSRT). MSRT is able to remove some of the most challenging and sophisticated malware out there, and we get telemetry back from each and every one of those PC’s. Our anti-malware solutions (Windows Defender and Microsoft Security Essentials) are protecting 100’s of millions of PC’s and they are
sending back telemetry. As this information is gathered, researchers based all over the world analyze this constant stream of data (more than 1 million pieces of malware reported each day) and we then publish updates to the service three times per day.
The malware profession is cold, brutal, hand-to-hand combat with organized crime and state governments on a global scale.
Consider for a moment the sheer volume of all the data that is collected, and then let your imagination run for a bit on how that information can be used to protect the world, a company, or a single device. The data alone is valuable, and when we bring vast data sets together the value increases exponentially. This value that’s extracted from these data sets is exactly what’s delivered via the services we’ve made available in Azure – and it would be impossible to produce without our ability to constantly collect telemetry from around the world.
This has all been long introduction to get to what I think is most valuable to you in terms of how we bring all this to bear when addressing the challenges organizations all over the world face when trying to protect themselves. Included below are two examples of how we use
to strengthen your security, discover attacks on your organization, and help you block these attacks.
Spotting a Compromised Account
Our recently introduced Azure Machine learning is a
. With the
Enterprise Mobility Suite
we have the ability to (quite literally) learn the work access patterns of everyone using a service that authenticates against Azure Active Directory (AAD). From this Azure Machine Learning can learn patterns like when a user starts work, when they go home, do they usually come back online in the evening to check on things, and where (
which city) do they normally work. Let’s walk through some of these capabilities with the actual reports from EMS.
Take me as an example: I am in the office at 8:00am, work until 6:00pm, get home to have dinner with my family, then come back online for a few minutes at 10:00pm – and I do this from Redmond, Washington. Now imagine if my user account has been compromised and someone is using my valid username and password to authenticate and access corporate information – at midnight, from Russia. Without Azure Machine Learning your infrastructure is likely unable to identify the change in work habits and recognize that it’s impossible for me to login from Redmond around 10:30pm and then start “working” from Moscow about an hour and a half later (see screenshots below). This exactly what the Enterprise Mobility Suite is able to do.
In the screenshots below you’ll see the same scenario of someone logging in from New York City and the Netherlands within the same minute – and this is flagged because the system knows that those two areas are separated by nine hours of travel time.
Not only does Azure Machine Learning understand the work habits of users, we have also fed into the system travel data. The Machine Learning in EMS would be alerted by both my unusual work pattern, my abnormal location, and the impossibility of moving between Redmond and Moscow in just 120 minutes (unless you’re Superman). When it spots these things it flags the situation to IT so action can be taken.
Once this hits the desk of someone in IT, there are a number of actions they can take,
They could force the user to change their password (the bad guys would then no longer have a valid password), or they could issue a two-factor authentication challenge (requiring the user to input a pin that had been texted to their cell phone). I’m sure you’ve seen this kind of thing when you access Facebook from a new PC, or when you get a phone call from American Express while you’re at an event in Las Vegas. In both cases it’s machine learning flagging an unusual access attempt or behavior. I, for one, am grateful when I see these things in place, and can take some confidence that my identity and finances are being protected.
Here’s what it looks like in action:
This is the starting point for looking at these reports – the highlighted entry is “User with Anomalous Sign In Activity.” Which then opens into a view of all of the users that our machine learning has identified as having some kind of strange authentication.
Let’s look at Clifford – our Azure machine learning has found all kinds of issue with Clifford. J
As we open this specific reason of “Signed in from geo-distant locations” report we see the following:
In this case Clifford signed in from New York on 9/9/2014 at 2:34:08pm and then from the Netherlands on 9/9/2014 at 2:34:08pm. How can Clifford be in these two different geographies in such a short period of time? This is likely a compromised account and the organization is being attacked through Clifford’s account.
This kind of protection and peace of mind is what we deliver to your organizations with AAD. It really bears emphasizing that
absolutely no one else is doing this
. Microsoft is unique in the telemetry that we acquire, and we are unique in our ability to deliver this kind of consistent value.
We Know Where the Criminals Operate
One of the things that I have always found interesting about the anti-malware work that we do is that we are able to identify the addresses where much of the world’s malware is created. We also make significant investments around fraud detection and the ability to detect attempts to use things like stolen credit card numbers when purchasing our services. We are constantly updating our known “suspicious” addresses as we learn more about them (and learn about new ones) every day.
Now imagine taking this data and applying it to the authentication attempts to users and devices in your organization. Now you would be able to see devices that are communicating with known suspicious addresses, the users that are using those device, and have a view into which users may have been compromised. This is an order of magnitude beyond simple threat detection – this is intelligent threat avoidance and protection.
In the example above you can see a user that is working on a PC that is communicating with and address that we know is suspicious. How do we know this? Simple: we are taking all the things we are learning from across our services and bringing them together to enable this kind of report. We know IP addresses that are trying to use stolen credit cards to purchase Microsoft service. We know of IP addresses that are known for creating malware. As we bring all this information together we can flag issues like this where known suspicious addresses are communicating with devices your users are working on. Again – this user has likely be compromised or the device itself may be participating in something like a botnet.
Again, this is an example of the value of big data when combined with additional data and then applied to a problem.
This is value that only Microsoft can deliver
. If you using an MDM solution from AirWatch, MobileIron, or Good you cannot and will not get this value or any promise of it. They simply do not have the data from other services to apply to this problem.
The same structure is in place for identifying and isolating infected devise.
These are just two examples of what we are doing today with the Machine Learning and data available in Azure – and ready to help you solve some of your most persistent problems. This kind of value can only be provided by a vendor that is operating multiple SaaS solutions at global scale, collecting the appropriate data/telemetry, and making huge capital investments in applying that data in new and interesting ways.
I write a
about our POV that
should be delivered as a cloud service, and Machine Learning is one of the reasons why. Another reason is that, with a cloud-based system, we are able to constantly update our services and add new reports like this as we think of them. Having built on-premises products for most of my career, I could previously only dream about his kind of agility. Marrying the two together – where we are able to constantly update the service (EMS and Intune) and then have the on-premises console (SCCM) be updated with it – brings together the best of both worlds.
To learn more about Azure Machine Learning (and I really recommend it), check out these resources: