FastTrack for Azure

8 MIN READ

Six Security Considerations for Machine Learning Solutions

KateB

Microsoft

Jan 18, 2023

From Azure Architecture

Why spend time on security of Machine Learning solutions?

Machine learning (ML) technology exposes a new type of attack surface and continues to be an active area for research. As machine learning becomes more embedded into day-to-day activities – think health care, financial transactions, mobile phones, cars, home security systems - we can expect it to become an increasingly attractive target for attackers.

Like any software component, machine learning can be vulnerable to cyberattacks and while these types of attacks are not common today that is likely to change as the technology becomes more pervasive and embedded into critical business processes. The purpose of this post is to provide awareness about the potential risks to ML solutions and to encourage teams to proactively consider the security risks and mitigations for machine learning solutions.

How is security for machine learning different?

Machine learning solutions can be more difficult and complex to implement than traditional software solutions. Some of the reasons include:

Complexity of workflows: The diversity and number of teams, tools, skills, and frameworks that are needed to develop, deploy, and operationalize an ML service are complex. Boundaries are crossed that require integration between teams and processes that traditionally have not been integrated. End-to-end implementation can be onerous and error prone.
Data: ML solutions are data intensive and consume massive amounts of data. The ingestion, storage, curation, and featurization of data is a primary machine learning function and security risk is inherent. Adding to the risk is that many machine learning solutions include sensitive data. Protecting data throughout the pipeline and production process is a primary security challenge.
Processes: Machine learning processes and solutions are new to software engineering methodologies and may not be integrated into a team’s DevOps or DevSecOps practice. A new discipline around Machine Learning Operations (MLOPs) is attempting to address some of the gaps.
ML packages/libraries: Open-source packages and libraries are foundational to machine learning development. Navigating this rapidly changing environment can make it difficult for users to achieve a consistent understanding of the capabilities/limitations/risks of a package. Adding to this challenge are the complex dependency chains that make it difficult for users to understand all the pre-requisites and potentially allowing them to inadvertently install compromised packages.
Security tools: Tools for security reviews or threat models or threat detection may not yet incorporate the capabilities needed to protect machine learning assets and resources.
Machine learning models: Machine learning models are a unique type of software artifact. The technologies used to produce and deploy them are often not familiar to IT deployment and operations teams.

The processes, workflows, tools, and data used to produce a machine learning solution are complex and diverse. When it comes to the security of machine learning solutions, consider the complexity of the entire system.

What are some specific machine learning threats?

Machine learning solutions can be vulnerable to traditional software application risks as well as risks unique to Machine Learning domain. Types of security threats include:

Data Poisoning: Data poisoning happens when the data an ML model uses to produce an outcome is tampered with and the model produces an incorrect result.

An early example of data poisoning is the attack on the Tay AI chatbot that Microsoft launched in 2016 which was taken offline within 24 hours of launch. The model used twitter interactions to learn and respond to users. Adversaries exploited the model by combining a ‘repeat after me’ function with racist and offensive language that forced Tay to repeat anything that was said to it. More info here: Learning from Tay’s introduction.

Model Theft: Because models represent a significant investment in Intellectual Property, they can be a valuable target for theft. And like other software assets, they are tangible and can be stolen. Model theft happens when a model is taken outright from a storage location or re-created through deliberate query manipulation.

An example of this type of attack was demonstrated by a research team at UC Berkeley who used public endpoints to re-create language models with near-production state-of-the-art translation quality. The researchers were then able to degrade the performance and erode the integrity of the original machine learning model using data input techniques to compromise the integrity of the original machine learning model (see Data Poisoning above). Another example of model theft is documented when Brown University researchers replicated the GPT-2 model using information released by OpenAI and open-source ML artifacts.

Inversion: An inversion attack happens when data is retrieved from a model – or the related workflows - and combined in ways that result in a data privacy leak. This type of attack happens when a model outputs sensitive information or there is a vulnerability in a solution pipeline that allows access to data storage or verbose error messages expose sensitive data.

Examples of inversion attacks include an attack where confidential data used to train a model is recovered. See: Is Tricking a Robot Hacking? with additional research examples of inversion attacks documented here: Model Inversion Attacks.

Extraction/Evasion: Also sometimes referred to as Adversarial Perturbation, this type of attack happens when a model is tricked into misclassifying input and returns an incorrect or unintended result.

Examples of this type of attack include:

A facial recognition solution is compromised when an attacker gets access to the storage account containing training data and modifies the images to disguise individuals from facial recognition software: https://atlas.mitre.org/studies/AML.CS0006 https://atlas.mitre.org/studies/AML.CS0011/

A research project to automate the manipulation of a target image causes an ML model to misclassify, see Microsoft Edge AI Evasion

A physical domain attack when self-driving cars were tricked to think a stop sign is a speed limit sign, and

An authentication system is compromised when the confidence output is maximized allowing an authorization check to pass allowing an invalid authorization attempt.

Traditional Software Attacks: Attackers will also use traditional software exploits to compromise machine learning components.

Examples include:

Open-source tools are used to extract deep learning model binaries from a mobile phone application. A backdoor is inserted into the model to circumvent the model. The compromised model is then repackaged to appear to be the original application. An example is the neural payload injection described here: Backdoor Attack on Deep learning Models in Mobile Apps.

Credential theft or storage service misconfiguration is used to exfiltrate or compromise sensitive training data.

Verbose logging or error messages are used to reverse engineer the attributes of a data set.

Data privacy violations when sensitive data is used and unprotected in test environments where data encryption or access control mechanisms are not enabled.

A vulnerable Application Programming Interface (API) or library or 3^rd party model is included in the build process and the software solution supply chain becomes compromised.

A recommended source for learning about adversarial threats against machine learning solutions is the MITRE ATLAS framework. The tactics and techniques used in AI/ML attacks can be explored by clicking through the matrix published with the framework.

What guidance is there for mitigating security threats?

#1: Build awareness of where and how the business incorporates machine learning components

Identify and inventory machine learning assets (models, workspaces, pipelines, endpoints, and data sets). Learn where the business leverages machine learning components. Understand where external and 3^rd party dependencies exist in the solution: APIs, data sets, 3^rd party ML models, open-source packages.

#2: Early in the project lifecycle, learn about security risks and vulnerabilities to machine learning solutions

Work with your organization’s security team early on to be informed about security and compliance requirements and responsible AI practices that apply to the solution. Through news stories, blogs, training, security bug reports, learn about the types of attacks that are impacting your industry and machine learning solutions. Some suggestions:

Review the Azure AI Risk Assessment whitepaper
On attack and failure scenarios for machine learning: Failure Modes in Machine Learning
For a reference on machine learning threats: AI/ML Pivots to the Security Development Lifecycle Bug Bar
Read through Empowering impactful responsible AI practices and consider how the solution aligns with practices for fairness, privacy, and ethics.

#3: Include machine learning solutions in a threat modeling practice

Today, reports of ML cyberattacks are not common so the perception may be that the risk is low. Keep in mind that even if the risk appears low, the impact of a cyberattack could be significant. Do not underestimate the importance of threat modeling a machine learning solution. For guidance:

Threat modeling is a core element of any secure development lifecycle. There are several different threat model approaches, and a Microsoft approach is described here: Security Development Lifecycle and Integrating threat modeling with DevOps.
Threat model guidance specific to machine learning solutions can be found here: Threat Modeling AI/ML systems and dependencies.

#4: Protect data throughout the pipeline

Machine learning products and services tend to store and consume a staggering amount and variety of data. Because data is at the heart of a machine learning solution, the attack surface expands as data is moved, transformed, curated, and stored throughout the pipeline process. The security techniques and controls that are important with traditional software solutions apply to machine learning pipelines: adopt recommended practices for access management, data encryption, data classification, monitoring. For guidance:

Azure Data security and encryption best practices

Certain types of machine learning scenarios may require personal and sensitive information like health data, financial data, user-specific IP addresses, employee data, or physical/email addresses. Consider data security, privacy, and compliance requirements and keep in mind that sensitive data may need to be obfuscated or scrubbed or anonymized for training, testing, and inferencing.

#5: Adopt recommended security practices across machine learning workflows

Follow recommended guidance for well architected, designed, developed, deployed, and operational solutions. Adopt as many recommended security practices as early in the project as possible. Consider:

A landing zone architecture that centralizes security capabilities and policies
Network isolation for components that do not need to be exposed to the internet
Configure role-based access for notebooks, jobs, shared workspaces, storage accounts, pipelines, etc.
Ensure compute targets are appropriately isolated and locked down
Secure storage for passwords, connection strings, tokens, keys
Audit and monitor the solution components

For detailed guidance:

Security pillar within Azure Well-Architected Framework for security principles for design, build, and deploy practices.
Use Azure Secure Score to assess and track how services comply with recommended security baselines: Assess and track your secure score.

#6: Invest in and build security monitoring, detection, and response processes

As the adage goes… An ounce of prevention is worth a pound of cure. Monitoring, detection, and response is about being proactive and having the confidence to operationally anticipate a security compromise. As the frequency and types of attacks on machine learning solutions increase, the need increases to proactively detect and respond to these specific threats.

A useful starting point is to identify the potential threats to a machine learning solution (output from the threat model) and characterize the behaviors and activities that could indicate a compromise. Some behaviors to consider:

Exfiltration of raw, curated, or training data sets
Unauthorized access and modification of training data that could indicate data poisoning
Detected drops in model performance and confidence over time, given consistent input
Vulnerabilities in software components
Unusual, excessive, or suspicious requests and inferencing patterns

Use tools such as Counterfit, an automation tool for security testing AI systems, to routinely monitor and scan for AI-specific vulnerabilities.

Audit logs need to be enabled to capture the relevant metrics and events so alerts can be triggered on suspicious behavior and activity.

And finally, given the chaos that can occur with a security breach, the time to put a response plan in place is before it is needed. Include threats to machine learning solutions in security response and escalation procedures.

Wrap-up:

When it comes to security, machine learning solutions deserve special consideration. This will be a series of articles to help learn about and understand the security risks, considerations, and resources that are available to help. The next article will provide guidance on recommended secure coding practices for machine learning solutions.

Please leave a comment and let me know if this is helpful or there are related topics you are interested in. Thank you for reading!

FastTrack for Azure: Move to Azure efficiently with customized guidance from Azure engineering. FastTrack for Azure – Benefits and FAQ | Microsoft Azure

Updated Jan 18, 2023

Version 2.0

data & ai

KateB

Microsoft

Joined March 13, 2020

View Profile

FastTrack for Azure

Follow this blog board to get notified when there's new activity