Handling Cloud Posture Tasks Overload
Cybersecurity risks pose a significant threat to organizations of all sizes. As a result, security teams must be diligent in their efforts to protect their networks and data from potential breaches. However, with the increasing complexity of the digital environment and the expanding attack surface, security teams are faced with more and more tasks to improve the organization’s posture as well as investigating potential incidents. This can lead to critical security risks being overlooked or delayed, leaving organizations vulnerable to cyber-attacks. It becomes increasingly more important to estimate the risk created by the security issues in the environment’s configuration and to prioritize their mitigation correctly.
Prioritized cyber risks allow security teams to focus their efforts and resources on the most critical threats, ensuring that they are addressed promptly and effectively, which ultimately helps to reduce the organization's overall risk profile.
Basic prioritization systems assign a static severity rating to each issue, a rating that is determined at the issue’s definition stage. This rating is based on an assumed or estimated potential impact of the threat. Devices vulnerable to the same vulnerability are evaluated to be of the same severity whether the vulnerable resource is of high business criticality and exposed, or it is an isolated insignificant resource vulnerability which cannot be exploited. That is to say, the specific details of an issue’s instance are not considered which is a significant shortcoming.
A well-established concept in risk assessment involves the dependence of risk on two critical components: the likelihood of a successful attack and the impact that such an attack can have. The relationship is frequently represented by the formula Risk = Likelihood × Impact.
More sophisticated and informed prioritization systems look further than the issue type but also the contextual information of each manifestation. It includes considering the impacted resource criticality, likelihood of exploitation, and other factors to estimate the risk posed by an issue correctly. In this blog post, we introduce a new framework for methodic estimation and scoring of risk.
The framework is based on contextual information, and we demonstrate how this risk estimation is used to prioritize the mitigation of security issues found in the environment. The risk calculation process, we consider both likelihood and impact, recognizing their significance in evaluating and managing risks effectively. By thoroughly assessing these factors, we aim to make informed decisions and develop strategies to mitigate potential threats and vulnerabilities.
Microsoft Defender for Cloud has recently introduced a new feature for Defender CSPM helping customers to rank the security issues in their environment configuration and fix them accordingly. This feature is based on the presented framework and enhances the risk prioritization capabilities of Defender CSPM.
From Concept to Reality: Prioritization Concepts
Overview
The framework enhances the basic severity score of an issue by factoring in the likelihood of a successful attack and its potential impact, empowering security teams with precise task prioritization for improved accuracy and effectiveness. The high customizability and extensibility make it suitable for any product to create prioritized task lists, whether security-related or not. It allows the incorporation of scenarios and contextual information specific to each product's area of expertise. By empowering end-users to define the value of assets and behaviors, they have the freedom to adjust the prioritization according to their preferences. This approach enables organizations to enhance their overall cybersecurity posture and effectively secure their resources.
Prioritization is accomplished in several steps:
- Security issues are enriched with contextual information.
- The security issue is scored using a new method, described below.
- Issues of the same risk type and security score form a group.
- Groups are prioritized in descending score order.
- Within each group items are sorted more precisely. This process is discussed later in this post.
In the following sections, we will explain in detail the core concepts of the new framework: security issues, contextual information, contextual security issue (CSI) and the contextual security matrix (CSM).
Security issue (SI)
The term “security issue” refers to the type of fault that requires attention, and may be indicative of increased likelihood of a successful attack, increased impact, or both. In Microsoft Defender for Cloud issues are surfaced as recommendations, which are actions that should be taken to resolve the fault and improve the overall posture and security of the environment. An issue may be realized in multiple ways and aspects. As a result, several recommendations can be an indication for the same issue. For example, both “Machines should have vulnerability findings resolved” and “SQL databases should have vulnerability findings resolved” point to the same security issue: a resource is vulnerable (and therefore, should be patched). Issues are not limited to the cloud: a vulnerability on an on-premises computer is also afflicted by the same issue. Some of the key issues in cloud environment are “Unnecessary internet reachability”, “excessive permissions”, and “vulnerability”.
Contextual Information
Contextual information is any information that affects the estimation of the hazard caused by a security issue. Contextual information could be about the methods and feasibility of exploitation of that issue, about the level and type of risk to the organization if a resource is compromised or information that may indicate an immediate intent, previous or current attempts to exploit the issue.
Like SI, contextual information could be related to increased likelihood or impact. For example, the use of a common username for authentication creates a risk as it could be a target for a password brute force guessing attack. The impact of such an issue is much greater when the asset the user has access to is the intellectual property of the organization than if the user can only access an isolated virtual machine. A resource’s exposure to the internet enhances the risk that a vulnerability on a resource would be exploited, and therefore it is crucial information to evaluate the urgency of patching the vulnerability. It might be more urgent to handle a configuration issue of a running VM than an issue concerning a shutdown VM or prioritize a resource that is known to be a target for past attacks.
Contextual Security Issue
A Contextual Security Issue (CSI) is a security issue enriched with contextual information.
Consider, for example, a vulnerable VM. The vulnerable VM poses a “Security Issue” that requires remediation. If the VM is exposed to the internet, it increases the risk of exploitation by an external attacker and therefore the internet exposure is considered "a contextual information". Similarly, if the VM stores credentials to a critical asset, the stored credentials are also considered contextual information as they can be used to move laterally and compromise the critical asset. When the issue (vulnerability) is enriched with contextual information it forms the CSI: "Internet exposed vulnerable VM containing sensitive data".
In certain scenarios the internet exposure itself may be a security issue, certainly the stored credentials are a security issue. Indeed, an issue could serve as a context for the evaluation of another issue’s risk. To summarize, a CSI expresses how the affected resource can be reached, the risk in compromising the resource itself, how the issue can be exploited and used to proceed to further compromise resources and the damage that may be caused.
The Contextual Security Matrix and CSI Score
So far, we have explained the significance of the context in which the issue exists. Let us examine how contextual information influences our risk estimation. Consider a resource vulnerable to remote code execution. We want to estimate the risk to the resource and environment and prioritize it compared to other issues in the queue. The contextual information that the resource is currently running, indicates that the vulnerability is in increased risk of exploitation. Consequently, this leads to an elevated risk evaluation. Let us also consider a resource that is not encrypted at rest. The risk created by the lack of encryption does not vary much whether the resource is running or not. We see that the same context may be insignificant to some issues, and incredibly significant to another. The CSM is a framework for quantifying the CSI security risk based on that understanding. The matrix reflects the degree a security issue risk increases or decreases when it occurs within a specific context.
|
The CSM consists of columns representing the different security issues, each assigned a "base score." The rows correspond to potential contexts. The matrix's cells indicate the differential score for a specific issue (column) considering the given context(s) (rows). To calculate the CSI score, the base score of the issue is summed with the applicable contextual differential scores. |
The following example demonstrates the calculation of a CSI score for a vulnerability on an exposed VM that has permission to a critical asset. To calculate a CSI’s security score, we refer to the issue’s column in the matrix. We sum the base score and the values of the contexts included in the CSI. The score for the issue (i.e., the vulnerability) in the context of the applicable contexts (exposure and access) is calculated to be 13=5+(3+5). |
|
The framework is extensible and adjustable. Organizations can add columns to the matrix to account for issues that are not currently considered or replace an issue with multiple more specific issues. Contexts can also be added to account for additional information if such is available. MDC prioritization is criticality biased, that is, it prioritizes protection of a critical asset over multiple non-critical assets. By adjusting the matrix weights, organizations can create a preference or a bias that better fits their needs. Contexts in the matrix may be added to support the highlighting and suppression of certain resources. This can be achieved by adding a flag context to increase the score or a flag context that reduces it for uninteresting resources. For example, a honeypot or a dev device that is deliberately left weakened.
Primary Group
The previous section explained how a CSI score is calculated. A security issue that is a part of multiple CSIs will be scored for each CSI, i.e., a list of CSI scores is calculated for it. The security issue is finally scored with the maximal CSI score on its list. This score is the issue’s Primary Score. This scoring method is where the emphasis of business criticality over the number of resources is introduced. Issues of the same primary score and issue type are grouped into a Primary Group. The primary groups are prioritized from the highest score to lowest.
Secondary Prioritization
The primary prioritization described above is criticality biased in the sense that it prioritizes security issues that expose sensitive and important resources over those risking multiple less significant resources. We prioritize one crown jewel over a stack of rocks. The complexity of exploitation of resources is considered: when a misconfigured resource is identified the primary prioritization looks to find other resources that could be “easily compromised” and may allow up to one “complicated step”.
The secondary prioritization prioritizes within the primary group, and prioritizes issues exposing more resources over those who expose fewer, and prioritizes simple exploitation routes over complex ones. This prioritization cannot result in an issue “escaping” the primary group.
Several parameters are used to determine the secondary prioritization.
- The list of compromised resources: an issue that creates a risk to a greater number of resources will have a higher risk estimation score. Since we value criticality, we promote first based on the number of high-importance resources, if these match we prefer the one that has greater number of less-important resources
- If several path exist that lead to similarly critical resource the simple CSIs are promoted over more complicated ones. This is achieved by assigning a decay factor, a value between 0 and 1, to each step that propagates the foreseen attack from one resource\identity to another. The easier it is to transition to the next resource- the higher the factor. For example: using clear text credentials on a machine to log to a resource is easy, therefore the factor is high, say, 0.9.
Using a vulnerability to stealthily take hold of a resource requires a more skilled attacker and the factor would be lower, 0.4. From this factor rises the definition for “easily compromised” and “complicated step” – steps with decay factor higher then 0.5 are considered easy, lower are considered difficult.
The score of the CSI is multiplied by its’ steps decay factors, reducing the score of more complex and longer path over shorter and simpler ones.
|
Summary
The introduced framework is a new method that was developed to evaluate risk and prioritize the tasks needed for the strengthening of the environment configuration and posture. Using this framework, issues can be addressed in a manner that best serves the security needs of the organization, allowing for better management and quicker improvement and reduction of attack surface. The method emphasizes critical assets and prevents multitude of less significant issues from overshadowing the crucial and immediate threats. The framework is flexible and can easily be adjusted and extended to include specific or new scenarios.
Appendix 1: Issues
Software vulnerability
Description
A resource may be at increased risk of being compromised, attacked, or exploited due to a software flaw.
Examples
A server is vulnerable to Log4Shell vulnerability (CVE-2021-44228), increasing the risk of exploitation by a remote malicious code execution.
Anonymous access
Description
A resource can be accessed without being required to provide identifying information, allowing untrusted actors to access data or compute resources and potentially do harm.
Example
Storage resources open to public may lead to unauthorized access to sensitive data, as well as data breaches and corruption. Additionally, public storage resources may be used to spread malware.
Excessive permissions
Description
An identity is allowed to access more resources than is necessary for its intended function, providing opportunities for malicious actors to gain access to sensitive data or systems.
Example
Simple user that has subscription permission,
A service account that only needs “read” permission to a database but is also granted “delete” permission may be used to corrupt the database, if compromised.
No MFA
Description
Multi-Factor Authentication (MFA) is not enabled for users of certain type. A user without MFA enabled is at a higher risk of having their account accessed by unauthorized individuals, such as through phishing attacks or password guessing.
Example
A company that only requires employees to use a username and password to access their email accounts.
The Target data breach of November 2013 started when attackers managed to steal a third-party contractor’s password through a phishing attack. Since no MFA was required, the attackers were able to use the stolen credentials to access Target systems.
Default account
Description
The use of a preconfigured user account in a system or application that has a default username and password. Actors may be able to use guessed or known default usernames and passwords to gain unauthorized access to sensitive data or system.
Example
Devices are often provided with default usernames such as “root” and “admin”. The Mirai botnet was able to spread by brute-forcing default usernames and passwords on Internet of Things (IoT) devices, disrupting major websites such as Netflix, PayPal, and others.
Usage of local Identity services
Description
Authentication and authorization should be managed by the central identity provider since it allows for more granular control over which resources and systems each user can access.
Example
Use Azure Active Directory authentication for Azure Kubernetes Service clusters.
Privilege escalation
Description
The resource is using a risky configuration that might allow privilege escalation. If compromised, an attacker may gain higher privileges and potentially get full control over the resource.
Example
A container that runs as a privileged container allows a container to access host resources including modifying the filesystem and gaining elevated privileges.
Unnecessary internet reachability
Description
The resource can be reached from the internet, either directly by having a public IP or due to insufficient network restrictions. This can be a security risk because it exposes the resource to potential attacks and unauthorized access attempts.
Example
A virtual machine listening on port 3389 (RDP) and is directly accessible to the internet may be targeted and exploited as an entry point to the network.
Unencrypted communication protocols
Description
Encrypted protocols should be used for traffic over the internet. Encryption ensures that only authorized parties can read your data if someone intercepts it as it travels over the network.
Example
Web applications should use require HTTPS using TLS 1.2 or higher than HTTP.
Poor integrity
Description
Software integrity refers to the trustworthiness of software and the confidence that it is genuine and will perform as expected without any malicious or unintended behavior.
Example
Linux virtual machines should use signed boot components. Unsigned boot components are exposed to tempered and backdoored software versions.
Data Loss and corruption
Description
Measures should be taken to reduce the risk of accidental data loss or corruption.
Example
Enabling deletion protection for resources reduces the risk of accidental deletion and loss of data.
Sanitation
Description
Refers to the process of deletion or removal of unnecessary resources and accounts to reduce attack surface and prevent unauthorized access.
Example
Blocked accounts with high privileged permissions should be deleted to reduce attack surface.
Appendix 2: Contexts
Issues, by definition, indicate a weakness in the environment, and therefore every issue may also add context to other issues found. Other than the issues contexts include:
Contains sensitive data
Description
The resource contains sensitive data, such as PII (personally identifiable information), network configuration information or credentials such as usernames, passwords, keys that if obtained by unauthorized individuals could be used to access other systems.
Example
A storage resource contains employee’s personal information such as social security number and credit card number.
Resource importance
Description
The resource is of high business criticality. This context may be based on end-user input, allowing them to adjust and promote assets and behavior of interest.
Example
A subscription owner of an Azure subscription, an individual or managed identity, has full administrative access and holds the highest level of authority over the resources and services within that subscription. If the Azure subscription owner account is compromised, it poses significant risks to the security and integrity of the Azure environment. The compromised resource can be used to gain unrestricted access to any resource within the subscription, leading to data breach, account takeover, resource abuse (crypto-mining), allow the attacker to achieve persistence and undermine compliance with regulatory requirements and industry standards.
Access to a resource
Description
The resource can authenticate and has permissions to another resource, either by its assigned identity and permissions, or by utilizing credentials found on it.
Example
A VM that can authenticate as a managed identity that has permission to the subscription.
Is running
Description
The resource is currently running, turned ON.
Example
Indicates that a resource, e.g., Kubernetes cluster, is in a running state.
Cross cloud
A resource maintained by a cloud service provider can be used to compromise a resources maintained by another cloud service provider.
Example
An shh-key found on an AWS EC2 instance that can be used to authenticate to an Azure VM
Highly likely to be under attack
Description
There is an indication on an ongoing attack on the resource.
Example
There are recent alerts for RDP BF activity on the VM.
Successfully compromised resource
Description
A resource that has been successfully compromised and is therefore untrusted.
Example
There are recent alerts for RDP BF activity on the VM resulting in a successful login.