With over 200 services, monitoring Microsoft Azure efficiently can be challenging for some security teams. For incident response to be successful, the proper tools and logging systems should be in place—but that is usually easier said than done. It is important for organizations to prioritize intrusion prevention but also ensure that the right configurations are in place to identify the source of any intrusion or incident. Proactive readiness involves taking preventive measures even in the absence of an active threat, making the various stages of incident response more efficient. This blog post shares lessons learned from Microsoft experts during forensic investigations in Azure and highlights key configurations that can improve forensic accuracy and completeness.
Forensic readiness in the cloud
Forensic readiness in the cloud refers to an organization’s ability to collect, preserve, and analyze digital evidence in preparation for security incidents.
Forensic readiness is increasingly important as more organizations migrate workloads to the cloud. Achieving an appropriate security posture ensures that organizations are adequately equipped for forensic investigations. This requires more than just the presence of logs; logging and monitoring configurations must be thoughtfully scoped and proactively enabled.
Additionally, the adoption of cloud environments presents unique challenges for forensic investigations. First, capturing the right evidence can be difficult due to the dynamic nature of cloud data. Second, in a shared responsibility model, organizations must work closely with their cloud providers to ensure preparedness for forensic investigations. Azure’s multi-tenant architecture adds another layer of complexity, as data from multiple customers may reside on the same physical hardware. Therefore, strict access controls and robust logging are essential. To maintain forensic readiness, organizations must implement comprehensive monitoring and logging across all cloud services to ensure evidence is available when needed.
Preparing your Azure environment for forensic readiness
When the Azure environment is set up correctly and configured with accurate logging in place, it becomes easier to quickly identify the scope of a security breach, trace the attacker’s actions, and identify the Tactics, Techniques, and Procedures (TTP) employed by a threat actor. Through the implementation of these measures, organizations can ensure that data required to support forensic investigations is available, hence ensuring compliance with auditing requirements, improving security, and ensuring security incidents are resolved efficiently. With that granularity of log data in the environment, organizations are more well-equipped to respond to an incident if it occurs.
Case study: Forensic investigation disrupted due to lack of forensic readiness in Azure
In a recent cybersecurity incident, a large company utilizing Azure experienced a major setback in its forensic investigation. This case study outlines the critical steps and logs that were missed, leading to a disrupted investigation.
Step 1: Initial detection of the compromise
The organization’s Security Operations Centre (SOC), identified anomalous outbound traffic originating from a compromised Azure virtual machine (VM) named “THA-VM.” Unfortunately, the absence of diagnostic settings significantly hindered the investigation. Without access to Guest OS logs and data plane logs, the team was unable to gain deeper visibility into the threat actor’s activities.
The lack of critical telemetry—such as Windows Event Logs, Syslog, Network Security Group (NSG) flow logs, and resource-specific data plane access logs—posed a major challenge in assessing the full scope of the compromise. Had these diagnostic settings been properly configured, the investigation team would have been better equipped to uncover key indicators of compromise, including local account creation, process execution, command-and-control (C2) communications, and potential lateral movement.
Figure 1: Diagnostic settings not configured on the virtual machine resource
Step 2: Evidence collection challenges
During the forensic analysis of the compromised virtual machine, the team attempted to capture a snapshot of the OS disk but discovered that restore points had not been configured and no backups were available—severely limiting their ability to preserve and examine critical disk-based artefacts such as malicious binaries, tampered system files, or unauthorized persistence mechanisms. Restore points, which are not enabled by default in Azure virtual machines, allow for the creation of application-consistent or crash-consistent snapshots of all managed disks, including the OS disk. These snapshots are stored in a restore point collection and serve as a vital tool in forensic investigations, enabling analysts to preserve the exact state of a VM at a specific point in time and maintain evidence integrity throughout the investigation process.
Step 3: Analysis of the storage blob
The team then turned to storage blobs after identifying unusual files that appeared to be associated with threat actor tool staging such as scanning utilities and credential dumping tools. However, because diagnostic settings for the storage account had not been enabled, the investigators were unable to access essential data plane logs. These logs could have revealed who uploaded or accessed the blobs and when those actions occurred. Since storage diagnostics are not enabled by default in Azure, this oversight significantly limited visibility into attacker behavior and impeded efforts to reconstruct the timeline and scope of malicious activity—an essential component of any effective forensic investigation.
Step 4: Slow response and escalation
In the absence of tailored logging and monitoring configurations, response timelines were delayed, and the full incident response process that was required was not initiated quickly enough to minimize the impact.
Step 5: Recovery and lessons learned
Despite the delays, the team pieced together elements of the story based on the data they had available, without determining the initial access vector largely because the necessary diagnostic data wasn't available. This absence of forensic readiness highlights the importance of configuring diagnostic settings, enabling snapshots, and using centralized logging solutions like Microsoft Sentinel, which will bring all this telemetry into a single pane of glass, providing real-time visibility and historical context in one place. This unified view enables faster incident detection, investigation, and response. Its built-in analytics and AI capabilities help surface anomalies that might otherwise go unnoticed, while retaining a searchable history of events for post-incident forensics.
Recommended practices for forensic readiness in Azure
The table below outlines key recommendations for deploying and administering workloads securely and effectively in Azure. Each recommendation is categorized by focus area and includes a best practice description, specific action to take, and a reference to supporting documentation or resources to assist with implementation.
Category |
Best Practice |
Recommended Action |
Resource/Link |
Identity and Access |
Enable MFA for all users. |
[ ] Enable Multi-Factor Authentication (MFA) for all Azure AD Users. | |
|
Monitor Access Review and Role Assignments |
[ ] Regularly review identities (SPNs, Managed Identities, Users), role assignments and permissions for anomalies. | |
|
Implement RBAC with least privilege. |
[ ] Use Role-Based Access Control (RBAC) and assign least-privilege roles to users. | |
|
Configure PIM for privileged roles. |
[ ] Configure Privileged Identity Management (PIM) for all privileged roles. Require approval for high privilege roles. | |
|
Enable Sign-in and Audit Logs. |
[ ] Ensure all sign-in activities and audit logs are enabled and logging in Azure AD. | |
|
Conditional Access Policies: Protect high-risk resources from unauthorized access. |
[ ] Set Conditional Access policies to enforce MFA or access restrictions based on conditions like risk or location. | |
Logging and Monitoring |
Enable Azure Monitor |
[ ] Enable Azure Monitor to collect telemetry data from resources. | |
|
Activate Microsoft Defender for Cloud. |
[ ] Activate and configure Microsoft Defender for Cloud for enhanced security monitoring. | |
|
Enable Diagnostic logging for VM and Applications. |
[ ] Configure Diagnostic logging for Azure VMs, and other critical resources. | |
|
Centralize Logs in Log Analytics Workspace. |
[ ] Consolidate all logs into a Log Analytics Workspace for centralized querying. | |
|
Set Audit logs retention to 365+ days. |
[ ] Ensure audit logs are retained for a minimum of 365 days to meet Forensic needs. | |
|
Enable Advanced Threat Detection. |
[ ] Enable Microsoft Defender for Cloud and Sentinel to detect anomalous behavior and security threats in real time. | |
Data Protection |
Ensure Data encrypted at rest and in transit. |
[ ] Enable encryption for data at rest and in transit for all Azure resources. | |
|
Use Azure Key Vault for Key management. |
[ ] Store and manage encryption key, certificates and secrets in Azure Key Vault. | |
|
Rotate Encryption Keys Regularly. |
Regularly rotate encryption key, certificates and secrets in Azure Key Vault. | |
|
Configure Immutable Backups. |
[ ] Set up immutable backups for critical data to prevent tampering. | |
|
Implement File Integrity Monitoring |
[ ] Enable File Integrity Monitoring in Azure Defender for Storage to detect unauthorized modifications. | |
Network Security |
Configure Network Security Groups (NSGs). |
[ ] set up NSGs to restrict inbound/outbound traffic for VM’s and services. | |
|
Enable DDoS Protection. |
[ ] Implement DDoS Protection for critical resources to safeguard against distributed denial-of-service attacks. | |
|
Use VPNs or ExpressRoute for secure connectivity. |
[ ] Establish VPNs or ExpressRoute for secure, private network connectivity. | |
Incident Response |
Set Up Alerts for suspicious activities. |
[ ] Configure alerts for suspicious activities such as failed login attempts or privilege escalation. | |
|
Automate incident response. |
[ ] Automate incident response workflows using Azure Automation or Logic Apps. | |
|
Integrate Threat intelligence with Sentinel. |
[ ] Integrate external threat intelligence feeds into Microsoft Sentinel to enrich detection capabilities | |
|
Run Advanced KQL Queries for Incident Investigations. |
[ ] Use Kusto Query Language (KQL) queries in Sentinel to investigate and correlate incidents. | |
|
Establish Incident Response Plan |
[ ] Document and formalize your organization’s incident response plan with clear steps and procedures. | |
Policies and Processes |
Define a Forensic Readiness Policy. |
[ ] Establish and document a Forensic Readiness policy that outlines roles, responsibilities, and procedures. | |
|
Conduct Administrator training. |
[ ] Provide regular training for administrators on security best practices, forensic procedures, and incident response. |
By using Microsoft’s tools and implementing these recommended best practices, organizations can improve their forensic readiness and investigation capabilities in Azure. This approach not only helps in responding effectively to incidents but also enhances an organization’s overall security posture. By staying ahead of potential threats and maintaining forensic readiness, you’ll be better equipped to protect your organization and meet regulatory requirements.
Conclusion
Forensic readiness in Azure is not a one-time effort, it is an ongoing commitment that involves proactive planning, precise configuration, and strong coordination across security, operations, and governance teams. Key practices such as enabling diagnostic logging, centralizing telemetry, enforcing least-privilege access, and developing cloud-tailored incident response playbooks are essential. Together, these measures improve your ability to detect, investigate, and respond to security incidents in a timely and effective manner.