azure storage
117 TopicsSSL/TLS connection issue troubleshooting guide
You may experience exceptions or errors when establishing TLS connections with Azure services. Exceptions are vary dramatically depending on the client and server types. A typical ones such as "Could not create SSL/TLS secure channel." "SSL Handshake Failed", etc. In this article we will discuss common causes of TLS related issue and troubleshooting steps.40KViews9likes1CommentAzure Data Factory trigger is not initiated when uploading a file using Java SDK
Uploading file using Java SDK class DataLakeFileClient does not initiate an ADF trigger despite configuring the ADF trigger correctly to be initiated once a new file is created. This is happening only when the trigger is configured to ignore 0 byte blobs.4.1KViews7likes0CommentsSet Up Endpoint DLP Evidence Collection on your Azure Blob Storage
Endpoint Data Loss Prevention (Endpoint DLP) is part of the Microsoft Purview Data Loss Prevention (DLP) suite of features you can use to discover and protect sensitive items across Microsoft 365 services. Microsoft Endpoint DLP allows you to detect and protect sensitive content across onboarded Windows 10, Windows 11 and macOS devices. Learn more about all of Microsoft's DLP offerings. Before you start setting up the storage, you should review Get started with collecting files that match data loss prevention policies from devices | Microsoft Learn to understand the licensing, permissions, device onboarding and your requirements. Prerequisites Before you begin, ensure the following prerequisites are met: You have an active Azure subscription. You have the necessary permissions to create and configure resources in Azure. You have setup endpoint Data Loss Prevention policy on your devices Configure the Azure Blob Storage You can follow these steps to create an Azure Blob Storage using the Azure portal. For other methods refer to Create a storage account - Azure Storage | Microsoft Learn Sign in to the Azure Storage Accounts with your account credentials. Click on + Create On the Basics tab, provide the essential information for your storage account. After you complete the Basics tab, you can choose to further customize your new storage account, or you accept the default options and proceed. Learn more about azure storage account properties Once you have provided all the information click on the Networking tab. In network access, select Enable public access from all networks while creating the storage account. Click on Review + create to validate the settings. Once the validation passes, click on Create to create the storage Wait for deployment of the resource to be completed and then click on Go to resource. Once the newly created Blob Storage is opened, on the left panel click on Data Storage -> Containers Click on + Containers. Provide the name and other details and then click on Create Once your container is successfully created, click on it. Assign relevant permissions to the Azure Blob Storage Once the container is created, using Microsoft Entra authorization, you must configure two sets of permissions (role groups) on it: One for the administrators and investigators so they can view and manage evidence One for users who need to upload items to Azure from their devices Best practice is to enforce least privilege for all users, regardless of role. By enforcing least privilege, you ensure that user permissions are limited to only those permissions necessary for their role. We will use portal to create these custom roles. Learn more about custom roles in Azure RBAC Open the container and in the left panel click on Access Control (IAM) Click on the Roles tab. It will open a list of all available roles. Open context menu of Owner role using ellipsis button (…) and click on Clone. Now you can create a custom role. Click on Start from scratch. We have to create two new custom roles. Based on the role you are creating enter basic details like name and description and then click on JSON tab. JSON tab gives you the details of the custom role including the permissions added to that role. For owner role JSON looks like this: Now edit these permissions and replace them with permissions required based on the role: Investigator Role: Copy the permissions available at Permissions on Azure blob for administrators and investigators and paste it in the JSON section. User Role: Copy the permissions available at Permissions on Azure blob for usersand paste it in the JSON section. Once you have created these two new roles, we will assign these roles to relevant users. Click on Role Assignments tab, then on Add + and on Add role assignment. Search for the role and click on it. Then click on Members tab Click on + Select Members. Add the users or user groups you want to add for that role and click on Select Investigator role – Assign this role to users who are administrators and investigators so they can view and manage evidence User role – Assign this role to users who will be under the scope of the DLP policy and from whose devices items will be uploaded to the storage Once you have added the users click on Review+Assign to save the changes. Now we can add this storage to DLP policy. For more information on configuring the Azure Blob Storage access, refer to these articles: How to authorize access to blob data in the Azure portal Assign share-level permissions. Configure storage in your DLP policy Once you have configured the required permissions on the Azure Blob Storage, we will add the storage to DLP endpoint settings. Learn more about configuring DLP policy Open the storage you want to use. In left panel click on Data Storage -> Containers. Then select the container you want to add to DLP settings. Click on the Context Menu (… button) and then Container Properties. Copy the URL Open the Data Loss Prevention Settings. Click on Endpoint Settings and then on Setup evidence collection for file activities on devices. Select Customer Managed Storage option and then click on Add Storage Give the storage name and copy the container URL we copied. Then click on Save. Storage will be added to the list. Storage will be added to the list for use in the policy configuration. You can add up to 10 URLs Now open the DLP endpoint policy configuration for which you want to collect the evidence. Configure your policy using these settings: Make sure that Devices is selected in the location. In Incident reports, toggle Send an alert to admins when a rule match occurs to On. In Incident reports, select Collect original file as evidence for all selected file activities on Endpoint. Select the storage account you want to collect the evidence in for that rule using the dropdown menu. The dropdown menu shows the list of storages configured in the endpoint DLP settings. Select the activities for which you want to copy matched items to Azure storage Save the changes Please reach out to the support team if you face any issues. We hope this guide is helpful and we look forward to your feedback. Thank you, Microsoft Purview Data Loss Prevention Team2.7KViews6likes1CommentHow to calculate Container Level Statistics in Azure Blob Storage with Azure Databricks
This article describes how to get container level stats in Azure Blob Storage, and how to work with the information provided by blob inventory. The approach presented here uses Azure Databricks and is most suited to be used in storage accounts with a huge amount of data, in order of Terabytes (TB) or more.17KViews6likes0CommentsLifecycle Management of Blobs (Deletion) using Automation Tasks
Background: We often encounter scenarios where we need to delete blobs that have been idle in a storage account for an extended period. For a small number of blobs, deletion can be handled easily using the Azure Portal, Storage Explorer, or inline scripts such as PowerShell or Azure CLI. However, in most cases, we deal with a large volume of blobs, making manual deletion impractical. In such situations, it's essential to leverage automation tools to streamline the deletion process. One effective option is using Automation Tasks, which can help schedule and manage blob deletions efficiently. Note: Behind the scenes, an automation task is actually a logic app resource that runs a workflow. So, the Consumption pricing model of logic-app applies to automation tasks. Scenario’s where “Automation Tasks” are helpful: You have a requirement to automate deletion of blobs which are older than a specific time, in days, weeks or months. You don’t want to put in much manual effort rather have simple UI based steps to achieve your goal You have System containers, and you want to action on it. We have “LCM (Life Cycle management)” which too can be leveraged by users to automation deletion of older blobs; however LCM cannot be used to delete blobs from System containers. You have to work on page blobs. Setup “Automation Tasks”: Let’s walk through on how to achieve our goal. Navigate to the desired storage account and scroll down to the “Automation” section and select the “Tasks” blade and then click on “Add Task” from the top panel or bottom panel (highlighted in image). On the next page click the “Select” (highlighted image) The new page which opens up should look as below, however there isn’t anything we are doing. So let’s just click on the “Next : Configure” (highlighted in image) and move to the next screen. The new page opens needs to be filled as per your requirement. I have added a sample. You can also use it on your own containers as well. 'sample' is a folder inside container '$web' The “Expiration Age” field means that Blobs older than these number of days needs be deleted. In above screenshot, blobs older than 180 days would be deleted. Similarly, we can configure values in weeks or months as well. Once we are through with the steps proceed with creation of the task. Once task is created it looks as below: You can click on the “View Run” to see run history. In-case you want to modify the task, click on your tasks name. For example in above screenshot I can modify by clicking “mytask” link and re-configure the task. Now this isn’t sufficient. We will update some of the steps which were used to create the Logic-app. Hence we would need to edit some steps and save those before re-running the app. a) Go the logic app and navigate to the “Logic App Designer” blade b) Now click on the “+” sign as shown below and “Add an Action” c) Once the new page opens up, search for “List Blobs (v2)” and select it d) Choose the “Enter custom value” and enter your storage account name e) The values would like as shown below f) Now let's navigate to the “For Each” condition g) We need to delete the “Delete blob” too and replace with “Delete blob (V2)” h) The “Delete Blob (V2)” looks like as below i) With all steps ready, lets save the logic app and click on “Run”. You should observe the run passing successfully. Impact due to Firewall: The above steps for works when your storage account is configured for public access. However, when firewall is enabled, you would need to provide the necessary permissions, else you are going to encounter 403 "Authorization Failure" errors. There would be no issue to create the task, but you will see failures when you check the runs. Example: To overcome this limitation, you need to navigate to your logic app and generate a managed identity for the app and provide the identity “Storage Blob Data Contributor” role. Step1. Enable Managed Identity: In Azure Portal, go to your Logic App resource. Under Settings, select Identity. In the Identity pane, under System assigned, select On and Save. This step registers the system-assigned identity with Microsoft Entra ID, represented by an object ID. Step2. Assign Necessary Role: Open the Azure Storage Account in Azure Portal. Select Access control (IAM) > Add > Add role assignment. Assign a role like 'Storage Blob Data Contributor', which includes write access for blobs in an Azure Storage container, to the managed identity. Under Assign access to, select Managed identity > Add members, and choose your Logic App's identity Save and refresh and you see the new role configured to your storage account Remember that, if the Storage Account and logic app are in different region you should add another step in the firewall of storage account. You need to whitelist the logic app instance in “Resource instances” list as shown below: Conclusion: The multiple ways to action on blobs are provided for your convenience. Depending on your requirement, feasibility & other factors like comfortability with the feature or pricing too would certainly influence your decisions. However, in-case you want to action upon System containers like $logs or $web, “Automation Tasks” are one of the most helpful feature which you can use and achieve your goal. Note: At the time of writing this blog this feature is still in preview. So ensure to check if there are any limitations which might impact you before implementing it in your Production environment. References: Create automation tasks to manage and monitor Azure resources - Azure Logic Apps | Microsoft Learn Optimize costs by automatically managing the data lifecycle - Azure Blob Storage | Microsoft Learn566Views4likes0CommentsOptimizing ETL Workflows: A Guide to Azure Integration and Authentication with Batch and Storage
Unlock the Power of Azure: Transform Your ETL Pipelines Dive into the world of data transformation and discover how to build a solid foundation for your ETL pipelines with Azure’s powerful trio: Data Factory, Batch, and Storage. Learn to navigate the complexities of data authentication and harness the full potential of Synapse Pipeline for seamless integration and advanced data processing. Ready to revolutionize your data strategy? This guide is your key to mastering Azure’s services and optimizing your workflows.6.7KViews4likes1CommentTroubleshooting connectivity to Azure Storage over SFTP via Windows or Linux machine
Azure Storage supports for Secure File Transfer (SFTP) protocol in Azure Storage Account. We can use a SFTP client to securely connect to the Blob Storage endpoint of your Azure Storage account, and then perform upload and download operations over the account. Please note that SFTP feature support is only available for hierarchical namespace (ADLS Gen2) enabled accounts. In this article, we will discuss about how to troubleshoot and isolate connectivity issues to SFTP storage account from your machine to understand whether this is due to port blockage, firewall issues, connectivity using private endpoint, incompatibility of the client being used due to unsupported algorithms whether from Windows or Linux machine. Let’s look at some of the steps/actions, you can perform from your side for isolation: From Windows Machine For Windows machines, we can make use of PowerShell or OpenSSH or WinSCP to connect to storage account via SFTP. In the below demo, we have used the authentication mechanism as SSH Key. For authentication mechanisms supported for SFTP, you can refer to the link: Connect to Azure Blob Storage using SFTP - Azure Storage | Microsoft Learn Scenario 1: Verifying the connectivity to Port 22 SFTP requires that the outgoing connections via Port 22 to be allowed. You can check if port 22 is open or not by making use of the below command in Windows machine using PowerShell console. Test-NetConnection -Port 22 -InformationLevel "Detailed" Considering if the port 22 is blocked, you will get connectivity issues. In the below scenario we got “connection reset” error message. Scenario 2: Storage account has firewall or VNET restrictions enabled. If the storage account is behind firewall or VNet and you are trying to connect to storage account over SFTP, then there will be failure in connection to the storage account. You can refer to the below screenshot: You can check this failed request ID in the Diagnostic Logging which will be pointing to IPAuthorizationFailure. As a mitigation, please ensure that the connection to the storage account and the VM from where you are accessing the storage account, is allowed in the storage account firewall rules. Scenario3: Connectivity over Private Endpoint If you have the storage account behind a private endpoint, please ensure that you are using the correct endpoint to connect. The connection will be made using the connection string as below: myaccount.myuser@myaccount.privatelink.blob.core.windows.net If home directory hasn't been specified for the user, the connection string is defined as myaccount.mycontainer.myuser@myaccount.privatelink.blob.core.windows.net. To verify there is a connectivity between the storage account and the VM, you can also perform “nslookup” on the storage account endpoint. We should see the private IP of the storage account as a result of the resolution happening. If you observe a public IP in the response, it means that the connection is not happening via private endpoint of the storage account. If the resolution is intact, you should be able to connect to SFTP successfully. Scenario 4: Un-supported client due to incompatible algorithms. In case you have validated port blockers, firewall and VNET configurations, and still facing connectivity issue with your SFTP client it is highly possible that the client might not be passing supported algorithms. You can use any SFTP client, however it must use the algorithm being discussed in the below link: https://learn.microsoft.com/en-us/azure/storage/blobs/secure-file-transfer-protocol-support#supported-algorithms If we try to connect using an unsupported algorithm, its connection will tend to fail. Below is a demonstration of an incorrect algorithm being passed resulting in connection failure. If you are aware of the algorithm the client uses underneath, you can verify them again the above shared document. If not, you can take a network packet capture and check for the algorithms that are being passed during the negotiation. You can check for the algorithms being passed between client to server and then service to client. From Linux Machine The above section talked about executing commands from Windows machine for isolation. In case you are using any Linux machine/client, you can do the isolation for that well. For this blog, we have made use of Linux Distribution of RHEL 8.6. We will demonstrate connecting to Azure Storage Account using SFTP commands via OpenSSH or curl commands from Linux machine and check for isolation. Before proceeding with the commands, we need to test the connectivity to port 22 for which we can use Telnet command. We can telnet to the storage endpoint over the port 22. Scenario 1: Verifying the connectivity to Port 22 Command to be used: telnet <host_storage_account_name> <port_number> Scenario 2. Connect to the Storage Account using OpenSSH commands: You can also make use of the curl command to upload to the Azure Storage Account from Linux. We need to follow the below command for the upload operation. curl -T <filename> -u <account>.<user>:<password> sftp:/<account>.blob.core.windows.net/~/<filename> Here, parameter “T” stands for the file path on your local machine that you want to upload to the storage account. Adding the correct parameter, the above commands become as: curl -T /home/shxxx/sample.yaml -u "<Account Name>.<Local User Name>:<SSH-Key> " -k "sftp:/<Account Name>.blob.core.windows.net/~/sample.yaml" At present, SFTP feature has certain limitations for the Azure Storage Account. For more details on SFTP feature and its limitations on the storage account, you can refer to the below links: SFTP support for Azure Blob Storage - Azure Storage | Microsoft Learn Limitations & known issues with SFTP in Azure Blob Storage - Azure Storage | Microsoft Learn Hope the article was helpful and do share your views on the same! If you have reviewed these checks but still facing connectivity issues, you can reach out to Microsoft Support ahead.10KViews4likes2CommentsStep-by-Step Guide: Setting up Custom Domain for Azure Storage Account with HTTPS Only Enabled
If you are using Azure Storage to host your website, you might want to enable HTTPS Only to ensure secure communication between the client and the server. However, setting up a custom domain with HTTPS Only enabled can be a bit tricky. In this blog, we will guide you through the step-by-step process of setting up a custom domain for your Azure Storage account with HTTPS Only enabled. Firstly, I would like to briefly explain that currently, configuring a custom domain directly on Azure Storage is only supported for HTTP. If HTTP access is allowed, you can follow this official documentation for configuration: https://learn.microsoft.com/en-us/azure/storage/blobs/storage-custom-domain-name?tabs=azure-portal#map-a-custom-domain-with-only-http-enabled Otherwise, if we need to use a custom domain with HTTPS Only enabled, we need to leverage other resources on Azure to do request forwarding. We will explore three options for setting up a custom domain in this scenario. These three options have their own pros and cons, and a brief comparison will also be provided at the end of the blog. You can choose to use either of them according to your specific situation. Azure CDN Application Gateway Azure Front Door Please note that if you use either request forwarder, you no longer need to manually add the custom domain to storage. If you add the custom domain in this case, you may get "cannot verify" error. Option 1: Using Azure CDN Official Document Reference: https://learn.microsoft.com/en-us/azure/storage/blobs/storage-custom-domain-name?tabs=azure-portal#using-azure-cdn If you have your own DNS provider already, you can ignore the first four steps and start with Step 5. [Step 1] First of all, you should get a domain. On Azure, you can deploy “App Service Domain” which will create a DNS zone for you automatically. [Step 2] You need to purchase a certificate and create a key vault (if you have any existing KVs, you can use an existing one). After the Key Vault is created, you need to go to the Access configuration of the KV and update it to “Vault access policy”. [Step 3] You need to go to the certificate that you just created. Then, you click the Certificate Configuration and have three boxes checked one by one. a. You select the key vault you just created to link with the certificate. b. You verify the domain you created earlier. This step may take a few minutes to take effect after checking for domain verification with certificate issuer. [Step 4] Now, you can add a CNAME record in your DNS zone, like below. [Step 5] You can create a new CDN profile on Azure if you don’t have one already. This step is within your storage account, where you can navigate to "Front door and CDN" in the left-hand menu. [Step 6] Then, you should be able to add a custom domain in your Endpoint of CDN. [Step 7] You need to enable the HTTPS feature for custom domain, you can click in the custom domain you just added in Step 6 and enable the HTTPS. (This process may take up to a few hours to finish.) To sum up, the traffic is going from external hostname (CNAME record name you added in Step 4: in my example, it’s xxxxx.zoeylanms.com) to CDN (xxxx.azureedge.net), and finally reaching Storage account (static website). Option 2: Using Application Gateway This option is similar to the first one, but instead of using Azure CDN to transfer requests, we will use Application Gateway to act as a web traffic load balancer and provide advanced security features. Here are the steps to set up Application Gateway: Step 1-3 in Option 1 are still needed if you don’t have your DNS provider. [Step 4] You need to have an Application Gateway resource on Azure. You could follow our official document to create an Application Gateway on Azure Portal: https://learn.microsoft.com/en-us/azure/application-gateway/quick-create-portal Note: The backend protocol used in the document example is HTTP. We can update it to HTTPS in later step. [Step 5] You can check the public IP of the Application Gateway you just created first. Then, you need to go back to the DNS zone you created earlier and add a new A record to map traffic to the Application Gateway using the public IP. [Step 6] You open the Application Gateway, go to the Backend Pools, and click the backend pool created just now. For the Target Type, you should pick "IP address or FQDN" and then put the static web site endpoint in the Target. Format: xxxxx.z7.web.core.windows.net [Step 7] In the Backend Settings, you should make sure the Backend protocol is updated to “HTTPS”. [Step 8] Then, you can create a custom health probe with a protocol in HTTPS like below to monitor the health of the resources in the backend pool. [Step 9] Before you save the new health probe, there is a test showing the connection result. Meanwhile, you can also check the backend health like the screenshot below later. Option 3: Using Azure Front Door Configuring request forwarding using Front Door is very similar to Azure CDN. You can follow our official documentation to complete the deployment. https://learn.microsoft.com/en-us/azure/storage/blobs/storage-custom-domain-name?tabs=azure-portal#using-azure-front-door Comparison Option Azure CDN Application Gateway Azure Front Door Behavior in custom domain setting Azure CDN does not have a built-in option for force HTTPS redirection. In this scenario, when accessing the website through a custom domain using HTTP, you will receive a 400 error with the error message 'AccountRequiresHttps'. However, everything works normally when accessing the webpage through a custom domain link using HTTPS. The scenarios that can be achieved with Application Gateway are more flexible because we can configure and combine the listener's protocol and backend setting's protocol separately. Therefore, whenever client accesses through either HTTP or HTTPS, by configuring the backend setting, we can forward requests to the backend storage account using HTTPS. Azure Front Door does have a built-in option for force HTTPS redirection. This can be configured in the Front Door's routing rules by selecting “Redirect all traffic to use HTTPS”. When a user tries to access the website using HTTP, Front Door will automatically redirect the request to HTTPS. Pricing Outbound data transfers are billed based on the node location from where the transfers are served, not the end user’s location. Reference: Understanding Azure CDN billing | Microsoft Learn Pricing - Content Delivery Network (CDN) | Microsoft Azure The cost of Application Gateway is mainly composed of fixed and variable charges. The fixed charge is based on a fixed hourly rate (which may vary by region), while the variable charge depends on the number of capacity units, with an hourly price for each capacity unit. Reference: Understanding pricing - Azure Application Gateway | Microsoft Learn Application Gateway Pricing | Microsoft Azure Each part of the request process is billed separately: Number of requests from client to Front Door Data transfer from Front Door edge to origin Data transfer from origin to Front Door (non-billable) Data transfer from Front Door to client Reference: Understand Azure Front Door billing | Microsoft Learn Pricing - Front Door | Microsoft Azure Performance Azure CDN offers global content delivery with low latency and high throughput. It stores cached content on the edge servers’ POP locations that are close to consuming users, thereby minimizing network latency. Reference: What is a content delivery network (CDN)? - Azure | Microsoft Learn Application Gateway provides layer 7 load balancing and web application firewall (WAF) capabilities, offers advanced traffic management features such as session affinity, redirection, and URL-based routing. It provides centralized management and monitoring of multiple backend servers. Reference: What is Azure Application Gateway | Microsoft Learn Azure Application Gateway features | Microsoft Learn Azure Front Door provides global load balancing and routing of traffic to different backend services. It unified static and dynamic delivery offered in a single tier to accelerate and scale your application through caching, SSL offload, and layer 3-4 DDoS protection. Reference: Azure Front Door | Microsoft Learn To summarize the blog, the most important thing to remember is that Azure Storage currently does not support the HTTPS Only + Custom domain mode. Therefore, we need to leverage a resource that can perform request forwarding to achieve our goal. In the previous sections, we introduced three of the most commonly used resources: Azure CDN, Application Gateway, and Azure Front Door.17KViews4likes0CommentsMount ADLS Gen2 or Blob Storage in Azure Databricks
Azure Databricks offers many of the same features as the open-source Databricks platform, such as a web-based workspace for managing Spark clusters, notebooks, and data pipelines, along with Spark-based analytics and machine learning tools. It is fully integrated with Azure cloud services, providing native access to Azure Blob Storage, Azure Data Lake Storage, Azure SQL Database, and other Azure services. This blog shows example of mounting Azure Blob Storage or Azure Data Lake Storage in the Databricks File System (DBFS), with two authentication methods for mount: Access Key and SAS token.59KViews4likes1Comment