concurrency
3 TopicsSolution: Handling Concurrency in Azure Data Factory with Marker Files and Web Activities
Hi everyone, I wanted to share a concurrency issue we encountered in Azure Data Factory (ADF) and how we resolved it using a small but effective enhancement—one that might be useful if you're working with shared Blob Storage across multiple environments (like Dev, Test, and Prod). Background: Shared Blob Storage & Marker Files In our ADF pipelines, we extract data from various sources (e.g., SharePoint, Oracle) and store them in Azure Blob Storage. That Blob container is shared across multiple environments. To prevent duplicate extractions, we use marker files: started.marker → created when a copy begins completed.marker → created when the copy finishes successfully If both markers exist, pipelines reuse the existing file (caching logic). This mechanism was already in place and worked well under normal conditions. The Issue: Race Conditions We observed that simultaneous executions from multiple environments sometimes led to: Overlapping attempts to create the same started.marker Duplicate copy activities Corrupted Blob files This became a serious concern because the Blob file was later loaded into Azure SQL Server, and any corruption led to failed loads. The Fix: Web Activity + REST API To solve this, we modified only the creation of started.marker by: Replacing Copy Activity with a Web Activity that calls the Azure Storage REST API The API uses Azure Blob Storage's conditional header If-None-Match: * to safely create the file only if it doesn't exist If the file already exists, the API returns "BlobAlreadyExists", which the pipeline handles by skipping. The Copy Activity is still used to copy the data and create the completed.marker—no changes needed there. Updated Flow Check marker files: If both exist (started and completed) → use cached file If only started.marker → wait and retry If none → continue to step 2 Web Activity calls REST API to create started.marker Success → proceed with copy in step 3 Failure → another run already started → skip/retry Copy Activity performs the data extract Copy Activity creates completed.marker Benefits Atomic creation of started.marker → no race conditions Minimal change to existing pipeline logic with marker files Reliable downstream loads into Azure SQL Server Preserves existing architecture (no full redesign) Would love to hear: Have you used similar marker-based patterns in ADF? Any other approaches to concurrency control that worked for your team? Thanks for reading! Hope this helps someone facing similar issues.40Views0likes0CommentsWindows Server 2022 or 2025 Data Centre edition- concurrent editing of Microsoft office documents
Does Windows Server 2022 or 2025 Data Centre edition provide real time collaboration or concurrent editing (through workspaces etc) by multiple users on Microsoft office documents hosted on it locally. So for the sole purpose of having concurrent editing feature of office documents, can Windows Server 2022 or 2025 Data Centre edition become an alternative to using Sharepoint server hosted locally or on premise.Solved195Views0likes2CommentsRandom issues with Pnp Powershell in Azure Runbooks
Hi everyone, we're facing random issues when running PnP powershell from Azure runbooks when several instances of the same runbook run concurrently. The scenario We have several of runbooks, they all use PnP Powershell, and each of them perform different actions, such as Enabling site collection app catalog, scripting, external sharing, creating a list etc. Those runbooks get triggered in two different ways: - one is triggered after a new site is created, through a site script, that triggers a logic app and from there we trigger the runbook - the others are triggered via webhook (http post from a function attached to a queue) The issues When the same runbook is triggered more than once at the same time, they fail in different ways: - we see the logs being logged more than once, then the runbooks get suspended - pnp randomly fails, it does not enable the site catalog, or enable scripting, etc - if we put our code inside a try catch block, we use to read weird errors (null reference, invalid connection when using Connect-PnPOnline, etc) Some errors are: Set-PnPTenantSite : Object reference not set to an instance of an object. At line:41 char:9 Connect-PnPOnline : Token request failed. At line:31 char:5 Set-PnPTenantSite : No connection, please connect first with Connect-PnPOnline At line:41 char:9 Add-PnPSiteCollectionAppCatalog : Object reference not set to an instance of an object. Background info: The runbooks were never executed at the same time against the same site. When we trigger them concurrently, we only do it for 10 sites, and the runbooks are pretty simple (connect to the tenant, enable site collection app catalog, and then disconnect) Any ideas? Thanks5.8KViews0likes1Comment