Azure Integration Runtime
21 TopicsWhat Synapse Serverless SQL pool authentication type for ADF Linked Service?
Hi, I'm relatively new to Azure Data Factory and require your guidance on how to successfully create/test a Linked Service to the Azure Synapse Analytics Serverless SQL pool. In the past, I've successfully created a Linked Service to a third-party (outside our domain) on-premises SQL Server through creating a self-hosted integration runtime on their box and then creating a Linked Service to use that. The Server Name, Database Name, Windows authentication, my username and password all configured by the third-party is what I entered into the Linked Service configuration boxes. All successfully tested. This third-party data was extracted and imported, via ADF Pipelines, into an Azure SQL Server database within our domain. Now I need to extract data from our own (hosted in our domain) Azure Synapse Analytics Serverless SQL pool database. My attempt is this, and it fails: 1) I create a 'Azure Synapse Analytics' Data Store Linked Service. 2) I select the 'AutoResolveIntegrationRuntime' as the runtime to use - I'm thinking this is correct as the Synapse source is within our domain (we're fully MS cloud based). 3) I select 'Enter manually' under the 'Account selection method'. 4) I've got the Azure Synapse Analytics Serverless SQL endpoint - which I place into the 'Fully qualified domain name' field. 5) I entered the data SQL Database name found under the 'SQL database' node/section present on the Data >> Workspace screen in Synapse. 6) I choose 'System-assigned managed identity' as the Authentication type - this is a guess and I was hoping it would recognised my username/account that I am building the Linked Service with, as that account also can query Synapse too and so has Synapse access. 7) I check the 'Trust server certification' box. All else is default. When I click test connection, it fails with the following message: "Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'xxxxxxxxxxxx-ondemand.sql.azuresynapse.net', Database: 'Synapse_Dynamics_data', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access. Login failed for user '<token-identified principal>'." I've reached out to our I.T. (who are novices with Synapse, ADF, etc.. even though they did install them in our domain) and they don't know how to help me. I'm hoping you can help. 1) Is choosing the 'Azure Synapse Analytics' the correct Data Store to chose when looking extract data from an Azure Synapse Serverless SQL pool SQL database? 2) Is using the AutoResolveIntegrationRuntime correct if Synapse is held within our domain? I've previously confirmed this runtime works (and still does) as when importing the third-party data I had to use that runtime to load the data to our Azure SQL Server database. 3) Have I populated the correct values for the 'Fully qualified domain name' and 'Database name' fields by entering the Azure Synapse Analytics Serverless SQL endpoint and subsequent SQL Database name, respectively? 4) Is choosing 'System-assigned managed identity' as the Authentication type correct? I'm guessing this could be the issue. I selected this as when loading the mentioned third-party data into the Azure SQL Server database, within our domain, this was the authentication type that was used (and works) and so I'm assuming it somehow recognises the user logged in and, through the magic of cloud authentication, says this user has the correct privileges (as I should have the correct privileges so say I.T.) so allow the Linked Service to work. Any guidance you can provide me will be much appreciated. Thanks.28Views0likes0CommentsFailure of azure data factory integration runtime with Vnet enabled
I had been using Data Factory's integration runtime with VNet successfully, but it recently stopped connecting to Cosmos DB with the MongoDB API (which is also within a VNet). After setting up a new integration runtime with VNet enabled and selecting the region as 'Auto Resolve,' the pipeline ran successfully with this new runtime. Could you help me understand why the previous integration runtime—configured with VNet enabled and the region set to match that of Azure Data Factory—worked for over a month but then suddenly failed? The new integration runtime with VNet and 'Auto Resolve' region worked, but I'm uncertain if the 'Auto Resolve' region contributed to the success or if something else allowed it to connect. Error:Failure happened on 'Source' side. ErrorCode=MongoDbConnectionTimeout,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=>Connection to MongoDB server is timeout.,Source=Microsoft.DataTransfer.Runtime.MongoDbAtlasConnector,''Type=System.TimeoutException,Message=A timeout occured after 30000ms selecting a server using CompositeServerSelector{ Selectors = MongoDB.Driver.MongoClient+AreSessionsSupportedServerSelector, LatencyLimitingServerSelector{ AllowedLatencyRange = 00:00:00.0150000 } }. Client view of cluster state is { ClusterId : "1", ConnectionMode : "ReplicaSet", Type : "ReplicaSet", State : "Disconnected", Servers : [{ ServerId: "{ ClusterId : 1, EndPoint : "Unspecified/cosmontiv01u.mongo.cosmos.azure.com:10255" }", EndPoint:13Views0likes0Comments'Cannot connect to SQL Database' error - please help
Hi, Our organisation is new to Azure Data Factory (ADF) and we're facing an intermittent error with our first Pipeline. Being intermittent adds that little bit more complexity to resolving the error. The Pipeline has two activities: 1) Script activity which deletes the contents of the target Azure SQL Server database table that is located within our Azure cloud instance. 2) Copy data activity which simply copies the entire contents from the external (outside of our domain) third-party source SQL View and loads it to our target Azure SQL Server database table. With the source being external to our domain, we have used a Self-Hosted Integration Runtime. The Pipeline executes once per 24 hours at 3am each morning. I have been informed that this timing shouldn't affect/or by affected by any other Azure processes we have. For the first nine days of Pipeline executions, the Pipeline successfully completed its executions. Then for the next nine days it only completed successfully four times. Now it seems to fail every other time. It's the same error message that is received on each failure - the received error message is below (I've replaced our sensitive internal names with Xs). Operation on target scr__Delete stg__XXXXXXXXXX contents failed: Failed to execute script. Exception: ''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'XX-azure-sql-server.database.windows.net', Database: 'XX_XXXXXXXXXX_XXXXXXXXXX', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access.,Source=Microsoft.DataTransfer.Connectors.MSSQL,''Type=Microsoft.Data.SqlClient.SqlException,Message=Server provided routing information, but timeout already expired.,Source=Framework Microsoft SqlClient Data Provider,'' To me, if this Pipeline was incorrectly configured then the Pipeline would never have successfully completed, not once. With it being intermittent, but becoming more frequent, suggests it's being caused by something other than its configuration, but I could be wrong - hence requesting help from you. Please can someone advise on what is causing the error and what I can do to verify/resolve the error? Thanks.773Views0likes2CommentsSome questions on ADF and Azure SQL Server
Hi, My company is looking to implement a data integration method. The project has been assigned to me but I'm not a data engineer and so I would like your guidance on the recommendation. I have the need to ingest several (only twelve at present) 3rd-party data sources into our domain so the data can be reported on. These external data sources are simple RDMS (most likely all to be MS SQL Server) and the volume of data, due to the 3rd-party creating a View for me, is only going to be around 20 columns and 20,000 rows, per data source. It's all structured data. My intention is to use Azure Data Factory (ADF) as the integration tool. The reason for this is we are entirely MS cloud-based and I see the ADF as the most suitable (simple, robust, cheap) MS cloud-based integration tool available - although you may inform me otherwise. I need to decide on the storage to hold the external data. I've had very brief experience with Synapse Serverless Pool, as it was the recommended substitute for Data Export Services (DES) (we use Dynamics 365 as our transactional system), which I found limiting in the SQL commands compatibility. Many of the SQL Views I had wrote upon DES weren't compatible in Synapse - I guess due to Synapse being written in Spark. For this reason, I am reluctant to use Synapse as the data storage. It is for this same reason I am reluctant to use the ADF Storage Account as I believe it is too written in Spark. Please can you advise on the below questions: 1) Is the ADF Storage Account written in Spark and thus prone to the same incompatibility as Synapse Serverless Pool is? 2) What are the benefits to using the ADF Storage Account over Azure SQL Server, and visa versa? 3) I know this question configuration specific but I'll ask anyway. Which is cheaper based on our basic use-case - ADF Storage Account or Azure SQL Server? I have trouble understanding the online pricing calculators. 4) I understand to execute activities/pipelines between Azure storage sources (ADF Storage Account, Azure SQL Server, etc.. Azure products) a 'Azure integration runtime' is needed. I also understand to extract data from an On-Premise SQL Server database a 'Self-Hosted integration runtime' is required - is this correct, and where will this 'Self-Hosted integration runtime' need to be installed (on the box that is running On-Premise SQL Server?)? I think that's all my questions for now. Thanks for your help.318Views0likes0CommentsHow to handle azure data factory lookup activity with more than 5000 records
Hello Experts, The DataFlow Activity successfully copies data from an Azure Blob Storage .csv file to Dataverse Table Storage. However, an error occurs when performing a Lookup on the Dataverse due to excessive data. This issue is in line with the documentation, which states that the Lookup activity has a limit of 5,000 rows and a maximum size of 4 MB. Also, there is a Workaround mentioned (Micrsofot Documentation): Design a two-level pipeline where the outer pipeline iterates over an inner pipeline, which retrieves data that doesn't exceed the maximum rows or size. How can I do this? Is there a way to define an offset (e.g. only read 1000 rows) Thanks, -Sri2.6KViews0likes1CommentAzure SSIS Integration Runtime Won't Start
I have an Azure SQL Managed Instance with a public endpoint. I've been struggling getting the SSIS Integration runtime to start using Standard injection. Our goal is to try out the Lift and Shift method to bring existing local packages to run Azure Data Factory. Found this table and highlighted outbound NSG rules that seem to not work if I use AzureCloud as the service destination but it works if I use DataFactoryManagement which the site says is only needed for Express injection. Can anyone clarify this and explain why with standard injection it fails unless I open 443/TCP/DataFactoryManagement in my NSG? Here's the NSG outbound for AzureCloud that does not help start the integration runtime: If I add the following, the IR will start:614Views0likes3CommentsCall unbound custom action (Dynamics) from the ADF
Hello Experts, I have a global action in Dynamics and need to initiate a call from Azure Data Factory to pass input parameters and retrieve output parameters. Are there any methods available to invoke the unbound custom action from ADF? Kindly provide recommendations. Thanks, -Sri269Views0likes0CommentsTuning Infrastructure performance with Managed Vnet Integration Runtime
Hi, I set up jobs to run in a managed Vnet IR as we are using Vnets to secure databases. I have a lot of jobs and see that it generates a lot of queue time for the pipeline copy activities. Does anyone have any guidance on tuning performance to reduce Queue time? This appears to add up to a lot of which will be billed time for teh cluster Most tables are small, but there are many of them roughly 900 being carried out 5 times from same application source databases. Any ideas? Peter293Views0likes0CommentsHow to load data from On-prem to Snowflake using ADF in better way
Hi, My use case is as follows: Our data source is an On-prem SQL Server, and it serves as our production database. Currently, we are building reports in Power BI and utilizing Snowflake as our data warehouse. I aim to extract 10 to 15 tables for my Power BI reporting into Snowflake, specifically wanting to construct a SCD Type 1 Pipeline, without the need to retain historical data. To facilitate the data transfer from On-Prem to Snowflake, we are leveraging Azure Data Factory, with blob storage set up in Azure. We already have a Self-hosted runtime in place that connects to Data Factory. Currently, I've employed the For Each loop activity to copy 15 tables from On-prem to Snowflake. The pipeline is scheduled to run daily, and each time it executes, it truncates all 15 tables before loading the data. However, this process is time-consuming due to the volume of data, especially since our On-prem SQL server is a legacy database. My questions are as follows: Is there a more efficient approach within Data Factory for this task? What are the best practices recommended for this type of case study? Dataflow is not functioning with the Self-hosted runtime; how can I activate Dataflow for an On-prem database? I would greatly appreciate any advice on the correct solution for this or pointers to relevant documentation or blogs.1.7KViews0likes0CommentsImport choices into Datavesrse using Azure Data Factory
Hello Experts, I possess the .csv file situated in the Azure blob storage account and desire to transfer the . CSV data to the Davaverse. What is the procedure for importing data of type choices into dynamics using Azure Data Factory? Provided below is the sample choice field data for the source and destination. It would be highly appreciated if you could provide suggestions without altering the .csv file. Source (.csv) Destination (Dataverse) Option A Open Open Option B Close Close Thanks -Sri349Views0likes0Comments