Azure Integration Runtime
21 TopicsHow to handle azure data factory lookup activity with more than 5000 records
Hello Experts, The DataFlow Activity successfully copies data from an Azure Blob Storage .csv file to Dataverse Table Storage. However, an error occurs when performing a Lookup on the Dataverse due to excessive data. This issue is in line with the documentation, which states that the Lookup activity has a limit of 5,000 rows and a maximum size of 4 MB. Also, there is a Workaround mentioned (Micrsofot Documentation): Design a two-level pipeline where the outer pipeline iterates over an inner pipeline, which retrieves data that doesn't exceed the maximum rows or size. How can I do this? Is there a way to define an offset (e.g. only read 1000 rows) Thanks, -Sri2.5KViews0likes1CommentConnection Failed - new linked services azure data factory
Hi, I am having trouble creating a new linked service on Azure for connecting it with MySQL. I am getting a weird error saying that " Connection failed The value of the property '' is invalid: 'Format of the initialization string does not conform to specification starting at index 115.'. Format of the initialization string does not conform to specification starting at index 115. anyone aware of this error.2.1KViews0likes0CommentsADF not working for ADLS with already created Private end Point
I have following components, ADF with private endpoint for data factory, ADLS with private endpoint, Azure VM (for SHIR).I have created a VNet with a subnet. I have created a VM in the subnet with private IP and with out any public IP, in this VM I have hosted a Self Hosted Integration Runtime(SHIR), the SHIR is up and running in ADF. In the ADLS, given Blob reader and blob contributor access to ADF. Scenario 1 (working) :- I created a delimited text file in the VM, using ADF copy activity in the source I used SHIR and able to read the content, with ADLS public endpoint, the copy activity is working with SHIR both in source and sink. Scenario 2 (not working):- I created a delimited text file in the VM, using ADF copy activity in the source I used SHIR and able to read the content, but in the sink not able to write ADLS (with private endpoint) using SHIR. The error is1). It's possible because the service principal or managed identity don't have enough permission to access the data. (2). It's possible because the IP address of the self-hosted integration runtime machines are not allowed by your Azure Storage firewall settings. (3). If the self-hosted integration runtime use proxy server, it's possible because the IP address of the proxy server is not allowed by your Azure Storage firewall settings.. Account: 'accountrajaadfadls'. FileSystem: 'rajavmtoadls'. Path: 'VMfile.txt'. ErrorCode: 'AuthorizationFailure'. Message: 'This request is not authorized to perform this operation Please help me why Scenario 2 is not working? Scenario 3 (working):- Then I created Azure Integration runtime, in the copy activity, I tried source with SHIR and in the sink, used Azure-IR and able to write to ADLS (Public end point) Scenario 4 (Not working):-Then I created Azure Integration runtime, created a managed private end point for adls and got it approved in the adls public endpoint. In the copy activity, I tried source with SHIR and in the sink, used Azure-IR it is not working error is(1). It's possible because the service principal or managed identity don't have enough permission to access the data. (2). It's possible because the IP address of the self-hosted integration runtime machines are not allowed by your Azure Storage firewall settings. (3). If the self-hosted integration runtime use proxy server, it's possible because the IP address of the proxy server is not allowed by your Azure Storage firewall settings.. Account: 'accountrajaadfadls'. FileSystem: 'rajavmtoadls'. Path: 'VMfile.txt'. ErrorCode: 'AuthorizationFailure'. Message: 'This request is not authorized to perform this operation. Please help me why scenario 2 and Scenario 4 with ADLS (with private endpoint) is not working.2.1KViews0likes0CommentsHow to load data from On-prem to Snowflake using ADF in better way
Hi, My use case is as follows: Our data source is an On-prem SQL Server, and it serves as our production database. Currently, we are building reports in Power BI and utilizing Snowflake as our data warehouse. I aim to extract 10 to 15 tables for my Power BI reporting into Snowflake, specifically wanting to construct a SCD Type 1 Pipeline, without the need to retain historical data. To facilitate the data transfer from On-Prem to Snowflake, we are leveraging Azure Data Factory, with blob storage set up in Azure. We already have a Self-hosted runtime in place that connects to Data Factory. Currently, I've employed the For Each loop activity to copy 15 tables from On-prem to Snowflake. The pipeline is scheduled to run daily, and each time it executes, it truncates all 15 tables before loading the data. However, this process is time-consuming due to the volume of data, especially since our On-prem SQL server is a legacy database. My questions are as follows: Is there a more efficient approach within Data Factory for this task? What are the best practices recommended for this type of case study? Dataflow is not functioning with the Self-hosted runtime; how can I activate Dataflow for an On-prem database? I would greatly appreciate any advice on the correct solution for this or pointers to relevant documentation or blogs.1.6KViews0likes0CommentsSSIS Integration Runtime Private Endpoint
Hi, Our Azure SQL Server is having a private endpoint and we have set "Deny Public Network access" to YES. So all the traffic from and through the SQL Server databases must go over the private endpoint. Now we have set up a Azure SSISIR that is linked to the same vnet/subnet. we're using this SSIS IR to get data from SAP on-premise using SSIS and theobald. The Vnet that is used is the same that we use to link the SQL Server and the private endpoints. The vnet is linked to the on-premise network with VPN. When we execute the Data Factory pipeline that must receive the data through SSIS and the SSIS IR the process always failed with the error message that we need to set the "Deny public network access" to No. Why is transfer that we do over the SSISIR not traveling over the private endpoint to the SQL Server that is the destination? And how can we fix this that we can set the setting "Deny Public Network access" to Yes?1.3KViews0likes2CommentsDataFactory Setup To Use a Private Network
Hi. I have a Pipeline that executes an SSIS package (through the SSIS-IR). This package uses a C# Script to call a web service to get data and transfer it into an Azure SQL Database. I have a requirement to execute this on a private secure network. I have researched solutions and found information about a private link IR (sorry, that may not be the exact terminology). And have my SSIS-IR reference that private link IR. Is this the proper solution? Will it give me a secure/private "tunnel" for the data that is received from the API back to the Azure SQL Database? Is there a way to setup the subscription to use only a private network for all incoming and outgoing traffic? That way if data is transferred without using the SSIS-IR it will not be exposed to public addresses? I appreciate suggestions and recommendations, or links to KB articles about using private networks from Azure Cloud Resources. Thank you.1.2KViews0likes1CommentCopy Activity in ADF
I have a copy activity in ADF that copies data from a Azure SQL DB to another Azure SQL DB. This databases are in different servers and are part of an Elastic Pool. The source is an sql script with some data transformations. I change the DIUs to 32 but when I execute it always shows that is running with 4 DIUs, and the throuput is very low around 400kb/s. Any ideas of what I could check? Thanks1.1KViews0likes1CommentBest Practice to get data from on-premises SQL Server.
Hi, How do I get data from on-premises SQL Server which connected to my Azure with VPN Site-To-Site? I have tried 'direct' (set the IP on-premises SQL Server) connection in linked service, but failed to connect. The error message: Cannot connect to SQL Database: x.x.x.x\instance_name, Database: db1, User: user01. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access. A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified), SqlErrorNumber=-1,Class=20,State=0, Activity ID: 7c20c940-0415-4b19-9c19-d30692aeeb16. I already tested the same configuration with DBeaver (SQL client tool) in one of my VM in my Azure, and successfully connected. I have checked, firewall on-premises do not blocking any IP from Azure. So what is the best practice to get data from SQL Server on-premises? Is using Self Hosted Integration Runtime the only way to achieve this goal? Thanks in advance.871Views0likes0Comments'Cannot connect to SQL Database' error - please help
Hi, Our organisation is new to Azure Data Factory (ADF) and we're facing an intermittent error with our first Pipeline. Being intermittent adds that little bit more complexity to resolving the error. The Pipeline has two activities: 1) Script activity which deletes the contents of the target Azure SQL Server database table that is located within our Azure cloud instance. 2) Copy data activity which simply copies the entire contents from the external (outside of our domain) third-party source SQL View and loads it to our target Azure SQL Server database table. With the source being external to our domain, we have used a Self-Hosted Integration Runtime. The Pipeline executes once per 24 hours at 3am each morning. I have been informed that this timing shouldn't affect/or by affected by any other Azure processes we have. For the first nine days of Pipeline executions, the Pipeline successfully completed its executions. Then for the next nine days it only completed successfully four times. Now it seems to fail every other time. It's the same error message that is received on each failure - the received error message is below (I've replaced our sensitive internal names with Xs). Operation on target scr__Delete stg__XXXXXXXXXX contents failed: Failed to execute script. Exception: ''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'XX-azure-sql-server.database.windows.net', Database: 'XX_XXXXXXXXXX_XXXXXXXXXX', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access.,Source=Microsoft.DataTransfer.Connectors.MSSQL,''Type=Microsoft.Data.SqlClient.SqlException,Message=Server provided routing information, but timeout already expired.,Source=Framework Microsoft SqlClient Data Provider,'' To me, if this Pipeline was incorrectly configured then the Pipeline would never have successfully completed, not once. With it being intermittent, but becoming more frequent, suggests it's being caused by something other than its configuration, but I could be wrong - hence requesting help from you. Please can someone advise on what is causing the error and what I can do to verify/resolve the error? Thanks.800Views0likes2CommentsLinked Service going down in ADF getting error InvalidLinkedServiceReference .Impacting Pipelines
SAP linked service going down in ADF and getting errorInvalidLinkedServiceReference which is impacting pipeline runs. Can someone help to fix this and what need to be checked . Any documented link referenced would be appreciated.776Views0likes0Comments