Azure Data Integration
107 TopicsGeneral availability of SAP CDC capabilities for Azure Data Factory and Azure Synapse Analytics
Customers use SAP systems for their business-critical operations.Today, customers want to be able to combine their SAP data with non-SAP data for their analytics needs. Azure Data Factory (ADF) is an industry-leading data integration service which enables customers to ingest data from diverse data sources (e.g., multi-cloud, SaaS, on-premises), transform data at scale, and more. Azure Data Factory (ADF)works seamlessly to combine data and prepare it at cloud-scale. Customers are using ADF to ingest data from different SAP data sources (e.g., SAP ECC, SAP Hana, SAP Table, SAP BW Open Hub, SAP BW via MDX, SAP Cloud for Customers), and combining them with data from other operational stores (e.g., Cosmos DB, Azure SQL family, and more). This enables customers to gain deep insights from both SAP and non-SAP data. Today, we are excited to announce the General Availability of SAP CDC support in Azure Data Factory and Azure Synapse Analytics.20KViews7likes13CommentsWhat Synapse Serverless SQL pool authentication type for ADF Linked Service?
Hi, I'm relatively new to Azure Data Factory and require your guidance on how to successfully create/test a Linked Service to the Azure Synapse Analytics Serverless SQL pool. In the past, I've successfully created a Linked Service to a third-party (outside our domain) on-premises SQL Server through creating a self-hosted integration runtime on their box and then creating a Linked Service to use that. The Server Name, Database Name, Windows authentication, my username and password all configured by the third-party is what I entered into the Linked Service configuration boxes. All successfully tested. This third-party data was extracted and imported, via ADF Pipelines, into an Azure SQL Server database within our domain. Now I need to extract data from our own (hosted in our domain) Azure Synapse Analytics Serverless SQL pool database. My attempt is this, and it fails: 1) I create a 'Azure Synapse Analytics' Data Store Linked Service. 2) I select the 'AutoResolveIntegrationRuntime' as the runtime to use - I'm thinking this is correct as the Synapse source is within our domain (we're fully MS cloud based). 3) I select 'Enter manually' under the 'Account selection method'. 4) I've got the Azure Synapse Analytics Serverless SQL endpoint - which I place into the 'Fully qualified domain name' field. 5) I entered the data SQL Database name found under the 'SQL database' node/section present on the Data >> Workspace screen in Synapse. 6) I choose 'System-assigned managed identity' as the Authentication type - this is a guess and I was hoping it would recognised my username/account that I am building the Linked Service with, as that account also can query Synapse too and so has Synapse access. 7) I check the 'Trust server certification' box. All else is default. When I click test connection, it fails with the following message: "Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'xxxxxxxxxxxx-ondemand.sql.azuresynapse.net', Database: 'Synapse_Dynamics_data', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access. Login failed for user '<token-identified principal>'." I've reached out to our I.T. (who are novices with Synapse, ADF, etc.. even though they did install them in our domain) and they don't know how to help me. I'm hoping you can help. 1) Is choosing the 'Azure Synapse Analytics' the correct Data Store to chose when looking extract data from an Azure Synapse Serverless SQL pool SQL database? 2) Is using the AutoResolveIntegrationRuntime correct if Synapse is held within our domain? I've previously confirmed this runtime works (and still does) as when importing the third-party data I had to use that runtime to load the data to our Azure SQL Server database. 3) Have I populated the correct values for the 'Fully qualified domain name' and 'Database name' fields by entering the Azure Synapse Analytics Serverless SQL endpoint and subsequent SQL Database name, respectively? 4) Is choosing 'System-assigned managed identity' as the Authentication type correct? I'm guessing this could be the issue. I selected this as when loading the mentioned third-party data into the Azure SQL Server database, within our domain, this was the authentication type that was used (and works) and so I'm assuming it somehow recognises the user logged in and, through the magic of cloud authentication, says this user has the correct privileges (as I should have the correct privileges so say I.T.) so allow the Linked Service to work. Any guidance you can provide me will be much appreciated. Thanks.35Views0likes0CommentsProcess your data in seconds with new ADF real-time CDC
In January, we announced that we've elevated our Change Data Capture features front-and-center in ADF. Up until just today, the lowest latency we were allowing for CDC processing was 15 minutes. But today, I am super-excited to announce that we have enabled the real-time option!23KViews12likes7CommentsData Factory Increases Maximum Activities Per Pipeline to 80
This week we have doubled thelimit on number of activitiesyou may define in a pipeline, from 40 to 80. With more freedom to develop, we want to empower you to create more powerful, versatile, and resilient data pipelines for all your business needs. We are excited to see what you come up with, harnessing the power of 40 more activities per pipeline!8.6KViews4likes23CommentsSecuring outbound traffic with Azure Data Factory's outbound network rules
The Outbound Rules feature in Azure Data Factory allows organizations to exercise granular control over outbound traffic, thereby strengthening network security. By integrating with Azure Policy, this feature also improves overall governance.10KViews5likes10CommentsAnnouncing the Public Preview of a new top-level CDC resource in ADF
Azure Data Factory (ADF) has recently added many new CDC-enabled connectors to process change data from SQL, Storage, Cosmos DB, and many other sources. Much of the feedback that we received from our users about this has been centered around making it easy to configure and to continuously detect changes at the source. We heard your feedback and are super excited to announce the immediate release of a new top-level ADF resource that is now available in public preview in your ADF resource explorer!31KViews10likes26Comments'Cannot connect to SQL Database' error - please help
Hi, Our organisation is new to Azure Data Factory (ADF) and we're facing an intermittent error with our first Pipeline. Being intermittent adds that little bit more complexity to resolving the error. The Pipeline has two activities: 1) Script activity which deletes the contents of the target Azure SQL Server database table that is located within our Azure cloud instance. 2) Copy data activity which simply copies the entire contents from the external (outside of our domain) third-party source SQL View and loads it to our target Azure SQL Server database table. With the source being external to our domain, we have used a Self-Hosted Integration Runtime. The Pipeline executes once per 24 hours at 3am each morning. I have been informed that this timing shouldn't affect/or by affected by any other Azure processes we have. For the first nine days of Pipeline executions, the Pipeline successfully completed its executions. Then for the next nine days it only completed successfully four times. Now it seems to fail every other time. It's the same error message that is received on each failure - the received error message is below (I've replaced our sensitive internal names with Xs). Operation on target scr__Delete stg__XXXXXXXXXX contents failed: Failed to execute script. Exception: ''Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Cannot connect to SQL Database. Please contact SQL server team for further support. Server: 'XX-azure-sql-server.database.windows.net', Database: 'XX_XXXXXXXXXX_XXXXXXXXXX', User: ''. Check the linked service configuration is correct, and make sure the SQL Database firewall allows the integration runtime to access.,Source=Microsoft.DataTransfer.Connectors.MSSQL,''Type=Microsoft.Data.SqlClient.SqlException,Message=Server provided routing information, but timeout already expired.,Source=Framework Microsoft SqlClient Data Provider,'' To me, if this Pipeline was incorrectly configured then the Pipeline would never have successfully completed, not once. With it being intermittent, but becoming more frequent, suggests it's being caused by something other than its configuration, but I could be wrong - hence requesting help from you. Please can someone advise on what is causing the error and what I can do to verify/resolve the error? Thanks.822Views0likes2CommentsCan an ADF Pipeline trigger upon source table update?
Hi, Is it possible for an Azure Data Factory Pipeline to be triggered each time the source table changes? Let's say I have a 'copy data' activity in a pipeline. The activity copies data from TableA to TableB. Can the pipeline be configured to execute whenever source TableA is updated (a record deleted, changed, a new record inserted, etc..)? Thanks.230Views0likes0CommentsHow to save Azure Data Factory work (objects)?
Hi, I'm new to Azure Data Factory (ADF). I need to learn it in order to ingest external third-party data into our domain. I shall be using ADF Pipelines to retrieve the data and then load it into an Azure SQL Server database. I currently develop Power BI reports and write SQL scripts to feed the Power BI reporting. These reports and scripts are saved in a backed-up drive - so if anything disappears, I can always use the back-ups to install the work. The target SQL database scripts, the tables the ADF Pipelines will load to, will be backed-up following the same method. How do I save the ADF Pipelines work and any other ADF objects that I may create (I don't know what exactly will be created as I'm yet to develop anything in ADF)? I've read about this CI/CD process but I don't think it's applicable to me. We are not using multiple environments (i.e. Dev, Test, UAT, Prod). I am using a Production environment only. Each data source that needs to be imported will have it's own Pipeline, so breaking a Pipeline should not affect other Pipelines and that's why I feel a single environment is suffice. I am the only Developer working within the ADF and so I have no need to be able to collaborate with peers and promote joint coding ventures. Does the ADF back-up it's Data Factories by default? If they do, can I trust that should our instance of ADF be deleted then I can retrieve the backed-up version to reinstall or roll-back? Is the a process/software which saves the ADF objects so I can reinstall them if I need to (by the way, I'm not sure how to reinstall them so I'll have to learn that)? Thanks.Solved1.2KViews0likes2Comments