database platform cse – sql
45 TopicsKey Considerations to avoid Implicit Conversion issues in Oracle to Azure SQL Modernization
Overview This blog dives into the mechanics of implicit data type conversions and their impact during the post-migration performance optimization phase of heterogeneous database migrations. Drawing from our observed Engineering field patterns across diverse application architectures, this blog explores why certain platforms like ORMs, JDBC drivers, and cross-platform data models are more prone to implicit conversions than others and how they result in performance issues, break indexes, or cause query regressions. You'll gain actionable strategies to detect, mitigate, and design around these pitfalls to ensure a successful and performant data migration to platforms like Azure SQL. Understanding Implicit Conversions in Database Migrations When an application communicates with a database whether through JDBC, ODBC, ADO.NET, or any other data access API it sends parameters and query values in a specific data type format. However, if the type of the value provided by the application does not exactly match the data type of the target column in the database, the database engine attempts to reconcile the mismatch automatically. This automatic adjustment is known as implicit conversion. For instance, a value passed as a string from the application may be compared against a numeric or date column in the database. This occurs because many front-end systems and APIs transmit values as strings by default, even if the underlying business logic expects numbers or dates. Unless the application explicitly parses or casts these values to match the expected types, the database engine must decide how to handle the type mismatch during query execution. In such cases, the engine applies type conversion internally, either to the parameter or the column, based on its own rules. While this feature can simplify application development by allowing flexible data handling, it often introduces engine-specific behavior that becomes more visible during cross engine database migrations, where assumptions built into one system may not hold true in another. Impact of Implicit Conversions Implicit conversions can adversely affect database performance and functionality in several ways, some of which are discussed below: Performance Degradation: When a database performs an implicit conversion, it may bypass indexes, resulting in slower query execution. For example, comparing a VARCHAR column to an INT value in SQL Server can trigger a table scan instead of an index seek, significantly increasing query time. Migration-Specific Issues and Data Integrity Risks: Implicit conversions can cause data loss or incorrect results during a few instances and one such example is, when a column defined as VARCHAR2 in Oracle, which can store Unicode characters by default is mapped to a VARCHAR column in SQL Server, non-ASCII characters such as Chinese, Russian, or Korean may be silently replaced with incorrect characters/symbols. One example of scenario when this can happen: Oracle VARCHAR2 stores Unicode if the database character set is UTF-8 (AL32UTF8), which is common in modern Oracle installations. SQL Server VARCHAR is ANSI/code-page based, so non-ASCII characters are stored differently, unless the column is explicitly declared as NVARCHAR. -- In Real World this can happen on any other data types Maintenance Challenges: Queries relying on implicit conversions are harder to debug and optimize, as these conversions are not explicitly visible in the code and may only surface during performance regressions. These queries forces the optimizer to compile an execution plan containing scans of large clustered indexes, or tables, instead of a seek resulting in degraded performance Execution Overhead and Resource Consumption: Implicit conversions increase execution times for both queries and API calls, as the engine must perform runtime casting operations. This can lead to higher CPU usage, increased logical reads, and memory pressure. Detection Methods Detecting implicit conversions is crucial for optimizing database performance post-migration. The following methods can be employed to detect: Query Store (QDS): Use QDS post-migration during load testing to track expensive queries based on cost and surface performance regressions caused by type mismatches. Review execution plans captured in QDS for conversion-related patterns. You can also use custom script like below to query the QDS: USE <[Replace_with_actual_DB_name]> -- Replace with the actual database name GO SELECT TOP (100) DB_NAME() AS [Database], qt.query_sql_text AS [Consulta], rs.last_execution_time AS [Last Execution Time], rs.avg_cpu_time AS [Avg Worker Time], rs.max_cpu_time AS [Max Worker Time], rs.avg_duration AS [Avg Elapsed Time], rs.max_duration AS [Max Elapsed Time], rs.avg_logical_io_reads AS [Avg Logical Reads], rs.max_logical_io_reads AS [Max Logical Reads], rs.count_executions AS [Execution Count], q.last_compile_start_time AS [Creation Time], CAST(p.query_plan AS XML) AS [Query Plan] FROM sys.query_store_query_text AS qt JOIN sys.query_store_query AS q ON qt.query_text_id = q.query_text_id JOIN sys.query_store_plan AS p ON q.query_id = p.query_id JOIN sys.query_store_runtime_stats AS rs ON p.plan_id = rs.plan_id WHERE CAST(p.query_plan AS NVARCHAR(MAX)) LIKE '%CONVERT_IMPLICIT%' AND qt.query_sql_text NOT LIKE '%sys.query_store%' ORDER BY rs.avg_cpu_time DESC; Execution Plans: For the expensive queries, in SSMS, hover over operators like Index Scan to inspect the Predicate. If implicit conversion exists, the plan includes something like “CONVERT_IMPLICIT(<data_type>, ..” XML Plan: For a confirmation of above, reviewing the underlying XML execution plan confirms whether implicit conversion is occurring and on which side of the comparison. This technique is particularly valuable when working with parameterized queries or when graphical plan warnings are insufficient. Look for elements like below in the XML plan: <Warnings> <PlanAffectingConvert ConvertIssue="Seek Plan" Expression="CONVERT_IMPLICIT(.. </Warnings> Plan Cache Inspection: Custom scripts can be written to scan the Azure SQL plan cache for any instances of CONVERT_IMPLICIT operations. Below is one such script that can be used to find. SELECT TOP (100) DB_NAME(B.[dbid]) AS [Database], B.[text] AS [SQL_text], A.total_worker_time AS [Total Worker Time], A.total_worker_time / A.execution_count AS [Avg Worker Time], A.max_worker_time AS [Max Worker Time], A.total_elapsed_time / A.execution_count AS [Avg Elapsed Time], A.max_elapsed_time AS [Max Elapsed Time], A.total_logical_reads / A.execution_count AS [Avg Logical Reads], A.max_logical_reads AS [Max Logical Reads], A.execution_count AS [Execution Count], A.creation_time AS [Creation Time], C.query_plan AS [Query Plan] FROM sys.dm_exec_query_stats AS A WITH (NOLOCK) CROSS APPLY sys.dm_exec_sql_text(A.plan_handle) AS B CROSS APPLY sys.dm_exec_query_plan(A.plan_handle) AS C WHERE CAST(C.query_plan AS NVARCHAR(MAX)) LIKE '%CONVERT_IMPLICIT%' AND B.[dbid] = DB_ID() AND B.[text] NOT LIKE '%sys.dm_exec_sql_text%' ORDER BY A.total_worker_time DESC XE event: Extended Events (XE) is valuable in support scenarios when Query Store or telemetry data alone can't pinpoint issues like implicit conversions, especially if plans aren't cached or historical data lacks detail. XE provides real-time capture of plan-affecting convert events, offering granular insights into query behavior that QS might miss during short-lived or dynamic workloads. However, use it sparingly due to overhead, as a targeted diagnostic tool rather than a broad solution. You can use below script to turn it off. Remember to stop and drop the event immediately when you are done collecting. IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'Detect_Conversion_Performance_Issues') DROP EVENT SESSION [Detect_Conversion_Performance_Issues] ON SERVER; GO CREATE EVENT SESSION [Detect_Conversion_Performance_Issues] ON SERVER ADD EVENT sqlserver.plan_affecting_convert( ACTION(sqlserver.database_name, sqlserver.sql_text) WHERE ([sqlserver].[database_name] = N'<Replace_with_your_DB_name>') -- Replace your DB name ) ADD TARGET package0.ring_buffer WITH ( MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 30 SECONDS, MEMORY_PARTITION_MODE = NONE, TRACK_CAUSALITY = OFF, STARTUP_STATE = OFF ); GO ALTER EVENT SESSION [Detect_Conversion_Performance_Issues] ON SERVER STATE = START; -- View the raw Extended Events buffer SELECT s.name AS session_name, t.target_name, CAST(t.target_data AS XML) AS raw_buffer_xml FROM sys.dm_xe_sessions s JOIN sys.dm_xe_session_targets t ON s.address = t.event_session_address WHERE s.name = 'Detect_Conversion_Performance_Issues'; Documentation Reference: Microsoft docs on conversion and precedence help explain engine behavior and mappings around implicit conversion triggers. This close look at them along with app developers during the schema/code conversion phase can help better understanding and mitigation. Implicit Conversion: Real-World Example To evaluate the impact of implicit conversions in Azure SQL during post-migration scenarios, we created a synthetic workload example using a table named dbo.Customers. It contains one million rows and includes columns such as AccountNumber, CustomerName, PhoneNumber, and JoinDate. The AccountNumber, CustomerName, and PhoneNumber columns were initially defined as VARCHAR, and Nonclustered indexes were created on these fields to enable efficient lookups. From the application layer, parameters were passed using NVARCHAR, which mirrors typical real-world ORM behavior particularly in Java-based applications or when migrating from Oracle, where VARCHAR2 frequently stores Unicode characters. This deliberate mismatch allows us to study the real performance consequences of implicit conversions in Azure SQL’s query execution engine. Although enabling SET STATISTICS XML ON can expose implicit conversions during query execution, our approach tries to reflect how these issues are usually uncovered in real-world scenarios where customers are less aware of this issue. In this case, we used Query Store and execution plan XML inspection Problem: Implicit Conversion Due to Type Mismatch: A NVARCHAR parameter from the application is compared against a VARCHAR column in the database. This scenario highlights a silent performance regression that can go unnoticed post-migration without detailed plan inspection. Query Used: DECLARE ACC NVARCHAR(20) = N’ACC000500’; SELECT CustomerID, AccountNumber, CustomerName, PhoneNumber FROM dbo.Customers WHERE AccountNumber = ACC; Execution Plan Behavior: Fig: 1 SQL Server applies an implicit CONVERT_IMPLICIT(nvarchar, AccountNumber) on the column side as we can see from Fig 1 when you hover to Index Scan and see the Predicate This disables the use of the nonclustered index on AccountNumber, leading to an Index Scan The XML plan includes a <Warning> tag under <PlanAffectingConvert> Extended Events monitoring consistently shows "plan_affecting_convert" warnings indicating suboptimal query plans caused by these conversions What's Missing: Type alignment between the query parameter and the column. Awareness that even matching string lengths won’t help if encoding mismatches exist. Impact: Index Seek is lost, and full scans are triggered. Higher Exec times and Overall costs observed. Mitigation via Explicit CAST – Matching the Column’s Type: In some cases, especially during post-migration tuning, application teams may not be able to change the database schema, but developers can update the query to explicitly align data types. This scenario simulates such a mitigation where an NVARCHAR parameter is explicitly cast to VARCHAR to match the column’s data type and avoid implicit conversions. Query Used: DECLARE ACC NVARCHAR(20) = N'ACC000500'; SELECT CustomerID, AccountNumber, CustomerName, PhoneNumber FROM dbo.Customers WHERE AccountNumber = CAST(@acc AS VARCHAR(20)); -- Explicit use of CAST Execution Plan Behavior: Fig: 2 The CAST operation ensures that the parameter side matches the VARCHAR column type. SQL performs an Index Seek on the IX_AccountNumber index. The Seek Predicates as seen in Fig 2 confirms this showing Scalar “Operator(CONVERT..” No <Warning> tag appears in the XML execution plan indicating the absence of implicit conversions. What's Fixed: Type mismatch is resolved on the query side without altering the database schema. The query is now SARGable, enabling index usage. Impact: Index Seek is lost, and full scans are triggered. Higher Exec times and Overall costs observed. What's Still Missing: This still creates a long-term maintainability concern, especially when many queries or columns are affected. Developers must remember to manually CAST in every affected query, increasing code complexity and the chance of inconsistency. Missed CASTs in other queries can still cause implicit conversions, so the issue isn’t eliminated just patched locally. Fix at DB end – Parameter Usage aligned Schema Column type This fix involves altering the column type to NVARCHAR, aligning it with the NVARCHAR parameter passed from the application. It eliminates implicit conversions and enables index seeks, improving performance. However, it’s a database-side adjustment, the ideal long-term fix lies in ensuring the application sends parameters matching the original column type. Query Used: DECLARE ACC NVARCHAR(20) = N'ACC000500'; SELECT CustomerID, AccountNumber, CustomerName, PhoneNumber FROM dbo.Customers WHERE AccountNumber = CAST(@acc AS VARCHAR(20)); Execution Plan Behavior: Fig: 3 The CAST operation ensures that the parameter side matches the VARCHAR column type. As seen in Fig 3 an Index Seek is performed on the updated IX_AccountNumber index. The Seek Predicates confirm this showing “Scalar Operator(..” No <Warning> tag appears in the XML execution plan indicating the absence of implicit conversions. What's Fixed: The fix is schema-driven and works universally, ensuring consistent performance across tools and interfaces. Encoding alignment between the parameter and column removes conversion logic entirely, making query plans stable and predictable. Impact: Indexes remain fully usable without manual intervention in queries. Application code stays clean, no casts or workarounds are needed. This is the most sustainable fix but may require coordination with application and DB teams. What's Still Missing: Some of implications around data type changes will be around data types consuming additional storage, for example NVARCHAR consumes 2 bytes/char, increasing storage when compared to VARCHAR. Implicit vs Explicit vs Aligned: Execution Plan Behavior Comparison Scenario Predicate Expression in Exec Plan Implicit Conversion Index Seek Used XML Plan Warning Shown Data Type Mismatch CONVERT_IMPLICIT(nvarchar, AccountNumber) = ACC Yes No (results in scan) PlanAffectingConvert Explicit Cast in Query AccountNumber = CONVERT(varchar, ACC) No Yes No Matching Data Types (NVARCHAR) AccountNumber = ACC No Yes No Best Practices for Managing Implicit Conversions Refactoring Application: Legacy systems, especially those using dynamic SQL or lacking strict type enforcement, are prone to implicit conversion issues. Refactor your application code to leverage strongly typed variables and parameter declarations to ensure data type consistency at the source, minimizing implicit conversions during query execution. Explicit Data Type Casting: Use CAST or CONVERT functions to explicitly define conversions, reducing reliance on implicit behavior. In our example we have used CAST, but a CONVERT function would have worked equally well. Both approaches explicitly align the parameter type to the column and avoid implicit conversions, enabling index seek. Data Type Alignment: When you are performing heterogenous migrations which involve different database engines, ensure data types are consistent between source and target DB engines. Check official documents thoroughly to know and see the nuances around your data and application convertibility and know the implications like additional storage, collation changes etc. that can negatively affect your business. Indexing: Create indexes on columns frequently involved in WHERE filters and JOIN predicates with matching data types to avoid implicit conversions that would cause index seeks to degrade into scans and ensures optimal index utilization by the optimizer. Early Testing: Conduct thorough post-migration testing using QDS to identify, then drill down on execution plans and performance metrics to identify and resolve conversion-related issues. Early collaboration between Developer and DBA teams will be crucial. Tools and Scripts: Utilize SQL Server Migration Assistant (SSMA) for Oracle to identify and change mappings early when you know your application needs. Additionally, use can use custom scripts or third-party tools if necessary to detect implicit conversions in the plan cache. References https://learn.microsoft.com/sql/t-sql/data-types/data-type-conversion-database-engine https://learn.microsoft.com/sql/t-sql/data-types/data-type-precedence-transact-sql Final Thoughts We hope that this post has helped you gain actionable strategies to detect, mitigate, and design around implicit conversions in order to ensure a successful and performant data migration to platforms such as SQL Server or Azure SQL. If you have feedback or suggestions for improving this post, please contact the Azure Databases SQL Customer Success Engineering Team. Thanks for your support!Enforcing SQL PaaS backup retention with Azure Policy
Implementation for SQL DB PITR using the portal Azure policy covers much more than SQL; here we are using a small portion of its capabilities. The bits we are using are A policy definition, on what the policy checks for, and what to do about issues A policy assignment, with the scope to check the definition across, and parameter values A remediation task, that makes the required changes The requirement in this example is to ensure that all Azure SQL Databases have a short-term (PITR) backup retention of at least 9 days. Any database created without specifying the retention period will have this added Any update made with a shorter period will have that modified to be 9 days Modifications or database creation that explicitly set the retention period to more than 9 days will have that value honoured All these are built under “Policy” in the portal The definition Open Policy | Authoring | Definitions, and on that blade, use the “+ Policy definition” to create a new one Definition Location: the subscription to hold this (there’s a pull-down menu of valid items) Name: for example, “Enforce SQL DB PITR” Category: for example “Backup” Role Definitions: “Contributor” for this example, but in general this should be the minimum needed for the updates that the definition will make Policy rule: { "mode": "All", "policyRule": { "if": { "anyOf": [ { "field": "Microsoft.Sql/servers/databases/backupShortTermRetentionPolicies/retentionDays", "exists": false }, { "field": "Microsoft.Sql/servers/databases/backupShortTermRetentionPolicies/retentionDays", "less": "[parameters('Minimum_PITR')]" } ] }, "then": { "effect": "modify", "details": { "roleDefinitionIds": [ "/providers/Microsoft.Authorization/roleDefinitions/b24988ac-6180-42a0-ab88-20f7382dd24c" ], "operations": [ { "operation": "addOrReplace", "field": "Microsoft.Sql/servers/databases/backupShortTermRetentionPolicies/retentionDays", "value": "[parameters('Minimum_PITR')]" } ] } } }, "parameters": { "Minimum_PITR": { "type": "Integer", "metadata": { "displayName": "Min PITR", "description": "Min PITR retention days" } } } } In this code Field is what we want to check and/or change; get the list of field names using PowerShell $aliases = Get-AzPolicyAlias -ListAvailable -NamespaceMatch 'Microsoft.Sql' | where ResourceType -like 'retentionpol' | Select-Object -ExpandProperty 'Aliases' $aliases | select Name For the list of fields that can be modified/updated, look at the Modifiable attribute $aliases | Where-Object { $_.DefaultMetadata.Attributes -eq 'Modifiable' } | select Name Minimum_PITR is the name of the parameter the assignment (next step) will pass in. You choose the name of the parameter roleDefinitionIds are the full GUID path of the roles that the update needs. The policy remediation docs talk about this, but we can get the GUID with PowerShell Get-AzRoleDefinition -name 'contributor' # replace contributor with the role needed This definition is saying that if the PITR retention isn’t set, or is less than the parameter value, then make it (via addOrReplace) the parameter value. The Assignment Once you save the definition, use “Assign policy” on the screen that appears For this, there are several tabs Basics: Scope and exclusions let you work on less than the entire subscription enable “policy enforcement” Parameters Enter 9 for Min_PITR (to have policy apply 9 days as the minimum) Remediation Tick “create remediation task” Default is to use a system managed identity Then create this assignment Initial Remediation Once the assignment is created, look at the compliance blade to see it; Azure policy is asynchronous, so for a newly created assignment, it takes a little while before it begins checking resources in its scope. Similarly, “remediation tasks” on the remediation blade shows the task pending to begin with. Once the initial remediation scan completes, you can look at the backup retention policies (in Data Management | backups) on the logical server(s) and see that the PITR retention periods have been increased to a minimum of 9 days. Ongoing operation With the initial remediation complete, the policy will now intercept non-compliant changes, and refactor them on the fly. For example, if we use PowerShell to set the retention to 2 days $DB_PITR = get-azsqldatabasebackupshorttermretentionpolicy -ResourceGroupName mylittlestarter-rg -ServerName mylittlesql -DatabaseName oppo $DB_PITR | Set-AzSqlDatabaseBackupShortTermRetentionPolicy -RetentionDays 2 ResourceGroupName : mylittlestarter-rg ServerName : mylittlesql DatabaseName : oppo RetentionDays : 9 DiffBackupIntervalInHours : 12 The update completes, but the summary shows that the retention stays as 9 days The experience on the portal is the same; we can change the retention to 1 day in the GUI, and the operation succeeds, but with the retention remaining at 9 days. In the activity log of either the logical server or the database, this shows up as a modify, with the JSON detail of the modify showing the policy name and the effect. Tricky bits A few challenges that can cause delays: Retention policies are separate resources – Both short-term and long-term backup retention aren’t direct attributes of the database resource. Instead, they exist as their own resources (e.g., with retentionDays) tied to the database. Keep policies simple – Focusing each policy on a single resource (like SQL DB PITR) proved more effective than trying to create one large, all-encompassing policy. Case sensitivity matters – The policy definition code is case-sensitive, which can easily trip you up if not handled carefully. The definitionRoleID is just the GUID of the security role that the policy is going to need, not anything to do with the identity that’s created for the remediation task…but the GUID is potentially different for each subscription, hence the PowerShell to figure out this GUID Writing the definitions in PowerShell means that they are just plain-text, and don’t have any syntax helpers; syntax issues in the definition tend to appear as strange “error converting to JSON” messages. Waiting patiently for the initial policy remediation cycle to finish; I haven’t found any “make it so” options References The posts mentioned in the introduction are Automatically Enable LTR and PITR Policy upon a Database creation on Azure SQL Managed Instance | Microsoft Community Hub using audits and runbooks Azure custom policy to prevent backup retention period to be below X number - Azure SQL | Microsoft Community Hub which uses ‘Deny’ to fail attempts that don’t meet the requirements. Expanding this using PowerShell With a working example for SQL DB PITR, we now want to expand this to have policies that cover both short and long term retention for both SQL DB and SQL MI. The code below isn’t exhaustive, and being a sample, doesn’t have error checking; note that the code uses “less” for the policy test, but operators like “equals” and “greater” (https://learn.microsoft.com/en-us/azure/governance/policy/concepts/definition-structure-policy-rule#conditions ) are available to build more complex tests, depending on the policy requirements. The document Programmatically create policies - Azure Policy | Microsoft Learn covers using powershell with Azure policy. Other wrinkles that this sample doesn’t explicitly cater for include LTR retentions are held in ISO 8601 format (eg, ‘P8D’ for 8 days), so it’s not trivial to do less than tests; in theory ARM template functions could be used to convert these into the number of days, but this example just does an equality check, and enforces the policy, without any understanding that P4W is a longer period than P20D LTR isn’t available for serverless databases with autopause enabled (https://learn.microsoft.com/en-us/azure/azure-sql/database/serverless-tier-overview?view=azuresql&tabs=general-purpose#auto-pause ); this would need some form of scope control, potentially either using resource groups, or a more complex test in the policy definition to look at the database attributes A few service levels, for example the Basic database SLO, have different limits for their short term retention PITR for databases that could be offline (stopped managed instances, auto-paused serverless databases, etc) hasn’t been explicitly tested. Remediation tasks just run to completion, with no rescheduling; to ensure that all existing databases are made compliant, this could be expanded to have a loop to check the count of resources needing remediation, and start a task if the relevant existing ones are complete <# /***This Artifact belongs to the Data SQL Ninja Engineering Team***/ Name: Enforce_SQL_PaaS_backup_retention.ps1 Author: Databases SQL CSE/Ninja, Microsoft Corporation Date: August 2025 Version: 1.0 Purpose: This is a sample to create the Azure policy defintions, assignment and remediation tasks to enfore organisational policies for minimum short-term (PITR) and weekly long-term (LTR) backup retention. Prerequisities: - connect to your azure environment using Connect-AzAccount - Register the resource provider (may already be done in your environment) using Register-AzResourceProvider -ProviderNamespace 'Microsoft.PolicyInsights' - if needed to modify/update this script, this can be used to find field names: Get-AzPolicyAlias -ListAvailable -NamespaceMatch 'Microsoft.Sql' | where ResourceType -like '*retentionpol*' | Select-Object -ExpandProperty 'Aliases' | select Name Warranty: This script is provided on as "AS IS" basis and there are no warranties, express or implied, including, but not limited to implied warranties of merchantability or fitness for a particular purpose. USE AT YOUR OWN RISK. Feedback: Please provide comments and feedback to the author at datasqlninja@microsoft.com #> # parameters to modify $Location = 'EastUS' # the region to create the managed identities used by the remediation tasks $subscriptionID = (Get-AzContext).Subscription.id # by default use the current Subscription as the scope; change if needed # the policies to create; PITR can do a less than comparison, but LTR has dates, so uses string equalities [array]$policies = @() $policies += @{type = 'DB'; backups='PITR'; name = 'Enforce SQL DB PITR retention'; ParameterName = 'Minimum_PITR'; ParameterValue = 9; Role = 'contributor'; Category='Backup'} $policies += @{type = 'MI'; backups='PITR'; name = 'Enforce SQL MI PITR retention'; ParameterName = 'Minimum_PITR'; ParameterValue = 9; Role = 'contributor'; Category='Backup'} # LTR retention is in ISO8601 format, eg P2W = 2 weeks, P70D = 70 days; 'PT0S' = no retention $policies += @{type = 'DB'; backups='LTR';name = 'Enforce SQL DB LTR retention'; Weekly = 'P4W'; Monthly = 'PT0S'; Yearly = 'PT0S'; WeekofYear = 1; Role = 'contributor'; Category='Backup'} $policies += @{type = 'MI'; backups='LTR';name = 'Enforce SQL MI LTR retention'; Weekly = 'P4W'; Monthly = 'PT0S'; Yearly = 'PT0S'; WeekofYear = 1; Role = 'contributor'; Category='Backup'} # templates for the Policy definition code; this has placeholders that are replaced in the loop $Policy_definition_template_PITR = @' { "mode": "All", "policyRule": { "if": { "anyOf": [ { "field": "Microsoft.Sql/<Type>/databases/backupShortTermRetentionPolicies/retentionDays", "exists": false }, { "field": "Microsoft.Sql/<Type>/databases/backupShortTermRetentionPolicies/retentionDays", "less": "[parameters('<ParameterName>')]" } ] }, "then": { "effect": "modify", "details": { "roleDefinitionIds": [ "/providers/Microsoft.Authorization/roleDefinitions/<RoleGUID>" ], "operations": [ { "operation": "addOrReplace", "field": "Microsoft.Sql/<Type>/databases/backupShortTermRetentionPolicies/retentionDays", "value": "[parameters('<ParameterName>')]" } ] } } }, "parameters": { "<ParameterName>": { "type": "Integer" } } } '@ # LTR, look for any of the weekly/monthly/yearly retention settings not matching $Policy_definition_template_LTR = @' { "mode": "All", "policyRule": { "if": { "anyOf": [ { "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/weeklyRetention", "exists": false }, { "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/weeklyRetention", "notEquals": "[parameters('Weekly_retention')]" }, { "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/monthlyRetention", "notEquals": "[parameters('Monthly_retention')]" }, { "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/yearlyRetention", "notEquals": "[parameters('Yearly_retention')]" } ] }, "then": { "effect": "modify", "details": { "roleDefinitionIds": [ "/providers/Microsoft.Authorization/roleDefinitions/<RoleGUID>" ], "operations": [ { "operation": "addOrReplace", "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/weeklyRetention", "value": "[parameters('Weekly_retention')]" }, { "operation": "addOrReplace", "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/monthlyRetention", "value": "[parameters('Monthly_retention')]" }, { "operation": "addOrReplace", "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/yearlyRetention", "value": "[parameters('Yearly_retention')]" }, { "operation": "addOrReplace", "field": "Microsoft.Sql/<Type>/databases/backupLongTermRetentionPolicies/weekOfYear", "value": "[parameters('WeekofYear')]" } ] } } }, "parameters": { "Weekly_retention": { "type": "String" }, "Monthly_retention": { "type": "String" }, "Yearly_retention": { "type": "String" }, "WeekofYear": { "type": "Integer" } } } '@ # main loop foreach ($policy in $policies) { # translate the Role name into its GUID $Role = Get-AzRoleDefinition -name $($policy.Role) $type = $policy.type -replace 'MI','managedInstances' -replace 'DB','servers' $template = if ($policy.backups -eq 'PITR') {$Policy_definition_template_PITR} else {$Policy_definition_template_LTR} # generate the definition code for this policy $policy_definition = $template -replace '<Type>',$type -replace '<RoleGUID>',$($Role.Id) -replace '<ParameterName>',$policy.ParameterName # create the policy definition $PolicyDefinition = new-AzPolicyDefinition -Name $($policy.name) -Policy $policy_definition -Metadata "{'category':'$($policy.Category)'}" # create the assignment if ($policy.backups -eq 'PITR') { $PolicyParameters = @{$($policy.ParameterName)=($($policy.ParameterValue))} } else { $PolicyParameters = @{"Weekly_retention"=($($policy.Weekly)); "Monthly_retention"=($($policy.Monthly)); "Yearly_retention"=($($policy.Yearly)); "WeekofYear"=($($policy.WeekofYear));} } $PolicyAssignment = New-AzPolicyAssignment -Name $($policy.name) -PolicyDefinition $PolicyDefinition -PolicyParameterObject $PolicyParameters -IdentityType 'SystemAssigned' -Location $Location # now follow the docs page to wait for the ID to be created, and assign the roles required to it; https://learn.microsoft.com/en-us/azure/governance/policy/how-to/remediate-resources?tabs=azure-powershell # include a loop to wait until the managed identity created as part of the assignment creation is available do { $ManagedIdentity = Get-AzADServicePrincipal -ObjectId $PolicyAssignment.IdentityPrincipalId -erroraction SilentlyContinue if (!($ManagedIdentity)) {start-sleep -Seconds 1} # wait for a bit... } until ($ManagedIdentity) $roleDefinitionIds = $PolicyDefinition.PolicyRule.then.details.roleDefinitionIds if ($roleDefinitionIds.Count -gt 0) { $roleDefinitionIds | ForEach-Object { $roleDefId = $_.Split("/") | Select-Object -Last 1 $roleAssigned = New-AzRoleAssignment -ObjectId $PolicyAssignment.IdentityPrincipalId -RoleDefinitionId $roleDefId -Scope "/subscriptions/$($subscriptionID)" } } # lastly create the remediation task $RemediationTask = Start-AzPolicyRemediation -Name $($policy.name) -PolicyAssignmentId $PolicyAssignment.Id } # confirm that the policies have been set up Get-AzPolicyDefinition | where name -In $policies.name | format-table Name, PolicyType Get-AzPolicyAssignment | where name -In $policies.name | format-table Name, Parameter Feedback and suggestions If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Customer Success Engineering (Ninja) Team (datasqlninja@microsoft.com). Thanks for your support! Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.SAP Sybase ASE to Azure SQL Migration using SSMA and BCP Overview
Introduction Enterprises upgrading legacy databases often face challenges in migrating complex schemas and efficiently transferring large volumes of data. Transitioning from SAP ASE (Sybase ASE) to Azure SQL Database is a common strategy to take advantage of enhanced features, improved scalability, and seamless integration with Microsoft services. With business growth, the limitations of the legacy system become apparent, performance bottlenecks, high maintenance costs, and difficulty in integrating with modern cloud solutions. SQL Server Migration Assistant for SAP Adaptive Server Enterprise (SSMA) Automates migration from SAP ASE to SQL Server, Azure SQL Database and Azure SQL Managed Instance. While SSMA provides a complete end-to-end migration solution, the custom BCP script (ASEtoSQLdataloadusingbcp.sh) enhances this process by enabling parallel data transfers, making it especially effective for migrating large databases with minimal downtime. Script Workflow One of the most common challenges we hear from customers migrating from Sybase ASE to SQL Server is: “How can we speed up data transfer for large tables without overwhelming the system?” When you are dealing with hundreds of tables or millions of rows, serial data loads can quickly become a bottleneck. To tackle this, we created a script called ASEtoSQLdataloadusingbcp.sh that automates and accelerates the data migration process using parallelism. It starts by reading configuration settings from external files and retrieves a list of tables, either from the source database or from a user-provided file. For each table, the script checks if it meets criteria for chunking based on available indexes. If it does, the table is split into multiple views, and each view is processed in parallel using BCP, significantly reducing the overall transfer time. If chunking is not possible, the script performs a standard full-table transfer. Throughout the entire process, detailed logging ensures everything is traceable and easy to monitor. This approach gives users both speed and control , helping migrations finish faster without sacrificing reliability. Prerequisites Before running the script, ensure the following prerequisites are met: Database schema is converted and deployed using SQL Server Migration Assistant (SSMA). Both the source (SAP ASE) and target (Azure SQL DB) databases are accessible from the host system running the script. Source ASE database should be hosted on Unix or Linux. The target SQL Server can be hosted on Windows, Linux, or as an Azure. Configuration Files The configuration aspect of the solution is designed for clarity and reuse. All operational parameters are defined in external files, this script will use following external config files during bcp_config.env The primary configuration file, bcp_config.env, contains connection settings and control flags. In the screenshot below you can see the format of the file. chunking_config.txt The chunking_config.txt file defines the tables to be partitioned, identifies the primary key column for chunking, and specifies the number of chunks into which the data should be divided. table_list.txt Use table_list.txt as the input if you want a specific list of tables. Steps to run the script Script Execution Log The script log records tables copied, timestamps, and process stages. Performance Baseline A test was run on a 32-core system with a 10 GB table (262,1440 rows) for ASE and SQL. Migration using SSMA took about 3 minutes. Using the BCP script with 10 chunks, the entire export and import finished in 1 minute 7 seconds. This demonstrates how parallelism and chunk-based processing greatly boost efficiency for large datasets. Disclaimer: These results are for illustration purposes only. Actual performance will vary depending on system hardware (CPU cores, memory, disk I/O), database configurations, network latency, and table structures. We recommend validating performance in dev/test to establish a baseline. General Recommendation Larger batch sizes (e.g., 10K–50K) can boost throughput if disk IOPS and memory are sufficient, as they lower commit overhead. More chunks increase parallelism and throughput if CPU resources are available; otherwise, they may cause contention when CPU usage is high. Monitor system’s CPU and IOPS: When the system has high idle CPU and low I/O wait, increasing both the number of chunks and the batch size is appropriate. If CPU load or I/O wait is high, reduce batch size or chunk count to avoid exhausting resources. This method aligns BCP operations with your system's existing capacity and performance characteristics. Steps to Download the script Please send an email to the alias: datasqlninja@microsoft.com and we will send you the download link with instructions. What’s Next: Upcoming Enhancements to the Script Smart Chunking for Tables Without Unique Clustered Indexes Enable chunk-based export using any unique key column, even if the table lacks a unique clustered index. This will extend chunking capabilities to a broader range of tables, ensuring better parallelization. Multi-Table Parallel BCP with Intelligent Chunking Introduce full parallel execution across multiple tables. If a table qualifies for chunking, its export/import will also run in parallel internally, delivering two-tier parallelism: across and within tables. LOB Column Handling (TEXT, IMAGE, BINARY) Add robust support for large object data types. Include optimized handling strategies for exporting and importing tables with TEXT, IMAGE, or BINARY columns, ensuring data fidelity, and avoiding performance bottlenecks. Feedback and Suggestions If you have feedback or suggestions for improving this asset, please contact the Data SQL Ninja Team (datasqlninja@microsoft.com). Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide. Thank you for your support!Seamless Homogeneous SQL Family Migration via Azure Data Factory
Migrating data across SQL platforms, be it SQL Server, Azure SQL Database, Managed Instance, or SQL Server on IaaS, often involves operational complexity and potential downtime. Azure Data Factory (ADF) removes those barriers by enabling seamless, logical data movement across these services in either direction. Whether using SQL Change Data Capture (CDC) for near-zero downtime or traditional batch-based strategies, ADF ensures data consistency and operational continuity throughout the process. While physical data migration strategies remain valuable in many scenarios, this blog focuses on how ADF delivers a unified, scalable approach to logical database migration, in modernizing the database environments with minimal downtime. Prerequisites NOTE: Please make sure to go through the limitations of CDC as this blog doesn't cover those. SQL CDC Limitations Known Issues with CDC Before proceeding, please ensure you have the following prerequisites: An Azure subscription. Access to Azure Data Factory. Source and target databases, such as SQL Server, Azure SQL Database, Azure SQL MI etc. Enable Change Data Capture (CDC) on the source database for online migration. CDC captures changes like insert, update, and delete (DML) operations in the source database, allowing near real-time replication to a target database with minimal latency. To enable CDC, run: -- Enable CDC on the database EXEC sys.sp_cdc_enable_db; -- Enable CDC on the source table EXEC sys.sp_cdc_enable_table @source_schema = N'dbo', @source_name = N'SourceTable', @role_name = NULL; Azure Data Factory Provisioning ADF should be provisioned to provide a runtime environment for executing the pipeline. Self-hosted Integration Runtime (SHIR) SHIR is required to connect to the data source or destination which is not natively reachable by Azure (e.g., on-premises, private VNET, behind firewall). Linked Services These should be created to connect to the source and target. Datasets Datasets identify data within different data stores, such as tables, files, folders, and documents. Performance Optimization To speed up the process, primary keys, non-clustered indexes and constraints should be dropped on the target to reduce blocking/deadlocks and minimize resource contention. Script Components Adf_source.sql This script should be deployed on the source SQL Server. It will populate information in the dbo.data_extraction_config_adf table to run Change Data Capture (CDC) and the initial load pipeline. Adf_target.sql This script should be deployed on the target SQL server. It will create stored procedures to help merge CDC changes and create objects necessary for running pipelines smoothly. Master tables dbo.cdc__watermark__adf contains information about the CDC tables for the last watermark. dbo.data__extraction__config__adf contains information about the heap tables for initial load and CDC tables. dbo.sqlqueries contains information about the clustered tables for initial load. Let's deep dive into pipelines to handle different scenarios Pipeline 1: ClusteredTableMigration_Initial This pipeline migrates data only from clustered tables. The dbo.sqlqueries table automatically populates with clustered table info via the pipeline (Stored Procedure Activity). Ensure the source table schema matches the target table schema. To run the pipeline for specific tables, set the IsActive flag to 0 (inactive) or 1 (active) in the sqlqueries table or add the table name in the Lookup activity. Pipeline 2: HeapTableMigration_Initial This pipeline is designated for migrating heap tables. Prior to executing this pipeline, ensure that the heap table information has been added to the dbo.data__extraction__config__adf table. The source table schema should be synchronized with the target table schema. To execute the pipeline for a set of tables, the IsActive flag may be set to 0 (inactive) or 1 (active) in the dbo.data__extraction__config__adf table. Pipeline 3: CDCTableMigration This pipeline facilitates the migration of clustered tables with Change Data Capture (CDC) enabled. Prior to execution, please ensure that the relevant information for these clustered tables is entered into the dbo.data__extraction__config__adf table. Ensure the table schema is synchronized with the target schema, and that all tables intended for CDC synchronization possess a primary key and matching schema definition on the target system (excluding constraints and non-clustered indexes). To execute the pipeline for specific tables, the IsActive flag can be set to 0 (inactive) or 1 (active) in the dbo.data__extraction__config__adf table. Schedule the Pipeline - For CDC load only Create a trigger: Create a trigger to schedule the pipeline to run at regular intervals (e.g., every 5-30 minutes based on application requirements) to capture and apply changes incrementally. Monitor the pipeline: Monitor the pipeline runs to verify that the data is being migrated and synchronized accurately. Cutover and cleanup Once the delta changes are synchronized fully on source and target database, cutover can be initiated by setting the source database to read-only and then changing the connection string of the application (or all apps, agent jobs etc. that are impacted) to use the new target database and perform cleanup by deleting the SPs in target database and stop the CDC, remove tables, and SPs in source database. Conclusion Using Azure Data Factory allows for both online and offline data migration with minimal downtime, ensuring consistency between source and target databases. Change Data Capture enables near real-time data migration, suitable for environments requiring continuous data synchronization. Note - To get ADF Pipelines and T-SQL Queries mentioned in this blog please reach out to our team alias : datasqlninja@microsoft.comAzure SQL Managed Instance – Sync Agent Jobs and Logins in Failover Groups
In an Azure SQL Managed Instance setup with Failover Group, the schema, data, and database-level users will always be synced between primary and secondary instances. As of October 2021, there is no sync mechanism for SQL Server Agent Jobs or Server Logins/Roles because the functionality of replicating system databases does not exist. This document provides guidance and scripts to implement a workaround solution to have the Agent Jobs and Server Logins/Roles synced between primary and secondary instances.Temporal Table Replication in SQL Server: Common Barriers and Solutions
Introduction Transactional replication is a SQL Server feature that copies and synchronizes data and database objects across servers. It generally begins with a snapshot of the publication database objects and data. After this initial snapshot, any data changes and schema modifications made at the Publisher are delivered to the Subscriber as they occur, typically in near real time. These data changes are applied to the Subscriber in the same order and within the same transaction boundaries as at the Publisher, maintaining transactional consistency within a publication Standard transactional replication in SQL Server does not provide support for system-versioned temporal tables. This constraint presents difficulties for organizations aiming to replicate historical data maintained in temporal columns, such as ValidFrom and ValidTo. The challenge persists even when system versioning is disabled, yet there remains a requirement to retain the original values within the target database. Understanding Temporal Tables System-versioned temporal tables are a specialized form of user table designed to retain a comprehensive record of all data modifications. These tables facilitate point-in-time analysis by automatically recording historical changes. Each temporal table contains two datetime2 period columns that specify the validity duration for each row. In addition to the current table, an associated history table preserves previous versions of rows whenever updates or deletions take place. Scenario & Challenge In one of the migration scenarios, the customer faced an issue where system versioning was disabled, but there was still a requirement to replicate data from the ValidFrom and ValidTo columns to the target database without modification. Although temporal tables are commonly used for auditing and historical analysis, replicating them within a transactional replication setup can present specific technical challenges: System managed period columns complicate schema compliance. Mismatch in ValidFrom and ValidTo columns across environments can compromise audit reliability. As transactional replication currently does not support temporal columns, we devised the following solution to address this requirement. Common Error Example When configuring replication for an article that includes a system-versioned temporal table, the setup process may encounter failures due to SQL Server limitations related to system-generated columns. In certain situations where system versioning is disabled, it may still be necessary to replicate the initial values of the ValidFrom and ValidTo period columns on the target system. However, during the configuration of transactional replication, the snapshot application process can fail on these columns, resulting in the following error: Error message: This issue arises because SQL Server considers these columns system-generated and restricts direct inserts, including during replication. The following workaround addresses this situation. The Workaround To successfully replicate temporal tables, follow these steps: Note: This approach will work in case of scenarios when there is a scope of minimal downtime. Predefine Table Schema on Target: Ensure that the source table schema exists on the target and matches with the source schema. Disable System Versioning Temporarily: Before configuring replication, disable system versioning on the temporal table. This allows replication to treat it like a regular table. ALTER TABLE [dbo].[Department] SET (SYSTEM_VERSIONING = OFF); 3. When you set SYSTEM_VERSIONING = OFF and don't drop the SYSTEM_TIME period, the system continues to update the period columns for every insert and update operation. Use the below script to remove the period for system time. ALTER TABLE dbo.Department DROP PERIOD FOR SYSTEM_TIME; 4. After this step, we can use the below script step by step to configure replication. Replication Setup Steps Set a replication database option for the specified database. This stored procedure is executed at the Publisher or Subscriber on any database. use master GO exec sp_replicationdboption @dbname = N'SourceDBNAme', @optname = N'publish', @value = N'true' GO Create a transactional publication. This stored procedure is executed at the Publisher on the publication database. use [SourceDBName] GO exec sp_addpublication @publication = N'PublicationName', @description = N'Transactional Replication publication of database', @sync_method = N'concurrent', @retention = 0, @allow_push = N'true', @allow_pull = N'true', @allow_anonymous = N'true', @enabled_for_internet = N'false', @SnapShot_in_defaultfolder = N'true', @compress_snapshot = N'false', @ftp_port = 21, @allow_subscription_copy = N'false', @add_to_active_directory = N'false', @repl_freq = N'continuous', @status = N'active', @independent_agent = N'true', @immediate_sync = N'true', @allow_sync_tran = N'false', @allow_queued_tran = N'false', @allow_dts = N'false', @replicate_ddl = 1, @allow_initialize_from_backup = N'false', @enabled_for_p2p = N'false', @enabled_for_het_sub = N'false' GO Create the Snapshot Agent for the specified publication. This stored procedure is executed at the Publisher on the publication database. use [SourceDBName] GO exec sp_addpublication_snapshot @publication = N'PublicationName', @frequency_type = 1, @frequency_interval = 1, @frequency_relative_interval = 1, @frequency_recurrence_factor = 0, @frequency_subday = 8, @frequency_subday_interval = 1, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 0, @active_end_date = 0, @publisher_security_mode = 0, @job_login = N'', @job_password = N'', @publisher_login = N'', @publisher_password = N'' Create an article and add it to a publication. This stored procedure is executed at the Publisher on the publication database. use [SourceDBName] GO exec sp_addarticle @publication = N'PublicationName', @article = N'ArticleName', @source_owner = N'Source Schema Name', @source_object = N'SourceTableName', @type = N'logbased', @description = null, @creation_script = null, @pre_creation_cmd = N'truncate', @schema_option = 0x000000000803509F, @identityrangemanagementoption = N'manual', @destination_table = N'Destination Table Name', @destination_owner = N'Destination Schema Name', @vertical_partition = N'false', @ins_cmd = N'CALL sp_MSins_dboEmployee', @del_cmd = N'CALL sp_MSdel_dboEmployee', @upd_cmd = N'SCALL sp_MSupd_dboEmployee' GO Add a subscription to a publication and set the Subscriber status. This stored procedure is executed at the Publisher on the publication database. use [SourceDBName] GO exec sp_addsubscription @publication = N'PublicationNAme', @subscriber = N'Azure SQL DB Server NAme', @destination_db = N'Target DB Name', @subscription_type = N'Push', @sync_type = N'automatic', @article = N'all', @update_mode = N'read only', @subscriber_type = 0 GO Add a new scheduled agent job used to synchronize a push subscription to a transactional publication. This stored procedure is executed at the Publisher on the publication database. Use [SourceDBNAme] GO exec sp_addpushsubscription_agent @publication = N'PublicationNAme', @subscriber = N'Azure SQL DB Server NAme', @subscriber_db = N'Target DB Name', @job_login = N'', @job_password = null, @subscriber_security_mode = 0, @subscriber_login = N'', @subscriber_password = null, @frequency_type = 64, @frequency_interval = 1, @frequency_relative_interval = 1, @frequency_recurrence_factor = 0, @frequency_subday = 4, @frequency_subday_interval = 5, @active_start_time_of_day = 0, @active_end_time_of_day = 235959, @active_start_date = 0, @active_end_date = 0, @dts_package_location = N'Distributor' GO 5. Once you have performed all the above steps and completed the data migration on target database you need to stop/delete the replication and again add period for system_time on the target table and enable system versioning. ALTER TABLE dbo.Department ADD PERIOD FOR SYSTEM_TIME(<ValidFrom>,<ValidTo>); ALTER TABLE [dbo].[Department] SET (SYSTEM_VERSIONING = ON); Note: The <ValidFrom> and <ValidTo> columns are datetime2 columns defined as PERIOD FOR SYSTEM_TIME, using GENERATED ALWAYS AS ROW START and ROW END. Request to refer the period column names you have created while creating the temporal table and use the same while adding the period columns in the above script. Conclusion Migrating temporal tables within a transactional replication environment involves managing system-versioned features appropriately. Temporarily disabling system versioning and removing the SYSTEM_TIME period allows for adherence to schema requirements and facilitates data replication. After completing replication on the target platform, re-enabling system versioning reinstates temporal table functionality while maintaining data integrity. This workaround ensures that your replication strategy remains robust while preserving the audit trail and historical insights offered by temporal tables.Oracle to Azure SQL Post Migration Optimization: Handling Oracle Function-Based Indexes in Azure SQL
Why Post-Migration Optimization Matters For customers migrating from Oracle to Azure SQL Databases, understanding Azure SQL optimization techniques can make a significant difference in overall database performance. During the Application Stress Testing phase, it is essential to identify query patterns, evaluate index usage, and refine queries to prevent performance bottlenecks and ensure system efficiency. While Azure SQL databases Intelligent Query Processing (IQP) can significantly improve performance with minimal intervention, certain complex scenarios may still require manual query rewrites to achieve the desired level of performance. Leveraging SARGable Queries in SQL Server One such optimization technique involves ensuring that queries in SQL Server are SARGable (Search ARGument-able). SARGable queries enable the SQL Engine to efficiently leverage indexes, significantly reducing resource consumption and execution time. For instance, small adjustments such as replacing column transformations with range-based conditions, creating computed columns, and adding required indexes on computed columns can have a significant impact on query performance. Metric Non-SARGable Query SARGable Query Execution Plan Index Scan (usually less efficient) Index Seek (usually more efficient) IO Cost High (Reads Many Pages) Low (Reads Fewer Pages) CPU Time Higher (Function Evaluation for Rows) Lower (Direct Use of Index) Logical Reads High (entire index/table scanned) Low (only relevant index pages accessed) Handling Function-Based Indexes: Oracle vs SQL Server Oracle: In Oracle, function-based indexes allow indexing on the result of a function or expression. This enables efficient query execution when functions or expressions are used in WHERE or JOIN conditions. The TRUNC(APPLN_START_DT) function is precomputed and stored in the Index. Queries can get the value of the expression from the index instead of computing it. The following is an example of this in practice: -- Create a function-based index on the TRUNC(APPLN_START_DT) expression CREATE INDEX idx_appln_start_dt ON appln_data ( trunc(appln_start_dt) ); -- Filter rows where the year of APPLN_START_DT matches the current year. SELECT * FROM appln_data WHERE trunc(appln_start_dt, 'YYYY') = trunc(sysdate, 'YYYY'); SQL Server: SQL Server takes a slightly different approach to handling Indexes and Query Optimization, particularly when working with large datasets or tables containing millions of rows. The SQL optimizer excels when queries are SARGable (Search ARGument-able), meaning the predicates are written to allow direct index utilization without applying transformations to columns. Let’s analyze this with a simple and practical example to see how query design impacts performance, resource usage, and execution efficiency. Let's analyze two versions of a query on a table [APPLN_DATA] with 50 million rows to understand how query formulation impacts performance, resource utilization, and execution efficiency. Comparing SARGable and Non-SARGable Queries Non-SARGable Query: -- Filter rows where the year of APPLN_START_DT matches the current year. SELECT * FROM [dbo].[APPLN_DATA] WHERE YEAR(APPLN_START_DT) = YEAR(GETDATE()); Execution Plan: Index Scan. Logical Reads: 928,804 CPU Time: 7,562 ms Physical Reads: 2 Elapsed Time: 63,130 ms Execution Plan: Why It's Inefficient: The YEAR() function transforms the APPLN_START_DT column, which prevents the optimizer from performing an Index Seek and forces an Index Scan. The execution plan confirms this by showing an Index Scan, where SQL Server reads through all rows instead of directly seeking the relevant ones using the index. This behavior increases resource consumption, including logical reads, CPU usage, I/O costs, and overall query execution time. SARGable Query: SELECT * FROM [dbo].[APPLN_DATA] WHERE APPLN_START_DT >= CAST(YEAR(GETDATE()) AS CHAR(4)) + '-01-01' AND APPLN_START_DT < CAST(YEAR(GETDATE()) + 1 AS CHAR(4)) + '-01-01'; Execution Plan: Index Seek. Logical Reads: 173,041 CPU Time: 3,547 ms Elapsed Time: 52,092 ms Execution Plan: Why It's Better: The query uses range-based conditions (>= and <) directly on the APPLN_START_DT column without applying any functions. This makes the query SARGable, allowing SQL Server to efficiently perform an Index Seek, as shown in the execution plan. An Index Seek ensures that SQL Server retrieves only the relevant rows, significantly reducing logical reads, CPU usage, and overall query execution time. Understanding the Key Differences and Impact of SARGable and Non-SARGable Queries Aspect Non-SARGable SARGable Query Example WHERE YEAR(APPLN_START_DT) = YEAR(GETDATE()) WHERE APPLN_START_DT >= '2024-01-01' Index Usage Index Scan (Due to YEAR() function - inefficient) Index Seek (efficient) Logical Reads High (entire index/table scanned) Low (only relevant index pages accessed) CPU Time High Low Elapsed Time Longer Shorter Query Optimization Applies function on the column Uses range conditions without transformations Additional Optimization Techniques: A Computed column with an index was not used here because the query retrieves rows for an entire year, making range-based conditions more efficient. Computed indexes are more suitable for scenarios with specific transformations on smaller subsets of data, such as Calculations and Filtering for a particular month or day. In our example, we use SELECT * for simplicity, but in real-world workloads, queries typically retrieve specific columns based on application requirements. Selecting only the required columns reduces I/O and improves query performance. Additionally, Covering Indexes (indexes that include all queried columns) should be considered to eliminate Key Lookups, further enhancing performance for frequently accessed queries. If your table has billions of rows, even efficient Index Seeks may take considerable time. Filtered Indexes, Indexed Views and Partitioning the table based on a logical key (e.g., date ranges) can help improve query performance further. Key Takeaways for a successful Heterogenous Migration: Oracle's function-based indexes can be adapted in SQL Server using Range-based Conditions, Computed Columns with Indexes, Indexed Views, or Application-level Optimizations. Aim to write SARGable queries to leverage SQL Server's indexing capabilities effectively. Avoid using Functions in Query Predicates. Evaluate and consider the use of Indexed Views and Filtered Indexes. It is crucial to iteratively analyze and Update Index Statistics, Evaluate Index Access patterns, Review Histograms, monitor Query Plans and Fragmentation, Rebuilding Indexes during the App Stress Testing phase to achieve optimal performance. Collaborate with application developers to refactor query logic when needed. References: Intelligent query processing details - SQL Server | Microsoft Learn Create indexed views - SQL Server | Microsoft Learn Create filtered indexes - SQL Server | Microsoft Learn Migrating Oracle Databases to SQL Server (OracleToSQL) - SQL Server | Microsoft Learn Automatic tuning - SQL Server | Microsoft Learn Feedback and suggestions If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Ninja Engineering Team (datasqlninja@microsoft.com). Thanks for your support! Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.Ingesting Mainframe File System Data (EBCDIC) into SQL DB on Fabric Using OSS Cobrix
Introduction Mainframe/Midrange data is often stored in fixed-length format, where each record has a predetermined length, or variable-length format, where each record’s length may vary. The data is stored in binary format, using Extended Binary Coded Decimal Interchange Code (EBCDIC) encoding and the metadata for the EBCDIC files is stored in a copybook file. These EBCDIC encoded files store data uniquely based on its data type, which is vital Mainframe file system data optimal storage and performance. However, this presents a challenge when migrating data from Mainframe or Midrange systems to distributed systems. The data, originally stored in a format specific to Mainframe or Midrange systems, is not directly readable upon transfer to distributed systems. As distributed systems only understand code pages like American Standard Code for Information Interchange (ASCII) To make this data readable on a distributed system, we would need to do an EBCDIC to ASCII code page conversion. This conversion can be achieved in many ways. Few of them are Microsoft Host Integration Server, Host File client Logic app IBM host File connector. Our detailed blog about it is here. Open Source (OSS) Libraries. Third-party ISV solutions. Microsoft Host Intergration server Microsoft Host Integration server (HIS) has a component named Host File Client (HFC). This particular component helps in converting Mainframe EBCDIC files to ASCII using a custom developed C# solution. More details on this solution is provided in HIS documentation page. Logic App Converter. If you prefer to choose a cloud native solution, then you can try to use the Host File Connector in Azure Logic Apps. The detailed process has been documented in this blog post. Fabric (with Open-Source Libraries) Microsoft Fabric is an enterprise-ready, end-to-end analytics platform. It unifies data movement, data processing, ingestion, transformation, real-time event routing, and report building. It supports these capabilities with integrated services like Data Engineering, Data Factory, Data Science, Real-Time Intelligence, Data Warehouse, and Databases. There are many open-source solutions which can help in achieving conversion of mainframe data to ASCII. This will help in converting files using Fabric, Databricks, Synapse on Azure. This blog will focus on the OSS option. Data Ingestion Architecture Using OSS on Fabric There are multiple Open-source libraries that can be utilized for this data conversion. In this article we will dive deeper into one of these solutions - Cobrix COBRIX is an open-source library built using scala and leverages the multithreaded process powered framework of spark. This helps in converting the file faster than compared to other single threaded processes. As this is multithreaded, it will need a pool of compute resources to achieve the conversion. Cobrix can run on spark environments like Azure Synapse, Databricks and Microsoft Fabric. We will dive deeper into how we can set up Cobrix on Microsoft Fabric. Download required Cobrix packages We will have to first download the required Cobrix packages from the right sources. As Fabric has a particular runtime dependency, please make sure your download the right build for Scala as per the fabric environment that you setup. You will have to download two jars named Cobol-Parser_xx.xx.jar and Spark-cobol_xxx.xx.jar. Setup the Fabric Environment. Login to fabric.microsoft.com. Create a Fabric workspace Create an Environment in the workspace Open the Environment and click on custom Libraries. Upload the two jars which were downloaded earlier.Once you have uploaded your custom library setup should look something like this. Create a new Lakehouse. Upload the cobol copybook file as well as the Mainframe Datafile in Binary to a particular location in the lakehouse. At the end of this step your lakehouse setup should look something of this kind. For both these files, copy the Azure Blob File System Secure (ABFSS) path by right clicking on the files. This link can be used to point to the file from the Spark notebook. Create a new Fabric pipeline. This pipeline will have two components, the first component will be a notebook, which will call the Cobrix framework to convert the file from EBCDIC to ASCII. Second piece of it will be a copy activity to copy the contents of the output file created in the notebook to a SQL DB on Fabric. Create a new Notebook . Attach the environment which you had created earlier to this notebook. In the notebook cell, use can use this piece of code. //Blob access var CopyBookName = "abfss://file1.cpy" var DataFileName = "abfss://file1.dat" var outputFileName = "abfss://output.txt" //Cobrix Converter Execution val cobolDataframe = spark .read .format("za.co.absa.cobrix.spark.cobol.source") .option("copybook", CopyBookName) .load(DataFileName) //Display DataFrame to view conversion results cobolDataframe.printSchema() cobolDataframe.show() Once you have set the configuration properly, you are all set to run the notebook. And this will convert the file from EBCDIC to ASCII and store it to the Lakehouse. Add a Copy activity to the pipeline with File as Source and SQL server as destination. At this point in time, your pipeline should look something like this Once you run this pipeline, the Mainframe EBCDIC file will be converted to ASCII and then loaded into Fabric Native SQL DB table. Third-party ISV solutions. There are many third-party ISV solutions which are available for EBCDIC to ASCII conversions. Please get int touch with us to help you get the right solution for your requirements. Summary EBCDIC to ASCII conversion is a critical piece of work during the data migration/modernization journey. Being able to do this with ease and accuracy will drive the success of data migration. With this feature enabled in fabric, this opens up a new set of use cases like Mainframe report generation etc kind of use cases which are predominantly data warehouse driven. Feedback and suggestions If you have feedback or suggestions for improving this data migration asset, please send an email to Database Platform Engineering Team.Seamless Cross-Tenant Migration of Azure SQL Databases without Modifying Connection Strings
Transferring an Azure SQL Database between tenants involves several steps, but it can be done efficiently without changing your connection strings. There are several reasons why an organization might need to migrate Azure Entra tenants. These can include organizational restructuring, mergers and acquisitions, compliance requirements, or the need to consolidate resources for better management and cost efficiency. This guide outlines the process in a clear and concise manner. High-Level Overview The migration process involves the following key steps: Create a Temporary Subscription in the Source Tenant: This is the initial step where you set up a new subscription in the source tenant. Move Resources to the Temporary Subscription: Use Azure Resource Mover to transfer the Azure SQL Server and Database to the new subscription. Transfer the Subscription to the Target Entra Tenant: This involves transferring the billing ownership of the subscription to the target tenant. Accept the Transfer at the Target Tenant: The target tenant accepts the transfer and completes the necessary verification steps. Move Resources to the Target Subscription in the Target Tenant: Finally, move the resources to the desired subscription within the target tenant. Detailed Steps Create a Temporary Subscription in the Source Tenant Navigate to the Azure Portal and create a new subscription (i.e. Pay-As-You-Go) if necessary. Enter your payment information and complete the setup. Move Resources to the Temporary Subscription Open Azure Resource Mover and select "Move across subscription". Create a new Resource Group and add the resources you want to move. Select Add resources and select your SQL Server and SQL Database resources you want to move Verify the resources and initiate the move. Note: This can take a few minutes. Transfer the Subscription to the Target Entra Tenant Select the Pay-As-You-Go subscription and transfer billing ownership to the target tenant's administrator. Enter the recipient's email address Important: Select 'Move subscription tenant' toggle to 'Yes' The source administrator will receive an email to confirm the transfer request. Accept the Transfer at the Target Tenant The target tenant's administrator accepts the transfer and completes the verification process. Once the transfer is complete, the subscription will appear in the target tenant. Move Resources to the Target Subscription in the Target Tenant Use Azure Resource Mover to transfer the Azure SQL Server and Database to the desired subscription within the target tenant. Verify the resources and complete the move. Post-Transfer Tasks After the migration, there are a few additional steps and best practices that should be followed to ensure everything is set up correctly: Take Ownership of the Server in the New Tenant: Set the desired Entra ID for the administrator of the SQL Server. Configure Security and Access: Set up necessary security controls and drop any old Entra IDs. Review Firewall Rules: Ensure that firewall rules and any Azure Key Vault configurations are correct. Test the Database: Perform thorough testing to ensure all data and functionalities are intact. Downtime Planning During the migration process, it's crucial to plan for minimal downtime. In our testing, we used an application that continuously inserted records into the database, and no loss of connection occurred throughout the transfer. Little to no downtime can be achieved in this process but only when using SQL Authentication. Because the Entra tenant is changing, any logins based on that tenant will cease to work once the database is moved. Conclusion Business conditions often change and can require the movement of Azure SQL Databases across Entra tenants. While this process can often seem daunting, following this can be straightforward if you follow the steps outlined in this guide. Remember to thoroughly test the database after the transfer and update all necessary configurations and security rules to maintain seamless operations. Feedback If you have feedback or suggestions for improving this data migration asset, please comment here or contact the Databases SQL Customer Success Engineering Team (datasqlninja@microsoft.com) directly. Thanks for your support! Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide. Happy Migrating!