database platform cse – oracle
15 TopicsDiagnosing and Resolving Oracle to Azure PostgreSQL Migration Failures with Ora2Pg
Introduction Migrating from Oracle to Azure PostgreSQL is a complex endeavor with multiple coordinated steps. Ora2Pg is a widely used tooling to accelerate the migration for both schema conversion and data migration. For installing Ora2Pg, access the step-by-step installation guide for Windows and Linux here After successful schema migration, Data migration can become complicated and time-consuming, especially when dealing with tables containing millions or billions of rows. The process may also encounter failures due to data errors related to platform, data type, or character set incompatibility. Identifying these errors can be challenging, but by applying the following debugging methods and troubleshooting steps, one can efficiently resolve these issues during large and complex migrations. Steps to debug the errors: Locate the Configuration File: Open your ora2pg.conf file. This file is typically found in the directory where you initialized your Ora2Pg project. Find the LOG_ON_ERROR parameter in the Ora2Pg configuration file. If it doesn't exist, you can add it. Set the value of LOG_ON_ERROR to 1 to enable logging of errors. Here's a brief example of how you set this in your ora2pg.conf file: Specify the Output Directory: Ensure that you have specified an output directory where the error logs will be saved. This can be done using the OUTPUT directive in the same configuration file. Proceed with the data import. When Ora2Pg encounters an error in a COPY or INSERT statement from PostgreSQL, it logs the problematic statements and data into a file named TABLENAME_error.log within the output directory and the migration will continue. By default, this setting is disabled, causing the import to abort upon error. Continue with the next batch of data. This approach allows you to troubleshoot and manually reload the error log file after fixing the issues. Review each column of the row to identify the problematic data. If you still can't determine the issue, update the log file by removing the SQL syntax and keeping the data in a tab-delimited format. Use the pgAdmin import utility to load the file and check if it generates the same error for the specific row. Capture the problematic data rows and attempt to insert them into a test environment with the same table structure or recreate the table. Run Ora2PG to import the data and check if the issue can be reproduced. If you have specific errors that you want to handle differently, you might need to adjust other parameters or review the logged errors to make necessary corrections before re-running the migration. Use Case: 1. Data migration failed and debug logs are showing some error message. 2. Enable LOG_ON_ERROR parameter file in config file. Re-run the data migration using Ora2Pg, which will generate a file containing data errors in the output directory. 3. Review the log file to identify problematic data. Apply appropriate remedies to resolve these issues. Search for errors and possible solution from here. 4. If no solution is available to fix data errors, consider using one of the following methods or tools to manually insert the data. Using Azure data factory. Update bad data manually in tablename_error file and import it using pgadmin. Export table using oracle developer into flat file and import using pgAdmin. Let’s go through some scenarios to illustrate why data copying might fail with Ora2Pg but succeed with Azure Data Factory or manual insertion using pgAdmin: Scenario 1: Data Type Handling Issue: Ora2Pg encounters an error while migrating a table with a NUMBER data type that has precision and scale values unsupported by PostgreSQL. Ora2Pg: Throws an error due to the unsupported precision and scale. Azure Data Factory: Automatically converts the NUMBER data type to a compatible PostgreSQL data type, such as NUMERIC. pgAdmin: Allows manual intervention to adjust the data type during insertion, avoiding the errors due to precision mismatch between Number and Numeric. Scenario 2: Character Set Issues Issue: The table contains special characters or emojis that are not supported by the configured character set in PostgreSQL. Ora2Pg: Fails to migrate the data due to character set incompatibility. Azure Data Factory: Handles the character set conversion more gracefully, ensuring the special characters are correctly encoded. pgAdmin: Allows manual encoding adjustments during data insertion, bypassing the character set issue. Scenario 3: Platform-Specific Incompatibilities Issue: The table includes Oracle-specific functions or features that are not directly compatible with PostgreSQL. (examples, Inbuilt Functions, Views or UDF) Ora2Pg: Encounters errors due to the strict enforcement of platform-specific rules. Azure Data Factory: Uses built-in transformations to convert Oracle-specific features to their PostgreSQL equivalents. pgAdmin: Allows manual editing of the data or SQL queries to ensure compatibility with PostgreSQL. Scenario 4: Error Handling and Logging Issue: A specific row contains invalid data that violates PostgreSQL constraints. Ora2Pg: Stops the migration process and logs the error, requiring manual intervention to fix the data. Azure Data Factory: Logs the error but continues with the migration, allowing you to address the problematic rows later. pgAdmin: Provides immediate feedback during manual insertion, allowing you to correct the data on the spot. Scenario 5: Configuration Differences Issue: Different default settings for data transfer and conversion. Ora2Pg: Uses strict default settings that might not be optimal for all data types and structures. Azure Data Factory: Offers more flexible configuration options that can be adjusted to handle various data scenarios. pgAdmin: Allows for manual configuration and adjustments during the data insertion process. Feedback and suggestions If you have feedback or suggestions for improving this data migration asset, please send an email to Azure Databases CSE Engineering Team.Optimizing Data Archival with Partitioning in Azure PostgreSQL for Oracle Migrations
Introduction As enterprises migrate mission-critical workloads from heterogeneous databases like Oracle to Azure Database for PostgreSQL Flexible Server, managing large datasets while ensuring compliance with strict data retention policies becomes a key priority. Industries such as retail, telecommunications, transportation and logistics industry, among others, enforce stringent data retention requirements to safeguard customer information, operational efficiency and maintain service reliability. Failure to meet these standards can lead to increased risks, including data loss, inefficiencies, and potential non-compliance issues. Implementing a robust data retention and partitioning strategy in PostgreSQL helps organizations efficiently manage and archive historical data while optimizing performance. Azure Database for PostgreSQL Flexible Server provides powerful partitioning capabilities that allow organizations to manage large volumes of data effectively. By partitioning tables based on time intervals or other logical segments, we can improve query performance, automate archival processes, and ensure efficient data purging—all while maintaining referential integrity across complex schemas. Migration Challenges While both Azure Database for PostgreSQL Flexible Server and Oracle support partitioning, their approaches differ significantly. In Azure Database for PostgreSQL Flexible Server, partitions are not created automatically; each partition must be explicitly defined using the CREATE TABLE statement. This means that when setting up a partitioned table, each partition must be created separately, requiring careful planning and implementation. This blog explores best practices for implementing range partitioning in Azure Database for PostgreSQL Flexible Server, maintaining referential integrity across multiple levels, and leveraging Azure Blob Storage through the azure storage extension to efficiently archive partitioned data. Implementing Partitioning with Example In this example, we demonstrate a partitioning strategy within the test_part schema, where a parent table logs and a child table child_logs are structured using range partitioning with a monthly interval. Depending on specific business requirements, the partitioning strategy can also be adjusted to quarterly or yearly intervals to optimize storage and query performance. --Create and set schema for the session CREATE SCHEMA test_part; SET SEARCH_PATH=test_part; --Create a partitioned table CREATE TABLE logs ( id integer not null, log_date date not null, message text ) PARTITION BY RANGE (log_date); --Add primary key constraints in parent partition table ALTER TABLE ONLY logs ADD primary key (id,log_date); --Define partition for each month CREATE TABLE logs_2024_01 PARTITION OF logs FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); CREATE TABLE logs_2024_02 PARTITION OF logs FOR VALUES FROM ('2024-02-01') TO ('2024-03-01'); --Create a Child partition table CREATE TABLE logs_child ( id integer, log_date date, message text, logs_parent_id integer ) PARTITION BY RANGE (log_date); --Add constraints ALTER TABLE ONLY logs_child ADD primary key (id,log_date); ALTER TABLE logs_child add constraint logs_child_fk foreign key(logs_parent_id,log_date) references logs(id,log_date) ON DELETE CASCADE; --Define a partition for each month CREATE TABLE logs_child_2024_01 PARTITION OF logs_child FOR VALUES FROM ('2024-01-01') TO ('2024-02-01'); CREATE TABLE logs_child_2024_02 PARTITION OF logs_child FOR VALUES FROM ('2024-02-01') TO ('2024-03-01'); --Insert data into the parent partition table : INSERT INTO logs (id,log_date, message) VALUES (1,'2024-01-15', 'Log message 1'); INSERT INTO logs (id,log_date, message) VALUES (11,'2024-01-15', 'Log message 1'); INSERT INTO logs (id,log_date, message) VALUES (2,'2024-02-15', 'Log message 2'); INSERT INTO logs (id,log_date, message) VALUES (22,'2024-02-15', 'Log message 2'); --Insert data into child partition table: INSERT INTO logs_child values (1,'2024-01-15', 'Log message 1',1); INSERT INTO logs_child values (2,'2024-01-15', 'Log message 1',1); INSERT INTO logs_child values (5,'2024-02-15', 'Log message 2',22); INSERT INTO logs_child values (6,'2024-02-15', 'Log message 2',2); --Review data using Select SELECT * FROM logs; SELECT * FROM logs_2024_01; SELECT * FROM logs_2024_02; SELECT * FROM logs_child_2024_01; SELECT * FROM logs_child_2024_02; Detach the partition: While detaching partition follow below order, first detach the child table partition, remove FK and then detach the parent table partition. --Remove partitioning tables ALTER TABLE logs_child DETACH PARTITION logs_child_2024_02; ALTER TABLE logs_child_2024_02 DROP CONSTRAINT logs_child_fk; ALTER TABLE logs DETACH PARTITION logs_2024_02; Archive the partition table in the Azure blob storage: The following steps demonstrate how to restore removed partition data in Azure Blob Storage using Microsoft Entra ID for authorization. Create an azure_storage extension by following steps provided in the link here. --Create an extension CREATE EXTENSION azure_storage; SET search_path=Azure_storage; --Add account (Entra id or Authentication keys steps provided in reference document link here) SELECT * FROM azure_storage.account_options_managed_identity('shaystorage','blob'); SELECT * FROM azure_storage.account_add('{ "auth_type": "managed-identity", "account_name": "shayrgstorage", "account_type": "blob" }'); SET SEARCH_PATH=test_part; COPY test_part.logs_child_2024_02 TO 'https://shayrgstorage.blob.core.windows.net/pgtable/logs_child_2024_02.csv' WITH (FORMAT 'csv', header); View or load the archived partitioned table --After truncating data from the partition, view data from Azure storage .csv file. When archival data needed for ready only purpose TRUNCATE TABLE test_part.logs_child_2024_02; SELECT * FROM test_part.logs_child_2024_02; SELECT * FROM azure_storage.blob_get ('shayrgstorage' ,'pgtable' ,'logs_child_2024_02.csv' ,NULL::test_part.logs_child_2024_02 ,options := azure_storage.options_csv_get(delimiter := ',' , header := 'true') ); --Load data from .csv file to Azure database for PostgreSQL flexible server table. When need to restore the data for update purpose TRUNCATE TABLE test_part.logs_child_2024_02; INSERT INTO test_part.logs_child_2024_02 (id,log_date,message,logs_parent_id) SELECT * FROM azure_storage.blob_get ('shayrgstorage' ,'pgtable' ,'logs_child_2024_02.csv' ,NULL::test_part.logs_child_2024_02 ,options := azure_storage.options_csv_get(delimiter := ',' , header := 'true') ); Attach the partition table --Attach the partition table to view data in partition table for operation purpose ALTER TABLE test_part.logs attach PARTITION test_part.logs_2024_02 for values from ('2024-02-01') TO ('2024-03-01'); ALTER TABLE test_part.logs_child attach PARTITION test_part.logs_child_2024_02 for values from ('2024-02-01') TO ('2024-03-01'); Alternative Data Archival Strategies Based on Business Requirements: Deploy a lower sku Azure database for PostgreSQL Server such as Burstable or General Purpose service tier. Utilize the postgres_fdw extension to move the data between tables resides in different PostgreSQL databases or instances. Burstable servers are available with up to 64 TB of space. Automate database start/stop processes to minimize expenses when loading or extracting data. If the database size is relatively small, consider removing a partition from a partitioned table using the ALTER TABLE DETACH PARTITION command, converting it into a standalone table for easier archival. Use LTR options to retain the database backups for up to 10 years, depending on the business requirement and restore it when needed. For more information review here Utilize Azure Data Factory (ADF) Pipeline to move data into Azure storage and restore it as needed using automation scripts. Feedback and suggestions If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Customer Success Engineering (Ninja) Team (datasqlninja@microsoft.com). Thanks for your support! Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.Oracle to Azure SQL Post Migration Optimization: Handling Oracle Function-Based Indexes in Azure SQL
Why Post-Migration Optimization Matters For customers migrating from Oracle to Azure SQL Databases, understanding Azure SQL optimization techniques can make a significant difference in overall database performance. During the Application Stress Testing phase, it is essential to identify query patterns, evaluate index usage, and refine queries to prevent performance bottlenecks and ensure system efficiency. While Azure SQL databases Intelligent Query Processing (IQP) can significantly improve performance with minimal intervention, certain complex scenarios may still require manual query rewrites to achieve the desired level of performance. Leveraging SARGable Queries in SQL Server One such optimization technique involves ensuring that queries in SQL Server are SARGable (Search ARGument-able). SARGable queries enable the SQL Engine to efficiently leverage indexes, significantly reducing resource consumption and execution time. For instance, small adjustments such as replacing column transformations with range-based conditions, creating computed columns, and adding required indexes on computed columns can have a significant impact on query performance. Metric Non-SARGable Query SARGable Query Execution Plan Index Scan (usually less efficient) Index Seek (usually more efficient) IO Cost High (Reads Many Pages) Low (Reads Fewer Pages) CPU Time Higher (Function Evaluation for Rows) Lower (Direct Use of Index) Logical Reads High (entire index/table scanned) Low (only relevant index pages accessed) Handling Function-Based Indexes: Oracle vs SQL Server Oracle: In Oracle, function-based indexes allow indexing on the result of a function or expression. This enables efficient query execution when functions or expressions are used in WHERE or JOIN conditions. The TRUNC(APPLN_START_DT) function is precomputed and stored in the Index. Queries can get the value of the expression from the index instead of computing it. The following is an example of this in practice: -- Create a function-based index on the TRUNC(APPLN_START_DT) expression CREATE INDEX idx_appln_start_dt ON appln_data ( trunc(appln_start_dt) ); -- Filter rows where the year of APPLN_START_DT matches the current year. SELECT * FROM appln_data WHERE trunc(appln_start_dt, 'YYYY') = trunc(sysdate, 'YYYY'); SQL Server: SQL Server takes a slightly different approach to handling Indexes and Query Optimization, particularly when working with large datasets or tables containing millions of rows. The SQL optimizer excels when queries are SARGable (Search ARGument-able), meaning the predicates are written to allow direct index utilization without applying transformations to columns. Let’s analyze this with a simple and practical example to see how query design impacts performance, resource usage, and execution efficiency. Let's analyze two versions of a query on a table [APPLN_DATA] with 50 million rows to understand how query formulation impacts performance, resource utilization, and execution efficiency. Comparing SARGable and Non-SARGable Queries Non-SARGable Query: -- Filter rows where the year of APPLN_START_DT matches the current year. SELECT * FROM [dbo].[APPLN_DATA] WHERE YEAR(APPLN_START_DT) = YEAR(GETDATE()); Execution Plan: Index Scan. Logical Reads: 928,804 CPU Time: 7,562 ms Physical Reads: 2 Elapsed Time: 63,130 ms Execution Plan: Why It's Inefficient: The YEAR() function transforms the APPLN_START_DT column, which prevents the optimizer from performing an Index Seek and forces an Index Scan. The execution plan confirms this by showing an Index Scan, where SQL Server reads through all rows instead of directly seeking the relevant ones using the index. This behavior increases resource consumption, including logical reads, CPU usage, I/O costs, and overall query execution time. SARGable Query: SELECT * FROM [dbo].[APPLN_DATA] WHERE APPLN_START_DT >= CAST(YEAR(GETDATE()) AS CHAR(4)) + '-01-01' AND APPLN_START_DT < CAST(YEAR(GETDATE()) + 1 AS CHAR(4)) + '-01-01'; Execution Plan: Index Seek. Logical Reads: 173,041 CPU Time: 3,547 ms Elapsed Time: 52,092 ms Execution Plan: Why It's Better: The query uses range-based conditions (>= and <) directly on the APPLN_START_DT column without applying any functions. This makes the query SARGable, allowing SQL Server to efficiently perform an Index Seek, as shown in the execution plan. An Index Seek ensures that SQL Server retrieves only the relevant rows, significantly reducing logical reads, CPU usage, and overall query execution time. Understanding the Key Differences and Impact of SARGable and Non-SARGable Queries Aspect Non-SARGable SARGable Query Example WHERE YEAR(APPLN_START_DT) = YEAR(GETDATE()) WHERE APPLN_START_DT >= '2024-01-01' Index Usage Index Scan (Due to YEAR() function - inefficient) Index Seek (efficient) Logical Reads High (entire index/table scanned) Low (only relevant index pages accessed) CPU Time High Low Elapsed Time Longer Shorter Query Optimization Applies function on the column Uses range conditions without transformations Additional Optimization Techniques: A Computed column with an index was not used here because the query retrieves rows for an entire year, making range-based conditions more efficient. Computed indexes are more suitable for scenarios with specific transformations on smaller subsets of data, such as Calculations and Filtering for a particular month or day. In our example, we use SELECT * for simplicity, but in real-world workloads, queries typically retrieve specific columns based on application requirements. Selecting only the required columns reduces I/O and improves query performance. Additionally, Covering Indexes (indexes that include all queried columns) should be considered to eliminate Key Lookups, further enhancing performance for frequently accessed queries. If your table has billions of rows, even efficient Index Seeks may take considerable time. Filtered Indexes, Indexed Views and Partitioning the table based on a logical key (e.g., date ranges) can help improve query performance further. Key Takeaways for a successful Heterogenous Migration: Oracle's function-based indexes can be adapted in SQL Server using Range-based Conditions, Computed Columns with Indexes, Indexed Views, or Application-level Optimizations. Aim to write SARGable queries to leverage SQL Server's indexing capabilities effectively. Avoid using Functions in Query Predicates. Evaluate and consider the use of Indexed Views and Filtered Indexes. It is crucial to iteratively analyze and Update Index Statistics, Evaluate Index Access patterns, Review Histograms, monitor Query Plans and Fragmentation, Rebuilding Indexes during the App Stress Testing phase to achieve optimal performance. Collaborate with application developers to refactor query logic when needed. References: Intelligent query processing details - SQL Server | Microsoft Learn Create indexed views - SQL Server | Microsoft Learn Create filtered indexes - SQL Server | Microsoft Learn Migrating Oracle Databases to SQL Server (OracleToSQL) - SQL Server | Microsoft Learn Automatic tuning - SQL Server | Microsoft Learn Feedback and suggestions If you have feedback or suggestions for improving this data migration asset, please contact the Databases SQL Ninja Engineering Team (datasqlninja@microsoft.com). Thanks for your support! Note: For additional information about migrating various source databases to Azure, see the Azure Database Migration Guide.Resolving ADF Copy Upsert Failures in SQL DB for XML Columns
Customers modernizing their Oracle, Mainframe/Midrange Db2, Sybase and other databases to Azure SQL often use Azure Data Factory (ADF) for data transfer. For large datasets, an initial copy is made a few days prior, followed by incremental data copies during cutover. The ADF Copy activity's Upsert method is common for these incremental transfers, but this method fails for tables which contain XML columns. This blog explains using Stored Procedure as the Sink behavior in Copy activity to perform Upsert into Azure SQL Database. Same method can be used even when target is Azure SQL MI or Azure SQL VM.Oracle to Azure PostgreSQL - Flexible Server Incremental Load Using ADF Data Flows - Part 1
This blog post provides a step by step guide for incremental data load from Oracle to Azure Database for PostgreSQL – Flexible Server using Azure Data Factory dataflows. By using this solution there is no additional set up required in the source Oracle database as typically needed using LogMiner or a traditional CDC tool (Change Data Capture).Accelerating Oracle Estate Assessment for Migrations to SQL Server
Collecting information to perform migration assessments for many servers is a painful manual effort. To alleviate some of this pain, we have come up with PowerShell scripts that run our PreSSMA quick assessment script or perform a full SSMA project collection from a spreadsheet driven list of Oracle servers. Include is a SQL script to provide a rough high level estimate of the conversion effort using the collected PreSSMA information.Key Considerations to avoid Implicit Conversion issues in Oracle to Azure SQL Modernization
Overview This blog dives into the mechanics of implicit data type conversions and their impact during the post-migration performance optimization phase of heterogeneous database migrations. Drawing from our observed Engineering field patterns across diverse application architectures, this blog explores why certain platforms like ORMs, JDBC drivers, and cross-platform data models are more prone to implicit conversions than others and how they result in performance issues, break indexes, or cause query regressions. You'll gain actionable strategies to detect, mitigate, and design around these pitfalls to ensure a successful and performant data migration to platforms like Azure SQL. Understanding Implicit Conversions in Database Migrations When an application communicates with a database whether through JDBC, ODBC, ADO.NET, or any other data access API it sends parameters and query values in a specific data type format. However, if the type of the value provided by the application does not exactly match the data type of the target column in the database, the database engine attempts to reconcile the mismatch automatically. This automatic adjustment is known as implicit conversion. For instance, a value passed as a string from the application may be compared against a numeric or date column in the database. This occurs because many front-end systems and APIs transmit values as strings by default, even if the underlying business logic expects numbers or dates. Unless the application explicitly parses or casts these values to match the expected types, the database engine must decide how to handle the type mismatch during query execution. In such cases, the engine applies type conversion internally, either to the parameter or the column, based on its own rules. While this feature can simplify application development by allowing flexible data handling, it often introduces engine-specific behavior that becomes more visible during cross engine database migrations, where assumptions built into one system may not hold true in another. Impact of Implicit Conversions Implicit conversions can adversely affect database performance and functionality in several ways, some of which are discussed below: Performance Degradation: When a database performs an implicit conversion, it may bypass indexes, resulting in slower query execution. For example, comparing a VARCHAR column to an INT value in SQL Server can trigger a table scan instead of an index seek, significantly increasing query time. Migration-Specific Issues and Data Integrity Risks: Implicit conversions can cause data loss or incorrect results during a few instances and one such example is, when a column defined as VARCHAR2 in Oracle, which can store Unicode characters by default is mapped to a VARCHAR column in SQL Server, non-ASCII characters such as Chinese, Russian, or Korean may be silently replaced with incorrect characters/symbols. One example of scenario when this can happen: Oracle VARCHAR2 stores Unicode if the database character set is UTF-8 (AL32UTF8), which is common in modern Oracle installations. SQL Server VARCHAR is ANSI/code-page based, so non-ASCII characters are stored differently, unless the column is explicitly declared as NVARCHAR. -- In Real World this can happen on any other data types Maintenance Challenges: Queries relying on implicit conversions are harder to debug and optimize, as these conversions are not explicitly visible in the code and may only surface during performance regressions. These queries forces the optimizer to compile an execution plan containing scans of large clustered indexes, or tables, instead of a seek resulting in degraded performance Execution Overhead and Resource Consumption: Implicit conversions increase execution times for both queries and API calls, as the engine must perform runtime casting operations. This can lead to higher CPU usage, increased logical reads, and memory pressure. Detection Methods Detecting implicit conversions is crucial for optimizing database performance post-migration. The following methods can be employed to detect: Query Store (QDS): Use QDS post-migration during load testing to track expensive queries based on cost and surface performance regressions caused by type mismatches. Review execution plans captured in QDS for conversion-related patterns. You can also use custom script like below to query the QDS: USE <[Replace_with_actual_DB_name]> -- Replace with the actual database name GO SELECT TOP (100) DB_NAME() AS [Database], qt.query_sql_text AS [Consulta], rs.last_execution_time AS [Last Execution Time], rs.avg_cpu_time AS [Avg Worker Time], rs.max_cpu_time AS [Max Worker Time], rs.avg_duration AS [Avg Elapsed Time], rs.max_duration AS [Max Elapsed Time], rs.avg_logical_io_reads AS [Avg Logical Reads], rs.max_logical_io_reads AS [Max Logical Reads], rs.count_executions AS [Execution Count], q.last_compile_start_time AS [Creation Time], CAST(p.query_plan AS XML) AS [Query Plan] FROM sys.query_store_query_text AS qt JOIN sys.query_store_query AS q ON qt.query_text_id = q.query_text_id JOIN sys.query_store_plan AS p ON q.query_id = p.query_id JOIN sys.query_store_runtime_stats AS rs ON p.plan_id = rs.plan_id WHERE CAST(p.query_plan AS NVARCHAR(MAX)) LIKE '%CONVERT_IMPLICIT%' AND qt.query_sql_text NOT LIKE '%sys.query_store%' ORDER BY rs.avg_cpu_time DESC; Execution Plans: For the expensive queries, in SSMS, hover over operators like Index Scan to inspect the Predicate. If implicit conversion exists, the plan includes something like “CONVERT_IMPLICIT(<data_type>, ..” XML Plan: For a confirmation of above, reviewing the underlying XML execution plan confirms whether implicit conversion is occurring and on which side of the comparison. This technique is particularly valuable when working with parameterized queries or when graphical plan warnings are insufficient. Look for elements like below in the XML plan: <Warnings> <PlanAffectingConvert ConvertIssue="Seek Plan" Expression="CONVERT_IMPLICIT(.. </Warnings> Plan Cache Inspection: Custom scripts can be written to scan the Azure SQL plan cache for any instances of CONVERT_IMPLICIT operations. Below is one such script that can be used to find. SELECT TOP (100) DB_NAME(B.[dbid]) AS [Database], B.[text] AS [SQL_text], A.total_worker_time AS [Total Worker Time], A.total_worker_time / A.execution_count AS [Avg Worker Time], A.max_worker_time AS [Max Worker Time], A.total_elapsed_time / A.execution_count AS [Avg Elapsed Time], A.max_elapsed_time AS [Max Elapsed Time], A.total_logical_reads / A.execution_count AS [Avg Logical Reads], A.max_logical_reads AS [Max Logical Reads], A.execution_count AS [Execution Count], A.creation_time AS [Creation Time], C.query_plan AS [Query Plan] FROM sys.dm_exec_query_stats AS A WITH (NOLOCK) CROSS APPLY sys.dm_exec_sql_text(A.plan_handle) AS B CROSS APPLY sys.dm_exec_query_plan(A.plan_handle) AS C WHERE CAST(C.query_plan AS NVARCHAR(MAX)) LIKE '%CONVERT_IMPLICIT%' AND B.[dbid] = DB_ID() AND B.[text] NOT LIKE '%sys.dm_exec_sql_text%' ORDER BY A.total_worker_time DESC XE event: Extended Events (XE) is valuable in support scenarios when Query Store or telemetry data alone can't pinpoint issues like implicit conversions, especially if plans aren't cached or historical data lacks detail. XE provides real-time capture of plan-affecting convert events, offering granular insights into query behavior that QS might miss during short-lived or dynamic workloads. However, use it sparingly due to overhead, as a targeted diagnostic tool rather than a broad solution. You can use below script to turn it off. Remember to stop and drop the event immediately when you are done collecting. IF EXISTS (SELECT * FROM sys.server_event_sessions WHERE name = 'Detect_Conversion_Performance_Issues') DROP EVENT SESSION [Detect_Conversion_Performance_Issues] ON SERVER; GO CREATE EVENT SESSION [Detect_Conversion_Performance_Issues] ON SERVER ADD EVENT sqlserver.plan_affecting_convert( ACTION(sqlserver.database_name, sqlserver.sql_text) WHERE ([sqlserver].[database_name] = N'<Replace_with_your_DB_name>') -- Replace your DB name ) ADD TARGET package0.ring_buffer WITH ( MAX_MEMORY = 4096 KB, EVENT_RETENTION_MODE = ALLOW_SINGLE_EVENT_LOSS, MAX_DISPATCH_LATENCY = 30 SECONDS, MEMORY_PARTITION_MODE = NONE, TRACK_CAUSALITY = OFF, STARTUP_STATE = OFF ); GO ALTER EVENT SESSION [Detect_Conversion_Performance_Issues] ON SERVER STATE = START; -- View the raw Extended Events buffer SELECT s.name AS session_name, t.target_name, CAST(t.target_data AS XML) AS raw_buffer_xml FROM sys.dm_xe_sessions s JOIN sys.dm_xe_session_targets t ON s.address = t.event_session_address WHERE s.name = 'Detect_Conversion_Performance_Issues'; Documentation Reference: Microsoft docs on conversion and precedence help explain engine behavior and mappings around implicit conversion triggers. This close look at them along with app developers during the schema/code conversion phase can help better understanding and mitigation. Implicit Conversion: Real-World Example To evaluate the impact of implicit conversions in Azure SQL during post-migration scenarios, we created a synthetic workload example using a table named dbo.Customers. It contains one million rows and includes columns such as AccountNumber, CustomerName, PhoneNumber, and JoinDate. The AccountNumber, CustomerName, and PhoneNumber columns were initially defined as VARCHAR, and Nonclustered indexes were created on these fields to enable efficient lookups. From the application layer, parameters were passed using NVARCHAR, which mirrors typical real-world ORM behavior particularly in Java-based applications or when migrating from Oracle, where VARCHAR2 frequently stores Unicode characters. This deliberate mismatch allows us to study the real performance consequences of implicit conversions in Azure SQL’s query execution engine. Although enabling SET STATISTICS XML ON can expose implicit conversions during query execution, our approach tries to reflect how these issues are usually uncovered in real-world scenarios where customers are less aware of this issue. In this case, we used Query Store and execution plan XML inspection Problem: Implicit Conversion Due to Type Mismatch: A NVARCHAR parameter from the application is compared against a VARCHAR column in the database. This scenario highlights a silent performance regression that can go unnoticed post-migration without detailed plan inspection. Query Used: DECLARE ACC NVARCHAR(20) = N’ACC000500’; SELECT CustomerID, AccountNumber, CustomerName, PhoneNumber FROM dbo.Customers WHERE AccountNumber = ACC; Execution Plan Behavior: Fig: 1 SQL Server applies an implicit CONVERT_IMPLICIT(nvarchar, AccountNumber) on the column side as we can see from Fig 1 when you hover to Index Scan and see the Predicate This disables the use of the nonclustered index on AccountNumber, leading to an Index Scan The XML plan includes a <Warning> tag under <PlanAffectingConvert> Extended Events monitoring consistently shows "plan_affecting_convert" warnings indicating suboptimal query plans caused by these conversions What's Missing: Type alignment between the query parameter and the column. Awareness that even matching string lengths won’t help if encoding mismatches exist. Impact: Index Seek is lost, and full scans are triggered. Higher Exec times and Overall costs observed. Mitigation via Explicit CAST – Matching the Column’s Type: In some cases, especially during post-migration tuning, application teams may not be able to change the database schema, but developers can update the query to explicitly align data types. This scenario simulates such a mitigation where an NVARCHAR parameter is explicitly cast to VARCHAR to match the column’s data type and avoid implicit conversions. Query Used: DECLARE ACC NVARCHAR(20) = N'ACC000500'; SELECT CustomerID, AccountNumber, CustomerName, PhoneNumber FROM dbo.Customers WHERE AccountNumber = CAST(@acc AS VARCHAR(20)); -- Explicit use of CAST Execution Plan Behavior: Fig: 2 The CAST operation ensures that the parameter side matches the VARCHAR column type. SQL performs an Index Seek on the IX_AccountNumber index. The Seek Predicates as seen in Fig 2 confirms this showing Scalar “Operator(CONVERT..” No <Warning> tag appears in the XML execution plan indicating the absence of implicit conversions. What's Fixed: Type mismatch is resolved on the query side without altering the database schema. The query is now SARGable, enabling index usage. Impact: Index Seek is lost, and full scans are triggered. Higher Exec times and Overall costs observed. What's Still Missing: This still creates a long-term maintainability concern, especially when many queries or columns are affected. Developers must remember to manually CAST in every affected query, increasing code complexity and the chance of inconsistency. Missed CASTs in other queries can still cause implicit conversions, so the issue isn’t eliminated just patched locally. Fix at DB end – Parameter Usage aligned Schema Column type This fix involves altering the column type to NVARCHAR, aligning it with the NVARCHAR parameter passed from the application. It eliminates implicit conversions and enables index seeks, improving performance. However, it’s a database-side adjustment, the ideal long-term fix lies in ensuring the application sends parameters matching the original column type. Query Used: DECLARE ACC NVARCHAR(20) = N'ACC000500'; SELECT CustomerID, AccountNumber, CustomerName, PhoneNumber FROM dbo.Customers WHERE AccountNumber = CAST(@acc AS VARCHAR(20)); Execution Plan Behavior: Fig: 3 The CAST operation ensures that the parameter side matches the VARCHAR column type. As seen in Fig 3 an Index Seek is performed on the updated IX_AccountNumber index. The Seek Predicates confirm this showing “Scalar Operator(..” No <Warning> tag appears in the XML execution plan indicating the absence of implicit conversions. What's Fixed: The fix is schema-driven and works universally, ensuring consistent performance across tools and interfaces. Encoding alignment between the parameter and column removes conversion logic entirely, making query plans stable and predictable. Impact: Indexes remain fully usable without manual intervention in queries. Application code stays clean, no casts or workarounds are needed. This is the most sustainable fix but may require coordination with application and DB teams. What's Still Missing: Some of implications around data type changes will be around data types consuming additional storage, for example NVARCHAR consumes 2 bytes/char, increasing storage when compared to VARCHAR. Implicit vs Explicit vs Aligned: Execution Plan Behavior Comparison Scenario Predicate Expression in Exec Plan Implicit Conversion Index Seek Used XML Plan Warning Shown Data Type Mismatch CONVERT_IMPLICIT(nvarchar, AccountNumber) = ACC Yes No (results in scan) PlanAffectingConvert Explicit Cast in Query AccountNumber = CONVERT(varchar, ACC) No Yes No Matching Data Types (NVARCHAR) AccountNumber = ACC No Yes No Best Practices for Managing Implicit Conversions Refactoring Application: Legacy systems, especially those using dynamic SQL or lacking strict type enforcement, are prone to implicit conversion issues. Refactor your application code to leverage strongly typed variables and parameter declarations to ensure data type consistency at the source, minimizing implicit conversions during query execution. Explicit Data Type Casting: Use CAST or CONVERT functions to explicitly define conversions, reducing reliance on implicit behavior. In our example we have used CAST, but a CONVERT function would have worked equally well. Both approaches explicitly align the parameter type to the column and avoid implicit conversions, enabling index seek. Data Type Alignment: When you are performing heterogenous migrations which involve different database engines, ensure data types are consistent between source and target DB engines. Check official documents thoroughly to know and see the nuances around your data and application convertibility and know the implications like additional storage, collation changes etc. that can negatively affect your business. Indexing: Create indexes on columns frequently involved in WHERE filters and JOIN predicates with matching data types to avoid implicit conversions that would cause index seeks to degrade into scans and ensures optimal index utilization by the optimizer. Early Testing: Conduct thorough post-migration testing using QDS to identify, then drill down on execution plans and performance metrics to identify and resolve conversion-related issues. Early collaboration between Developer and DBA teams will be crucial. Tools and Scripts: Utilize SQL Server Migration Assistant (SSMA) for Oracle to identify and change mappings early when you know your application needs. Additionally, use can use custom scripts or third-party tools if necessary to detect implicit conversions in the plan cache. References https://learn.microsoft.com/sql/t-sql/data-types/data-type-conversion-database-engine https://learn.microsoft.com/sql/t-sql/data-types/data-type-precedence-transact-sql Final Thoughts We hope that this post has helped you gain actionable strategies to detect, mitigate, and design around implicit conversions in order to ensure a successful and performant data migration to platforms such as SQL Server or Azure SQL. If you have feedback or suggestions for improving this post, please contact the Azure Databases SQL Customer Success Engineering Team. Thanks for your support!