Forum Discussion

Roelemanski's avatar
Roelemanski
Copper Contributor
Apr 30, 2024

Delta Query Performance in Azure Synapse Serverless: Issues with Response Times using Serverless SQL

We are encountering significant performance issues with Delta query execution in Azure Synapse Serverless across multiple client setups. Specifically, we utilize views built on Delta tables within Synapse Serverless to deliver data via a Synapse on-demand SQL endpoint. When end users execute queries involving simple joins at this endpoint, we observe that the response times are 7-8 times slower compared to equivalent views created over Parquet files.


Efforts to optimize have included optimizing & vacuuming the Delta tables and explicitly defining data types during view creation, which resulted in only marginal improvements. We suspect the issue might be related to Synapse utilizing an outdated version of the Delta reader.

 

For comparison, using a Databricks SQL Warehouse dramatically enhances the query response times. However, this approach does not support querying via SQL Server Management Studio, which is a significant limitation for our end users.

 

Has anyone else experienced similar performance challenges with Delta in Synapse Serverless? Any recommendations or workarounds would be greatly appreciated. Also, maintaining duplicate datasets in both Delta and Parquet formats solely to support SQL endpoints feels inefficient.

 

Thoughts on this?

 

2 Replies

    • Roelemanski's avatar
      Roelemanski
      Copper Contributor

      BrightSpark1225 thank you for your response! I'm quite interested in hearing about your findings. We're contemplating creating an additional Parquet and using the overwrite method to store it in a '_parquet' subfolder within the Delta table. This way, the folder would be overlooked by Delta but could still be utilized to function as views.

Resources