Serverless query went from taking 19 minutes to 6+ hours

Copper Contributor

I have a weird issue where a simple query that was taking 19 minutes is now taking over 6 hours to execute, so long in fact that it never successfully completes.  It was running successfully until about a week ago.  There were not changes to the underlying data and the query is identical.  I also haven't made any changes to settings in the workspace or on the blob storage.

 

The underlying data is 21 TB of parquet files stored in the same region in blog storage.  Each file is no larger than 500mb. 

 

Query structure is similar to below.

 

SELECT
*
FROM
OPENROWSET(
BULK 'mypath/**',
FORMAT = 'PARQUET'
) AS [result]
Where id in
(
SELECT
distinct id
FROM
OPENROWSET(
BULK 'mypath/**',
FORMAT = 'PARQUET'
) AS [result]
where number between 38 and 40 and number2 between -104 and -105)

 

I have tried running the sub query separately (never completes) and switching the where condition to a text based column which also never completes.  I'm pulling my hair out at this point because it had been running successfully for several weeks.

 

 

 

0 Replies