SOLVED

CTAS against external table running EXTREMELY slow

Copper Contributor

I have two tables of similar size (5 GB each) sitting in 120 (60 files for each table) parquet.snappy files in a storage account.

 

I can create an external table against both of these, however, when I run a CTAS statement, the external table with 20 columns takes about 5 minutes to complete.

 

The external table with 165 columns has been running for over 5 hours, and has yet to complete.

 

I'm using an Azure Synapse Dedicated SQL pool and my CTAS looks like this:

 

CREATE TABLE [MySchema_20211122].[MyTable_rehydrated_from_cold_storage]
WITH
(
DISTRIBUTION = HASH ( [PrimaryID] ),
CLUSTERED COLUMNSTORE INDEX
)

AS SELECT * FROM MySchema_20211122MyTable_cold_storage_archive ;

 

Could it be the columnstore index creation that is taking forever?

1 Reply
best response confirmed by ctech1320 (Copper Contributor)
Solution

@ctech1320 - I changed the CLUSTERED COLUMNSTORE INDEX to CLUSTERED INDEX([CustomerID]) and it ran in 15 minutes.  I think that was it.

1 best response

Accepted Solutions
best response confirmed by ctech1320 (Copper Contributor)
Solution

@ctech1320 - I changed the CLUSTERED COLUMNSTORE INDEX to CLUSTERED INDEX([CustomerID]) and it ran in 15 minutes.  I think that was it.

View solution in original post