Forum Discussion
CTAS against external table running EXTREMELY slow
I have two tables of similar size (5 GB each) sitting in 120 (60 files for each table) parquet.snappy files in a storage account.
I can create an external table against both of these, however, when I run a CTAS statement, the external table with 20 columns takes about 5 minutes to complete.
The external table with 165 columns has been running for over 5 hours, and has yet to complete.
I'm using an Azure Synapse Dedicated SQL pool and my CTAS looks like this:
CREATE TABLE [MySchema_20211122].[MyTable_rehydrated_from_cold_storage]
WITH
(
DISTRIBUTION = HASH ( [PrimaryID] ),
CLUSTERED COLUMNSTORE INDEX
)
AS SELECT * FROM MySchema_20211122MyTable_cold_storage_archive ;
Could it be the columnstore index creation that is taking forever?
ctech1320 - I changed the CLUSTERED COLUMNSTORE INDEX to CLUSTERED INDEX([CustomerID]) and it ran in 15 minutes. I think that was it.