Forum Discussion

jasleen13's avatar
jasleen13
Copper Contributor
Nov 25, 2020

spark pool taking more than 10m mins to start

I have created 3 notebook by selecting the language as Pyspark.

spark pool is taking more then 10 minutes when starts executing the notebook.

Running these notebooks sequentially, each time spark pool starts and stops. Although the setting to pause the Spark pool is set to 60 minutes.

 

Please let me know is there any workaround to fast the start of spark pool, And same spark pool can be used in all the 3 notebooks without stopping.

5 Replies

  • azdelta2022's avatar
    azdelta2022
    Copper Contributor
    i don't have a direct answer for this, because we used databricks for spark. Can you create a databricks cluster, and invoke the job from Synapse pipeline and check ? Because, in databricks you can utilize "cluster pool" concept. Check and lets us know.
    • kashif_m's avatar
      kashif_m
      Copper Contributor
      Hi azdelta2022,

      if we will call databricks using synapse then no benefit of Synapse Spark pool and even synapse.
      • azdelta2022's avatar
        azdelta2022
        Copper Contributor
        Hi kashif_m , can you try using databricks first and see if you are able to achieve the "warming" the instance ?
  • Meister1867's avatar
    Meister1867
    Copper Contributor

    I am wondering the same thing that kashif_m is.  I know Azure Synapse Spark Pools does not work like Databricks does, but this can get painful real quick.

Resources