Forum Discussion

dannybhar's avatar
dannybhar
Copper Contributor
Oct 14, 2021

geospatial data

Hi

I have application written by third party with geospatial data and I need to rearchitect it with scale on azure. This one written with GeoPandas and Dask in python. For full data run more than 10 hours for around 100 cpu box. I like to rearchitect in PaaS . What are best option for me. I know batch is one option. What other options I can have to make it autoscalable .

I read somewhere Dask with kubernetes cluster so it could be AKS is that will be option too?

Did any body try it ,like to know challenges ?  

4 Replies

  • fnealon's avatar
    fnealon
    Copper Contributor
    Can you move the geospatial data to Cosmo Db (Gremlin API) ?
    • dannybhar's avatar
      dannybhar
      Copper Contributor

      what advantage it will give? as application is more as scientific programming application which is basically more CPU intensive type. Something like it process data of some cm. and say you need to calculate it for few square kilometers say 5X5 km square. 

      • fnealon's avatar
        fnealon
        Copper Contributor

        dannybhar 

        Scalability for your data. If your application can be hosted within a web platform use App Service. Azure Kubernetes Service and Kubernetes in general has a steep enough learning curve. You could use VM Scalesets (VMSS) to get up and running quickly too.

          

Resources