geospatial data

Occasional Contributor


I have application written by third party with geospatial data and I need to rearchitect it with scale on azure. This one written with GeoPandas and Dask in python. For full data run more than 10 hours for around 100 cpu box. I like to rearchitect in PaaS . What are best option for me. I know batch is one option. What other options I can have to make it autoscalable .

I read somewhere Dask with kubernetes cluster so it could be AKS is that will be option too?

Did any body try it ,like to know challenges ?  

4 Replies
Can you move the geospatial data to Cosmo Db (Gremlin API) ?

what advantage it will give? as application is more as scientific programming application which is basically more CPU intensive type. Something like it process data of some cm. and say you need to calculate it for few square kilometers say 5X5 km square. 


Scalability for your data. If your application can be hosted within a web platform use App Service. Azure Kubernetes Service and Kubernetes in general has a steep enough learning curve. You could use VM Scalesets (VMSS) to get up and running quickly too.


As I mentioned it is more as data processing type of app, that is reason think azure batch first.