Infra in Azure for Developers - The What

Andreas Helland · ‎Jan 10 2024

A week ago I gave a quick intro to why developers should look into infrastructure but conveniently (and intentionally) skipped going into details on how to solve things. A few details remain before implementation instructions - let's take a look at the "what" part of the problem.

What kind of mess is this?

Of course, when you start to think about cloud infra these days the term infrastructure-as-code pops up. Which is what we should strive to use, but we don't want to get ahead of ourselves. You have choices to make before writing that code, and you have an understanding that needs to be in place as well.

Maybe this post feels more like a talk on architecture than on coding; it's unavoidable. It is however an integral part of becoming a cloud native developer (in my subjective interpretation). I will try to keep it actionable though.

Platform Engineering

Reading through this someone will think isn't this basically platform engineering or why we have platform teams, so let's get that out of the way. Well… I've seen teams building impressive platforms and I love good solutions in that area. For this series of posts I do not really have an opinion on the matter. Investing in a platform is just that - investing. It will cost time and money and you need to see some return on that. You will always need some sort of infra, but if you have ten developers you're probably better off spending on the product rather than the platform. If you have five hundred developers it's something else entirely.

Most things I touch upon will apply regardless of your platform strategy, but naturally there are details you would do differently if building a dedicated platform for cloud usage within your company. Having platform engineers do some of the lifting doesn't exclude there being value in developers trying to go cloud native as a separate exercise.

Control plane vs data plane

Many are aware of the separation of duties in Azure already, but not everyone so a quick explanation of permissions. You have a fairly rich model for permissions in Azure and the most common implementation is some variant of Role-Based Access Control (RBAC). Just having an "admin role" and a "user role" is usually too coarse grained for your needs so there are a number of pre-defined roles to choose from. Azure splits this further along a control plane and a data plane. The control plane is "outside access" to the resource. For instance control plane access to a virtual machine would mean having access to starting and stopping the VM, attaching a firewall rule (on the network level) and similar actions. Data plane is "inside access" which would mean things like logging in to the VM and handling things in the operating system. (Let's disregard Windows permissions for the moment.) Similarily you could be allowed to administrate a SQL Server resource without any knowledge of what's inside the databases.

In a dev environment it's not uncommon for developers to have access on both planes, but in other environments you should consider separate roles (provided you have enough of a head count). Unless you put other guardrails in place it can be possible to escalate privileges - if you have full control plane access you can grant yourself data plane access. The other way around not so much. The details of this is out of scope here, but it is important to be aware of the differences. If another developer asks about being granted permissions you need to know what kind of access they are talking about. (The value of this will be especially apparent when diving into app permissions which we'll get to later.)

Microsoft frameworks - CAF and WAF

If you've never worked with Azure before it can be daunting when you're starting out. There's hundreds of services. Some services seem to have hundreds of settings you can tweak. You quickly learn that one person cannot be an expert on all things Azure so you need to get more people up to speed. And you need to make sure you deploy according to best practices. You get the picture.

You don't want to start from scratch and re-invent the wheel so you look for documentation. Maybe there's some architectural guidance. Maybe there's samples.

Two common terms that will be thrown out in this context are the Cloud Adoption Framework, shortened to CAF, and the Well-Architected Framework, or WAF. CAF as a term has been abused and to an extent lost some meaning along the way. (I've seen some pretty bad implementations claiming to be "based on CAF".) This isn't a deep dive on either framework - Microsoft already has tons of material on that. I'll focus on the salient points for you as a developer.

CAF covers things like getting your organization to understand "the cloud" and how to structure teams and other things that don't necessarily help you with the actual hands-on coding tasks. (You still need those other things in place though so don't take this as an instruction to ignore it altogether.) What is useful for the hands-on work you need to do is how it teaches you to think about your cloud architecture in layers . Take for instance running a web app in a container.

To be able to run a container you need a container host (which can be covered by different Azure services). To be able to run a container host you need a network to attach it to. To be able to deploy a network you need policies and governance. You basically nest yourself down through the stack. And these different services are then deployed as logical levels:

Level-1: Azure Policy, Entra ID, Log Analytics
Level-2: Virtual networks, DNS zones, Azure Firewall
Level-3: Azure Kubernetes Service, Azure Container Apps, SQL Server
Level-4: Containers, SQL Databases

This is an example and I'm not saying this is the only true and acceptable interpretation. You may have interim levels like 2.5. You may have services appear on multiple levels - for instance logging infra and logging apps work in different places in the stack. Don't come running with pitchforks in the comments section if you happen to have a different take on this. (Do come if you have helpful insights on what I've missed.)

Subscriptions vs resource groups

Azure allows you to group resources on multiple levels. A resource group is a logical container for resources. A subscription is a logical container for resource groups, and a management group is a logical container for subscriptions.

Management groups are not that relevant on a daily basis for developers, but resource groups are very much so. And it is a concept that I have seen so many troublesome usage patterns for.

A resource group is a logical boundary of resources sharing the same life cycle.

Your web app and your database does probably not share life cycle. You want to be able to delete your web app and move your code to a different hosting platform or for that matter rewrite in a different language without even touching your database. The people who should be allowed to do backup and restore of your database are possibly not the same people you want changing the version of the .NET runtime for your app.

Resource groups are free. Create them according to your logical architecture. You don't win anything by having resource groups containing large clusters of unrelated resources.

Also keep in mind that resource groups are not a security boundary per se. Do not share a subscription among teams or customers based on giving each of them a resource group and trusting they behave. (Yes, I know you can assign access on resource groups.) Multiple teams equals multiple subscriptions. Accordingly; dev environment and prod environment go into separate subscriptions. What if you have a frontend team and a backend team - shouldn't they be allowed to deploy to the same Kubernetes cluster? The term "team" can carry different meanings across organizations so there could be a higher level of abstraction deciding if cluster sharing is ok or not and that will in turn impact how you set up your subscriptions. (Like Product_A_Frontend + Product_A_Backend as one team vs Product_X_Y as other teams.)

If you follow CAF you will have "special" subscriptions like one for connectivity between on-prem and the cloud, one for centralized logging, etc. Let's disregard that. (If your organization is at that stage it's bigger than you as a single developer.) There are use cases where you need to share subscriptions and/or resource groups. The advice above is the general recommendation, and has a direct impact when you start creating your infra code which is why it needs to be mentioned.

https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/

An example of a use case where you can separate by resource group is if you have a "sandbox" subscription where developers can create resources to become familiar with the settings of a resource type, do experiments, etc. Resource groups with individual developers as naming suffixes are ok in a sandbox.

On first sight it might not be apparent why this even matters. This is like the infrastructure version of monolithic apps vs distributed apps/microservices discussion. A monolithic MVC web app is a totally acceptable choice for a simple web site. And a no-levels deployment is fine for a one-week proof-of-concept. However, a monolith is bad if you want independently scalable and deployable components. If you try to scale infra across many teams and different types of development projects a monolithic block of infra is equally challenging.

WAF is more on the practical side. This consists of best practices and recommendations along the following pillars: Reliability, Security, Cost Optimization, Operation Excellence, and Performance Efficiency.

An example of reliability would be to have two instances of a service in case one goes down. Cost optimization would be that you can probably save money by downscaling a virtual machine where memory usage is ten percent on average. Some of these are obvious, others might be pointed out in the documentation. But the best part is that most of these will be automatically flagged in the Azure Portal either on the resource blade or through the Azure Advisor feature. (Security advisories might be pointed out in Defender for Cloud as well.) Deploy a service, watch out for improvement suggestions and adjust your infra code to match. Microsoft really is helping you along here.

Managed identity

A classic conundrum while coding is where to hide your secrets. You need a connection string for a database. You need a client secret for an API. Some of these values can be stored in an Azure Key Vault. It is however even better if you can avoid using the values altogether. If your app runs as a container in Azure and needs to store JSON documents in CosmosDB you do not need a connection string. You need a managed identity assigned to the app which in turn has permissions for the Cosmos account/collection.

Key to the kingdom

Read up on what managed identity is. Try to understand when you should use system-assigned versus user-assigned identities. Apply it through infra-as-code and attempt to use it for every service you have in Azure :)

Terraform vs Bicep vs Pulumi vs whatever

As a developer you've probably been part of a language war. Is Java or C# best for backends? Client-side or server-side Javascript? Or something similar. This of course extends to infra-code - Terraform or Bicep? Or the flexibility of Pulumi, and if so which language to use with Pulumi.

I like how Terraform plan works, (the what-if of Bicep is less reliable), but I'm not a fan how the state files blur the line between control plane and data plane. Try both if you have the time and do your evaluations. For this series I am going with Bicep, but you are of course free to go down the path that makes the most sense for you. (The why and what of IaC languages can probably also kick off some good wars, but this is not that post.)

Anything else?

Of course this is just scratching the surface of the "what" that is needed if you want to rearchitect fifty on-premises solutions to run on Azure. These are just a few bullet points to get you started thinking before you dive into wizards in the Azure Portal.

The why and what can be crossed of the list, so the next time we will try to get our hands dirty with the how of infra in Azure for devs.

Products (50)

Special Topics (27)

Video Hub (462)

Most Active Hubs

Most Active Hubs

Video Hub

Infra in Azure for Developers - The What