When you go to deploy a server or any part of our infrastructure manually, how long does it take you? Can you do a manual deployment end to end without any mistakes? Now, how do you scale that? This is where automation comes in, more specifically Infrastructure as Code (IaC).
In many of the companies I've worked for it would take days for a server to be deployed, why? Because there was a ‘process’ and a physical paper checklist that had to be followed, signed off, and checked again. Each person had to complete their task(s) and get them signed off. To get a server deployed, you'd have to configure the VM and the host (networking, storage, etc), each server required an image to be deployed, patch the OS, harden the deployment, then install/configure an application. Once that was all done… the server was ready for sign off and handed over to the customer. That took 3 days.
In some of the environments I managed, I could automate most of a complete server/infrastructure deployment in a few hours, it was still a very manual process, mistakes were often made. This is when I discovered Infrastructure as Code, many ask, where do I begin? With all the are various choices when it comes to choosing the right tool for the job, which one is best?
Let’s first begin with defining what is Infrastructure as Code. Infrastructure as Code (IaC) is the management of infrastructure (networks, virtual machines, load balancers, and connection topology) in a descriptive model, using version control to store the files. You can also watch this awesome one minute video from the great Abel Wang, What is Infrastructure as Code?
There are a huge number of benefits to using IaC, to name just a few:
Declarative vs Imperative Methods
When writing you infrastructure as code it is important to understand the difference between these two methods so that you understand the difference in the types of templates that can be written and the way in which you will write them.
Declarative languages define the desired state of the target, the system executes what needs to happen to achieve the desired state. Effectively you define the end state of the infrastructure, adding the resources that you need, along with their configuration and the IaC tool will figure the rest out.
Imperative languages define the specific commands that must executed and in the specific order the commands must run to achieve the desired state.
A declarative example would be: ‘Can I have a cup of coffee on my desk after lunch?’
Whereas an imperative example would be: ‘Go to the coffee machine, add 1 scoop of freshly ground beans and 400ml of water into the correct reservoir, press the start button, allow the coffee to fill the cup. Add in 50ml of fresh 2% milk to the cup and then deliver to my desk at precisely 1pm...’ You get the idea.
An imperative language requires more specific input and can fail during the process if one of the steps is not fulfilled properly for any reason.
A declarative style is great when you need to update your infrastructure or make any changes to it. Whereas the imperative is good for a deploy and forget model, but that isn’t always great if you’re looking to be an agile organization or have a changing infrastructure. The choice really comes down to personal preference and which situation fits best for your team.
IaC Tooling: So many Choices!
There are numerous tools that can be used for IaC, there are some questions that I would ask yourself and your team:
I’ve listed some of the tools below, I’ll go through each one and describe some pros and cons, hopefully leading you to pick the one that suits you and your team the best.
Azure Resource Manager (ARM) Templates
ARM Templates are designed specifically for deployments into Microsoft Azure. If you are looking for a tool for on-premises environments or multiple cloud providers, this isn’t it. ARM is the native IaC templating option for Azure. You can deploy a resource in Azure using the Azure Portal, then download your template so that you can do it again and repeat the process. That is an easier way to get started, but there are some drawbacks.
First, you need to learn JSON, which could be your first hurdle. Also, when you export an ARM template there is quite a bit of boilerplate code that you need. ARM, for many people, can be difficult to learn. There is not a way to really know if what you’re deploying is what will get deployed (there isn’t a ‘what-if’ usage or ‘plan’ output that shows you what is about to be deployed). ARM has other limitations when it comes to writing IaC, such as when you get a validation or syntax error, it can be painful to troubleshoot with ARM. ARM templates can also grow to be very large and sometimes unwieldly. In an environment that needs repeatability and scalability, it can cause some issues.
On the other hand, there are some great learning resources for ARM templates if that is the path you choose:
Microsoft Learn – Create and deploy ARM templates
Pros:
Cons:
Bicep
Bicep is the Domain Specific Language (DSL) that allows for declarative deployment of Azure resources, so yes, this is an IaC tool that is native to Azure. Anything that you can do with an ARM template you can do with Bicep (and more!). As soon as a new resource is added into Azure, it is immediately supported by Bicep. Bicep requires a lot less syntax than ARM templates, you can compare the template syntax differences here.
Bicep allows for the use of modules, which means you create a module for each grouping of resources, creating much more manageable and readable files. It keeps your IaC from getting too big and unruly. Bicep is integrated into the Azure CLI, making the Azure deployment experience really seamless.
One of my favorite features of Bicep is the ‘What-if’ operation. When you pass the argument, it checks your current deployment and what changes would be applied before you make them, allowing you to confirm those changes before it applies them. Knowing what you’re about to deploy before you push the button to deploy it is a great way to validate and ensure your results without having to deploy it first.
Pros:
Cons:
Great learning resources with Bicep:
Write your first Bicep Module with Microsoft Learn (and other free learning paths around Bicep)
Barbara Forbes' Blog for Bicep Learnings
Terraform:
Terraform is an open-source tool that uses HCL (Hashicorp Configuration Language), which is based on Golang, which many people find one of the most easily learned IaC languages. Terraform comes with a lot of benefits that makes it a popular choice.
Terraform can be used with any cloud and on-prem resources. While it requires a different template, you can use the same language and formatting to deliver IaC to any environment. The reality is most organization are multi-cloud and configured in a hybrid model, this is where Terraform shines.
terraform {
required_version = ">=0.12"
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "~>2.0"
}
}
}
provider "azurerm" {
features {}
}
resource "azurerm_resource_group" "vmss" {
name = var.resource_group_name
location = var.location
tags = var.tags
}
resource "random_string" "fqdn" {
length = 6
special = false
upper = false
number = false
}
resource "azurerm_virtual_network" "vmss" {
name = "vmss-vnet"
address_space = ["10.0.0.0/16"]
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
tags = var.tags
}
resource "azurerm_subnet" "vmss" {
name = "vmss-subnet"
resource_group_name = azurerm_resource_group.vmss.name
virtual_network_name = azurerm_virtual_network.vmss.name
address_prefixes = ["10.0.2.0/24"]
}
resource "azurerm_public_ip" "vmss" {
name = "vmss-public-ip"
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
allocation_method = "Static"
domain_name_label = random_string.fqdn.result
tags = var.tags
}
resource "azurerm_lb" "vmss" {
name = "vmss-lb"
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
frontend_ip_configuration {
name = "PublicIPAddress"
public_ip_address_id = azurerm_public_ip.vmss.id
}
tags = var.tags
}
resource "azurerm_lb_backend_address_pool" "bpepool" {
loadbalancer_id = azurerm_lb.vmss.id
name = "BackEndAddressPool"
}
resource "azurerm_lb_probe" "vmss" {
resource_group_name = azurerm_resource_group.vmss.name
loadbalancer_id = azurerm_lb.vmss.id
name = "ssh-running-probe"
port = var.application_port
}
resource "azurerm_lb_rule" "lbnatrule" {
resource_group_name = azurerm_resource_group.vmss.name
loadbalancer_id = azurerm_lb.vmss.id
name = "http"
protocol = "Tcp"
frontend_port = var.application_port
backend_port = var.application_port
backend_address_pool_id = azurerm_lb_backend_address_pool.bpepool.id
frontend_ip_configuration_name = "PublicIPAddress"
probe_id = azurerm_lb_probe.vmss.id
}
resource "azurerm_virtual_machine_scale_set" "vmss" {
name = "vmscaleset"
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
upgrade_policy_mode = "Manual"
sku {
name = "Standard_DS1_v2"
tier = "Standard"
capacity = 2
}
storage_profile_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04-LTS"
version = "latest"
}
storage_profile_os_disk {
name = ""
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
}
storage_profile_data_disk {
lun = 0
caching = "ReadWrite"
create_option = "Empty"
disk_size_gb = 10
}
os_profile {
computer_name_prefix = "vmlab"
admin_username = var.admin_user
admin_password = var.admin_password
custom_data = file("web.conf")
}
os_profile_linux_config {
disable_password_authentication = false
}
network_profile {
name = "terraformnetworkprofile"
primary = true
ip_configuration {
name = "IPConfiguration"
subnet_id = azurerm_subnet.vmss.id
load_balancer_backend_address_pool_ids = [azurerm_lb_backend_address_pool.bpepool.id]
primary = true
}
}
tags = var.tags
}
resource "azurerm_public_ip" "jumpbox" {
name = "jumpbox-public-ip"
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
allocation_method = "Static"
domain_name_label = "${random_string.fqdn.result}-ssh"
tags = var.tags
}
resource "azurerm_network_interface" "jumpbox" {
name = "jumpbox-nic"
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
ip_configuration {
name = "IPConfiguration"
subnet_id = azurerm_subnet.vmss.id
private_ip_address_allocation = "dynamic"
public_ip_address_id = azurerm_public_ip.jumpbox.id
}
tags = var.tags
}
resource "azurerm_virtual_machine" "jumpbox" {
name = "jumpbox"
location = var.location
resource_group_name = azurerm_resource_group.vmss.name
network_interface_ids = [azurerm_network_interface.jumpbox.id]
vm_size = "Standard_DS1_v2"
storage_image_reference {
publisher = "Canonical"
offer = "UbuntuServer"
sku = "16.04-LTS"
version = "latest"
}
storage_os_disk {
name = "jumpbox-osdisk"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
}
os_profile {
computer_name = "jumpbox"
admin_username = var.admin_user
admin_password = var.admin_password
}
os_profile_linux_config {
disable_password_authentication = false
}
tags = var.tags
}
Terraform builds resources, makes changes and can call existing resources using a state file. Terraform is easily readable and uses modules to easily configure your code and call your resources. While Terraform is a declarative language, it does call the state file to know what it is supposed to deploy. Managing the state file does introduce other topics (security, access, etc), but is very much achieved using the documentation in place. Learn more about Terraform state files here.
Terraform has great features built in to validate your code, run a ‘plan’ so you know exactly what elements are going to change before they change, and traceability of what was deployed. Terraform shines when you want to continuously deploy your infrastructure, it even has the ability to deploy to different environments using workspaces.
Pros:
Cons:
Terraform on Azure Blog - covering the basics into modules and state files
Generate your first Terraform template with NubesGen
Terraform on Azure Documentation
Pulumi:
Pulumi is another IaC tool that uses a declarative format to deploy your infrastructure, the biggest differentiator with Pulumi is that it allows you to write your IaC in the language that your organization or team knows best. Pulumi support TypeScript, JavaScript, Python, Go and C#, which means that you write your templates in the language that you are comfortable with.
Adding in another bonus, you can use the testing tools native to that language to test your code. Testing is crucial. We not only want to deploy our infrastructure as code to automate tasks and increase our velocity, but we also need to reduce our human error. This is where testing is a crucial part of the development and deployment lifecycle.
Pulumi, like Terraform supports ANY cloud. It has another huge benefit: It can coexist or convert your existing templates from Terraform, ARM, Helm/YAML, etc into Pulumi.
Pros:
Cons:
Video on deploying to Azure using Pulumi
Ansible
Ansible an imperative IaC tool, while it not only provisions your infrastructure, but it also manages the configuration of your services. The other services above do not, another 3rd party tool would be required. Ansible relies heavily on YAML files to define your infrastructure in the form of Ansible Playbooks and Python for its written language. These describe your automation tasks form deployment to ongoing state, it’s an all-in-one solution.
Ansible does not maintain state, it does not keep track of dependencies. Ansible is fairly easy to get started with but does have less of a community feel when looking for troubleshooting tips or self-help.
Pros:
Cons:
Chef:
Chef is an open source IaC tool that can run on multiple platforms (Windows, Linux, AWS, Azure, etc) and uses cookbooks and recipes to define not only your deployment templates, but also your configuration of your environment. Chef uses Ruby DSL, requiring a dedicated set of programming skills to learn the language. Chef requires an infrastructure to run on, so that is a consideration when looking at it, there is a licensing and infrastructure cost associated to this. This also means that Chef runs on a dedicated environment, requiring an agent on every machine that you are deploying to.
Due to the fact that Chef requires a lot of other considerations outside of just the capability of the product I am going to list the pros and cons, it very much requires much more consideration outside of just infrastructure as code.
Pros:
Cons:
Puppet
Puppet and Chef often get roped together when comparing IaC as they’ve both been around for some time. Puppet uses its own declarative language to deploy and maintain system configuration, it uses manifests and modules in the form of PuppetDSL.
Puppet also requires an infrastructure to run on, deploying agents on every machine that you are deploying and managing. As Puppet also requires a lot of other considerations outside of just the capability of the product, it’s not one that is as popular in Azure when there are more cost-effective options.
Pros:
Cons:
In Summary
Choosing an Infrastructure as Code tool is decision that requires thought, along with comparing the pros and cons for every organization. There is no one-size-fits-all solution for anyone nor any company. Take your time, read through the options and find the best solution for you. Once you choose your preferred IaC tool, make sure you start looking at how to automate not only your infrastructure, but also your delivery process with a solid continuous integration/continuous delivery (CI/CD) tool.
Happy coding!
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.