Setup Of Virtual Machines On Azure Using Terraform

Introduction

Microsoft recently announced increased investment in integrating Terraform with Azure (Aug 2017). This is a continuation of Microsoft's reach into the agnostic/multi-cloud cloud arena, where they are doing whatever it takes to help developers succeed in the cloud. It used to be the case that Azure was only for Micosoft developers - well no more. The crew in Redmond and every office around the globe are really pushing the open-source and 'Azure for everyone' opportunity. Openness can only be good for everyone in the long run. You can read more detail about Terraform meets Azure on the offical MS Terraform page.

Anyway, back to the task at hand - progressing this DevOps/Infrastructure focused series. In the first article in this series, I introduced 'Terraform', and gave an introduction to what it is, and how to use it. The second article dicussed how to use variables, interpolation and resource count to make multiple copies of resources to save duplication of config/code. This article assumes you have read the other two, and gives instruction for setting up or 'provisioning' the remote virtual machines with your particular setup once they have been deployed. 

Background

When you create an instance of a virtual machine in Azure, AWS or elsewhere, you cannot be sure if it is completly up to date and patched etc. Even if it is, you may need to configure ('provision') a setup that is quite specific to your needs and supports the solution you want to buld the infrastructure for, This article shows you some of the different options for provisioning machines using Terraform on Azure.

Provisioners

When we create virtual machines, we then need to run things on them - this might be for setup, updates, security hardening, etc. 'Provisioners' are the construct we use to declare what we want to provision against a particular resource. This might be uploading files, installing software, or running some custom script.

Code Placement

You will recall from the earlier articles in this series that we put together basic building blocks that define our infrastructure in '.TF' configuration files. Bulding blocks represent the different parts of our infrastructure, for example virtual networks, public IP addresses, virtual network cards, disks, etc. Provisioners are generally placed inside the resource block they are targetted at. So in this case, we are going to create a provisioner block inside our virtual machine creation block.

  1. # outer virtual machine resource definition  
  2. resource "azurerm_virtual_machine" "workerNode" {  
  3.     name = "workerNode"  
  4.     location = "North Europe"  
  5.     resource_group_name = "${azurerm_resource_group.Alias_RG.name}"    
  6.     network_interface_ids = [network_interface_ids = ???]  
  7.     vm_size = "Standard_A0"  
  8.     ... etc  
  9. # inner definition of a provisioner  
  10.     provisioner XXX TYPE XXX {  
  11.        XXX specific settings XXX  
  12. }  

Lifecycle events

Terraform offers a number of specific templates we can use when provisioning that will be triggered at specific times during the provisioning life-cycle - these include create, destroy and on-fail provisioners. By default, the config is set to assume a create event, however, you can flag the provisioner to only happen when the resource its attached to is being destroyed. Here are three examples showing default, default with create (the same) and destroy. Note the key word 'when' that needs to be put in to tell Terraform when to do the provisioning in this block.

  1. provisioner XXX TYPE XXX {  
  2.     XXX specific settings XXX  
  3. }  
  4. provisioner XXX TYPE XXX {  
  5.    when = "create"}  
  6.   
  7. provisioner XXX TYPE XXX {  
  8.    when = "destroy"}  

Now, one gotcha - at the time of writing (Aug 2017), there seems to be some instability in this particular setting - so if it is flakey for you, raise a ticket on github!

So - we know what the basic structure looks like, but how can we actually provision things to our remote server? ... to do this we can use one of three types of provisioning code block (that would be the 'XXX type XXX' place-holder I have above!). 

File provisioning

Uploading files is carried out using the FILE provisioning setting. We can use FILE to upload both single files, and the contents of entire folders. Terraform works out if it is a file or folder depending on what you give it as a 'source', and then copies that file or folder up to the remote system 'destination' location. The only thing you must be careful of is that you upload to a location that exists, and that you have write permissions on. 

  1.  # Copies all files and folders in apps/app1 to D:/IIS/webapp1  
  2.    provisioner "file" {  
  3.    source "apps/app1/"  
  4.    destination = "D:/IIS/webapp1"  
  5. }  
  6.  
  7. # Copies a single file "MyTextFile.txt" to the d:\data folder on remote machine  
  8.    provisioner "file" {  
  9.    source "c:\data\MyTextFile.txt"  
  10.    destination = "d:\data\files\MyTextFile.txt"  
  11. }  

You can also get Terraform to create a file, and then pass over what the contents of that file should be - this is useful if you are making dynamic file contents for example. The trick here is we dont use the 'source' attribute, but a 'content' attribute instead.

  1. #  Copies the string in "content"into c:\data\someNotes.txt  
  2.    provisioner "file" {  
  3.    content "my notes: ${var.Notes}"  
  4.    destination = "d:\data\someNotes.txt"  
  5. }  

EXEC provisioning

Aside from uploading files and content to our newly built remote machines, most likely we also need to run scripts and other forms of setup. This can be achieved by using the EXEC commands. There are two of these, one for remote hosts, and one that can be used to execute back on the local host that Terraform mis running from. As you may expect, these are remote-exec and local-exec.

A provisioner type of remote-exec takes the following format,

  1. provisioner "remote-exec" {  
  2.     inline = [   # argument declaration  
  3.       "RunApp.exe someParam"   # 1 or more elements to execute inline  
  4.     ]  
  5.   }  

Remote-exec has three different argument types,

  1. inline - This is a list of command strings. They are executed in the order they are provided. This cannot be combined with script or scripts.
  2. script - This is a path (relative or absolute) to a local script that will be copied to the remote resource and then executed. This cannot be combined with inline or scripts.
  3. scripts - This is a list of paths (relative or absolute) to local scripts that will be copied to the remote resource and then executed. They are executed in the order they are provided. This cannot be combined with inline or script.

Local-exec gets constructed like this,

  1. provisioner "local-exec" {  
  2.     command = [  
  3.       "RunApp.exe someParam"  
  4.     ]  
  5.   }  

So where remote-exec had an argument element of 'inline', local-exec uses 'command'. The command given can be provided as a relative path to the current working directory or as an absolute path. It is evaluated in a shell, and can use environment variables or Terraform variables.

Connections

When we run a provisioner, it is done within the context of a resource being created or destroyed. By default, the provisioners should use the established connection it they already have to the machine, however, sometimes you need to help things along. To be associated with a resource, connections need to be declared within their context - the following example demonstrates,

  1. provisioner "remote-exec" {  
  2. inline = [  
  3.  "sudo mkdir /tmp/staging"  
  4.  ]  
  5.   connection {  
  6.      type = "ssh"  
  7.      user = "testadmin"  
  8.      host = "<<some public IP or server host name>>"  
  9.      private_key = "${file("id_rsa")}"  
  10.   }  
  11. }  

You will notice in the connection I define the user and also a private key, but not a password. This is my choice - if you wish you can configure remote hosts to use passwords. I have denied this, and prefer to connect securely using keys. The particular interpolation shown parses to load a local file I have placed in the same folder as the terraform.exe called 'id_rsa'.

Timeouts

Sometimes when using Terraform you will find that your script fails with a timeout. A common problem here is that the cloud provider (AWS, google, Azure) is just taking its jolly old time spinning up the resources you have requested, or perhaps responding down the pipe with feedback to the script. To help with this issue, we can define and lengthen the 'timeout' period on exec provisioners - all we need to do is provide the amount of time in seconds that we want to delay before calling it quits and starting again....

  1. provisioner "remote-exec" {  
  2. inline = [  
  3.  "sudo mkdir /tmp/staging"  
  4.  ]  
  5.   connection {  
  6.      type = "ssh"  
  7.      user = "testadmin"  
  8.      host = "<<some public IP or server host name>>"  
  9.      private_key = "${file("id_rsa")}"  
  10.      timeout = "300s"  
  11.   }  
  12. }  

Null resources

It's worth mentioning here a useful thing called a 'null resource'. This is a type of resource that you can use to wrap more generic provisioners that you wish to carry out on the overall infrastructure, not just a particular machine. Here is an example that echos a message to the local machine you are running Terraform on, when any instnce of a cluster changes,

  1. resource "null_resource" "cluster" {  
  2.   # Changes to any instance of the cluster requires re-provisioning  
  3.   triggers {  
  4.     cluster_instance_ids = "${join(",", aws_instance.cluster.*.id)}"  
  5.   }  

The 'trigger' is the event that kicks off the change in the dependency graph.

When things go wrong....

Despite our best efforts, sometimes things just dont work out as we should like. From a Terraform point of view, this might mean that the cloud provider does not complete a request, a request fails for some reason, etc. The way we handle these gracefully, is to use the 'on_failure' tag. In the example below, lets say that we made a typo when naming the variable, and put in 'NotesN', instead of just 'Notes'. We dont have a variable called 'NotesN', and even if we did, and it pointed to a local file lets say that didnt exist, this provisioner would fail. The default behaviour of Terraform in this case is to'Taint' the resource being created (that is, mark it as not completed/needs rebuliding), and report an error. If you set 'on_failure' to 'fail', a taint will occur (this is the default setting), if on the other hand its not a problem, you can set the 'on_failure' to 'continue' and the provisioning will proceed regardless.

  1. #  Copies the string in "content"into c:\data\someNotes.txt  
  2.    provisioner "file" {  
  3.    content "my notes: ${var.NotesN}"  
  4.    destination = "d:\data\someNotes.txt"  
  5.    on_failure="continue" # or on_failure="fail"  
  6.    ...  
  7. }  

Wrap-up

Ok, thats the basics of provisionoing. If you need further details and more options, you can read all about it on the Terraform provisioners documentation page.

I have attached an example script  -- just remember to fill in your own details! ... Happy coding and dont forget to give the article a vote if it was useful!

In the next article in this series we will take a look at the next step, setting up a Kubernetes/Docker cluster inside the virtual network of machines we have created using Terraform.

Finally, if you want to get a broader view of things, this video about Terraform on Azure is worth a look.