Terraform, null_resources & Azure ASM API

Recently, I was trying to bring up virtual machines in Microsoft Azure but ran into this interesting & annoying problem of not being able to upload SSH keys via the terraform DSL. There is a provision to provide a ssh_key_thumbprint but sadly no way to upload what you would call a KeyPair in AWS jargon.

While terraform does not support this operation via its DSL, It is possible to achieve this using some less-explored features of terraform.


Solution

I am using OS X, so my code samples might include some OS X specific commands. However it should be fairly easy to carry out these operations on other operating systems too.

First, the azure cli must be installed. Easiest way to do that is using brew:

$: brew install azure-cli

Post installation you will have to authenticate the azure cli. But that’s fairly easy. All you have to do is $: azure login and subsequent instructions on the screen will handhold you through the process.

Next, generate a SSL certificate that meets the following requirements:

  • The certificate must contain a private key.
  • The certificate must be created for key exchange, exportable to a Personal Information Exchange (.pfx) file.
  • The certificate must use a minimum of 2048-bit encryption.

A SSH keypair requires to be associated with an azure service. So you can create a service.json with the following contents:

Here’s how you can generate a certificate, a .pfx file and upload it to Azure portal.

$service="domain-gamma"
openssl req -x509 \
  -key $service-deployer.key \
  -nodes \
  -days 1365 -newkey rsa:2048 \
  -out /tmp/$service-deployer.pem \
  -subj ‘/CN=domain.com/O=Domain Inc./C=US’
openssl x509 \
  -outform der \
  -in /tmp/$service-deployer.pem \
  -out /tmp/$service-deployer.pfx
azure service cert create $service /tmp/$service-deployer.pfx

Azure API also provides a way to fetch the list of all certificates uploaded and attached to it’s services.

piyush:azure master λ azure service cert list
info: Executing command service cert list
+ Getting cloud services
+ Getting cloud service certificates
data: Service Name Thumbprint Algorithm
data: domain-gamma 4F2AUA9ADF39830CDEHAJAND553DEANAJNAD8C8F sha1
info: service cert list command OK

The recently uploaded certificate has started showing up with a corresponding thumbprint, that can be used to provision new Azure machines.


Catch

So while the above example works well, it does not yet have an automatic essence to it. I am still responsible for the grunt work of checking if the certificate has been uploaded and if not, create one key pair, upload the .pfx and then save the thumbnail corresponding to that service, and all of this before running the terraform plan. Thing can be definitely be done better.


Explanation

You mainly have to observe these four things in the above example:

  • depends_on
  • null_resource.ssh_key
  • ssh_key_thumbprint: ${file(“./ssl/ssh_thumbprint”)}
  • ssl/cert.sh

depends_on

While most dependencies in Terraform are implicit; i.e Terraform is able to infer dependencies based on usage of attributes of other resources, Sometime you need to specify explicit dependencies. You can do this with the depends_on parameter which is available on any resource.

I recommend reading more about Terraform dependencies here.

By injecting a depends_on we can defer the responsibility of assurance of a thumbprint to another resource, but that should be done before an Instance is created.

Note (FAQ): Using a local-exec provisioner approach will not work here, because local-exec is done AFTER the resource has been created and not before. Also local-exec provisioner on any previous operation doesn’t guarantee re-run if the resource itself does not change.

Read on, for the solution.

null_resource

The null_resource is a resource that allows you to configure provisioners that are not directly associated with a single existing resource.

null_resource is like a dummy stub that you can use to insert a node that encapsulates provisioners between two existing stages of the graph. The position is determined by refering to this resource via a depends_on from the child resource. In this case, null_resource will be called from the azure_instance resource.

You can read more about terraform’s null_resource here.

Say we delegate all the duties to a standalone Bash script, we can invoke the script as a local-exec provisioner from the null_resource.

triggers

But what if someone deletes the ssh_thumbprint file? Every subsequent terraform run would panic and crash. Solution lies in triggers attribute of a null_resource. triggers is amapping of values which should trigger a rerun of this set of provisioners. Values are meant to be interpolated references to variables or attributes of other resources.

In this case it’s a file that is being read from the filesystem. So any changes forces the resource to be re-trigerred eventually forcing a re-converge on the instances that depend on this null_resource.

ssl/cert.sh

Putting together the bash script, which accepts the service name and tries to locate an existing uploaded certificate for that service. If not, it generates a new .pfx using the above mentioned techniques, fetches the ssh_key_thumbprint and saves it to a common file where terraform instance resource can read it from.


Now, you should be able to provision a SSH only VM and use the generated .pem file to login to your freshly created Virtual Machine. Yay!

Enjoyed our content? Subscribe to receive our latest articles right in your inbox:
(no spam, promise!)