When using Terraform, it’s not possible to know when the EC2 instances that are part of an AutoScaling Group are completely ready without using an ELB. And by completely ready, I mean we know the instances have finished running their cloud-init process (AKA userdata). See this Github issue for more details. But if we have access to the instances in the VPC, there’s a workaround!

To do this, you’ll create your AutoScaling Group as usual:

resource "aws_autoscaling_group" "cluster" {
  desired_capacity     = "${var.ec2-instance-count}"
  launch_configuration = "${aws_launch_configuration.cluster.id}"
  max_size             = "${var.ec2-instance-count}"
  min_size             = 1
  name                 = "cluster"
  vpc_zone_identifier  = var.subnet-ids

  tag {
    key                 = "Name"
    value               = "cluster"
    propagate_at_launch = true
  }
}

Then, we’ll create a data source that’s meant to wait for the AutoScaling group to be created:

data "aws_instances" "cluster" {
  filter {
    name = "tag:aws:autoscaling:groupName"
    values = ["${aws_autoscaling_group.openshift-masters.name}"]
  }
}

This data source will wait for the AutoScaling Group to be ready before populating the instance information. Now for the trick…

Create a Terraform module and feed the data source as an input to that module. The variable should accept a list as input:

variable "cluster-ips" {
  type = list(string)
}

The calling the module should look something like this:

module "cluster-wait-for" {
  source = "./cluster-wait-for"
  cluster-ips = "${data.aws_instances.cluster}"
  ec2-instance-count = "${var.ec2-instance-count}"
}

And now, inside that module, we create a null_resource to ssh to the nodes and wait for cloud-init to finish:

resource "null_resource" "wait-for-cloud-init-cluster" {
  count = "${var.ec2-instance-count}"
  connection {
    type = "ssh"
    user = "ec2-user"
    host = "${var.cluster-ips[count.index]}"
    private_key = "${file(var.ec2-key-location)}"
  }

  provisioner "remote-exec" {
    inline = [
      "sudo cloud-init status --wait"
    ]
  }


  triggers = {
    workers = "${join(",", sort(var.cluster-ips))}"
  }
}

Now, whenever the AutoScaling Group is updated (thanks to the trigger) this resource will ssh to the instances and wait until cloud-init is finished running.