DevOps in 2019 (Part 4)

Given the speed with which technology is changing, we thought it would be interesting to ask IT professionals to share their thoughts on their predictions for 2019. Here’s more of what they are thinking about DevOps:

Stefano Bellasio, CEO, Cloud Academy

Original Link

Ad-Hoc YAML DSLs and Productivity

One of my frustrations with the DevOps and cloud infrastructure tools is that most of them are badly designed DSLs that eschew all features of modern programming languages. Things like modules, data structures, functions, imperative control flow constructs, debuggers, linters, standard versioning/deployment practices, and rich library ecosystems are all missing. Of course, it is hard to do any real work without these features so the folks using these tools at some point come to the same conclusion and re-invent non-standard analogs to get by. The re-invention usually ends up being some kind of templating system built with a real language. Two obvious examples I can think of are Ansible with its Jinja templating and Terraform with its own ad-hoc variable interpolation mechanism that I presume is built on top of Go’s templating features. Oh and I almost forgot Kubernetes and Helm.

The arguments the tool designers bring up for why they made yet another DSL are usually some variation of "YAML or FooBarLang is declarative and so it reduces complexity". On the surface, this seems to make sense because declarative solutions, in theory, reduce complexity by hiding more details but when you start actually trying to solve problems the shortcomings become obvious. When real-world use cases are brought up along with the shortcomings of the tool to address them the response ends up being some variation of "You’re using the tool wrong". Again, this kinda makes sense until you dig deeper and realize that it’s not really an answer. Tools must always be subordinate to human intentions. If a tool can not accomplish a goal or requires extraordinary workarounds then it’s not the user’s fault and the logical conclusion is that the tool is badly designed. If I want to write a loop and can’t for some reason then that’s a lack of foresight on the tool designer’s part and not my problem. There could be several valid reasons I’d want to use a loop (or recursion) but because DSLs are not really programming languages I don’t have any real recourse other than to figure out how to work around the limitation.

Original Link

The DevOps Road Map — A Guide for Programmers

DevOps is really hot at the moment and most of my friends, colleagues, and senior developers I know are working hard to become a DevOps engineer and project themselves as DevOps champion in their organization.

While I truly acknowledge the benefits of DevOps, which is directly linked to improved software development and deployment, from my limited experience I can say that it’s not an easy job. It’s very difficult to choose the right path in the middle of so many tools and practices.

Original Link

10 Easy-to-Use Modules in Ansible

Ansible is all about using modules in its Playbook. This article talks about my top ten most used modules.

Before reading this blog, I would like to explain some terminologies used below.

Original Link

Installing CI/CD Tools With Ansible: Everything You Need to Know

When setting up a CI/CD pipeline, you need to choose the right tools. Things you should take into consideration include your specific needs, abilities and ease of use. In my last post, I recommended different tools for building a CI/CD pipeline. Today, I will explain how to install and configure them with Ansible. In the end, I will provide resources that explain how to configure them to work together.

The tools we will show are

Original Link

Getting Started With Ansible Tower

Today’s scaling industries aim to provide large productivity gains, but they have to deal with a wide variety of automation challenges, which are overcome by tools such as Ansible. Let’s start with what Ansible Tower is.

What Is Ansible Tower?

Ansible Tower is Ansible at an enterprise level. It is a web-based solution for managing your organization with a very easy user interface that provides a dashboard with state summaries of all the hosts, allows quick deployments, and monitors all configurations. Tower allows you to share SSH credentials without exposing them, log all jobs, manage inventories graphically, and sync them with a wide variety of cloud providers.

Original Link

Ansible: An Effective IT Automation Tool

Ansible is an effective IT computerization tool that is used to program application deployment, cloud provisioning, configuration management, etc. It is specifically designed for multi-tier deployments and models IT infrastructure by defining how system management can be interrelated instead of managing each of them individually. One of the best aspects of this engine is that it doesn’t employ any agents or additional custom security infrastructure making it effortless to deploy. It further uses a simple language that allows the system admin to describe automation tasks in simple English. Ansible, by default, represents the nodes it manages using simple files that categorize the managed documents and files in custom groups. Instead of managing individual node or system one at a time, the tool configures every node to the labeled format.

How Does Ansible Work?

Now that we know what Ansible is, let’s delve into how it works to provide utmost system efficiency in a server. Ansible hosts are well-structured and described in a file

Original Link

Using Ansible Galaxy Roles in Ansible Playbook Bundles

The Open Service Broker API standard aims to standardize how services (cloud, third-party, on-premise, legacy, etc) are delivered to applications running on cloud platforms like OpenShift. This allows applications to consume services the exact same way no matter on which cloud platform they are deployed. The service broker pluggable architecture enables admins to add third-party brokers to the platform in order to make third-party and cloud services available to the application developers directly from the OpenShift service catalog. As an example AWS Service Broker created jointly by Amazon and Red Hat, Azure Service Broker created by Microsoft and Helm Service Broker created by Google to allow consumption of AWS services, Azure services and Helm charts on Kubernetes and OpenShift. Furthermore, admins can create their own brokers in order to make custom services like provisioning an Oracle database on their internal Oracle RAC available to the developers through the service catalog.

OpenShift Automation Broker is a service broker that is included in OpenShift out-of-the-box and leverages a lightweight, container-based application definition called an Ansible Playbook Bundle (APB) to automate service provisioning using Ansible. The playbooks can perform actions on OpenShift platform, as well as off platform such as provisioning a virtual machine on VMware and installing a database on it.

Original Link

Thrashing Code Sessions, Week 34 [Video]

In this session, I talk through the issues I had in a previous session (but I give context, so there’s no need to review it) with the environment variable problems I was having. I detail how Jetbrains Rider deals with environment variables for the user session, specifically versus how those are dealt with if you’re using something like Visual Studio Code and its built-in terminal. After that, I move on to doing some refactoring and moving toward some feature additions.

In this session, my friend Jeremy joins to setup PXE boot, net install, Debian loads, configuration, Ansible, and a whole lot more to set up a network and cluster of machines for use. It’s long, but he covers a LOT of ground with plenty of tips and tricks to pick up. NOTE: the sound is rough in some places. Will try to fix that in post-production editing. No promises just yet though, just crank it to 11 in the meantime.

Original Link

DevOps Anonymous — Lessons Learned From Building Cloud Infrastructure From Scratch


I was sitting at yet another DevOps Anonymous (DA) meeting. I was not actually in DevOps (yes, I know that having a DevOps role misses the point of the DevOps approach, but I’m trying to tell a story here, so be quiet). I was a software engineer at that time, mostly working with Java. But I enjoyed listening to DevOps people having problems. They all loved me because they thought I also received alerts in the middle of the night; I didn’t.

It was Bob’s turn: "I was pushing to automate things from the start, but nobody listened. Everyone was making ad hoc changes, solving the problem at hand. Nobody was documenting anything. Until one day we had to replicate the environment for a different customer. It took us three months to untie everything and set it up again." -Bob, eyes glazed with tears.

Original Link

ELK Clusters on AWS With Ansible

 In the previous post, we built a virtual ELK cluster with Vagrant and Ansible, where the individual VMs comprising the cluster were carved out of a single host. While that allowed for self-contained development and testing of all the necessary artifacts, it is not a real-world scenario. The components of the ELK stack are usually on separate, possibly dedicated hosts. Fortunately, this does not mean that we are at square one on our efforts to put up an ELK cluster in these cases. Having used Ansible roles for each of the software components earlier, we already have an idempotent and reproducible means to deliver software to hosts. It is the provisioning of the hosts, and the targeting of sub-groups among them for different roles is what would be different, as we change the provisioner from Virtualbox to something else. Here we choose AWS as the host provisioner and devote the bulk of this blog to the mechanics of building the ELK cluster on AWS with Ansible. In the end, we touch upon the small modifications needed to our earlier playbook for delivering software to these hosts. You can download the build out from Github here.

1. The Cluster

We prepare a yml file below with some information on the type, the number of hosts for each group in the ELK cluster, along with some tags that allow us to pull out a specific group of hosts by a tag later for software delivery.

Original Link

ELK Stack With Vagrant and Ansible

I had been playing with ELK on a routine basis, so, for what I thought to be a quick win, I decided to add to my earlier blog post on Building Elasticsearch clusters with Vagrant. Well, it did not quite turn out that way and I had to cover a good bit of ground and publish code to other repos in order for this blog to be useful.

To recap, that post used (a) Virtualbox as the means to build the VMs for the cluster, and (b) a shell script to orchestrate the installation and configuration of an Elasticsearch cluster on those VMs. In this post, we will still use Virtual Box for giving us the VMs, but enhance the provisioning in two ways.

  1. We will build a full ELK stack where application logs are shipped by Beats to a Logstash host for grokking and posting to an ES cluster hooked to Kibana for querying and dashboards. Here is a schematic.
  2. The provisioning (install and config) of the software for each of E (Elasticsearch), L (Logstash), K (Kibana), and Filebeat plugin is done via Ansible playbooks. Why? While provisioning with shell scripts is very handy, it is programmatic and can make building complex coupled software systems across a cluster of hosts too complicated. Ansible hides much of that and instead presents more or a less a declarative way (playbooks!) of orchestrating the provisioning. While there are alternatives, Ansible has become insanely popular lately in the DevOps world.

You can download the code from GitHub to play along with the build out.

1. The Inventory

We need 7 VMs – 2 for applications with Filebeat, 1 ES master node, 2 ES data nodes, and 1 each for Logstash, and Kibana. The names and IP addresses for these VMs will be needed both by Vagrant for creating these and, later, by Ansible for provisioning. So we prepare a single inventory file and use it with both Vagrant and Ansible. Further, this file rations the cpu/memory resources on my 8-core, 16GB memory laptop across these 7 Vms. The file is simply YAML that is processed in Ruby by Vagrant and in Python by Ansible. Our inventory.yml file looks like:

es-master-nodes: hosts: es-master-1: # hostname ansible_host: # ip address ansible_user: vagrant memory: 2048 # ram to be assigned in MB ansible_ssh_private_key_file: .vagrant/machines/es-master-1/virtualbox/private_key es-data-nodes: hosts: es-data-1: ansible_host: ansible_user: vagrant memory: 2048 ansible_ssh_private_key_file: .vagrant/machines/es-data-1/virtualbox/private_key es-data-2: ansible_host: ansible_user: vagrant memory: 2048 ansible_ssh_private_key_file: .vagrant/machines/es-data-2/virtualbox/private_key kibana-nodes: hosts: kibana-1: ansible_host: ansible_user: vagrant memory: 512 ansible_ssh_private_key_file: .vagrant/machines/kibana-1/virtualbox/private_key logstash-nodes: hosts: logstash-1: ansible_host: ansible_user: vagrant memory: 1536 ansible_ssh_private_key_file: .vagrant/machines/logstash-1/virtualbox/private_key filebeat-nodes: hosts: filebeat-1: ansible_host: ansible_user: vagrant memory: 512 ansible_ssh_private_key_file: .vagrant/machines/filebeat-1/virtualbox/private_key filebeat-2: ansible_host: ansible_user: vagrant memory: 512 ansible_ssh_private_key_file: .vagrant/machines/filebeat-2/virtualbox/private_key

2. The Vagrantfile

The Vagrantfile below builds each of the 7 VMs as per the specs in the inventory.

require 'rbconfig'
require 'yaml' DEFAULT_BASE_BOX = "bento/ubuntu-16.04"
cpuCap = 10 # Limit to 10% of the cpu
inventory = YAML.load_file("inventory.yml") # Get the names & ip addresses for the guest hosts
VAGRANTFILE_API_VERSION = '2' Vagrant.configure(VAGRANTFILE_API_VERSION) do |config| config.vbguest.auto_update = false inventory.each do |group, groupHosts| next if (group == "justLocal") groupHosts['hosts'].each do |hostName, hostInfo| config.vm.define hostName do |node| = hostInfo['box'] ||= DEFAULT_BASE_BOX node.vm.hostname = hostName # Set the hostname :private_network, ip: hostInfo['ansible_host'] # Set the IP address ram = hostInfo['memory'] # Set the memory node.vm.provider :virtualbox do |vb| = hostName vb.customize ["modifyvm", :id, "--cpuexecutioncap", cpuCap, "--memory", ram.to_s] end end end end

The VMs are created simply with the vagrant up --no-provision command and the cluster is provisioned with Ansible.

3. The Playbook

The main playbook will simply delegate the specific app provisioning to roles while overriding some defaults as needed. We override the port variables in the main playbook so we can see they match up as per our schematic for the cluster. Some other variables are overridden in group_vars/* files to keep them from cluttering the main playbook. The cluster is provisioned with ansible-playbook -i inventory.yml elk.yml where elk.yml is the file below. 

- hosts: es-master-nodes become: true roles: - { role: elastic.elasticsearch, cluster_http_port: 9201, cluster_transport_tcp_port: 9301} - hosts: es-data-nodes become: true roles: - { role: elastic.elasticsearch, cluster_http_port: 9201, cluster_transport_tcp_port: 9301} - hosts: kibana-nodes become: true roles: - { role: ashokc.logstash, kibana_server_port: 5601, cluster_http_port: 9201 } - hosts: logstash-nodes become: true roles: - { role: ashokc.logstash, cluster_http_port: 9201, filebeat_2_logstash_port: 5044 } - hosts: filebeat-nodes become: true roles: - {role: ashokc.filebeat, filebeat_2_logstash_port: 5044 }

The directory layout shows a glimpse of all that is under the hood.

├── elk.yml
├── group_vars
│ ├── all.yml
│ ├── es-data-nodes.json
│ ├── es-master-nodes.json
│ ├── filebeat-nodes.yml
│ ├── kibana-nodes.yml
│ └── logstash-nodes.yml
├── inventory.yml
├── roles
│ ├── ashokc.filebeat
│ ├── ashokc.kibana
│ ├── ashokc.logstash
│ └── elastic.elasticsearch
└── Vagrantfile

Common variables for all the host groups are specified in groups_vars/all.yml. The variable ‘public_iface‘ can vary depending on the VM provider. For vagrant here, it is “eth1.” We use that to pull out the IP address of the host from ansible_facts whenever it’s required in the playbook. The file groups_vars/all.yml, in our case, will be:

public_iface: eth1 # For Vagrant Provider
elk_version: 5.6.1
es_major_version: 5.x
es_version: "{{ elk_version }}"
es_apt_url: deb{{ es_major_version }}/apt stable main

3.1 Elasticsearch

The provisioning of Elasticsearch on master and data nodes is delegated to the excellent role elastic.elasticsearch published by As the role allows for multiple instances of ES on a host, we name the instances, “{{cluster_http_port}}_{{cluster_transport_port}}” which would be a unique identifier. The ES cluster itself is taken to be defined by this pair of ports that are used by all the master/data members of the cluster. If we rerun the playbook with a separate pair, say 9202 and 9302, we will get a second cluster, ‘9202_9302’ (in addition to ‘9201_9301’ that we get here on the first run) on the same set of hosts, and all would work fine.

The master node configuration variables are in the file group_vars/es-master-nodes.json shown below. The key useful thing here are the lines 5, 13, and 14, where we derive the “” and “” settings for Elasticsearch from the information in the inventory file.

{ "es_java_install" : true, "es_api_port": "{{cluster_http_port}}", "es_instance_name" : "{{cluster_http_port}}_{{cluster_transport_tcp_port}}", "masterHosts_transport" : "{% for host in groups['es-master-nodes'] %} {{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_trans
port_tcp_port}}{%endfor %}", "es_config": { "": "{{es_instance_name}}", "http.port": "{{cluster_http_port}}", "transport.tcp.port": "{{cluster_transport_tcp_port}}", "node.master": true, "": false, "": ["{{ hostvars[inventory_hostname]['ansible_' + public_iface]['ipv4']['address'] }}","_local_" ], "" : "{{ masterHosts_transport.split() }}" }

The data node configuration variables are very similar in the file group_vars/es-data-nodes.json below. The lines 2, 12, and 13 show the only changes.

{ "es_data_dirs" : "/opt/elasticsearch", "es_java_install" : true, "es_api_port": "{{cluster_http_port}}", "es_instance_name" : "{{cluster_http_port}}_{{cluster_transport_tcp_port}}", "masterHosts_transport" : "{% for host in groups['es-master-nodes'] %} {{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_trans
port_tcp_port}}{%endfor %}", "es_config": { "": "{{es_instance_name}}", "http.port": "{{cluster_http_port}}", "transport.tcp.port": "{{cluster_transport_tcp_port}}", "node.master": false, "": true, "": ["{{ hostvars[inventory_hostname]['ansible_' + public_iface]['ipv4']['address'] }}","_local_" ], "" : "{{ masterHosts_transport.split() }}" }

3.2 Logstash

Logstash is provisioned with the role ashokc.logstash. The default variables for this role are overridden with group_vars/logstash-nodes.yml. Lines 4-5 specify the user and group that own this instance of logstash. Lines 9 and 10 derive the Elasticsearch URLs from the inventory file. It will be used for configuring Elasticsearch output sections.


es_java_install: True
update_java: False
logstash_version: "{{ elk_version }}"
logstash_user: logstashUser
logstash_group: logstashGroup
logstash_enabled_on_boot: yes
logstash_install_plugins: - logstash-input-beats
esMasterHosts: "{% for host in groups['es-master-nodes'] %} http://{{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_http_port}}
{% endfor %}"
logstash_es_urls : "{{ esMasterHosts.split() }}"

A simple Elasticsearch output config and filebeat input config are enabled with the following.


output { elasticsearch { hosts => {{ logstash_es_urls | to_json }} }


input { beats { port => {{filebeat_2_logstash_port}} }

3.3 Kibana

Kibana is provisioned with the role ashokc.kibana. The default variables for this role are again overridden with group_vars/kibana-nodes.yml. Unlike logstash, it is quite common to run multiple Kibana servers on a single host with each instance targeting a separate ES cluster. This role allows for that and identifies the Kibana instance with the port it is running at (Line # 7). Lines 2 and 3 specify the owner/group for the instance.


kibana_version: "{{ elk_version }}"
kibana_user: kibanaUser
kibana_group: kibanaGroup
kibana_enabled_on_boot: yes
kibana_elasticsearch_url : http://{{hostvars[groups['es-master-nodes'][0]]['ansible_'+public_iface]['ipv4']['address'] }}:{{cluster_http_port}}
kibana_instance: "{{kibana_server_port}}"

The template file for ‘kibana.yml ‘ below picks up the correct Elasticsearch cluster URL from below 


server.port: {{ kibana_server_port }} {{ kibana_server_host }}
elasticsearch.url: {{ kibana_elasticsearch_url }}
pid.file: {{ kibana_pid_file }}
logging.dest: {{ kibana_log_file }}

3.4 Filebeat

Filebeat is provisioned with the role ashokc.filebeat The default variables are overridden in groups_vars/filebeat-nodes.yml below. Lines 5 and 7 figure out the logstash connection to use.

filebeat_version: "{{ elk_version }}"
filebeat_enabled_on_boot: yes
filebeat_user: filebeatUser
filebeat_group: filebeatGroup
logstashHostsList: "{% for host in groups['logstash-nodes'] %} {{hostvars[host]['ansible_'+public_iface]['ipv4']['address'] }}:{{filebeat_2_logstash_por
t}}{% endfor %}"
filebeat_logstash_hosts: "{{ logstashHostsList.split() }}"

Line #14 in the template for the sample filebeat.yml below configures the output to our logstash host at the right port.

- type: log enabled: true paths: - /tmp/custom.log fields: log_type: custom type: {{ansible_hostname}} from: beats multiline.pattern: '^\s[+]{2}\scontinuing .*' multiline.match: after
output.logstash: hosts: {{ filebeat_logstash_hosts | to_nice_yaml }}

4. Logs

The last step would be to run an application on the filebeat nodes and watch the logs flow into Kibana. Our application would simply be a Perl script that writes the log file /tmp/custom.log. We log in to each of the filebeat hosts and run the following Perl script.

#!/usr/bin/perl -w
use strict ;
no warnings 'once';
my @codes = qw (fatal error warning info debug trace) ;
open(my $fh, ">>", "/tmp/custom.log") ;
my $now = time();
for my $i (1 .. 100) { my $message0 = "Type: CustomLog: This is a generic message # $i for testing ELK" ; my $nDays = int(rand(5)) ; my $nHrs = int(rand(24)) ; my $nMins = int(rand(60)) ; my $nSecs = int(rand(60)) ; my $timeValue = $now - $nDays * 86400 - $nHrs * 3600 - $nMins * 60 - $nSecs ; my $now1 = localtime($timeValue) ; my $nMulti = int(rand(10)) ; my $message = "$now1 $nDays:$nHrs:$nMins:$nSecs $nMulti:$codes[int(rand($#codes))] $message0" ; if ($nMulti > 0) { for my $line (1 .. $nMulti) { $message = $message . "\n ++ continuing the previous line for this log error..." } } print $fh "$message\n" ;
close $fh ;

The corresponding sample logstash config file for processing this log would be placed at roles/ashokc.logstash/files/custom-filter.conf

filter { if [fields][log_type] == "custom" { grok { match => [ "message", "(?<matched-timestamp>\w{3}\s+\w{3}\s+\d{1,2}\s+\d{1,2}:\d{1,2}:\d{1,2}\s+\d{4})\s+(?<nDays>\d{1,3}):(?<nHrs>\d{1,2}):(?<nMi
ns>\d{1,2}):(?<nSecs>\d{1,2})\s+(?<nLines>\d{1,2}):(?<code>\w+) Type: (?<given-type>\w+):[^#]+# (?<messageId>\d+)\s+%{GREEDYDATA}" ] add_tag => ["grokked"] add_field => { "foo_%{nDays}" => "Hello world, from %{nHrs}" } } mutate { gsub => ["message", "ELK", "BULK"] } date { match => [ "timestamp" , "EEE MMM d H:m:s Y", "EEE MMM d H:m:s Y" ] add_tag => ["dated"] } }


By placing appropriate filter files for logstash at roles/ashokc.logstash/files and prospector config file for filebeat at roles/ashokc.filebeat/templates/filebeat.yml.j2, one can use this ELK stack to analyze application logs. A variety of extensions are possible, for example enabling X-PACK login/security, other distributions and versions for ‘ashokc’ roles, automated testing etc… But then there is always more to be done, isn’t there?

Original Link

Running Ansible on a Windows System

On my last conference talk (it was about Ansible and Docker at DevOpsCon in Berlin), I was asked what is the best way to run Ansible on a Windows system. Ansible itself requires a Linux-based system as the control machine. When I have to develop on a Windows machine, I install a Linux-based virtual machine to run the Ansible’s playbooks inside the virtual machine. I set up the virtual machine with Virtualbox and Vagrant. These tools allow me to share the playbooks easily between host and the virtual machine. so I can develop the playbook on the windows system and the virtual machine can have a headless setup. The next section shows you how to set up this toolchain.

Toolchain Setup

At first, install VirtualBox and Vagrant on your machine. I additionally use Babun, a windows shell based on Cygwin and oh-my-zsh, for a better shell experience on Windows, but this isn’t necessary. Then, go to the directory (let’s called it ansible-workspace), where your Ansible’s playbooks are located. Create there a Vagrant configuration file with the command vagrant init:

├── inventories
│ ├── production
│ └── test
├── roles
│ ├── deploy-on-tomcat
│ │ ├── defaults
│ │ │ └── main.yml
│ │ └── tasks
│ │ ├── cleanup-webapp.yml
│ │ ├── deploy-webapp.yml
│ │ ├── main.yml
│ │ ├── start-tomcat.yml
│ │ └── stop-tomcat.yml
│ ├── jdk
│ │ └── tasks
│ │ └── main.yml
│ └── tomcat8
│ ├── defaults
│ │ └── main.yml
│ ├── files
│ │ └── init.d
│ │ └── tomcat
│ ├── tasks
│ │ └── main.yml
│ └── templates
│ └──
├── demo-app-ansible-deploy-1.0-SNAPSHOT.war
├── deploy-demo.yml
├── inventories
│ ├── production
│ └── test
├── roles
│ ├── deploy-on-tomcat
│ │ ├── defaults
│ │ │ └── main.yml
│ │ └── tasks
│ │ ├── cleanup-webapp.yml
│ │ ├── deploy-webapp.yml
│ │ ├── main.yml
│ │ ├── start-tomcat.yml
│ │ └── stop-tomcat.yml
│ ├── jdk
│ │ └── tasks
│ │ └── main.yml
│ └── tomcat8
│ ├── defaults
│ │ └── main.yml
│ ├── files
│ │ └── init.d
│ │ └── tomcat
│ ├── tasks
│ │ └── main.yml
│ └── templates
│ └──
├── setup-app-roles.yml
├── setup-app.yml
└── Vagrantfile ├── setup-app-roles.yml
├── setup-app.yml
└── Vagrantfile

Now, we have to choose a so-called Vagrant Box on Vagrant Cloud. A box is the package format for a Vagrant environment. It depends on the provider and the operation system that you choose to use. In our case, it is a Virtualbox VM image based on a minimal Ubuntu 18.04 system (box name is bento/ubuntu-18.04 ). This box will be configured in our Vagrantfile:

Vagrant.configure("2") do |config| = "bento/ubuntu-18.04"

The next step is to ensure that Ansible will be installed in the box. Thus, we use the shell provisioner of Vagrant. The Vagranfile will be extended by the provisioning code:

Vagrant.configure("2") do |config| # ... other Vagrant configuration config.vm.provision "shell", inline: &lt;&lt;-SHELL sudo apt-get update -y sudo apt-get install -y software-properties-common sudo apt-add-repository ppa:ansible/ansible sudo apt-get update -y sudo apt-get install -y ansible # ... other Vagrant provision steps SHELL

The last step is to copy the SSH credential into the Vagrant box. Thus, we mark the SSH credential folder of the host system as a Shared folder, so that we can copy them to the SSH config folder inside the box.

Vagrant.configure("2") do |config| # ... other Vagrant configuration config.vm.synced_folder ".", "/vagrant" config.vm.synced_folder "path to your ssh config", "/home/vagrant/ssh-host" # ... other Vagrant configuration config.vm.provision "shell", inline: &lt;&lt;-SHELL # ... other Vagrant provision steps cp /home/vagrant/ssh-host/* /home/vagrant/.ssh/. SHELL


After setting up the toolchain let’s have a look at how to work with it. I write my Ansible playbook on the Windows system and run them from the Linux guest system against the remote hosts. For running the Ansible playbooks we have to start the Vagrant box.

> cd ansible-workspace
> vagrant up

When the Vagrant box is ready to use, we can jump into the box with:

You can find the Ansible playbooks inside the box in the folder /vagrant. In this folder run Ansible:

> cd" /vagrant
> ansible-playbook -i inventories/test -u tekkie setup-db.yml


Maybe on Windows 10, it’s possible to use Ansible natively, because of the Linux subsystem. But I don’t try it out. Some Docker fans would prefer a container instead of a virtual machine. But remember, before Windows 10 Docker runs on Windows in a virtual machine, so therefore, I don’t see a benefit for using Docker instead of a virtual machine. But of course with Windows 10 native container support a setup with Docker is a good alternative if Ansible doesn’t run on the Linux subsystem.

Do you another idea or approach? Let me know and write a comment.


Original Link

Provision an AWS APC using Ansible

This tutorial explains how to manually provision an AWS Virtual Private Cloud (VPC) using Ansible. Before you start, you should be familiar with the following concepts:

The best way to get started is to install Ansible and run playbooks manually on your local machine to provision a VM. The first section of this tutorial explains how to do that. However, manual execution isn’t the best and most efficient way to run Ansible playbooks, so we will take a look at the challenges and learn how to automate this workflow with a step-by-step tutorial.

Follow the steps below in order to provision your EC2 machine.

Step 1: Prep Your Machine

You can run your Ansible scripts manually on your local machine to provision a VPC. This is the best way to get started with this task.

Step 2: Prepare Ansible Playbook

Ansible uses a convention for folder structure that looks something like this:

  • ansible.cfg holds configuration info
  • inventory has the inventory of artifacts
  • variables.yml has the vars that you need for your scripts to make it more reusable
  • vpc_prov_playbook.yml is the playbook which has a list of tasks to execute
├── ansible.cfg
├── inventory
├── variables.yml
├── vpc_prov_playbook.yml

If you do not have your own ansible playbook, please feel free to clone our sample playbook here.

In our scenario, the important files are:

  • vpc_prov_playbook.yml, which is the playbook config containing tasks should be run as part of this playbook.
  • variables.yml, which contains wildcard settings for the playbook.

It is important to note the following:

  • vpc_prov_playbook.yml scripts have some wildcards, which ansible replaces by reading values from variables.yml.
  • Since we want to create a reusable playbook, we have not hardcoded values in variables.yml but left it up to the user to replace these when needed. This will be done in a later step, just before running the playbook.
  • Replace the wildcards in variables.yml with your desired values: ${vpc_region} ${vpc_name} ${vpc_cidr_block} ${vpc_access_from_ip_range} ${vpc_public_subnet_1_cidr}

Step 3: Run Your Playbook!

  • Execute the following command to run the Ansible playbook from the directory that contains the playbook.
$ ansible-playbook -v vpc_prov_playbook.yml 
  • Verify on AWS that the VPC was created successfully.

Challenges with Running Ansible Playbooks Manually

While manual execution is great while getting started, you’ll run into some challenges if you continue doing this manually.

Reduced Reusability

The vpc_prov_playbook.yml is a reusable playbook, i.e. it has wildcards for settings like region, name, and CIDR blocks. This means that as long as you inject the right values using variables.yml, the playbook can be used to provision multiple VPCs. However, this also means that you need to be very careful to use the right variables.yml each time, and the number of these files will multiply over time. This defeats the reusability of your playbook. The right way to do this is to have a programmatic way to inject the right values based on context.

Security Concerns

 The machine you will use to run your playbook needs to be authenticated to the AWS account. If you now want to provision using different credentials, you’ll need to keep switching accounts, or use different machines. The machines also need to be secure since your AWS credentials will be accessible on the machine unless you clean up after every execution.

In a nutshell, if you want to achieve frictionless execution of Ansible playbooks with modular, reusable playbooks, you need to templatize your playbooks and automate the workflow used to execute them.

Automated Provisioning of AWS EC2 VMs using Ansible 

To show you how to automate the provisioning of your AWS infrastructure, we have designed a step by step tutorial in our documentation.

Original Link

Unit Testing and DevOps

I gave a small talk about this a few years ago at the LA DevOps meetup. The slides are here.

I should start with some background on why I did this talk in the first place. Unit testing (and integration and end-to-end) is something familiar to even a junior developer. It’s definitely an extensive and hotly debated topic. Anything from TDD to the quality of tests to the significance of coverage to prioritizing integration over unit tests, etc, etc. These are interesting subjects, but they are for a different post.

Unfortunately, unit testing is often a foreign concept in many DevOps/syseng/ops teams. The way I look at systems engineering or DevOps (I hate the term when used as a job description, though I think I lost that battle) is that in a modern organization, they are fundamentally software development teams with software engineers. It just so happens that their domain of expertise is systems/observability/scalability, etc and their product is often the tooling or the glue or the platform that makes all of this happen.

But if your job is primarily writing software and you take infrastructure-as-code approach, then tests are absolutely mandatory.  My goal with this talk was to give a quick intro to testing and lay out a few options that are available to folks in this world. A lot of details are in the slides, which I am not going to rehash here, but some of the things that can be useful:

  • moto – excellent library for mocking out boto and a lot of AWS services. Having python scripts and tools is fairly common in ops and boto library is quite good. I often use it for testing in addition to unittest and others.
  • bats, roundup, and shunit2 – bash scripts can also use some love and testing.

For config management tools, there are plenty of testing frameworks available as well. My teams tended to settle on Ansible so I am more intimately familiar with it, but a lot of these work across the board. The tricky part is deciding where to draw the line and what can be reasonably tested. A few general things that will typically be important to test are:

  • Complex conditionals – sometimes these are unavoidable. These often come paired with somewhat complex data structures that are being evaluated. You want to make sure you check your logic.
  • Logic in templating engines (Jinja2 in Ansible) – Something very complicated there is often a code smell, but there are exceptions to every rule.
  • Variable inheritance – In Ansible, you can define variables in 24 places and the precedence is sometimes unclear. When the same variable is being set in multiple places, you need to test that the precedence is correct.

More specifically, with Ansible, you can use a combination of:

As a reference example, this is a workflow that a few of my teams have used in the past. It’s an Ansible example, but other config management systems will be reasonably similar. A typical role folder/file structure will look like this:

r-role_name .bumpversion.cfg molecule.yml playbook.yml defaults/ main.yml files/ goss/ test.yml handlers/ main.yml tasks/ main.yml templates/ main.yml tests/ test.yml vars/ main.yml

A majority of the directory structure is the same as you would see with ansible-galaxy init. The rest is:

  • bumpversion – used for auto-versioning roles by the build system.
  • molecule.yml – has Molecule configuration in there. This is where we’d specify the docker container, test sequences, and the tool used for testing among other things.
  • files/goss/test.yml – this will have a set of tests for a particular role. It will generally include some combination of these tests if using goss and/or additional custom testing code.

When engineer commits changes to the role, the build system will trigger a “build,” execute the tests defined for it and increment the version if the tests pass. Especially in cases when dealing with complex roles that handle many scenarios, this can be very helpful. Someone might be making a small modification and the overall impact is possibly unclear. Assuming tests are well written and cover main use cases, this will help an engineer catch an error early.

No organization is going to be perfect and have 100% test coverage that captures every corner case. But adopting some structure/process, setting aspirational goals will go a long way, especially for larger teams. What’s not acceptable is ignoring testing altogether or saying that it can’t be done.

Original Link

Gaining a Systemic View of Immutable Infrastructure Tooling

I put together a few starter notes on things you should delve into and understand before working with infrastructure related tooling, like Ansible, Terraform, or similar tools. If you think I’ve missed any do ping me @Adron and let me know your thoughts on other additions. The first starter items I’d list as the following.

  • Computer Networking – Basic understanding of what an IP number is, how they’re assigned via DHCP vs. static, and related basic TCP/IP knowledge. Also, it is helpful, yet not necessary to understand the idea behind CIDR ranges and related network concepts about addressing also.
  • Basic Linux or Unix CLI Commands – Commands like; ls, mv, rm, cp, mkdir, rmdir, wget, curl, ping, traceroute, ifconfig, nslookup, vi, kill, chown, whoami, su, chmod, chown, chroot, cat, env, echo or tail. All these commands aren’t used in the course, but they’re fundamental for building and troubleshooting any infrastructure and related environments and systems in the future.
  • Basic Scripting and Language Constructs – Things like if-then-else, while loops, and related programming constructs for use in scripting languages like Bash or Powershell.

With those basics covered, here are some of the things that help shape one getting a good understanding of what Terraform, Ansible, Chef, Puppet, and related tooling is.

  • Ansible – – Ansible is a tool that helps provision, deploy, and apply compliance of infrastructure through configuration management. It was purchased by Red Hat and is now in their stable of tools.
  • Puppet – – Puppet provides discovery, management of infrastructure, and mutable updates and changes to that infrastructure. It’s also important to note that Puppet is made up of a number of individual tools.
  • Chef – – Chef is one of the leading companies that helped to start and push the DevOps narrative in the industry. It is a tool that provides a programmatic DSL, configuration, and related ways to create and manage one’s infrastructure. Chef, similarly to Puppet, is also made up of a number of individual tools, and the company itself continues to develop a number of tools within the space.
  • CloudFormation – – AWS CloudFormation provides a way to describe and provision infrastructure resources in AWS. This is used as the single source of truth for all the cloud resources when used.

I talk about and use Terraform a LOT so let’s talk comparisons with these tools and Terraform for a moment. Terraform compares with configuration management tools like Chef and Puppet in it’s abstract use of provisioners to setup resources such as the network, instances, and other things within the cloud platform being deployed to. Terraform operates at a higher level abstraction of to where the platform’s resources are deployed. By doing this Terraform can be used with, or even similar to, Chef and Puppet. These tools can complement each other or work independently.

In comparison to CloudFormation, the differences start out with the cross-platform and multi-cloud platform capabilities of Terraform versus CloudFormation being limited to only AWS. Terraform also draws out a more stage based approach for each execution. This gives more control over where, when, and why Terraform can replace or be used in conjunction with CloudFormation or any of the tooling mentioned.

These Tools, Immutable Infrastructure?

With these tools, one key ideal among them is the pattern of immutable infrastructure. Defining immutable infrastructure, in short, is the idea you create and operate your infrastructure using the programming concept of immutability. Once you instantiate something it is never changed. It is replaced with another instance of making changes or ensuring proper behavior.

The effect of immutable infrastructure is that it provides stability, efficiency, and fidelity to your applications through automation and the use of successful patterns from software development.

  • Increasing operational complexity can be dealt with. With the complexity of distributed systems like Cassandra, Kubernetes, and others immutable infrastructure practices have become the defacto way to manage this extra complexity and volume of resources.
  • Snowflake components are removed. Mutable maintenance methods enable this idea of snowflake components like bespoke, handcrafted, and artisanal servers and infrastructure. Immutable infrastructure eliminates the possibility of these outliers and increases repeatability within systems and infrastructure.
  • Increased threat identification and mitigation of harm. With immutable infrastructure the number of knowns, at the start and completion of infrastructure usage, increases as needed. Need debugging information, a single way to gather that can be provided, set by policy, and enforced with immutable infrastructure in a way that simply can’t be with artisanal infrastructure.
  • Integration and seamless continuous integration and deployment. With immutable infrastructure, the enforcement and enablement of integration, and deployment of solutions, is largely self-fulfilled. To deploy through automation requires integration and deployment, for immutable infrastructure automation for integration and deployment is needed, and, in turn, enables the other.
  • Reproducibility is inherent in the system. The ability to take an environment and all of its nuances, recreate that environment among complex orchestrations that is flawless with immutable infrastructure. This gives us the ability to set audit points, track, and log data accordingly to the entire environment.

That’s my wrap up for getting into and gaining a more holistic idea of what immutable infrastructure is and what tools work to help attain this level of infrastructure development and usage. Until next time, feel free to throw ideas at me via @Adron!

Original Link

Provision AWS EC2 Virtual Machines With Ansible

This tutorial explains how to manually provision an AWS EC2 virtual machine using Ansible. Before you start, you should be familiar with the following concepts:

 The best way to get started is to install Ansible and run playbooks manually on your local machine to provision a VM. The first section of this tutorial explains how to do that. However, manual execution isn’t the best and most efficient way to run ansible playbooks, so we will take a look at the challenges and learn how to automate this workflow.

Step-by-Step Instructions

Follow the steps below in order to provision your EC2 machine.

Step 1: Prep Your Machine

Step 2: Prepare Ansible Playbook

  • Ansible uses a folder structure that looks like this:
    • ansible.cfg holds configuration info.
    • inventory has the inventory of artifacts.
    • variables.yml has the variables that are used to replace wildcards in your playbooks to make them more reusable.
    • ec2_prov_playbook.yml is the playbook which has a list of tasks to provision an EC2 instance.
    • ec2_term_playbook.yml is the playbook which has a list of tasks to terminate an EC2 instance.
├── ansible.cfg
├── inventory
│ ├── base
│ ├── ec2.ini
│ ├──
│ ├── static_hosts
├── variables.yml
├── ec2_prov_playbook.yml
├── ec2_term_playbook.yml
  • If you do not have your own ansible playbooks, please feel free to clone our sample playbook here.
  • In our scenario, the important files are:
  • It is important to note the following:
    • ec2_prov_playbook.yml and ec2_term_playbook.yml scripts have some wildcards, which ansible replaces with values from variables.yml.
    • Since we want to create a reusable playbook, we have not hardcoded values in variables.ymlbut left it up to the user to replace these when needed. This will be done in a later step, just before running the playbook.
  • In ansible.cfg, replace ${AWS_EC2_PEM_KEYPATH} with the path to the PEM key that should be used to provision the machine.
  • In variables.yml, replace these wildcards with your desired values: ${ec2_region} ${ec2_tag_Type} ${ec2_image} ${ec2_keypair} ${ec2_volume_size} ${ec2_count} ${security_group_id} ${public_subnet_id} ${ec2_tag_Type} ${ec2_tag_Role}.

Step 3: Run Your Playbook!

  • Execute the following command to run the ansible playbook from the directory that contains the playbook.
ansible-playbook -v ec2_prov_playbook.yml
  • Verify on AWS if the EC2 machine was provisioned.
  • You can terminate the instance by running the command below:
ansible-playbook -v ec2_term_playbook.yml

Challenges With Manual Execution of Ansible Playbooks

There are a few challenges with manual execution of Ansible playbooks:

  • Ansible playbook templates can be reused since they have wildcards. However, you need a programmatic way to replace wildcards at runtime. Creating static variables files is an option, but reduces reusability.
  • Automating provisioning for different environments and creating a dependency tree of all applications that are deployed into that environment is tedious to achieve with manual steps. You need an automated workflow to effectively transfer information like subnet_id, security_group_id to downstream activities. for e.g. EC2 provisioners.
  • Security with RBAC is a problem. The machine used to provision is authenticated to an AWS account (even in the case of service accounts). This means that the only way you can implement RBAC across multiple projects/teams is to use multiple accounts on the machine. This is messy and painful to maintain at scale.
  • The machine has to be prepped with the right version of the CLI. If multiple teams are deploying and they have a need to use different versions of the CLI, you will need different deployment machines for each team.

In a nutshell, if you want to achieve frictionless execution of Ansible playbooks with modular, reusable playbooks, you need to templatize your playbooks and automate the workflow used to execute them.

Automated Provisioning of AWS EC2 VMs Using Ansible 

To show you how to automate the provisioning of your AWS infrastructure, we have designed a step by step tutorial in our documentation:

Automate provisioning of AWS EC2 using Ansible

Original Link

Setting Up a Cassandra Cluster Through Ansible

In this post, we will use Ansible to and set up an Apache Cassandra database cluster. We will use AWS EC2 instances as the nodes for the cluster. Creating a cluster manually is a tedious task. We have to manually configure each node and each node must be correctly configured before starting the cluster. With Ansible, we can automate the task and let Ansible handle the configuration management for us.

First of all, create a directory for storing the files and folders related to the playbook. It helps in keeping our work organized and saves us from the confusion which may arise due to relative and absolute path references in passing the variables in our playbook. Following is the structure of my directory that contains the playbook and the roles:

Steps to Follow While Using AWS

  • Create two or three instances of AWS EC2 that will serve as the nodes in a cluster.
  • Create a security group to allow all connections and add the nodes to that security groups.
  • Create an inventory that has the IP addresses of the nodes.
  • Add the inventory file into the configuration file of the Ansible, e.g. ansible.cfg.

Now, we create a playbook to set up the nodes for us. Following is the playbook:

- hosts: aws-webservers gather_facts: yes remote_user: ec2-user become: yes vars: cluster_name: Test_Cluster seeds: roles: - installation

Then, we define the roles we have created. In the role, installation, the following tasks have been achieved:

  • Installing a JRE.
  • Adding and unpacking the Apache Cassandra tar.
  • Replacing the cassandra.yaml having default configurations with cassandra.yaml with our own configurations, whose details are given below.
  • Ensuring Cassandra is started.

The following is the main.yml file from the roles:

- name: Copt Java RPM file copy: src: jdk-8_linux-x64_bin.rpm dest: /tmp/jdk-8_linux-x64_bin.rpm - name: install JDK via RPM file with yum yum: name: /tmp/jdk-8_linux-x64_bin.rpm state: present
- name: Copy Cassandra tar copy: src: apache-cassandra-3.11.2-bin.tar.gz dest: /tmp/apache-cassandra-3.11.2-bin.tar.gz - name: Extract Cassandra command: tar -xvf /tmp/apache-cassandra-3.11.2-bin.tar.gz - name: override cassandra.yaml file template: src=cassandra.yaml dest=apache-cassandra-3.11.2/conf/ - name: Run Cassandra from bin folder command: ./cassandra -fR args: chdir: /home/ec2-user/apache-cassandra-3.11.2/bin/

The cassandra.yaml contains most of the Cassandra configuration such as ports used, file locations, and seed node IP addresses. We need to edit this file on each node, so I have created a template for the file. The template cassandra.yaml uses the following variables:

  • cluster_name: ‘{{ cluster_name }} can be anything chosen by you to describe the name of the cluster.
  • seeds: “{{ seeds }} are the IP addresses of the clusters seed servers. Seed nodes are used as known places where cluster information (such as a list of nodes in the cluster) can be obtained.
  • listen_address: {{ aws-webservers }} is the IP address that Cassandra will listen on for internal (Cassandra to Cassandra) communication will occur.
  • rpc_address: {{ aws-webservers }} is the IP address that Cassandra will listen on for client-based communication.

Now, we can run the playbook and our cluster will be up and running. We can add more nodes to the list by simply adding them to the host list and Ansible will ensure that Cassandra is installed and the nodes are connected to the cluster and started.

Points to Remember

  • The host IP should be the public IP of a node.

  • Put the Java rpm packages and Cassandra tar file in the files directory of the role created.

  • Use Java 8, as Cassandra is not supported on higher versions of Java. It will throw the following error:

[0.000s][warning][gc] -Xloggc is deprecated. Will use -Xlog:gc:/home/mmatak/monero/apache-cassandra-3.11.1/logs/gc.log instead.
intx ThreadPriorityPolicy=42 is outside the allowed range [ 0 ... 1 ]
Improperly specified VM option 'ThreadPriorityPolicy=42'
Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.

Thus, Ansible makes it very easy to install distributed systems like Cassandra. The thought of doing it manually is very disheartening. The full source code including templates and directory structure are here.

This article was first published on the Knoldus blog.

Original Link

Ansible, Part IV: Putting It All Together

Roles are the most complicated and yet simplest aspect of Ansible to learn.

I’ve mentioned before that Ansible’s ad-hoc mode often is overlooked as just a way to learn how to use Ansible. I couldn’t disagree with that mentality any more fervently than I already do. Ad-hoc mode is actually what I tend to use most often on a day-to-day basis. That said, using playbooks and roles are very powerful ways to utilize Ansible’s abilities. In fact, when most people think of Ansible, they tend to think of the roles feature, because it’s the way most Ansible code is shared. So first, it’s important to understand the relationship between ad-hoc mode, playbooks and roles.

Ad-hoc Mode

This is a bit of a review, but it’s easy to forget once you start creating playbooks. Ad-hoc mode is simply a one-liner that uses an Ansible module to accomplish a given task on a set of computers. Something like:

ansible cadlab -b -m yum -a "name=vim state=latest"

will install vim on every computer in the cadlab group. The -b signals to elevate privilege (“become” root), the -m means to use the yum module, and the -a says what actions to take. In this case, it’s installing the latest version of vim.

Usually when I use ad-hoc mode to install packages, I’ll follow up with something like this:

ansible cadlab -b -m service -a "name=httpd state=started &rarrhk;enabled=yes"

That one-liner will make sure that the httpd service is running and set to start on boot automatically (the latter is what “enabled” means). Like I said at the beginning, I most often use Ansible’s ad-hoc mode on a day-to-day basis. When a new rollout or upgrade needs to happen though, that’s when it makes sense to create a playbook, which is a text file that contains a bunch of Ansible commands.

Playbook Mode

I described playbooks in my last article. They are YAML- (Yet Another Markup Language) formatted text files that contain a list of things for Ansible to accomplish. For example, to install Apache on a lab full of computers, you’d create a file something like this:

--- - hosts: cadlab tasks: - name: install apache2 on CentOS yum: name=httpd state=latest notify: start httpd ignore_errors: yes - name: install apache2 on Ubuntu apt: update_cache=yes name=apache2 state=latest notify: start apache2 ignore_errors: yes handlers: - name: start httpd service: name=httpd enable=yes state=started - name: start apache2 service: name=apache2 enable=yes state=started

Mind you, this isn’t the most elegant playbook. It contains a single play that tries to install httpd with yum and apache2 with apt. If the lab is a mix of CentOS and Ubuntu machines, one or the other of the installation methods will fail. That’s why the ignore_errors command is in each task. Otherwise, Ansible would quit when it encountered an error. Again, this method works, but it’s not pretty. It would be much better to create conditional statements that would allow for a graceful exit on incompatible platforms. In fact, playbooks that are more complex and do more things tend to evolve into a “role” in Ansible.


Roles aren’t really a mode of operation. Actually, roles are an integral part of playbooks. Just like a playbook can have tasks, variables and handlers, they can also have roles. Quite simply, roles are just a way to organize the various components referenced in playbooks. It starts with a folder layout:

roles/ webserver/ tasks/ main.yml handlers/ main.yml vars/ main.yml templates/ index.html.j2 httpd.conf.j2 files/ ntp.conf

Ansible looks for a roles folder in the current directory, but also in a system-wide location like /etc/ansible/roles, so you can store your roles to keep them organized and out of your home folder. The advantage of using roles is that your playbooks can look as simple as this:

--- - hosts: cadlab roles: - webserver

And then the “webserver” role will be applied to the group “cadlab” without needing to type any more information inside your playbook. When a role is specified, Ansible looks for a folder matching the name “webserver” inside your roles folder (in the current directory or the system-wide directory). It then will execute the tasks inside webserver/tasks/main.yml. Any handlers mentioned in that playbook will be searched for automatically in webserver/handlers/main.yml. Also, any time files are referenced by a template module or file/copy module, the path doesn’t need to be specified. Ansible automatically will look inside webserver/files/ or /webserver/templates/ for the files.

Basically, using roles will save you lots of path declarations and include statements. That might seem like a simple thing, but the organization creates a standard that not only makes it easy to figure out what a role does, but also makes it easy to share your code with others. If you always know any files must be stored in roles/rolename/files/, it means you can share a “role” with others and they’ll know exactly what to do with it—namely, just plop it in their own roles folder and start using it.

Sharing Roles: Ansible Galaxy

One of the best aspects of current DevOps tools like Chef, Puppet and Ansible is that there is a community of people willing to share their hard work. On a small scale, roles are a great way to share with your coworkers, especially if you have roles that are customized specifically for your environment. Since many of environments are similar, roles can be shared with an even wider audience—and that’s where Ansible Galaxy comes into play.

I’ll be honest, part of the draw for me with Ansible is the sci-fi theme in the naming convention. I know I’m a bit silly in that regard, but just naming something Ansible or Ansible Galaxy gets my attention. This might be one of those “built by nerds, for nerds” sort of things. I’m completely okay with that. If you head over to the Galaxy site, you’ll find the online repository for shared roles—and there are a ton.

For simply downloading and using other people’s roles, you don’t need any sort of account on Ansible Galaxy. You can search on the website by going to Galaxy and clicking “Browse Roles” on the left side of the page (Figure 1). There are more than 13,000 roles currently uploaded to Ansible Galaxy, so I highly recommend taking advantage of the search feature! In Figure 2, you’ll see I’ve searched for “apache” and sorted by “downloads” in order to find the most popular roles.

Figure 1. Click that link to browse and search for roles.

Figure 2. Jeff Geerling’s roles are always worth checking out.

Many of the standard roles you’ll find that are very popular are written by Jeff Geerling, whose user name is geerlingguy. He’s an Ansible developer who has written at least one Ansible book that I’ve read and possibly others. He shares his roles, and I encourage you to check them out—not only for using them, but also for seeing how he codes around issues like conditionally choosing the correct module for a given distribution and things like that. You can click on the role name and see all the code involved. You might notice that if you want to examine the code, you need to click on the GitHub link. That’s one of the genius moves of Ansible Galaxy—all roles are stored on a user’s GitHub page as opposed to an Ansible Galaxy server. Since most developers keep their code on GitHub, they don’t need to remember to upload to Ansible Galaxy as well.

Incidentally, if you ever desire to share your own Ansible roles, you’ll need to use a GitHub user name to upload them, because again, roles are all stored on GitHub. But that’s getting ahead of things; first you need to learn how to use roles in your environment.

Using ansible-galaxy to Install Roles

It’s certainly possible to download an entire repository and then unzip the contents into your roles folder. Since they’re just text files and structured folders, there’s not really anything wrong with doing it that way. It’s just far less convenient than using the tools built in to Ansible.

There is a search mechanism on the Ansible command line for searching the Ansible Galaxy site, but in order to find a role I want to use, I generally go to the website and find it, then use the command-line tools to download and install it. Here’s an example of Jeff Geerling’s “apache” role. In order to use Ansible to download a role, you need to do this:

sudo ansible-galaxy install geerlingguy.apache

Notice two things. First, you need to execute this command with root privilege. That’s because the ansible-galaxy command will install roles in your system-wide roles folder, which isn’t writable (by default) by your regular user account. Second, take note of the format of roles named on Ansible Galaxy. The format is username.rolename, so in this case, geerlingguy.apache, which is also how you reference the role inside your playbooks.

If you want to see roles listed with the correct format, you can use ansible-galaxy‘s search command, but like I said, I find it less than useful because it doesn’t sort by popularity. In fact, I can’t figure out what it sorts by at all. The only time I use the command-line search feature is if I also use grep to narrow down roles by a single person. Anyway, Figure 3 shows what the results of ansible-galaxy search look like. Notice the username.rolename format.

Figure 3. I love the command line, but these search results are frustrating.

Once you install a role, it is immediately available for you to use in your own playbooks, because it’s installed in the system-wide roles folder. In my case, that’s /etc/ansible/roles (Figure 4). So now, if I create a playbook like this:

- hosts: cadlab roles: - geerlingguy.apache

Apache will be installed on all my cadlab computers, regardless of what distribution they’re using. If you want to see how the role (which is just a bunch of tasks, handlers and so forth) works, just pick through the folder structure inside /etc/ansible/roles/geerlingguy.apache/. It’s all right there for you to use or modify.

Figure 4. Easy Peasy, Lemon Squeezy

Creating Your Own Roles

There’s really no magic here, since you easily can create a roles folder and then create your own roles manually inside it, but ansible-galaxy does give you a shortcut by creating a skeleton role for you. Make sure you have a roles folder, then just type:

ansible-galaxy init roles/rolename

and you’ll end up with a nicely created folder structure for your new role. It doesn’t do anything magical, but as someone who has misspelled “Templates” before, I can tell you it will save you a lot of frustration if you have clumsy fingers like me.

Sharing Your Roles

If you get to the point where you want to share you roles on Ansible Galaxy, it’s fairly easy to do. Make sure you have your role on GitHub (using git is beyond the scope of this article, but using git and GitHub is a great way to keep track of your code anyway). Once you have your roles on GitHub, you can use ansible-galaxy to “import” them into the publicly searchable Ansible Galaxy site. You first need to authenticate:

ansible-galaxy login

Before you try to log in with the command-line tool, be sure you’ve visited the Ansible Galaxy website and logged in with your GitHub account. You can see in Figure 5 that at first I was unable to log in. Then I logged in on the website, and after that, I was able to log in with the command-line tool successfully.

Figure 5. It drove me nuts trying to figure out why I couldn’t authenticate.

Once you’re logged in, you can add your role by typing:

ansible-galaxy import githubusername githubreponame

The process takes a while, so you can add the -no-wait option if you want, and the role will be imported in the background. I really don’t recommend doing this until you have created roles worth sharing. Keep in mind, there are more than 13,000 roles on Ansible Galaxy, so there are many “re-inventions of the wheel” happening.

From Here?

Well, it’s taken me four articles, but I think if you’ve been following along, you should be to the point where you can take it from here. Playbooks and roles are usually where people focus their attention in Ansible, but I also encourage you to take advantage of ad-hoc mode for day-to-day maintenance tasks. Ansible in some ways is just another DevOps configuration management tool, but for me, it feels the most like the traditional problem-solving solution that I used Bash scripts to accomplish for decades. Perhaps I just like Ansible because it thinks the same way I do. Regardless of your motivation, I encourage you to learn Ansible enough so you can determine whether it fits into your workflow as well as it fits into mine.


Here are links to the first three articles in this series by Shawn Powers:

Original Link

Running Ansible at Scale

I’ve used plenty of automation solutions over the years. Chef, Puppet, Fabric, SaltStack, Capistrano, custom scripts, etc. All of them work well to varying degrees, but only one tool has stood the test of time for me. That tool is Ansible.

In my humble opinion, no other tool combines the same level of functionality, ease of use, maintainability, portability, extensibility, and security as Ansible. I’ve used it for everything as simple as checking the time on a fleet of AWS EC2 instances, to complex orchestration operations like a zero-downtime blue/green deployment.

In this article I’ll give a brief overview of Ansible, and then quickly jump into some examples of how I’ve used it in the past for automation activities in a cloud environment.

The Basics

Ansible uses an agentless approach, making it a perfect fit for dynamic cloud environments. Since it uses SSH to communicate with remote hosts, there’s no additional infrastructure required.

Ansible has 3 primary concepts:

  • Host Inventory: A set of named groupings of hosts. It can be a static map, or it can be dynamic when dealing with constantly changing cloud infrastructure.
  • Playbooks: Indicates to Ansible which set of hosts should have what tasks performed on them. For example, there may be a web server farm which should all have Nginx installed.
  • Roles: A grouping of tasks run on a single host. For example, there may be a role for installing, configuring, and starting Nginx. A role has no concept of which host it will apply to.

We’ll look at each of these in more detail in the following sections.

Host Inventory

Static Host Inventory

The simplest approach to defining host groupings is with a static inventory file.


[webservers] [dbservers] 

This file defines 3 “webserver” hosts, and 2 “dbserver” hosts. While this may be fine when dealing with a traditional data center, it becomes nearly impossible to manage in a dynamic cloud environment. What we want is for the list of hosts to be dynamically constructed based on metadata. This is what the dynamic inventory feature of Ansible provides.

Dynamic Host Inventory

Dynamic inventory is what makes Ansible such a great fit in a cloud environment. As servers come and go, Ansible can dynamically build a list of hosts.

The exact mechanism of how this works depends on the cloud provider. In AWS, a script is used to make calls to the EC2 metadata service and group hosts by whatever metadata you choose. For example, you may have a web server farm consisting of a number of identically configured servers running Nginx. You could add EC2 tags for each instance using a key of “Service”, and a value of “Webserver”. Ansible’s dynamic inventory can then be used to discover any of these EC2 instances using the host name tag_Service_Webserver.

A similar option exists for Azure. In this case the script is called

For Google Cloud the script is

All of the available dynamic inventory scripts can be found here:

To use these scripts, place them in a subdirectory of the inventory folder. For example, this is what my folder structure looks like for AWS.

inventories/aws/ inventories/aws/ec2.ini 

To use the inventory, simply pass it to the ansible-playbook command.

ansible-playbook -i inventories/aws playbooks/myplaybook.yml


Now that we know how to define named groups of hosts, we can create a playbook. A playbook is a yaml file which describes the tasks and roles which should be applied to a given set of hosts.

In the example below, we configure any EC2 instance with the tag key-value pair of “service:zeppelin” to run Apache Zeppelin (a fantastic data analytics workbench).


- hosts: tag_service_zeppelin become: true roles: - java8 - zeppelin 

For all matching hosts, this playbook will first apply the java8 role, and then the zeppelin role. It is the responsiblity of the role to define what should actually happen.


Defining roles is where most of the work takes place in setting up an Ansible-based automation solution. The role is where you define what packages to install, any users to create, systemd templates, configuration file templates, start/stop the service, etc.

Fortunately a large and active community can be found at Ansible Galaxy. There you can find roles already built for most common applications.


We want to reuse our playbooks and roles as much as possible, so we’ll extract any environment-specific values into variables.

With a dynamic inventory, you can easily group variables by host group using the following layout:


vars.yml contains property key-values that don’t need to be encrypted at rest.

--- http_port: 8080 

vault.yml uses Ansible Vault to store properties requiring encryption. Database passwords, private keys, etc. Start by creating a plain text properties file called vault.yml as follows:

--- db_password: some_complex_password 

To encrypt the file:
ansible-vault encrypt vault.yml

To use the encrypted values during a playbook run, you need to supply the vault password. One way is to prompt for it with the --ask-vault-pass flag:
ansible-playbook --ask-vault-pass -i inventories/aws playbooks/myplaybook.yml

Directory Layout

Ansible has a recommended directory layout, but I’ve found that having all the playbooks at the root level adds clutter.

This is the structure that has worked well for me:

inventories/ aws/ group_vars/ # variables for groups playbooks/ setup-kafka.yml # playbook to setup a Kafka cluster deploy-myapp.yml # playbook to deploy 'myapp' roles/ common/ kafka/ java8/ myapp/ 

Putting It All Together

The title of this article is “Running Ansible at Scale”. But we haven’t yet addressed how all of this should work when dealing with multiple teams, prod/uat/dev environments, and how to meet the normal enterprise requirements of least privilege and separation of duties.

In order for Ansible to work we need a control server somewhere. I’ve found it works best to have separate control servers as dictated by security requirements. For example, you might have a locked down server for production automation, and a separate one for uat automation. This allows you to limit what each Ansible master can do. Network isolation, security groups, and separate SSH keys all contribute to keeping things locked down.

You also should limit who is allowed to run Ansible. My preferred approach here is to restrict ssh access to the Ansible control hosts by using named-user accounts along with MFA. See here for details on how to do this.

And finally, you need to ensure you have a full audit trail. All of your Ansible code should be stored in a version control system. Each Ansible playbook run should write its log output to a centralized logging system. While I prefer using syslog along with a log shipping system such as logstash, there are plenty of other logging options detailed here.

An excellent option for integrating Ansible into an enterprise is to use the commercial Ansible Tower product, or the open-source upstream AWX project.


Time Checks

Ansible can be used to run ad-hoc commands across a set of hosts. Basically we forgo the use of a playbook, and instead execute a single command.

The example below shows how to run an arbitrary command across a set of servers. In this case, we want to check the time and date of all EC2 instances with a tag key of Role and value of PROD_apigateway, PROD_serviceA, or PROD_serviceB. This particular command is useful to check for any servers with excessive clock drift due to ntp issues.

ansible tag_Role_PROD_apigateway, tag_Role_PROD_serviceA, tag_Role_PROD_serviceB -i inventories/aws -a "date" 

Rolling AWS Deployment

The real value of playbooks can be seen when a complex orchestration of operations need to be performed across a fleet of servers.

Let’s assume we have a fairly basic web application architecture. A fronting web server farm, an application server cluster, and a backend MySQL database. We also assume that our applications can tolerate simultaneous different versions running across tiers.

At a high level, our deployment pipeline requires the following tasks to be orchestrated by Ansible (all running from the Ansible control host in our AWS devzone):

  1. Record start of deployment process in release tracking tool
  2. Perform database schema upgrade
  3. For each tier (webserver, appserver):
    1. Disable monitoring
    2. Remove server from ELB pool
    3. Shut down application
    4. Update application
    5. Start application
    6. Enable monitoring
    7. Add server to ELB pool
    8. Wait for service to pass health checks
  4. Record deployment complete in release tracking tool

Here is a sample playbook showing the process:

# Send a slack notification that the deployment is starting
################# - hosts: tag_Role_PROD_webserver tasks: - name: Send slack notification slack: token: "{{ slack_token }}" msg: "Starting production deployment..." color: warning icon_url: '' run_once: true delegate_to: localhost #################
# Run database scheme update
################# - hosts: tag_Role_PROD_db roles: - database.liquibase #################
# Rolling deployment: web server farm
################# # Roll out updates to the webserver farm 2 nodes at a time
- hosts: tag_Role_PROD_webserver become: yes serial: 2 # These are the tasks to run before applying updates: pre_tasks: - name: Gather EC2 facts action: ec2_facts - name: disable the server in the loadbalancer local_action: ec2_elb args: instance_id: "{{ ansible_ec2_instance_id }}" region: "us-east-1" ec2_elbs: "{{ aws_elb_webserver }}" state: 'absent' - name: Disable service monitor service: name='zabbix-agent' state=stopped # Execute the deployment roles: - application.webserver.deploy # These tasks run after the roles: post_tasks: - name: Enable service monitor service: name='zabbix-agent' state=started - name: Add instance to ELB... will wait up to 5 minutes for healthy checks to pass local_action: ec2_elb args: instance_id: "{{ ansible_ec2_instance_id }}" region: "us-east-1" ec2_elbs: "{{ aws_elb_webserver }}" wait_timeout: 300 state: 'present' #################
# Rolling deployment: application server cluster
################# # Roll out updates to the app server cluster 2 nodes at a time
- hosts: tag_Role_PROD_appserver become: yes serial: 2 # These are the tasks to run before applying updates: pre_tasks: - name: Gather EC2 facts action: ec2_facts - name: disable the server in the loadbalancer local_action: ec2_elb args: instance_id: "{{ ansible_ec2_instance_id }}" region: "us-east-1" ec2_elbs: "{{ aws_elb_appserver }}" state: 'absent' - name: Disable service monitor service: name='zabbix-agent' state=stopped # Execute the deployment roles: - application.appserver.deploy # These tasks run after the roles: post_tasks: - name: Enable service monitor service: name='zabbix-agent' state=started - name: Add instance to ELB... will wait up to 5 minutes for healthy checks to pass local_action: ec2_elb args: instance_id: "{{ ansible_ec2_instance_id }}" region: "us-east-1" ec2_elbs: "{{ aws_elb_appserver }}" wait_timeout: 300 state: 'present' #################
# Send a slack notification that the deployment is complete
################# - hosts: tag_Role_PROD_webserver tasks: - name: Send slack notification slack: token: "{{ slack_token }}" msg: "Production deployment complete." color: good icon_url: '' run_once: true delegate_to: localhost 


In this post I’ve briefly outlined some of the concepts and approaches to using Ansible for configuration management and orchestration. With a mature product, active community, and a focus on simplicity, Ansible is a great tooling choice to manage your cloud infrastructure and applications.

If you’d like to see working examples of some of these concepts, feel free to visit my GitHub repo!

Original Link

Ansible, Part III: Playbooks

Playbooks make Ansible even more powerful than before.

To be quite honest, if Ansible had nothing but its ad-hoc mode, it still would be a powerful and useful tool for automating large numbers of computers. In fact, if it weren’t for a few features, I might consider sticking with ad-hoc mode and adding a bunch of those ad-hoc commands to a Bash script and be done with learning. Those few additional features, however, make the continued effort well worth it.

Tame the Beast with YAML

Ansible goes out of its way to use an easy-to-read configuration file for making “playbooks”, which are files full of separate Ansible “tasks”. A task is basically an ad-hoc command written out in a configuration file that makes it more organized and easy to expand. The configuration files use YAML, which stands for “Yet Another Markup Language”. It’s an easy-to-read markup language, but it does rely on whitespace, which isn’t terribly common with most config files. A simple playbook looks something like this:

--- - hosts: webservers become: yes tasks: - name: this installs a package apt: name=apache2 update_cache=yes state=latest - name: this restarts the apache service service: name=apache2 enabled=yes state=restarted

The contents should be fairly easy to identify. It’s basically two ad-hoc commands broken up into a YAML configuration file. There are a few important things to notice. First, every filename ends with .yaml, and every YAML file must begin with three hyphen characters. Also, as mentioned above, whitespace matters. Finally, when a hyphen should precede a section and when it should just be spaced appropriately often is confusing. Basically every new section needs to start with a – symbol, but it’s often hard to tell what should be its own section. Nevertheless, it starts to feel natural as you create more and more playbooks.

The above playbook would be executed by typing:

ansible-playbook filename.yaml

And that is the equivalent of these two commands:

ansible webservers -b -m apt -a "name=apache2 &rarrhk;update_cache=yes state=latest"
ansible webservers -b -m service -a "name=apache2 &rarrhk;enabled=yes state=restarted"

Handling Your Handlers

But a bit of organization is really only the beginning of why playbooks are so powerful. First off, there’s the idea of “Handlers”, which are tasks that are executed only when “notified” that a task has made a change. How does that work exactly? Let’s rewrite the above YAML file to make the second task a handler:

--- - hosts: webservers become: yes tasks: - name: this installs a package apt: name=apache2 update_cache=yes state=latest notify: enable apache handlers: - name: enable apache service: name=apache2 enabled=yes state=started

On the surface, this looks very similar to just executing multiple tasks. When the first task (installing Apache) executes, if a change is made, it notifies the “enable apache” handler, which makes sure Apache is enabled on boot and currently running. The significance is that if Apache is already installed, and no changes are made, the handler never is called. That makes the code much more efficient, but it also means no unnecessary interruption of the already running Apache process.

There are other subtle time-saving issues with handlers too—for example, multiple tasks can call a handler, but it executes only a single time regardless of how many times it’s called. But the really significant thing to remember is that handlers are executed (notified) only when an Ansible task makes a change on the remote system.

Just the Facts, Ma’am

Variable substitution works quite simply inside a playbook. Here’s a simple example:

--- - hosts: webservers become: yes vars: package_name: apache2 tasks: - name: this installs a package apt: "name={{ package_name }} update_cache=yes state=latest" notify: enable apache handlers: - name: enable apache service: "name={{ package_name }} enabled=yes state=started"

It should be fairly easy to understand what’s happening above. Note that I did put the entire module action section in quotes. It’s not always required, but sometimes Ansible is funny about unquoted variable substitutions, so I always try to put things in quotes when variables are involved.

The really interesting thing about variables, however, are the “Gathered Facts” about every host. You might notice when executing a playbook that the first thing Ansible does is “Gathering Facts…”, which completes without error, but doesn’t actually seem to do anything. What’s really happening is that system information is getting populated into variables that can be used inside a playbook. To see the entire list of “Gathered Facts”, you can execute an ad-hoc command:

ansible webservers -m setup

You’ll get a huge list of facts generated from the individual hosts. Some of them are particularly useful. For example, ansible_os_family will return something like “RedHat” or “Debian” depending on which distribution you’re using. Ubuntu and Debian systems both return “Debian”, while Red Hat and CentOS will return “RedHat”. Although that’s certainly interesting information, it’s really useful when different distros use different tools—for example, apt vs. yum.

Getting Verbose

One of the frustrations of moving from Ansible ad-hoc commands to playbooks is that in playbook mode, Ansible tends to keep fairly quiet with regard to output. With ad-hoc mode, you often can see what is going on, but with a playbook, you know only if it finished okay, and if a change was made. There are two easy ways to change that. The first is just to add the -v flag when executing ansible-playbook. That adds verbosity and provides lots of feedback when things are executed. Unfortunately, it often gives so much information, that usefulness gets lost in the mix. Still, in a pinch, just adding the -v flag helps.

If you’re creating a playbook and want to be notified of things along the way, the debug module is really your friend. In ad-hoc mode, the debug module doesn’t make much sense to use, but in a playbook, it can act as a “reporting” tool about what is going on. For example:

--- - hosts: webservers tasks: - name: describe hosts debug: msg="Computer {{ ansible_hostname }} is running &rarrhk;{{ ansible_os_family }} or equivalent"

The above will show you something like Figure 1, which is incredibly useful when you’re trying to figure out the sort of systems you’re using. The nice thing about the debug module is that it can display anything you want, so if a value changes, you can have it displayed on the screen so you can troubleshoot a playbook that isn’t working like you expect it to work. It is important to note that the debug module doesn’t do anything other than display information on the screen for you. It’s not a logging system; rather, it’s just a way to have information (customized information, unlike the verbose flag) displayed during execution. Still, it can be invaluable as your playbooks become more complex.

Figure 1. Debug mode is the best way to get some information on what’s happening inside your playbooks.

If This Then That

Conditionals are a part of pretty much every programming language. Ansible YAML files also can take advantage of conditional execution, but the format is a little wacky. Normally the condition comes first, and then if it evaluates as true, the following code executes. With Ansible, it’s a little backward. The task is completely spelled out, then a when statement is added at the end. It makes the code very readable, but as someone who’s been using if/then mentality his entire career, it feels funny. Here’s a slightly more complicated playbook. See if you can parse out what would happen in an environment with both Debian/Ubuntu and Red Hat/CentOS systems:

--- - hosts: webservers become: yes tasks: - name: install apache this way apt: name=apache2 update_cache=yes state=latest notify: start apache2 when: ansible_os_family == "Debian" - name: install apache that way yum: name=httpd state=latest notify: start httpd when: ansible_os_family == "RedHat" handlers: - name: start apache2 service: name=apache2 enabled=yes state=started - name: start httpd service: name=httpd enabled=yes state=started

Hopefully the YAML format makes that fairly easy to read. Basically, it’s a playbook that will install Apache on hosts using either yum or apt based on which type of distro they have installed. Then handlers make sure the newly installed packages are enabled and running.

It’s easy to see how useful a combination of gathered facts and conditional statements can be. Thankfully, Ansible doesn’t stop there. As with other configuration management systems, it includes most features of programming and scripting languages. For example, there are loops.

Play It Again, Sam

If there is one thing Ansible does well, it’s loops. Quite frankly, it supports so many different sorts of loops, I can’t cover them all here. The best way to figure out the perfect sort of loop for your situation is to read the Ansible documentation directly.

For simple lists, playbooks use a familiar, easy-to-read method for doing multiple tasks. For example:

--- - hosts: webservers become: yes tasks: - name: install a bunch of stuff apt: "name={{ item }} state=latest update_cache=yes" with_items: - apache2 - vim - chromium-browser

This simple playbook will install multiple packages using the apt module. Note the special variable named item, which is replaced with the items one at a time in the with_items section. Again, this is pretty easy to understand and utilize in your own playbooks. Other loops work in similar ways, but they’re formatted differently. Just check out the documentation for the wide variety of ways Ansible can repeat similar tasks.


One last module I find myself using often is the template module. If you’ve ever used mail merge in a word processor, templating works similarly. Basically, you create a text file and then use variable substitution to create a custom version on the fly. I most often do this for creating HTML files or config files. Ansible uses the Jinja2 templating language, which is conveniently similar to standard variable substitution in playbooks themselves. The example I almost always use is a custom HTML file that can be installed on a remote batch of web servers. Let’s look at a fairly complex playbook and an accompanying HTML template file.

Here’s the playbook:

--- - hosts: webservers become: yes tasks: - name: install apache2 apt: name=apache2 state=latest update_cache=yes when: ansible_os_family == "Debian" - name: install httpd yum: name=httpd state=latest when: ansible_os_family == "RedHat" - name: start apache2 service: name=apache2 state=started enable=yes when: ansible_os_family == "Debian" - name: start httpd service: name=httpd state=started enable=yes when: ansible_os_family == "RedHat - name: install index template: src: index.html.j2 dest: /var/www/html/index.html

Here’s the template file, which must end in .j2 (it’s the file referenced in the last task above):

<h1>This computer is running {{ ansible_os_family }},
and its hostname is:</h1>
<h3>{{ ansible_hostname }}</h3>
{# this is a comment, which won't be copied to the index.html file #}

This also should be fairly easy to understand. The playbook takes a few different things it learned and installs Apache on the remote systems, regardless of whether they are Red Hat- or Debian-based. Then, it starts the web servers and makes sure the web server starts on system boot. Finally, the playbook takes the template file, index.html.j2, and substitutes the variables while copying the file to the remote system. Note the {# #} format for making comments. Those comments are completely erased on the remote system and are visible only in the .j2 file on the Ansible machine.

The Sky Is the Limit!

I’ll finish up this series in my next article, where I plan to cover how to build on your playbook knowledge to create entire roles and take advantage of the community contributions available. Ansible is a very powerful tool that is surprisingly simple to understand and use. If you’ve been experimenting with ad-hoc commands, I encourage you to create playbooks that will allow you to do multiple tasks on a multitude of computers with minimal effort. At the very least, play around with the “Facts” gathered by the ansible-playbook app, because those are things unavailable to the ad-hoc mode of Ansible. Until next time, learn, experiment, play and have fun!


Part I: “Ansible: the Automation Framework That Thinks Like a Sysadmin” by Shawn Powers

Part II: “Ansible: Making Things Happen” by Shawn Powers

Original Link

Caylent’s 12-Step Guide to Managing Docker Swarm Using Ansible

Want to standardize and automate your whole Docker Swarm management? Feeling inspired by our article Creating a High-Availability Docker Swarm on AWS and now wish to reduce the repetition every time you need a swarm? Well, we’ve got you covered with the help of Ansible. In the following article, we’ll lead you through an Ansible playbook that installs a fresh version of Docker, establishes a node as your Docker Swarm manager, and adds additional managers and workers to your swarm until you have a high-availability cluster. Furthermore, the process creates a default Docker network to enable the nodes to communicate properly.

Caylent’s Ansible/Docker Swarm Playbook Guide

1. Setup Prerequisites

Here’s a link to the hosts file and playbook on GitHub which you need for Ansible to build out a Docker Swarm: Caylent/Ansible-Docker-Swarm

On top of this, for the sake of this article, you’ll need to have a few other things already in place:

  • A network of connected machines (i.e. a virtual private cloud (VPC))
  • At least 2 public subnets
  • 5 EC2 instances on AWS with an elastic load balancer (ELB)

Set these ingress rules on your EC2 security groups:

  • HTTP port 80 from\0
  • SSH from\0 (for increased security, replace this with your own IP address)

Once your machines are configured correctly we can begin. We’re using Ubuntu 14.04 LTS, though the process will work similarly on other Linux-based operating systems too.

2. Assign Nodes

Before diving into the rest of the tasks that will install Docker and start up the swarm, it’s necessary to detail the specifications of the nodes outlined on AWS. The image below shows the hosts file which specifies the nodes needed to create managers and workers and specifies the role each node will undertake.Fill in the IP addresses accordingly to assign your nodes, replacing {{ manager->ip }} with your intended master node IPs and {{ worker->ip}} with your intended worker ones. If you prefer to have 5 masters you can fill those in continuing with the format shown for the first 3 and same with additional workers. As we’ve mentioned before, it’s important to always create an odd number of masters in your Swarm; as a ‘majority’ vote is needed between managers to define the lead instance. This works in accordance with Docker’s

[docker-manager-first] manager1 ansible_host="{{manager1->ip}}" [docker-managers] manager2 ansible_host="{{manager2->ip}}" manager3 ansible_host="{{manager3->ip}}" [docker-workers] worker1 ansible_host="{{worker1->ip}}" worker2 ansible_host="{{worker2->ip}}" [docker-api] manager1 manager2 manager3 [docker-cloud] manager1 manager2 manager3

 3. Customize Variables

You will also need to customize the group variables in group_vars/all.yml to reflect your own SSH username and the path to your private key.

ansible_ssh_user: {{ssh-username}} ansible_ssh_private_key_file: "{{~/path/to/your/ssh_private_key}}" ansible_host_key_checking: false

Now that you’ve supplied Ansible with the necessary information to access your nodes, you can run the playbook. Each set of commands will automatically loop until completed, meaning very little input is required on your part—with the exception of some copy and pasting.To run the playbook, enter the following command in the root folder of the cloned repo:

$ ansible-playbook docker-ce.yaml -i hosts.ini

4. Install Ubuntu and Docker

The following code operates on all hosts; both managers and workers.

hosts: all remote_user: root become: yes become_method: sudo tasks:

The following tasks check that no previous installs of Docker exist on your nodes.

name: "add docker repository" apt_repository: repo='deb [arch=amd64] {{ ansible_distribution_release }} stable' state=present when: ansible_distribution == "Ubuntu" name: "ensure old versions of docker are purged 1" apt: name=lxc-docker state=absent purge=yes when: ansible_distribution == "Ubuntu" name: "ensure old versions of docker are purged 2" apt: name=docker state=absent purge=yes when: ansible_distribution == "Ubuntu" name: "ensure old versions of docker are purged 3" apt: name=docker-engine state=absent purge=yes when: ansible_distribution == "Ubuntu" name: "ensure old versions of docker are purged 4" apt: state=absent purge=yes when: ansible_distribution == "Ubuntu"

The following tasks check the current kernel version and then downloads dependencies for Ubuntu 14.04.

name: "get kernel version" shell: uname -r register: kernel name: "install 14.04 pre-req 1" apt: name: linux-image-extra-{{ kernel.stdout }} state: present update_cache: yes install_recommends: yes when: ansible_distribution == "Ubuntu" retries: 3 delay: 20 name: "install 14.04 pre-req 2" apt: name=linux-image-extra-virtual state=present update_cache=yes install_recommends=yes when: ansible_distribution == "Ubuntu" retries: 3 delay: 20

The following task installs your preferred Docker version. Our example input is ‘17.06.2*’.

name: "install docker" apt: name=docker-ce=17.06.2* state=present update_cache=yes install_recommends=yes allow_unauthenticated=yes when: ansible_distribution == "Ubuntu" retries: 3 delay: 20

5. Create Docker Group

The following tasks create a Docker group, add Ubuntu as a user, and restart the service so that we don’t need to use sudo every time we use a Docker command.

name: "add docker group" group: name=docker state=present name: "add ubuntu to docker group" user: name=ubuntu groups=docker append=yes name: "restart docker service" service: name=docker state=started name: "get docker info" shell: docker info register: docker_info changed_when: false

6. Initiate the Swarm

The following tasks will run on the first manager as specified above in the hosts file.

hosts: docker-manager-first remote_user: root become: yes become_method: sudo tasks:

The following tasks initiate a Docker Swarm and then save the master and worker tokens so we can add more hosts to the cluster.

name: "create primary swarm manager" shell: docker swarm init --advertise-addr {{ ansible_eth0['ipv4']['address'] }} when: "docker_info.stdout.find('Swarm: inactive') != -1" name: "get docker swarm manager token" shell: docker swarm join-token -q manager register: manager_token name: "get docker swarm worker token" shell: docker swarm join-token -q worker register: worker_token

7. Add Managers to the Swarm

The following tasks run on all nodes designated as ‘docker-managers’ in the hosts file, adding each to the swarm as managers.

hosts: docker-managers remote_user: root become: yes become_method: sudo tasks: name: "join as a manager" shell: "docker swarm join --token {{ hostvars['manager1']['manager_token']['stdout'] }} {{ hostvars['manager1']['ansible_eth0']['ipv4']['address'] }}:2377" when: docker_info.stdout.find("Swarm{{':'}} inactive") != -1 retries: 3 delay: 20

8. Add Workers to the Swarm

The following tasks add all nodes designated as ‘docker-workers’ in the hosts file to your swarm as workers.

hosts: docker-workers remote_user: root become: yes become_method: sudo tasks: name: "join as a worker" shell: "docker swarm join --token {{ hostvars['manager1']['worker_token']['stdout'] }} {{ hostvars['manager1']['ansible_eth0']['ipv4']['address'] }}:2377" when: "docker_info.stdout.find('Swarm: inactive') != -1" retries: 3 delay: 20

9. Expose Docker API

The following tasks will run on all manager nodes that were previously designated in the hosts file under.


The commands confirm that the API is running on these nodes and is exposed. If not, the task stops Docker, exposes it, and restarts the service.

hosts: docker-api remote_user: root become: yes become_method: sudo tasks: name: "confirm service exists" stat: path=/etc/init.d/docker register: service_wrapper name: "check whether api already exposed" command: "grep 'DOCKER_OPTS=\"-D -H tcp:// -H unix:///var/run/docker.sock\"' /etc/default/docker" register: check_has_api always_run: True ignore_errors: True changed_when: False name: "stop docker" service: name=docker state=stopped when: service_wrapper.stat.exists check_has_api.stdout == "" register: service_stopped name: "expose docker api" lineinfile: "dest=/etc/default/docker state=present regexp='#DOCKER_OPTS=' line='DOCKER_OPTS=\"-H tcp:// -H unix:///var/run/docker.sock\"'" when: ervice_stopped check_has_api.stdout == "" name: "restart docker service" service: name=docker state=started when: service_wrapper.stat.exists check_has_api.stdout == ""

10. Create Daemon File

The following tasks will run on all the manager nodes which were designated in the hosts file under ‘docker-cloud’ and confirms that there is a Daemon file on each. If not, it stops Docker, creates a daemon file, and restarts the service.

hosts: docker-cloud remote_user: root become: yes become_method: sudo tasks: name: "confirm service exists" stat: path=/etc/init.d/docker register: service_wrapper name: "check for daemon file" stat: path=/etc/docker/daemon.json register: daemon_file name: "stop docker" service: name=docker state=stopped when: service_wrapper.stat.exists not daemon_file.stat.exists register: service_stopped name: "create daemon file" template: src=templates/daemon.j2 dest=/etc/docker/daemon.json when: not daemon_file.stat.exists name: "restart docker service" service: name=docker state=started when: service_wrapper.stat.exists

11. List Networks

The following task lists the networks on your Docker manager. This output will determine if the default network still needs to be created.

hosts: docker-manager-first remote_user: root become: yes become_method: sudo tasks: name: list networks shell: docker network ls register: docker_networks

12. Expand Networks

After checking if the default network exists, Ansible will work through the list of networks provided and create each one with the specified subnet, gateway, and appropriate name. This network allows containers within this service to communicate properly.

name: create network when not there shell: docker network create --driver overlay --subnet {{ item.subnet }} --gateway {{ item.gateway }} {{ }} with_items: {name: 'caylent-default', subnet: '', gateway: ''} when: docker_networks.stdout.find( ) == -1

Congratulations, you’re all finished and the entire process is now automated! As always, we’d love your feedback and suggestions for future articles.

Original Link

Ansible: Making Things Happen

Finally, an automation framework that thinks like a sysadmin. Ansible, you’re hired.

In my last article, I described how to configure your server and clients so you could connect to each client from the server. Ansible is a push-based automation tool, so the connection is initiated from your “server”, which is usually just a workstation or a server you ssh in to from your workstation. In this article, I explain how modules work and how you can use Ansible in ad-hoc mode from the command line.

Ansible is supposed to make your job easier, so the first thing you need to learn is how to do familiar tasks. For most sysadmins, that means some simple command-line work. Ansible has a few quirks when it comes to command-line utilities, but it’s worth learning the nuances, because it makes for a powerful system.

Command Module

This is the safest module to execute remote commands on the client machine. As with most Ansible modules, it requires Python to be installed on the client, but that’s it. When Ansible executes commands using the Command Module, it does not process those commands through the user’s shell. This means some variables like $HOME are not available. It also means stream functions (redirects, pipes) don’t work. If you don’t need to redirect output or to reference the user’s home directory as a shell variable, the Command Module is what you want to use. To invoke the Command Module in ad-hoc mode, do something like this:

ansible host_or_groupname -m command -a "whoami"

Your output should show SUCCESS for each host referenced and then return the user name that the user used to log in. You’ll notice that the user is not root, unless that’s the user you used to connect to the client computer.

If you want to see the elevated user, you’ll add another argument to the ansible command. You can add -b in order to “become” the elevated user (or the sudo user). So, if you were to run the same command as above with a “-b” flag:

ansible host_or_groupname -b -m command -a "whoami"

you should see a similar result, but the whoami results should say root instead of the user you used to connect. That flag is important to use, especially if you try to run remote commands that require root access!

Shell Module

There’s nothing wrong with using the Shell Module to execute remote commands. It’s just important to know that since it uses the remote user’s environment, if there’s something goofy with the user’s account, it might cause problems that the Command Module avoids. If you use the Shell Module, however, you’re able to use redirects and pipes. You can use the whoami example to see the difference. This command:

ansible host_or_groupname -m command -a "whoami > myname.txt"

should result in an error about > not being a valid argument. Since the Command Module doesn’t run inside any shell, it interprets the greater-than character as something you’re trying to pass to the whoami command. If you use the Shell Module, however, you have no problems:

ansible host_or_groupname -m shell -a "whom > myname.txt"

This should execute and give you a SUCCESS message for each host, but there should be nothing returned as output. On the remote machine, however, there should be a file called myname.txt in the user’s home directory that contains the name of the user. My personal policy is to use the Command Module whenever possible and to use the Shell Module if needed.

The Raw Module

Functionally, the Raw Module works like the Shell Module. The key difference is that Ansible doesn’t do any error checking, and STDERR, STDOUT and Return Code is returned. Other than that, Ansible has no idea what happens, because it just executes the command over SSH directly. So while the Shell Module will use /bin/sh by default, the Raw Module just uses whatever the user’s personal default shell might be.

Why would a person decide to use the Raw Module? It doesn’t require Python on the remote computer—at all. Although it’s true that most servers have Python installed by default, or easily could have it installed, many embedded devices don’t and can’t have Python installed. For most configuration management tools, not having an agent program installed means the remote device can’t be managed. With Ansible, if all you have is SSH, you still can execute remote commands using the Raw Module. I’ve used the Raw Module to manage Bitcoin miners that have a very minimal embedded environment. It’s a powerful tool, and when you need it, it’s invaluable!

Copy Module

Although it’s certainly possible to do file and folder manipulation with the Command and Shell Modules, Ansible includes a module specifically for copying files to the server. Even though it requires learning a new syntax for copying files, I like to use it because Ansible will check to see whether a file exists, and whether it’s the same file. That means it copies the file only if it needs to, saving time and bandwidth. It even will make backups of existing files! I can’t tell you how many times I’ve used scp and sshpass in a Bash FOR loop and dumped files on servers, even if they didn’t need them. Ansible makes it easy and doesn’t require FOR loops and IP iterations.

The syntax is a little more complicated than with Command, Shell or Raw. Thankfully, as with most things in the Ansible world, it’s easy to understand—for example:

ansible host_or_groupname -b -m copy \ -a "src=./updated.conf dest=/etc/ntp.conf \ owner=root group=root mode=0644 backup=yes"

This will look in the current directory (on the Ansible server/workstation) for a file called updated.conf and then copy it to each host. On the remote system, the file will be put in /etc/ntp.conf, and if a file already exists, and it’s different, the original will be backed up with a date extension. If the files are the same, Ansible won’t make any changes.

I tend to use the Copy Module when updating configuration files. It would be perfect for updating configuration files on Bitcoin miners, but unfortunately, the Copy Module does require that the remote machine has Python installed. Nevertheless, it’s a great way to update common files on many remote machines with one simple command. It’s also important to note that the Copy Module supports copying remote files to other locations on the remote filesystem using the remote_src=true directive.

File Module

The File Module has a lot in common with the Copy Module, but if you try to use the File Module to copy a file, it doesn’t work as expected. The File Module does all its actions on the remote machine, so src and dest are all references to the remote filesystem. The File Module often is used for creating directories, creating links or deleting remote files and folders. The following will simply create a folder named /etc/newfolder on the remote servers and set the mode:

ansible host_or_groupname -b -m file \ -a "path=/etc/newfolder state=directory mode=0755"

You can, of course, set the owner and group, along with a bunch of other options, which you can learn about on the Ansible doc site. I find I most often will either create a folder or symbolically link a file using the File Module. To create a symlink:

sensible host_or_groupname -b -m file \ -a "src=/etc/ntp.conf dest=/home/user/ntp.conf \ owner=user group=user state=link"

Notice that the state directive is how you inform Ansible what you actually want to do. There are several state options:

  • link — create symlink.

  • directory — create directory.

  • hard — create hardlink.

  • touch — create empty file.

  • absent — delete file or directory recursively.

This might seem a bit complicated, especially when you easily could do the same with a Command or Shell Module command, but the clarity of using the appropriate module makes it more difficult to make mistakes. Plus, learning these commands in ad-hoc mode will make playbooks, which consist of many commands, easier to understand (I plan to cover this in my next article).

File Management

Anyone who manages multiple distributions knows it can be tricky to handle the various package managers. Ansible handles this in a couple ways. There are specific modules for apt and yum, but there’s also a generic module called “package” that will install on the remote computer regardless of whether it’s Red Hat- or Debian/Ubuntu-based.

Unfortunately, while Ansible usually can detect the type of package manager it needs to use, it doesn’t have a way to fix packages with different names. One prime example is Apache. On Red Hat-based systems, the package is “httpd”, but on Debian/Ubuntu systems, it’s “apache2”. That means some more complex things need to happen in order to install the correct package automatically. The individual modules, however, are very easy to use. I find myself just using apt or yum as appropriate, just like when I manually manage servers. Here’s an apt example:

ansible host_or_groupname -b -m apt \ -a "update_cache=yes name=apache2 state=latest"

With this one simple line, all the host machines will run apt-get update (that’s the update_cache directive at work), then install apache2’s latest version including any dependencies required. Much like the File Module, the state directive has a few options:

  • latest — get the latest version, upgrading existing if needed.

  • absent — remove package if installed.

  • present — make sure package is installed, but don’t upgrade existing.

The Yum Module works similarly to the Apt Module, but I generally don’t bother with the update_cache directive, because yum updates automatically. Although very similar, installing Apache on a Red Hat-based system looks like this:

ansible host_or_groupname -b -m yum \ -a "name=httpd state=present"

The difference with this example is that if Apache is already installed, it won’t update, even if an update is available. Sometimes updating to the latest version isn’t want you want, so this stops that from accidentally happening.

Just the Facts, Ma’am

One frustrating thing about using Ansible in ad-hoc mode is that you don’t have access to the “facts” about the remote systems. In my next article, where I plan to explore creating playbooks full of various tasks, you’ll see how you can reference the facts Ansible learns about the systems. It makes Ansible far more powerful, but again, it can be utilized only in playbook mode. Nevertheless, it’s possible to use ad-hoc mode to peek at the sorts information Ansible gathers. If you run the setup module, it will show you all the details from a remote system:

ansible host_or_groupname -b -m setup

That command will spew a ton of variables on your screen. You can scroll through them all to see the vast amount of information Ansible pulls from the host machines. In fact, it shows so much information, it can be overwhelming. You can filter the results:

ansible host_or_groupname -b -m setup -a "filter=*family*"

That should just return a single variable, ansible_os_family, which likely will be Debian or Red Hat. When you start building more complex Ansible setups with playbooks, it’s possible to insert some logic and conditionals in order to use yum where appropriate and apt where the system is Debian-based. Really, the facts variables are incredibly useful and make building playbooks that much more exciting.

But, that’s for another article, because you’ve come to the end of the second installment. Your assignment for now is to get comfortable using Ansible in ad-hoc mode, doing one thing at a time. Most people think ad-hoc mode is just a stepping stone to more complex Ansible setups, but I disagree. The ability to configure hundreds of servers consistently and reliably with a single command is nothing to scoff at. I love making elaborate playbooks, but just as often, I’ll use an ad-hoc command in a situation that used to require me to ssh in to a bunch of servers to do simple tasks. Have fun with Ansible; it just gets more interesting from here!

Original Link

Ansible Run Analysis

Ansible can be used for a lot of things, and it has grown pretty popular for managing servers and their configuration. Today, Ansible is heavily used to deploy or test through Continuous Integration (CI).

In the world of automated continuous integration, it’s not uncommon to have hundreds, if not thousands of jobs running every day for testing, building, compiling, deploying, and so on.


Well, you cannot call it a “problem,” but Ansible runs generate quite a large amount of console data. And yes, it continues to grow with each flag (Remember -v & -vvv?) that you add to the run!

Keeping up with such a large amount of Ansible runs and their outcome, not just in the context of CI, is challenging.

Due to this mess, one tends to feel the need of something, that will present this verbose output in a way which is easily readable, graphical, tabular, and more representative of the job status and debug information.

That something is ARA.

Ansible Run Analysis (ARA) Tool

ARA records Ansible playbook runs and makes the recorded data available and intuitive for users and systems. ARA organizes recorded playbook data in a way to make it intuitive for you to search and find what you’re interested in as fast and as easily as possible.


It provides summaries of task results per host or per playbook.


It allows you to filter task results by playbook, play, host, task, or by the status of the task.


With ARA, you’re able to easily drill down from the summary view for the results you’re interested in, whether it’s a particular host or a specific task.


Beyond browsing a single ansible-playbook run, ARA supports recording and viewing multiple runs in the same database.



There are two ways in which you can install ARA in your system.

  • Using Ansible Role hosted on my GitHub Account

    • Clone the repo and run:

       ansible-playbook Playbook.yml

    • If the Playbook run is successful, you will get:

    TASK [ara : Display ara UI URL] ************************ ok: [localhost] => {} "msg": "Access playbook records at http://YOUR_IP:9191" 
    • Note: It picks the IP address from ansible_default_ipv4 fact gathered by Ansible. If there is no such fact gathered, replace it with your IP in main.yml file present in the roles/ara/tasks/ folder.

  • ARA is an open source project available on GitHub under the Apache v2 license. Installation instructions are present under the Quickstart chapter.


Ever since I came across this tool, it’s been a useful resource for me to get more out of Ansible run logs and outputs. I would highly recommend it for all Ansible Ninjas out there.

Feel free to share this with others and do let me know your experience with using ARA.

Original Link

Automating Your Infrastructure Code Tests

Quality must be everyone’s role. All the teams must be aware of this responsibility. Assuring the quality of all the code we produce is not the kind of task we should delegate. We must take the ownership of our work and deliver it with quality.

The infrastructure test automation, asides any application test automation, is important in the process of delivering code. Every change you make in your Ansible playbook, or any file of your infrastructure project, must be followed by the test of the entire project.

The tests can be done either manually or automatically. The advantage of automating the tests is obviously to save time and make it reproducible at any time. Although you have to invest some time in developing the automation, you get rid of manually repeating the tests. With automation, it becomes a simple matter of a click of a button.

You can use a tool like Molecule to test your Ansible playbooks, or simply use shell scripts. The file below is and example of the use of shell script to automate an entire Ansible project. More about the project you can find in the article Dealing With the Same Configuration File With Different Content in Different Environments. You can also clone the Codeyourinfra repository and take a look at the same_cfgfile_diff_content directory.

tmpfile=$(mktemp) teardown()
vagrant destroy -f
rm -rf .vagrant/ *.retry "$tmpfile"
} . ../common/ # turn on the environment
vagrant up # check the solution playbook syntax
checkPlaybookSyntax playbook.yml hosts # execute the solution
ansible-playbook playbook.yml -i hosts | tee ${tmpfile}
assertEquals 3 $(tail -5 ${tmpfile} | grep -c "failed=0") # validate the solution
ansible qa -i hosts -m shell -a "cat /etc/conf" | tee ${tmpfile}
assertEquals "prop1=Aprop2=B" $(awk '/qa1/ {for(i=1; i<=2; i++) {getline; printf}}' ${tmpfile})
assertEquals "prop1=Cprop2=D" $(awk '/qa2/ {for(i=1; i<=2; i++) {getline; printf}}' ${tmpfile})
assertEquals "prop1=Eprop2=F" $(awk '/qa3/ {for(i=1; i<=2; i++) {getline; printf}}' ${tmpfile}) # turn off the environment and exit
exit 0

The script is quite simple. It basically turns on the required environment for testing, do the tests and turn off the environment. If everything goes as expected, the script exits with the code 0. Otherwise, the exit code is 1. (Here is a great article about exit codes.)

The environment for testing is managed by Vagrant. The command upturns the environment on, while the command destroy turns it down. Vagrant can manage both local virtual machines and AWS’ EC2 instances. When the test is done in the cloud, there’s an additional step of gathering the IP addresses from AWS. Ansible requires these IPv4addresses in order to connect with the remote hosts through SSH. If you want more details, please take a look at the previous article Bringing Ansible Development to the Cloud.

Notice that the environment is turned off and all the auxiliary files are removed in the teardown function. Other functions are also used within the script, loaded from the file. They are as follows:

  • checkPlaybookSyntax – uses the –check-syntax option of the ansible-playbook command in order to validate the playbook YML file;
  • assertEquals – compares an expected value with the actual one in order to validate what was supposed to happen;
  • assertFileExists – checks if a required file exists.

The script also creates a temporary file. In to the temporary file the command tee writes the output of the command ansible-playbookexecutions. Right after each execution, some assertions are made, in order to check if everything has just gone fine. The example below shows the output of the playbook.yml execution.

ansible-playbook playbook.yml -i hosts PLAY [qa] ************************************************************************************************************************************************************************************************** TASK [Gathering Facts] *************************************************************************************************************************************************************************************
ok: [qa1]
ok: [qa2]
ok: [qa3] TASK [Set the configuration file content] ******************************************************************************************************************************************************************
changed: [qa1] => (item={'key': u'prop1', 'value': u'A'})
changed: [qa3] => (item={'key': u'prop1', 'value': u'E'})
changed: [qa2] => (item={'key': u'prop1', 'value': u'C'})
changed: [qa1] => (item={'key': u'prop2', 'value': u'B'})
changed: [qa2] => (item={'key': u'prop2', 'value': u'D'})
changed: [qa3] => (item={'key': u'prop2', 'value': u'F'}) PLAY RECAP *************************************************************************************************************************************************************************************************
qa1 : ok=2 changed=1 unreachable=0 failed=0 qa2 : ok=2 changed=1 unreachable=0 failed=0 qa3 : ok=2 changed=1 unreachable=0 failed=0

The command tail gets the last five (-5) lines of the temporary file, and the command grep counts (-c) how many lines have “failed=0”. Ansible outputs the result at the end, and it’s expected success (failed=0) in the performing of the tasks in all of the three target hosts (3).

In a single execution, Ansible is able to do tasks in multiple hosts. The Ansible ad-hoc command bellow executes the command cat /etc/confin each of the hosts that belong to the test environment (q1, q2 and q3). The goal is validate the prior playbook’s execution. The content of the configuration file of each host must be as defined in the config.json file.

ansible qa -i hosts -m shell -a "cat /etc/conf" qa2 | SUCCESS | rc=0 >>
prop2=D qa3 | SUCCESS | rc=0 >>
prop2=F qa1 | SUCCESS | rc=0 >>

The command awk finds a specific line by a pattern (/hostname/) and gets the two lines below in a single line. This way is possible compare the configuration file content obtained from each host with the expected content.

Every Codeyourinfra project’s solution has its own automated tests. You can check it out by navigating through the repository directories. The file of each folder does the job, including those which are in the aws subdirectories. In this case, the test environment is turned on in an AWS region of your choice.

Shell scripting is just an example of how you can implement your infrastructure test automation. You can use Docker containers instead of virtual machines managed by Vagrant, too. The important is having a consistent and reproducible way to guarantee the quality of your infrastructure code.

The next step is create an integration continuous process for developing your infrastructure. But it’s the subject of the next article. Stay tuned!

Before I forget, I must reinforce it: the purpose of the Codeyourinfra project is help you. So, don’t hesitate to tell the problems you face as a sysadmin.

Original Link

Cloudy Morning: Docker, Serverless, and Cloud Dev

I hope everyone has had a great weekend! Now that Monday is here, grab your java (puns: I’ve got them), settle in, and let’s get caught up with what’s making news in the cloud-o-sphere. The big themes this week are Docker (namely whether the company itself is dying as well as a ton of tools you can use in container-based projects), some tips for serverless computing (including how Java works with AWS Lambda), and some general tips for developing in the cloud (like bringing Ansible to it as well as a newly launched Go 2.0 SDK for AWS).

So, without further ado, let’s brave the storm.

Looking Cloudy Out

  1. Docker (Inc.) Is Dead, by Chris Short. 2017 was a great year for containers, but the future is grim for the company that made them popular. See what happened to Docker last year and what’s in store for 2018 — and debate on whether you agree!
  2. 50+ Useful Docker Tools, by JP La Torre. As containers take root, dozens of tools have sprung up to support them. Check out your options for orchestration, CI/CD, logging, monitoring, and more!
  3. 4 Advantages of Using Java With AWS Lambda, by Serhat Can. Sure, cold starts hurt Java usage with AWS Lambda, but there are plenty of benefits that the language and platform bring to serverless computing.
  4. The Why and How of Kubernetes Namespaces, by Ajitesh Kumar. This Kubernetes primer will get you situated with namespaces and how to create them for different environments, how to assign contexts to them, and how to delete them.
  5. Bringing Ansible Development to the Cloud, by Gustavo Carmo. Wish you could have your dev environment provisioned and set up in the cloud? Well, here’s how to get Ansible working on AWS to do just that.

By the way, if you’re interested in writing for your fellow DZoners, feel free to check out our Writers’ Zone, where you can also find some current hot topics and our Bounty Board, which has writing prompts coupled with prizes.

Going Stormchasing

A Golang 2.0 SDK for AWS Users

Are you a Go fan? With Go 2.0 hopefully right around the corner, you might be happy to know that AWS just released an SDK to help users get ready. More to the point, this is a preview of what’s likely in store, and AWS is asking developers to give the SDK a spin and tell them what you like and don’t like. Feedback is an important tool, so use it!

“Pluggable” Clouds: It’s a Trap!

With multi-cloud environments gaining traction, the push toward “pluggable” clouds is on the rise once again. If you’re not familiar with them, pluggable clouds supposedly make things easy by letting you swap out public or private clouds without major changes to your apps’ dependencies. But here’s why that’s not a great model for multi-cloud — or at least what you need to be aware of before you pull the trigger.

GCP and GPUs

One of the most interesting developments (to me at least) in recent times is the work that’s been done to bring GPUs to the cloud for devs to use. Wel, Google Cloud recently announced the launch of GPUs attached to preemptible VMs. Check out the article to see the ideal use cases for these GPUs as well as the caveats you need to be aware of before you take the plunge. 

Diving Deeper Into Cloud

  1. Refcard of the Week: Getting Started With Docker
  2. Guide: Orchestrating and Deploying Containers

Original Link

Ansible: the Automation Framework That Thinks Like a Sysadmin

I’ve written about and trained folks on various DevOps tools through the years, and although they’re awesome, it’s obvious that most of them are designed from the mind of a developer. There’s nothing wrong with that, because approaching configuration management programmatically is the whole point. Still, it wasn’t until I started playing with Ansible that I felt like it was something a sysadmin quickly would appreciate.

Part of that appreciation comes from the way Ansible communicates with its client computers—namely, via SSH. As sysadmins, you’re all very familiar with connecting to computers via SSH, so right from the word “go”, you have a better understanding of Ansible than the other alternatives.

With that in mind, I’m planning to write a few articles exploring how to take advantage of Ansible. It’s a great system, but when I was first exposed to it, it wasn’t clear how to start. It’s not that the learning curve is steep. In fact, if anything, the problem was that I didn’t really have that much to learn before starting to use Ansible, and that made it confusing. For example, if you don’t have to install an agent program (Ansible doesn’t have any software installed on the client computers), how do you start?

Getting to the Starting Line

The reason Ansible was so difficult for me at first is because it’s so flexible with how to configure the server/client relationship, I didn’t know what I was supposed to do. The truth is that Ansible doesn’t really care how you set up the SSH system; it will utilize whatever configuration you have. There are just a couple things to consider:

  1. Ansible needs to connect to the client computer via SSH.

  2. Once connected, Ansible needs to elevate privilege so it can configure the system, install packages and so on.

Unfortunately, those two considerations really open a can of worms. Connecting to a remote computer and elevating privilege is a scary thing to allow. For some reason, it feels less vulnerable when you simply install an agent on the remote computer and let Chef or Puppet handle privilege escalation. It’s not that Ansible is any less secure, but rather, it puts the security decisions in your hands.

Next I’m going to list a bunch of potential configurations, along with the pros and cons of each. This isn’t an exhaustive list, but it should get you thinking along the right lines for what will be ideal in your environment. I also should note that I’m not going to mention systems like Vagrant, because although Vagrant is wonderful for building a quick infrastructure for testing and developing, it’s so very different from a bunch of servers that the considerations are too dissimilar really to compare.

Some SSH Scenarios

1) SSHing into remote computer as root with password in Ansible config.

I started with a terrible idea. The “pros” of this setup is that it eliminates the need for privilege escalation, and there are no other user accounts required on the remote server. But, the cost for such convenience isn’t worth it. First, most systems won’t let you SSH in as root without changing the default configuration. Those default configurations are there because, quite frankly, it’s just a bad idea to allow the root user to connect remotely. Second, putting a root password in a plain-text configuration file on the Ansible machine is mortifying. Really, I mentioned this possibility because it is a possibility, but it’s one that should be avoided. Remember, Ansible allows you to configure the connection yourself, and it will let you do really dumb things. Please don’t.

2) SSHing into a remote computer as a regular user, using a password stored in the Ansible config.

An advantage of this scenario is that it doesn’t require much configuration of the clients. Most users are able to SSH in by default, so Ansible should be able to use credentials and log in fine. I personally dislike the idea of a password being stored in plain text in a configuration file, but at least it isn’t the root password. If you use this method, be sure to consider how privilege escalation will take place on the remote server. I know I haven’t talked about escalating privilege yet, but if you have a password in the config file, that same password likely will be used to gain sudo access. So with one slip, you’ve compromised not only the remote user’s account, but also potentially the entire system.

3) SSHing into a remote computer as a regular user, authenticating with a key pair that has an empty passphrase.

This eliminates storing passwords in a configuration file, at least for the logging in part of the process. Key pairs without passphrases aren’t ideal, but it’s something I often do in an environment like my house. On my internal network, I typically use a key pair without a passphrase to automate many things like cron jobs that require authentication. This isn’t the most secure option, because a compromised private key means unrestricted access to the remote user’s account, but I like it better than a password in a config file.

4) SSHing into a remote computer as a regular user, authenticating with a key pair that is secured by a passphrase.

This is a very secure way of handling remote access, because it requires two different authentication factors: 1) the private key and 2) the passphrase to decrypt it. If you’re just running Ansible interactively, this might be the ideal setup. When you run a command, Ansible should prompt you for the private key’s passphrase, and then it’ll use the key pair to log in to the remote system. Yes, the same could be done by just using a standard password login and not specifying the password in the configuration file, but if you’re going to be typing a password on the command line anyway, why not add the layer of protection a key pair offers?

5) SSHing with a passphrase-protected key pair, but using ssh-agent to “unlock” the private key.

This doesn’t perfectly answer the question of unattended, automated Ansible commands, but it does make a fairly secure setup convenient as well. The ssh-agent program authenticates the passphrase one time and then uses that authentication to make future connections. When I’m using Ansible, this is what I think I’d like to be doing. If I’m completely honest, I still usually use key pairs without passphrases, but that’s typically because I’m working on my home servers, not something prone to attack.

There are some other considerations to keep in mind when configuring your SSH environment. Perhaps you’re able to restrict the Ansible user (which is often your local user name) so it can log in only from a specific IP address. Perhaps your Ansible server can live in a different subnet, behind a strong firewall so its private keys are more difficult to access remotely. Maybe the Ansible server doesn’t have an SSH server installed on itself so there’s no incoming access at all. Again, one of the strengths of Ansible is that it uses the SSH protocol for communication, and it’s a protocol you’ve all had years to tweak into a system that works best in your environment. I’m not a big fan of proclaiming what the “best practice” is, because in reality, the best practice is to consider your environment and choose the setup that fits your situation the best.

Privilege Escalation

Once your Ansible server connects to its clients via SSH, it needs to be able to escalate privilege. If you chose option 1 above, you’re already root, and this is a moot point. But since no one chose option 1 (right?), you need to consider how a regular user on the client computer gains access. Ansible supports a wide variety of escalation systems, but in Linux, the most common options are sudo and su. As with SSH, there are a few situations to consider, although there are certainly other options.

1) Escalate privilege with su.

For Red Hat/CentOS users, the instinct might be to use su in order to gain system access. By default, those systems configure the root password during install, and to gain privileged access, you need to type it in. The problem with using su is that although it gives you total access to the remote system, it also gives you total access to the remote system. (Yes, that was sarcasm.) Also, the su program doesn’t have the ability to authenticate with key pairs, so the password either must be interactively typed or stored in the configuration file. And since it’s literally the root password, storing it in the config file should sound like a horrible idea, because it is.

2) Escalate privilege with sudo.

This is how Debian/Ubuntu systems are configured. A user in the correct group has access to sudo a command and execute it with root privileges. Out of the box, this still has the problem of password storage or interactive typing. Since storing the user’s password in the configuration file seems a little less horrible, I guess this is a step up from using su, but it still gives complete access to a system if the password is compromised. (After all, typing sudo su - will allow users to become root just as if they had the root password.)

3) Escalate privilege with sudo and configure NOPASSWD in the sudoers file.

Again, in my local environment, this is what I do. It’s not perfect, because it gives unrestricted root access to the user account and doesn’t require any passwords. But when I do this, and use SSH key pairs without passphrases, it allows me to automate Ansible commands easily. I’ll note again, that although it is convenient, it is not a terribly secure idea.

4) Escalate privilege with sudo and configure NOPASSWD on specific executables.

This idea might be the best compromise of security and convenience. Basically, if you know what you plan to do with Ansible, you can give NOPASSWD privilege to the remote user for just those applications it will need to use. It might get a little confusing, since Ansible uses Python for lots of things, but with enough trial and error, you should be able to figure things out. It is more work, but does eliminate some of the glaring security holes.

Implementing Your Plan

Once you decide how you’re going to handle Ansible authentication and privilege escalation, you need to set it up. After you become well versed at Ansible, you might be able to use the tool itself to help “bootstrap” new clients, but at first, it’s important to configure clients manually so you know what’s happening. It’s far better to automate a process you’re familiar with than to start with automation from the beginning.

I’ve written about SSH key pairs in the past, and there are countless articles online for setting it up. The short version, from your Ansible computer, looks something like this:

# ssh-keygen
# ssh-copy-id -i .ssh/
# ssh

If you’ve chosen to use no passphrase when creating your key pairs, that last step should get you into the remote computer without typing a password or passphrase.

In order to set up privilege escalation in sudo, you’ll need to edit the sudoers file. You shouldn’t edit the file directly, but rather use:

# sudo visudo

This will open the sudoers file and allow you to make changes safely (it error-checks when you save, so you don’t accidentally lock yourself out with a typo). There are examples in the file, so you should be able to figure out how to assign the exact privileges you want.

Once it’s all configured, you should test it manually before bringing Ansible into the picture. Try SSHing to the remote client, and then try escalating privilege using whatever methods you’ve chosen. Once you have configured the way you’ll connect, it’s time to install Ansible.

Installing Ansible

Since the Ansible program gets installed only on the single computer, it’s not a big chore to get going. Red Hat/Ubuntu systems do package installs a bit differently, but neither is difficult.

In Red Hat/CentOS, first enable the EPEL repository:

sudo yum install epel-release

Then install Ansible:

sudo yum install ansible

In Ubuntu, first enable the Ansible PPA:

sudo apt-add-repository spa:ansible/ansible
(press ENTER to access the key and add the repo)

Then install Ansible:

sudo apt-get update
sudo apt-get install ansible

Configuring Ansible Hosts File

The Ansible system has no way of knowing which clients you want it to control unless you give it a list of computers. That list is very simple, and it looks something like this:

# file /etc/ansible/hosts [webservers]
blogserver ansible_host=
wikiserver ansible_host= [dbservers]
mysql_1 ansible_host=
pgsql_1 ansible_host=

The bracketed sections are specifying groups. Individual hosts can be listed in multiple groups, and Ansible can refer either to individual hosts or groups. This is also the configuration file where things like plain-text passwords would be stored, if that’s the sort of setup you’ve planned. Each line in the configuration file configures a single host, and you can add multiple declarations after the ansible_host statement. Some useful options are:


The Ansible Vault

I also should note that although the setup is more complex, and not something you’ll likely do during your first foray into the world of Ansible, the program does offer a way to encrypt passwords in a vault. Once you’re familiar with Ansible and you want to put it into production, storing those passwords in an encrypted Ansible vault is ideal. But in the spirit of learning to crawl before you walk, I recommend starting in a non-production environment and using passwordless methods at first.

Testing Your System

Finally, you should test your system to make sure your clients are connecting. The ping test will make sure the Ansible computer can ping each host:

ansible -m ping all

After running, you should see a message for each defined host showing a ping: pong if the ping was successful. This doesn’t actually test authentication, just the network connectivity. Try this to test your authentication:

ansible -m shell -a 'uptime' webservers

You should see the results of the uptime command for each host in the webservers group.

In a future article, I plan start to dig in to Ansible’s ability to manage the remote computers. I’ll look at various modules and how you can use the ad-hoc mode to accomplish in a few keystrokes what would take a long time to handle individually on the command line. If you didn’t get the results you expected from the sample Ansible commands above, take this time to make sure authentication is working. Check out the Ansible docs for more help if you get stuck.

Original Link

Bringing Ansible Development to the Cloud

It’s very important that you, as a sysadmin, have your own environment where you can develop and test Ansible playbooks. Like any dev guy’s environment, your environment must be of your total control, because you will certainly need to recreate it from scratch many times. The environment must be not shared as well, therefore with no risk of being in an unexpected state, after someone intervenes

Vagrant is an excellent tool to manage Ansible development environments. Its default integration with VirtualBox, amongst other hypervisors, allows you to have virtual machines in your own host. Through its simple command-line interface, you are able to create and destroy them, over and over, at your will.

Vagrant uses specific images known as boxes. You can find most of them in Vagrant Cloud. There are Vagrant boxes for several providers, like VirtualBox. In addition, there are boxes of all sorts of Linux distributions, as well as with other open source software installed. You too can provision your local virtual machine with software and configuration, package it as a Vagrant box, and share it in Vagrant Cloud as explained in the article Choosing between baked and fried provisioning.

Besides handling your local virtual machines, Vagrant can also manage EC2 instances in AWS. If you have hardware limitations, why not benefit from the cloud? If your environment demands more resources than what’s available on your host, it’s a great idea to bring your Ansible development environment to AWS. Don’t forget that AWS charges you if you are not eligible for the AWS Free Tier or if you have exceeded your free tier usage limit.

AWS Setup

In order to bring your Ansible development environment to AWS, you must follow some steps:

  1. First of all, you must have an AWS account.
  2. Then, you must create a user with full access to the EC2 resources (eg: AmazonEC2FullAccess permission) through the IAM (Identity and Access Management) console.
  3. After that, you must create an access key in the Security credential tab. Set the local environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with the respective values from the access key id and the secret access key, generated during access key creation.
  4. Alternatively, install the AWS CLI (Command Line Interface) tool and configure your local environment through the command aws configure. This command prompts you for information about the access key and the default profile’s region, storing them inside the .aws directory of your home folder.

AWS Region Setup

On AWS, you can create EC2 instances in different regions. You must choose one specific region to create your EC2 instances at a time. It can be done by defining the AWS_REGION or the EC2_REGION environment variables, or through the cited aws configure command. The environment variables have precedence when both configurations are made.

Before creating the EC2 instances, you must create a security group and key pairs in the chosen region. You can do it:

  • manually, through the EC2 console.
  • through the command line, by using the AWS CLI tool.
  • programmatically, by using the Amazon EC2 API.
  • automatically, by using Ansible AWS modules!

The playbook-aws-region-configuration.yml file below is a good example of using Ansible to automate the configuration of a specific AWS region. The playbook is responsible for creating the required resources and gathering information from the AWS region for later use by Vagrant. If you want to run the Codeyourinfra project’s solutions on AWS, you must execute the playbook for your chosen AWS region.

- hosts: localhost connection: local gather_facts: false vars_prompt: - name: "aws_region" prompt: "AWS Region" default: "sa-east-1" private: no tasks: - name: Create the AWS directory if it doesn't exist file: path: '{{aws_region}}' state: directory - name: Get the VPCs ec2_vpc_net_facts: region: '{{aws_region}}' register: ec2_vpc_net_facts_results - name: Create the Vagrant security group ec2_group: name: vagrant description: Security Group for EC2 instances managed by Vagrant region: '{{aws_region}}' vpc_id: '{{}}' rules: - proto: tcp ports: - 22 - 80 - 3000 - 8080 - 8086 cidr_ip: vars: default_vpc: '{{(ec2_vpc_net_facts_results|json_query("vpcs[?is_default]"))[0]}}' register: ec2_group_result - name: Store the security group's data copy: content: '{{ec2_group_result|to_nice_json}}' dest: '{{aws_region}}/security-group.json' - name: Get the default VPC's subnets ec2_vpc_subnet_facts: region: '{{aws_region}}' filters: vpc-id: '{{ec2_group_result.vpc_id}}' register: ec2_vpc_subnet_facts_results - name: Store the VPC subnet's data copy: content: '{{(ec2_vpc_subnet_facts_results.subnets|sort(attribute="availability_zone"))[0]|to_nice_json}}' dest: '{{aws_region}}/subnet.json' - name: Create the key pairs ec2_key: name: codeyourinfra-aws-key region: '{{aws_region}}' register: ec2_key_result - name: Store the private key copy: content: '{{ec2_key_result.key.private_key}}' dest: '{{aws_region}}/codeyourinfra-aws-key.pem' mode: 0400 when: ec2_key_result.key.private_key is defined - name: Find Ubuntu Server 14.04 LTS AMIs ec2_ami_find: name: 'ubuntu/images/hvm-ssd/ubuntu-trusty-14.04-amd64-server-*' region: '{{aws_region}}' owner: 099720109477 sort: name sort_order: descending sort_end: 1 register: ec2_ami_find_result - name: Store the Ubuntu AMI's data copy: content: '{{ec2_ami_find_result.results[0]|to_nice_json}}' dest: '{{aws_region}}/ubuntu-ami.json' when: ec2_ami_find_result.results[0] is defined

Behind the scenes, the used Ansible modules interact with the Amazon EC2 API. Here are some details about the playbook:

  • The ec2_vpc_net_facts module is used to get the default VPC (Virtual Private Cloud) of the chosen region.
  • The ec2_group module is used to create the required security group, whose data is stored in the security-group.json file for later use by Vagrant.
  • The ec2_vpc_subnet_facts module is used to select a subnet and store its data in the subnet.json file for later use by Vagrant.
  • The ec2_key module is used to create the required key pairs and store the private key in the codeyourinfra-aws-key.pem file for later use by Vagrant.
  • The ec2_ami_find module is used to select an Ubuntu AMI (Amazon Machine Image) and store its data in the ubuntu-ami.json file for later use by Vagrant.

If you haven’t yet cloned the Codeyourinfra project’s repository, do it right now. You will find the playbook-aws-region-configuration.yml file inside the cloud/aws directory. Go to the folder and run the following command, informing your AWS region of your preference when prompted:

ansible-playbook playbook-aws-region-configuration.yml

vagrant up

In order to make Vagrant manage EC2 instances, you must install the AWS plugin. Execute the command vagrant plugin install vagrant-aws. You can find details about the plugin on its GitHub repository.

Every Codeyourinfra project’s solution has an aws subdirectory where there will be the specific Vagrantfile for managing EC2 instances. One example is this Vagrantfile, which creates the Ansible development environment of the solution explained in the article How to unarchive different files in different servers in just one shot on AWS.

Notice that the Vagrantfile handles the environment variables introduced in the Codeyourinfra project’s release 1.4. The APPEND_TIMESTAMP and the PROVISIONING_OPTION environment variables are explained in detail in the blog post Choosing between fried and baked provisioning.

If you initialize the EC2 environment with the baked provisioning option, the São Paulo region (sa-east-1) will be used, because it is the region where the repo server AMI (ami-b86627d4) is available. Otherwise, the AWS region where the EC2 instances will be created is taken from either the AWS_REGION or the EC2_REGION environment variable.

# -*- mode: ruby -*-
# vi: set ft=ruby : load File.join("..", "..", "common", "timestamp-appender.rb") provisioning_option = ENV['PROVISIONING_OPTION'] || "fried"
if provisioning_option != "baked" && provisioning_option != "fried" puts 'PROVISIONING_OPTION must be \'baked\' or \'fried\'.' abort
end if provisioning_option == "baked" aws_region = "sa-east-1"
else aws_region = ENV['AWS_REGION'] || ENV['EC2_REGION'] || "sa-east-1"
relative_path = File.join("..", "..", "cloud", "aws", aws_region)
security_group = JSON.parse(, "security-group.json")))
subnet = JSON.parse(, "subnet.json")))
ubuntu_ami = JSON.parse(, "ubuntu-ami.json")))
ec2_instances = JSON.parse('[{"name": "repo", "role": "repository"}, {"name": "server1", "role": "server"}, {"name": "server2", "role": "server"}]') Vagrant.configure("2") do |config| = "dummy" config.vm.box_url = "" config.vm.synced_folder ".", "/vagrant", disabled: true ec2_instances.each do |ec2_instance| config.vm.define ec2_instance["name"] do |ec2_config| ec2_config.vm.provider "aws" do |aws, override| aws.region = aws_region if ec2_instance["name"] == "repo" && provisioning_option == "baked" aws.ami = "ami-b86627d4" else aws.ami = ubuntu_ami['ami_id'] end aws.instance_type = "t2.micro" aws.keypair_name = "codeyourinfra-aws-key" aws.security_groups = security_group['group_id'] aws.subnet_id = subnet['id'] aws.tags = {"Name" => ec2_instance["name"], "Role" => ec2_instance["role"], "Solution" => "unarchive_from_url_param"} override.ssh.username = "ubuntu" override.ssh.private_key_path = File.join(relative_path, "codeyourinfra-aws-key.pem") override.nfs.functional = false end if ec2_instance["name"] == "repo" && provisioning_option == "fried" ec2_config.vm.provision "ansible" do |ansible| ansible.playbook = File.join("..", "playbook-repo.yml") end end end end

Besides the region, other required AWS provider-specific configuration options are defined:

  • ami – the AMI id to boot. If you initialize the environment with the baked provisioning option, the AMI is the one previously prepared, as mentioned. (If you have installed the AWS CLI tool and would like to know the AMIs provided by the Codeyourinfra project, just execute the command aws ec2 describe-images –owners 334305766942) Otherwise, the AMI is the one selected during the AWS Region Setup phase, obtained from the ubuntu-ami.json file.
  • instance_type – AWS provides a wide range of EC2 instance types, for different use cases. For our testing purposes, the T2 instances are more than sufficient. Besides that, the t2.micro instance type is eligible for the AWS Free Tier.
  • keypair_name – the name of keypair created during the AWS Region Setup phase, when the playbook-aws-region-configuration.yml was executed. The path of the stored private key file (codeyourinfra-aws-key.pem) is then configured by overriding the default Vagrant ssh.private_key_path configuration.
  • security_groups – the id of the security group created during the AWS Region Setup phase, obtained from the security-group.json file. The security group was created exclusively for EC2 instances managed by Vagrant.
  • subnet_id – the id of the subnet selected during the AWS Region Setup phase, obtained from the subnet.json file. The subnet was selected from the default VPC’s subnets list, ordered by availability zone.
  • tags – a hash of tags to set on the EC2 instance. The tags are very useful for later EC2 instances identification.

Now that you have checked out the Vagrantfile in the unarchive_from_url_param/aws directory of the Codeyourinfra project’s repository, stay there and run the command below in order to see the magic happen!

vagrant up

Ansible Inventory

The Ansible inventory file is where you group your machines. Ansible needs the information placed there in order to connect to the hosts through SSH. It enables the agentless characteristic of the tool and makes it possible for a task be executed on several servers in a single execution.

When you use AWS, each time you create an EC2 instance, it gets a different IP address. Differently from when you create local virtual machines, you are not able to define the IP address in the Vagrantfile. The Vagrant AWS plugin highlights in the output: “Warning! The AWS provider doesn’t support any of the Vagrant high-level network configurations (‘‘). They will be silent ignored.”

For that reason, inside the aws directory below every Codeyourinfra project folder in the repository, you will find two more files: playbook-ec2-instances-inventory.yml and ec2_hosts_template. The Ansible playbook is responsible for discovering the IP addresses of the just-created EC2 instances and, based on the template, generating the ec2_hosts file.

You must execute the playbook right after the EC2 instances have been created. Just run:

ansible-playbook playbook-ec2-instances-inventory.yml

Once you have generated the ec2_hosts file, you can use it as the inventory option (-i) of either the ansible or ansible-playbook commands. For example, run the following Ansible ad-hoc command:

ansible ec2_instances -m ping -i ec2_hosts

It will use the ping module to test the connection with all of the EC2 instances up and running.

Ansible Development

Finally, you have your Ansible development environment on AWS. You can edit any playbook and test them against the created EC2 instances. If something goes wrong, you can recreate the entire environment from the scratch. You have the autonomy to do what you want because the environment is yours.

The next step is to automate the test of your infrastructure code. But that’s the subject of the next article. Stay tuned!

Before I forget, I must reinforce it: The purpose of the Codeyourinfra project is to help you. So, don’t hesitate to talk about the problems you face as a sysadmin.

Original Link

Xcode Server + Ansible: Scalable and Programmable iOS CI/CD Infrastructure

One of the complaints about Xcode Server is it’s not scalable to be used at a large-scale or enterprise level. However, if you have the skills to understand and manage an infrastructure for iOS deployment, then scalability can be easily achieved for small, medium, or large iOS teams using a mixture of iOS and DevOps toolkits. We have achieved it in the past in my previous role at Photobox Group using the combination of tools Ansible + Fastlane + TeamCity. You can read my blog post on the Moonpig Engineering blog here to learn how we achieved this.

In this post, we will see how learning a little bit of DevOps tools like Ansible and Apple’s CI solution Xcode Server can help to manage large-scale iOS CI/CD infrastructure without any pain.


Continuous Integration practices for iOS development aren’t great as compared to the web world because of some unique challenges specific to iOS development. It’s essential to understand the entire iOS ecosystem to setup and manage CI/CD infrastructure for the team. Some Senior/Lead iOS developers or iOS Tech Architects/Lead are great at the understanding whole stack of Apple developer tools, technologies, and infrastructure and implementing it in their projects, but most of the iOS developers in the industry at the moment are just iOS developers. They are good at writing Swift or Objective-C to build UI for applications inside Xcode; however, when it comes to understanding the infrastructure, they suck. They don’t properly understand the underlying tools and technologies used by Xcode under the hood and that’s why they became totally incapable of setting up or managing CI/CD practices for iOS apps. As an effect of this, they always have to depend on local Xcode or someone from the DevOps team to manage the infrastructure stack for them, or CI/CD is handled by highly expensive third-party services. It means developers are not full-stack and do not understand their own tools and technologies. Unfortunately, the world of iOS development is full of these kinds of amateurs and half-arsed developers. Continuous Integration won’t be successful without skilled engineers on the team. In my previous post, I wrote about the top five reasons for CI failures. Those are wrong selection of CI server, CI amature engineers, lack of Infrastructure as Code, poor quality build servers, and lack of support from the team.

iOS Infrastructure Automation Challenges

On top of iOS engineers’ inability to understand the entire iOS ecosystem, there are some traditional challenges in automation of iOS ecosystem. Some of them are imposed by Apple’s proprietary software, and others are due to lack of tooling to automate iOS and macOS infrastructure. Some of the common challenges are

  • Lack of a configuration management tool that can be used to automate the infrastructure. Chef, Ansible, and Puppet are not designed to work with macOS servers.
  • Lack of an official package management system to provision and automate software installation. We have to use third-party solutions like HomeBrew
  • Lack of an official dependency management tools. We have to use Carthage or Cocoapods to manage dependencies. Swift Package Manager is not ready to use with iOS.
  • Lack of an official build tool. We have to use tools like Fastlane to automate iOS development tasks.

As a result of these challenges, nobody wanted to spend the time to explore a clean and automated solution to set up iOS infrastructure for smooth Continuous Delivery, which results in the following situations:

  • Manual setup of CI servers, which takes days for engineers to get things working on all build servers.
  • Flaky build automation, which produces inconsistent results.
  • The software configuration on local and build machines is different.
  • Continuous Integration is managed by a third-party provider and they need to be contacted them in case of build errors.
  • The team loses trust in CI and start using the local machine for testing and release.

These symptoms are obstacles to the best application development practices. We can solve some of these issues by knowing a little bit about configuration management tools and proper continuous integration tools. Now, we will see how the combination of Xcode Server and Ansible can help to set up large-scale continuous integration and manage it effectively.

Xcode Server

Xcode Server is a native Continuous Integration service developed by Apple. Xcode Server is one of the easiest ways to get started with Continuous Integration for iOS applications. It just needs Xcode installation and permission to access the Apple developer portal on macOS servers. If you use other solutions like Jenkins or TeamCity, you will need to install other dependencies like Java and other related software which have nothing to do with iOS development. Apple still has old documentation for Xcode Server at the time of writing this post, but I wrote the setup instructions in my previous blog post here. In order to set up the Continuous Integration server, we just need to install the following on the build machines:

  • Xcode
  • Xcode Command Line tools
  • macOS default settings
  • Ruby and Homebrew (optional)

We can manually install that software on each machine, but it will time-consuming if we have multiple macOS machines. We need to make sure that Xcode and other tool versions are consistent on all the servers. Now, we need to have configuration management to manage servers and software versions on those servers.

Provisioning Xcode Server With Ansible

There are various configuration management tools on the market, like Chef, Puppet, and Salt, but Ansible is very easy to learn and use. Although Ansible is designed for Linux, there are enough modules that can be used with macOS as well. This is not an Ansible tutorial, but you need to read a bit of documentation to get started with Ansible here.

In order to provision Xcode Server, we need the following things:

  • Xcode.xip file downloaded from Apple Developer Portal
  • Ansible installed on a remote macOS machine; Ansible will then ssh into a different macOS and install all available software.

Additionally, we can install other tools like Ruby and Homebrew using Ansible, if needed.

Provisioning Xcode Installation

In order to install Xcode, we need to have the .xip file downloaded from Apple Developer Portal and put it on the Ansible remote machine to copy over to ssh machines. There are various other things to be installed as part of the Xcode installation:

  • Archive XIP file and move Xcode app to /Applications
  • Accept Licence agreement
  • Install Xcode Command Line tools and other mobile development packages
  • Enable developer tools and install additional components
  • Set Xcode Developer directory

We can achieve all these tasks using Ansible.

- name: Check if Xcode is already installed stat: path=~/Library/Developer/Xcode/ register: xcode_dir - name: test whether xcode was uploaded earlier stat: path=/Users/{{ ansible_ssh_user }}/{{ xcode_src }} register: xcode_dl_path - name: upload xcode copy: src={{ xcode_src }} dest=/Users/{{ ansible_ssh_user }}/ when: xcode_dl_path.stat.exists == False - name: check that the xcode archive is valid command: "pkgutil --check-signature ~/{{ xcode_src }} | grep \"Status: signed Apple Software\"" when: xcode_dir.stat.exists == False - name: Install Xcode from XIP file Location command: bash -c 'open -FWga "Archive Utility" --args /Users/{{ ansible_ssh_user }}/{{ xcode_src }}' when: xcode_dir.stat.exists == False - name: Move Xcode To Application command: bash -c 'mv ~/Xcode*.app /Applications/' when: xcode_dir.stat.exists == False - name: accept license agreement command: bash -c "sudo /Applications/ -license accept" - name: install mobile device package command: bash -c " sudo installer -pkg /Applications/ -target /" - name: install mobile device development package command: bash -c "sudo installer -pkg /Applications/ -target /" - name: install Xcode System Resources package command: bash -c "sudo installer -pkg /Applications/ -target /" - name: install additional xcode components command: bash -c "sudo installer -pkg /Applications/ -target /" - name: check on the status of developer mode command: DevToolsSecurity -status become: yes become_method: sudo register: dev_tools_security_status - name: enable developer mode command: DevToolsSecurity -enable become: yes become_method: sudo when: - "'disabled' in dev_tools_security_status.stdout" - xcode_dir.stat.exists == False - name: Set Xcode Developer Directory command: bash -c "sudo xcode-select -s /Applications/"

Note that the code above is the example of an Ansible task; we have to set up some variables to configure users and directories.

Provisioning Xcode Server

Now we can install Xcode as part of the task mentioned above. The next thing is to control the Xcode Server using Ansible. Xcode Server has command line tools, xcscontrol, to manage the Xcode Server. We can start, stop, reset, and restart Xcode Server using this utility. We can write an Ansible task like this:

--- - name: Start Xcode Server with given user command: bash -c 'sudo xcrun xcscontrol --initialize --build-service-user {{xcode_server_user}}' - name: COnfigure Xcode Server Certificate command: bash -c 'sudo xcrun xcscontrol --configure-ota-certificate' - name: Configure Xcode Server timeout command: bash -c 'sudo xcrun xcscontrol --configure-integration-timeout 3600' - name: Check Xcode Server Health command: bash -c 'sudo xcrun xcscontrol --health'

With this simple configuration, we can easily manage Xcode Servers on multiple macOS servers.

Xcode Server: Ansible Role

I have created an Ansible role for Xcode Server and published to Ansible Galaxy. You can find this role here.

This role is very configurable and can be used for setting up multiple macOS servers as per your needs. You can find detailed information in the README file.

You can see all the variables used for this role and turn them ON/OFF or set them as per your needs. You can optionally set up Ruby and Homebrew if needed.

vars: clean_xcode: yes clean_rvm: no clean_homebrew: no configure_xcode: yes configure_xcodeserver: yes configure_macos_defaults: yes configure_ruby_rvm: no configure_homebrew: no xcode_src: Xcode_9.1.xip xcode_server_user: shashi ansible_ssh_user: shashi

How to Use an Xcode Server Ansible Role

Imagine, you have a fresh Mac with fresh macOS installed. You can set up all your Xcode Server for CI by creating a Playbook for this role. You can set up config variables as per your needs. Assuming you have installed Ansible, we can download the role by running this command:

$ ansible-galaxy install Shashikant86.XcodeServer

Now we have to create our own playbook for this role by setting variables. We can use the default variables here. Now make a new directory called xcode_server and create a files directory and xcs_playbook.yml file.

$ mkdir xcode_server
$ mkdir XcodeServer/files
$ touch xcs_playbook.yml

Now place your Xcode_9.x.xip file inside the files directory and insert the following into the playbook file:

- hosts: localhost connection: local vars: clean_xcode: yes clean_rvm: no clean_homebrew: no configure_xcode: yes configure_xcodeserver: yes configure_macos_defaults: yes configure_ruby_rvm: no configure_homebrew: no xcode_src: Xcode_9.1.xip xcode_server_user: shashi ansible_ssh_user: shashi roles: - Shashikant86.XcodeServer

Change ansible_ssh_user and xcode_server_user to your username and Feel free to set variables as per your need. You can also specify your inventory to enable this playbook to run on multiple hosts. Now execute this playbook using the following command:

$ ansible-playbook xcs_playbook.yml

Watch your Mac Mini servers getting set up with Xcode Server for iOS Continuous Integration.

When to Use Xcode Server and Ansible

There are situations when we want to run Ansible playbook on servers; some common situations are:

  • New Xcode Versions to be installed
  • Resetting the Rebuild CI environment at regular intervals
  • Cleaning up the environment to fix code signing issues
  • Resetting the server to save disk space

Ideally, we should run the Ansible playbook regularly to check if anything has changed.


Some of the benefits of using configuration management tools with Xcode Server are

  • Consistent setup all across the servers
  • Infrastructure as a code
  • Destroy and build CI servers whenever needed
  • Save hours of engineers manually setting up CI server machines
  • Eliminate the cost of expensive third-party CI services
  • No need to have dedicated DevOps resources to manage iOS CI/CD infrastructure
  • Understanding of the iOS ecosystem and the confidence to deal with any issues
  • No need to use third-party build automation tools like Fastlane as Xcode Server can handle almost everything needed for CI/CD

You might see several other benefits and increased confidence and trust in Continuous Integration.


Using the combination of DevOps tools and Apple’s native CI/CD tools, we can achieve scalable and programmable infrastructure for our iOS apps. The Xcode Server and Ansible combination can work well together to achieve Continuous Delivery of iOS apps without any pain. Are you using any configuration management tools for iOS CI/CD infrastructure? What are your experiences with Xcode Server? Please let me know what you think about this approach- any ideas or suggestions would be great.

Original Link

On-Demand Test Environments With Ansible and Shippable

One of the biggest challenges to implementing an end-to-end Continuous Delivery pipeline is making sure adequate test automation is in place. However, even if you have automated across your entire Test Suite, there is a second challenge: How do you manage Test infrastructure and environments without breaking the bank?

If you want to move towards Continuous Delivery, you need to execute a majority of your tests for each code change in a pristine environment that is as close to your production environment as possible. This ensures that code defects are identified immediately and every code change is therefore “shippable.” However, creating these environments and updating them with each new application version or every time the config changes adds a lot of overhead. If you’re testing an application with many tiers or microservices, the complexity increases since each tier might need to be tested independently in its own environment against specific versions of other tiers.

The utopia of Test Automation is the following:

  • Environment definitions are represented by infrastructure-as-code tools like Ansible, Terraform, Puppet, or Chef. The provisioning scripts are committed to source control and versioned, so you can go back to an earlier state if needed.
  • All (or at least a good majority) of your tests are automated and either committed to source control or hosted on services such as Nouvola, Sauce, etc.
  • You have a completely automated deployment pipeline that automatically spins up a production-like Test environment for every code change, triggers your automation, and if all tests succeed, destroys the environment. If tests fail, the right folks are notified and the environment is kept live until someone can debug the failures.

The first step is already happening in most organizations. The DevOps movement encouraged Ops teams to start writing scripts to provision and manage environments and infrastructure, and multiple vendors support this effort quite effectively. The second step is still a challenge in many organizations, but this is really something that needs executive buy-in and a commitment to automation, even if it slows down product development for a while.

This whitepaper presents a method of implementing the third step – spinning up test environments on-demand and destroying them automatically after the automated test suite is executed.

The Scenario

To make things simpler, we’ll skip the CI step which builds and tests the application Docker image and pushes it to Amazon ECR. This can be accomplished by following instructions for CI: Run CI for a sample app.

Our example follows the steps below:

1. A service definition, aka manifest, is created, including the Docker image and some options

2. A test environment is provisioned using Ansible under the cover. Ansible config files are templatized using environment variables defined in

Shippable, allowing Ansible config to become highly reusable to provision multiple test clusters if needed.

3. The manifest is deployed to the Test environment and functional test suite is triggered

4. If tests pass, the test environment is destroyed using Ansible and test owner is notified.

5. If tests fail, the test owner is notified and the environment is not destroyed. The Test owner can always destroy the environment manually after he/she has extracted the information they need about the failure.

Before We Start

You will need the following to implement this scenario:

  • A GitHub or Bitbucket account that you will use to login to Shippable
  • An AWS account
  • A Docker Hub account (or Amazon ECR/GCR/Quay)
  • Ideally, some familiarity with Ansible is desirable, though not required.

If you’re not familiar with Shippable, here are some basic concepts you should know before you start:

  • Configuration: The Assembly Lines configuration for Shippable resides in a shippable.yml file. The repository that contains this config in your source control is called a Sync Repository, aka syncRepo. A syncRepo is added through your Shippable UI to add your Assembly Line.
  • Jobs are executable units of your pipeline and can perform any activity such as CI, provisioning an environment, deploying your application, or running pretty much any custom script. A simple way to think of it is, if something can execute in the shell of your laptop, it can execute as a Job.
  • To execute, such as credentials, pointer to Resources typically contain information needed for a Jobs cluster on a Container Engine or an image on a Hub, or any key-value pairs. Resources are also used to store information produced by a job which can be then accessed by downstream jobs.
  • Integrations are used to configure connections to third-party services, such as AWS, Docker Hub, GKE, Artifactory, etc.
  • The Single Pane of Glass view shows a real-time, interactive view of your Assembly Line(s).

How the Sample Application Is Structured

Our sample repositories are on GitHub:

  • The sample application that we will run functional tests on is a voting app that is built using Python Flask and Redis. The source for the front end (Flask) can be found in the vote_fe repository and the backend (Redis) in the vote_be repository. The shippable.yml in these repositories contains the CI configuration to build and deploy their Docker images to their public repositories on Docker Hub.
  • devops-recipes/on_demand_test_environments contains the Shippable configuration and required Ansible playbooks for this scenario. The sections below explain in detail how the Shippable configuration is built.

Step 1: Enabling CI for the Sample Application

  • Fork the vote_fe and the vote_be repositories into your SCM.
  • Login to Shippable with your SCM account and enable CI using these steps.
  • Create a Docker Registry integration using these steps and call it drship_dockerhub. If you use a different integration name, replace drship_dockerhub in the shippable.yml file.
  • Specify your Docker repository and account in the shippable.yml file and commit the file.
  • Trigger CI for these repositories using these steps.

At the end of Step 1, you should have two images published in your Docker registry integration.

Step 2: Create the Service Definition

A. Define the Resource in the shippable.yml File

The shippable.yml file can be committed in one of the app repositories or to a separate repository. We have used a different repository, devops-recipes/on_demand_test_environments, in our sample. The repository containing your jobs and resources ymls is called a Sync repository and represents your workflow configuration.

resources: ###---------------------------------------------------------------# ###----------------------- BUILD/CI Resources --------------------# ###---------------------------------------------------------------# # Back-end image - name: vote_be_odte type: image # replace dr-dockerhub with your docker registry integration name integration: dr-dockerhub pointer: # replace devopsrecipes/vote_be with your repository sourceName: "devopsrecipes/vote_be" seed: # specify the latest tag of the image in your docker registry versionName: "master.2" # Front-end image - name: vote_fe_odte type: image # replace dr-dockerhub with your docker registry integration name integration: dr-dockerhub pointer: # replace devopsrecipes/vote_fe with your repository sourceName: "devopsrecipes/vote_fe" seed: # specify the latest tag of the image in your docker registry versionName: "master.3" # Docker options to expose port 80 on the front-end container and link the redis container - name: vote_fe_options_odte type: dockerOptions version: memory: 128 portMappings: - "80:5000/tcp" links: - vote_be_odte:redis

B. Define the Jobs in the shippable.yml File

create_app_man_odteis a manifest job that defines all the containers that run in the ECS cluster. This definition is versioned and each version is immutable.

Add the following to your shippable.yml file and commit it.

jobs: #---------------------------------------------------------------#
#------------------- BUILD/CI with SHIPPABLE CI ----------------#
#---------------------------------------------------------------# # CI job definition. The image that is pushed to Docker hub is specified in an OUT image resource. # This image resource becomes an IN to the manifest job and triggers the manifest job whenever # a new image version (tag) is created. - name: vote_be_runCI type: runCI steps: - OUT: vote_be_odte - name: vote_fe_runCI type: runCI steps: - OUT: vote_fe_odte # Application service definition - name: create_app_man_odte type: manifest steps: - IN: vote_fe_odte - IN: vote_fe_options_odte applyTo: - vote_fe_odte - IN: vote_be_odte

Step 3: Provision the Test Environment

We use an Ansible playbook to create the ECS cluster, whose implementation can be found here.

We templatize the Ansible configuration files to make them flexible.The configuration in then defined by Shippable generated environment variables and resources.

A. Ansible.cfg File

Here we use the SCRIPTS_REPO_ODTE_STATE environment variable to point to the root of the repository when the playbook is run in a Shippable node.

[defaults] # update, as needed, for your scenario host_key_checking=false inventory = ${SCRIPTS_REPO_ODTE_STATE}/infra/provision-ecs-ansible/inventory/ [ssh_connection] # for running on Ubuntu control_path=%(directory)s/%%h-%%r

B. Group Variables

All the variables used by Ansible modules to create the cluster are defined as placeholders. These placeholders are replaced at runtime by values defined in a params resource.

ec2_instance_type: "${EC2_INSTANCE_TYPE}" ec2_image: "${EC2_IMAGE}" ec2_keypair: "${EC2_KEYPAIR}" ec2_user_data: "#!/bin/bash \n echo ECS_CLUSTER=\"${ECS_CLUSTER_NAME}\" >> /etc/ecs/ecs.config" ec2_region: "${EC2_REGION}" ec2_tag_Role: "${EC2_TAG_ROLE}" ec2_tag_Type: "${EC2_TAG_TYPE}" ec2_volume_size: ${EC2_VOLUME_SIZE} ec2_count: ${EC2_COUNT} STATE_RES_NAME: "${STATE_RES_NAME}" ec2_security_group: "${TEST_PUBLIC_SG_ID}" ec2_subnet_ids: ["${TEST_PUBLIC_SN_01_ID}","${TEST_PUBLIC_SN_02_ID}"] ec2_tag_Environment: "${ENVIRONMENT}" ECS_CLUSTER_NAME: "${ECS_CLUSTER_NAME}"

C. Define Ansible Configuration in the shippable.yml File

resources: #---------------------------------------------------------------# #-------------------- Common INFRA Resources -------------------# #---------------------------------------------------------------# # Ansible scripts repository - name: scripts_repo_odte type: gitRepo integration: "dr-github" pointer: sourceName: "devops-recipes/on-demand-test-environments" branch: master # AWS integration that sets up the AWS CLI environment used by Ansible playbook - name: aws_cli_config_odte type: cliConfig integration: dr-aws-keys pointer: region: us-east-1 # SecOps approved AMI - name: ami_sec_approved_odte type: params version: params: AMI_ID: "ami-9eb4b1e5" #---------------------------------------------------------------# #----------------------- TEST VPC Resources --------------------# #---------------------------------------------------------------# # TEST environment config - name: test_conf_odte type: params version: params: EC2_REGION: "us-east-1" EC2_TAG_ROLE: "dr-on-demand-test-environments" EC2_TAG_TYPE: "ecs-container-instance" EC2_VOLUME_SIZE: 30 EC2_COUNT: 1 STATE_RES_NAME: "test_info_odte" ECS_CLUSTER_NAME: "test_env_ecs_odte" ENVIRONMENT: "test" EC2_INSTANCE_TYPE: "t2.large" EC2_IMAGE: "ami-9eb4b1e5" EC2_KEYPAIR: "ambarish-useast1" # Test VPC Info - name: test_vpc_conf_odte type: params version: params: TEST_VPC_ID: "vpc-a36912da" TEST_PUBLIC_SG_ID: "sg-c30fc8b6" TEST_PUBLIC_SN_01_ID: "subnet-34378e50" TEST_PUBLIC_SN_02_ID: "subnet-34378e50" REGION: "us-east-1" # Output of Test ECS Provisioning - name: test_info_odte type: params version: params: SEED: "initial_version" # Reference to ECS Test Cluster - name: test_env_ecs_odte type: cluster integration: "dr-aws-keys" pointer: sourceName : "test_env_ecs_odte" region: "us-east-1"

D. Augment the Ansible Playbook ansible-ecs-provision That Provisions the ECS Cluster

After the cluster is created, we use Shippable platform resources and API to persist important cluster metadata such as the ARN and public IP of the cluster in a params resource test_info_odte and the cluster resource test_env_ecs_odte.

The ansible-ecs-provision playbook calls two roles to provision the ECS cluster.

--- ### provision AWS ECS cluster - hosts: localhost connection: local gather_facts: false user: root pre_tasks: - include_vars: group_vars/ecs-cluster-vars.yml roles: - ecs-cluster-provision - ec2-container-inst-provision post_tasks: - name: refresh hosts inventory list meta: refresh_inventory
  • ecs-cluster-provision
    --- # update Shippable resource state with this job number - name: run cmd shell: | shipctl post_resource_state "" versionName "build-${BUILD_NUMBER}" # provision ECS cluster - name: Create ECS Cluster ecs_cluster: name: "" state: present register: ecs # update shippable resource state with provisioned cluster_arn - name: run cmd shell: | shipctl put_resource_state "" CLUSTER_ARN "" shipctl put_resource_state "" TEST_ECS_CLUSTER_ID ""
  • ecs-cluster-provision
    --- - name: Provision instances with tag local_action: module: ec2 key_name: "" group_id: "" instance_type: "" instance_profile_name: "ecsInstanceRole" image: "" user_data: "" vpc_subnet_id: "" region: "" instance_tags: '{"Name":"","Role":"","Type":"","Environment":""}' assign_public_ip: yes wait: true exact_count: "" count_tag: Role: "" volumes: - device_name: /dev/xvda volume_type: gp2 volume_size: "" delete_on_termination: true register: ec2 - add_host: name: "{{item.public_ip}}" groups: tag_Type_,tag_Environment_ ec2_region: "" ec2_tag_Name: "" ec2_tag_Role: "" ec2_tag_Type: "" ec2_tag_Environment: "" ec2_ip_address: "{{item.public_ip}}" with_items: "" - name: Wait for the instances to boot by checking the ssh port wait_for: host={{item.public_ip}} port=22 delay=15 timeout=300 state=started with_items: "" # update shippable resource state - name: run cmd shell: | shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_PUBLIC_IP" "{{item.public_ip}}" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_ID" "{{}}" shipctl put_resource_state "" "REGION" "" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_PUBLIC_IP" "{{item.public_ip}}" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_ID" "{{}}" with_items: ""

shipctl provides a comprehensive library of utilities that can be used to extract and persist useful data in a Shippable param or state resource. This data can then be used by jobs downstream.

E. Define the Shippable Job That Runs the Ansible Playbook ansible-ecs-provision.yml

Add the following to shippable.yml file and commit it.

--- - name: Provision instances with tag local_action: module: ec2 key_name: "" group_id: "" instance_type: "" instance_profile_name: "ecsInstanceRole" image: "" user_data: "" vpc_subnet_id: "" region: "" instance_tags: '{"Name":"","Role":"","Type":"","Environment":""}' assign_public_ip: yes wait: true exact_count: "" count_tag: Role: "" volumes: - device_name: /dev/xvda volume_type: gp2 volume_size: "" delete_on_termination: true register: ec2 - add_host: name: "{{item.public_ip}}" groups: tag_Type_,tag_Environment_ ec2_region: "" ec2_tag_Name: "" ec2_tag_Role: "" ec2_tag_Type: "" ec2_tag_Environment: "" ec2_ip_address: "{{item.public_ip}}" with_items: "" - name: Wait for the instances to boot by checking the ssh port wait_for: host={{item.public_ip}} port=22 delay=15 timeout=300 state=started with_items: "" - name: display ecs cluster debug: msg: "{{item}}" with_items: "" # update shippable resource state - name: run cmd shell: | shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_PUBLIC_IP" "{{item.public_ip}}" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_ID" "{{}}" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_PUBLIC_DNS" "{{item.public_dns_name}}" shipctl put_resource_state "" "REGION" "" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_PUBLIC_IP" "{{item.public_ip}}" shipctl put_resource_state "" "INST_{{item.ami_launch_index}}_ID" "{{}}" with_items: ""

Step 4: Deploy the Application to the Test ECS Environment

deploy_app_test_odte is a deploy job, which creates the service and task definition in the ECS cluster and starts the service. Since it needs to run after the ECS cluster is created in the workflow, prov_test_vpc_odte is specified as an input.

Add the following to the shippable.yml file and commit it.

jobs: ##---------------------------------------------------------------# ##-------------------- App Release Automation -------------------# ##---------------------------------------------------------------# # DEPLOY to TEST environment - name: deploy_app_test_odte type: deploy steps: - IN: create_app_man_odte switch: off - IN: prov_test_vpc_odte - IN: test_env_ecs_odte switch: off - TASK: managed

Step 5: Run Functional Tests on the Test Cluster

Add thedeploy_app_test_odte job to your shippable.yml file. This job extracts the public DNS of the ECS cluster from the test_info_odte params resource and passes it to the script that runs some tests using the public DNS.

It is a runSh job that lets you run any shell script. Since it needs to run after the application is deployed in the workflow, test_env_ecs_odte is specified as an input. In addition, we also provide the manifest job as an input to the job.

jobs: # RUN System Integration Testing - name: sit_odte type: runSh steps: - IN: scripts_repo_odte switch: off - IN: deploy_app_test_odte - TASK: # Run tests - script: | pushd $(shipctl get_resource_state "scripts_repo_odte")/tests PARAMS_JSON=$(shipctl get_resource_version_key test_info_odte params) CLUSTER_DNS=$(echo $PARAMS_JSON | jq -r .INST_0_PUBLIC_DNS) echo "ECS Cluster DNS: "$CLUSTER_DNS ./ $CLUSTER_DNS popd on_success: - script: echo "SUCCESS" on_failure: - script: echo "FAILURE"

Step 5: Deprovision the Cluster

A. Add thedeprov_test_infra_ode Job to Your shippable.yml file

It is a runSh job that lets you run any shell script. Since it needs to run after the system integrations tests are run, sit_odte is specified as an input.

jobs: #---------------------------------------------------------------# #----------------------- Deprov Test Infra----------------------# #---------------------------------------------------------------# # DEPROV TEST Infra with Ansible - name: deprov_test_infra_odte type: runSh steps: - IN: sit_odte - IN: aws_cli_config_odte switch: off - IN: test_vpc_conf_odte switch: off - IN: test_conf_odte switch: off - IN: test_info_odte switch: off - IN: scripts_repo_odte switch: off - IN: ami_sec_approved_odte switch: off - TASK: - script: shipctl replace $SCRIPTS_REPO_ODTE_STATE/infra/provision-ecs-ansible/ansible.cfg $SCRIPTS_REPO_ODTE_STATE/infra/provision-ecs-ansible/group_vars/ecs-cluster-vars.yml - script: sudo pip install boto3 - script: | cd $SCRIPTS_REPO_ODTE_STATE/infra/provision-ecs-ansible ansible-playbook -v ansible-ecs-terminate.yml on_success: - script: echo "SUCCESS" on_failure: - script: echo "FAILURE"

B. Commit Shippable.Yml and Create a Sync Repo in Your Shippable Account

Follow these instructions to import your configuration files into your Shippable account.

Your pipeline should now look like this in the SPOG view.

Different sections of the pipeline expanded.

6. Trigger Your Pipeline

Right click on create_app_man_odtein the SPOG and click on Build Job. This will trigger the entire pipeline.

Screenshot of the Manifest Job

Screenshot of the Cluster Provision Job

Screenshot of Resources Populated by the Cluster Provision Job

Screenshot of the Deploy Job

Screenshot of the Deprovision Job

Original Link

Red Hat OpenStack Platform 12 Is Here!

Red Hat OpenStack Platform 12 is now Generally Available (GA).

This is Red Hat OpenStack Platform’s 10th release and is based on the upstream OpenStack release, Pike.

Red Hat OpenStack Platform 12 is focused on the operational aspects to deploying OpenStack. OpenStack has established itself as a solid technology choice and with this release, we are working hard to further improve the usability aspects and bring OpenStack and operators into harmony.

With operationalization in mind, let’s take a quick look at some the biggest and most exciting features now available.


As containers are changing and improving IT operations it only stands to reason that OpenStack operators can also benefit from this important and useful technology concept. In Red Hat OpenStack Platform we have begun the work of containerizing the control plane. This includes some of the main services that run OpenStack, like Nova and Glance, as well as supporting technologies, such as Red Hat Ceph Storage. All these services can be deployed as containerized applications via Red Hat OpenStack Platform’s lifecycle and deployment tool, director.

frank-mckenna-252014 Photo by Frank McKenna on Unsplash

Bringing a containerized control plane to OpenStack is important. Through it we can immediately enhance, among other things, stability and security features through isolation. By design, OpenStack services often have complex, overlapping library dependencies that must be accounted for in every upgrade, rollback, and change. For example, if Glance needs a security patch that affects a library shared by Nova, time must be spent to ensure Nova can survive the change; or even more frustratingly, Nova may need to be updated itself. This makes the change effort and resulting change window and impact, much more challenging. Simply put, it’s an operational headache.

However, when we isolate those dependencies into a container we are able to work with services with much more granularity and separation. An urgent upgrade to Glance can be done alongside Nova without affecting it in any way. With this granularity, operators can more easily quantify and test the changes helping to get them to production more quickly.

We are working closely with our vendors, partners, and customers to move to this containerized approach in a way that is minimally disruptive. Upgrading from a non-containerized control plane to one with most services containerized is fully managed by Red Hat OpenStack Platform director. Indeed, when upgrading from Red Hat OpenStack Platform 11 to Red Hat OpenStack Platform 12 the entire move to containerized services is handled “under the hood” by director. With just a few simple preparatory steps director delivers the biggest change to OpenStack in years direct to your running deployment in an almost invisible, simple to run, upgrade. It’s really cool!

Red Hat Ansible

Like containers, it’s pretty much impossible to work in operations and not be aware of, or more likely be actively using, Red Hat Ansible. Red Hat Ansible is known to be easier to use for customising and debugging; most operators are more comfortable with it, and it generally provides an overall nicer experience through a straightforward and easy to read format.

Of course, we at Red Hat are excited to include Ansible as a member of our own family. With Red Hat Ansible we are actively integrating this important technology into more and more of our products.

In Red Hat OpenStack Platform 12, Red Hat Ansible takes center stage.

But first, let’s be clear, we have not dropped Heat; there are very real requirements around backward compatibility and operator familiarity that are delivered with the Heat template model.

But we don’t have to compromise because of this requirement. With Ansible we are offering operator and developer access points independent of the Heat templates. We use the same composable services architecture as we had before; the Heat-level flexibility still works the same, we just translate to Ansible under the hood.

Simplistically speaking, before Ansible, our deployments were mostly managed by Heat templates driving Puppet. Now, we use Heat to drive Ansible by default, and then Ansible drives Puppet and other deployment activities as needed. And with the addition of containerized services, we also have positioned Ansible as a key component of the entire container deployment. By adding a thin layer of Ansible, operators can now interact with a deployment in ways they could not previously.

For instance, take the new openstack overcloud config download command. This command allows an operator to generate all the Ansible playbooks being used for a deployment into a local directory for review. And these aren’t mere interpretations of Heat actions, these are the actual, dynamically generated playbooks being run during the deployment. Combine this with Ansible’s cool dynamic inventory feature, which allows an operator to maintain their Ansible inventory file based on a real-time infrastructure query, and you get an incredibly powerful troubleshooting entry point.

Check out this short (1:50) video showing Red Hat Ansible and this new exciting command and concept:

Network Composability

Another major new addition for operators is the extension of the composability concept into networks.

As a reminder, when we speak about composability we are talking about enabling operators to create detailed solutions by giving them basic, simple, defined components from which they can build for their own unique, complex topologies.

With composable networks, operators are no longer only limited to using the predefined networks provided by director. Instead, they can now create additional networks to suit their specific needs. For instance, they might create a network just for NFS filer traffic, or a dedicated SSH network for security reasons.

radek-grzybowski-74331 Photo by Radek Grzybowski on Unsplash

And as expected, composable networks work with composable roles. Operators can create custom roles and apply multiple, custom networks to them as required. The combinations lead to an incredibly powerful way to build complex enterprise network topologies, including an on-ramp to the popular L3 spine-leaf topology.

And to make it even easier to put together we have added automation in director that verifies that resources and Heat templates for each composable network are automatically generated for all roles. Fewer templates to edit can mean less time to deployment!

Telco Speed

Telcos will be excited to know we are now delivering production ready virtualized fast data path technologies. This release includes Open vSwitch 2.7 and the Data Plane Development Kit (DPDK) 16.11 along with improvements to Neutron and Nova allowing for robust virtualized deployments that include support for large MTU sizing (i.e. jumbo frames) and multiple queues per interface. OVS+DPDK is now a viable option alongside SR-IOV and PCI passthrough in offering more choice for fast data in Infrastructure-as-a-Service (IaaS) solutions.

Operators will be pleased to see that these new features can be more easily deployed thanks to new capabilities within Ironic, which store environmental parameters during introspection. These values are then available to the overcloud deployment providing an accurate view of hardware for ideal tuning. Indeed, operators can further reduce the complexity around tuning NFV deployments by allowing director to use the collected values to dynamically derive the correct parameters resulting in truly dynamic, optimized tuning.

Serious About Security


Helping operators, and the companies they work for, focus on delivering business value instead of worrying about their infrastructure is core to Red Hat’s thinking. And one way we make sure everyone sleeps better at night with OpenStack is through a dedicated focus on security.

Starting with Red Hat OpenStack Platform 12 we have more internal services using encryption than in any previous release. This is an important step for OpenStack as a community to help increase adoption in enterprise datacenters, and we are proud to be squarely at the center of that effort. For instance, in this release even more services now feature internal TLS encryption.

Let’s be realistic, though, focusing on security extends beyond just technical implementation. Starting with Red Hat OpenStack Platform 12 we are also releasing a comprehensive security guide, which provides best practices as well as conceptual information on how to make an OpenStack cloud more secure. Our security stance is firmly rooted in meeting global standards from top international agencies such as FedRAMP (USA), ETSI (Europe), and ANSSI (France). With this guide, we are excited to share these efforts with the broader community.

Do You Even Test

How many times has someone asked an operations person this question? Too many! “Of course we test,” they will say. And with Red Hat OpenStack Platform 12 we’ve decided to make sure the world knows we do, too.

Through the concept of Distributed Continuous Integration (DCI), we place remote agents on site with customers, partners, and vendors that continuously build our releases at all different stages on all different architectures. By engaging outside resources we are not limited by internal resource restrictions; instead, we gain access to hardware and architecture that could never be tested in any one company’s QA department. With DCI we can fully test our releases to see how they work under an ever-increasing set of environments. We are currently partnered with major industry vendors for this program and are very excited about how it helps us make the entire OpenStack ecosystem better for our customers.

So, do we even test? Oh, you bet we do!

Get the Goods

Red Hat OpenStack Platform 12 is designated as a “Standard” release with a one-year support window. Click here for more details on the release lifecycle for Red Hat OpenStack Platform.

Found out more about this release at the Red Hat OpenStack Platform Product page. Or visit our vast online documentation.

And if you’re ready to get started now, check out the free 60-day evaluation available on the Red Hat portal.

Looking for even more? Contact your local Red Hat office today.

Original Link

Ansible TDD Development Using Molecule 2.4


Ansible is an agentless IT orchestration tool written in Python that simplifies infrastructure automation and deployment, similar to an agent-based tool like Puppet or Chef. 

Molecule contains a set of tools to help us in the development and testing of Ansible roles. Ansible role can be tested against multiple operating systems and distributions, virtualization providers such as docker and vagrant, test frameworks such as testinfra and Goss using Molecule.

This tutorial is to develop an Ansible role from scratch using the test-driven development approach using Molecule, testinfra, and Docker.


Develop an Ansible role for installing an SOS report package on Ubuntu machines and generate a report using the TDD approach.

SOS report is a tool to capture the debugging information for the current system in a compressed tarball format that can be sent to technical support for further analysis.


Install Ansible, Molecule, and docker-py using pip. Assume Docker for Mac is installed on your Mac OS.

pip install ansible
pip install molecule
pip install docker-py
ansible –version ansible
molecule –version molecule, version 2.4.0
python -V Python 2.7.10
OS macOS High Sierra (10.13.1)

Create Ansible Role Skeleton

We can initialize an Ansible role with Molecule using the following two approaches:

Approach 1: Create Ansible role with the ansible-galaxy command, then initialize with Molecule

We can generate an Ansible skeleton role using the ansible-galaxy init command without Molecule and later initialize through the molecule init command.

This approach can be used to add Molecule tests to any existing Ansible role.

ansible-galaxy init ansible-role-sosreport
cd ansible-role-sosreport/
molecule init scenario --scenario-name default --role-name ansible-role-sosreport

Approach 2: Create Ansible role with Molecule init command

Create a new role using the Molecule init command:

molecule init role --role-name ansible-role-sosreport

The Molecule init command creates the Molecule directory inside the newly created Ansible role with a default scenario. The tests are written under the Molecule/default/tests folder using testinfra.Image title

Since we are doing a test-driven approach, we will write the tests initially, then go with the actual code.

First, we have to redefine the default Molecule configuration file, which is generated by the molecule init command with the driver and image, platforms, verifier, and the test sequence for our default scenario.

file: ansible-role-sosreport/molecule/molecule.yml

dependency: name: galaxy
driver: name: docker
lint: name: yamllint
platforms: - name: instance image: ubuntu:16.04 privileged: true
provisioner: name: ansible lint: name: ansible-lint options: x: ANSIBLE0013
verifier: name: testinfra options: verbose: true lint: name: flake8 enabled: False
scenario: name: default test_sequence: - destroy - syntax - create - prepare - converge - lint - side_effect - verify - destroy

We are using Docker driver in this example, which is mentioned in the driver section (line 4-5).

The platforms section (line 8-11) has the Docker image used by Molecule to test this role. Here we are using the Ubuntu image, but we can have multiple Docker images in this section.

In the provisioner section (line 12-17), we are excluding the Ansible line rule ANSIBLE0013 for the shell module.

We are overriding the scenario (line 25-36) section by specifying the test sequence that we required. For example, we removed the idempotence test from the test_sequence list.

1. Writing Tests

The intention of ansible-role-sosreport role is to make sure that the sosreport package is installed and generates the system report in a compressed tar format.

Use Case 1: Make sure that sos-report pre-requisites and sos-report package is installed.

Use Case 2: Make sure that sos report file is generated at a given location.

Tests are written under the scenario_name/tests folder. Currently, we have only one scenario, named default. To learn more about scenarios, please refer to the reference section.

file: ansible-role-sosreport/molecule/default/tests/

import os
import testinfra.utils.ansible_runner testinfra_hosts = testinfra.utils.ansible_runner.AnsibleRunner( os.environ['MOLECULE_INVENTORY_FILE']).get_hosts('all') def test_sos_report_package(host): assert host.package("software-properties-common").is_installed apt_repo = int(host.check_output("find /etc/apt/ -type f -exec grep 'canonical-support/support-tools' '{}' \; -print | wc -l").strip()) assert apt_repo > 0 assert host.package("sosreport").is_installed def test_sos_report_file(host): assert host.check_output("find /tmp/sosreport-*/ -name 'sosreport-shoneslab*.tar.xz' | wc -l").strip() == '1'

Here, we have two functions to verify whether the sosreport package (line 8-12) is installed with its dependencies and the actual report is generated (line 15-16) after applying the ansible-role-sosreport role to the Ubuntu 16.04 Docker image.

2. Run the Test

After writing the test scenarios, we can run the test using molecule. Since the actual functionality is not yet implemented, the following commands will throw errors.

molecule --debug test

3. Implement the Code

Once the test has failed, we know that the functionality is missing in the Ansible role and yet to implement. We have to write the logic as Ansible tasks in the tasks folder under the module name ansible-role-sosreport.

File: ansible-role-sosreport/tasks/install.yml

This file is to install the sosreport package and its dependencies:

- name: install prereq apt: name: software-properties-common state: present become: true - name: add canonical support tools apt repo apt_repository: state: present repo: "{{ sosreport_apt_repo }}" become: true - name: update apt packages apt: update_cache: yes cache_valid_time: 86400 become: true - name: install sosreport apt: name: sosreport state: present become: true

file: ansible-role-sosreport/tasks/report.yml

This file is to generate the sosreport:

- name: get the timestamp set_fact: sosreport_dir_timestamp: "{{ lookup('pipe', 'date +%m%d%Y-%H%M%S') }}" - name: ensure the temp directory file: path: "{{ sosreport_temp_dir }}/sosreport-{{ sosreport_dir_timestamp }}" state: directory become: true - name: generate sosreport shell: > sosreport -a \ --name={{ sosreport_customer_name }} \ --case-id={{ sosreport_caseid }} \ --tmp-dir="{{ sosreport_temp_dir }}/sosreport-{{ sosreport_dir_timestamp }}" \ --batch become: true changed_when: false - name: find the latest generated file find: path: "{{ sosreport_temp_dir }}/sosreport-{{ sosreport_dir_timestamp }}" patterns: "sosreport-{{ sosreport_customer_name }}.{{ sosreport_caseid }}-*.tar.xz" register: reg_sosreport - name: stat for the sosreport stat: path: "{{ reg_sosreport.files[0]['path'] }}" register: reg_stat_sosreport when: reg_sosreport | length > 0 - name: change file permissions file: name: "{{ reg_sosreport.files[0]['path'] }}" mode: 0755 become: true when: - reg_sosreport | length > 0 - reg_stat_sosreport.stat.exists is defined - reg_stat_sosreport.stat.exists - name: ensure the destination directory file: path: "{{ sosreport_final_dest_path }}" state: directory mode: 0755 delegate_to: localhost become: false when: - reg_sosreport | length > 0 - reg_stat_sosreport.stat.exists is defined - reg_stat_sosreport.stat.exists - name: stat whether the file is already copied stat: path: "{{ sosreport_final_dest_path }}/{{ reg_sosreport.files[0]['path'] }}" register: reg_stat_sosreport_copied delegate_to: localhost when: reg_sosreport | length > 0 - name: fetch the remote file to local fetch: src: "{{ reg_sosreport.files[0]['path'] }}" dest: "{{ sosreport_final_dest_path }}" flat: yes become: true when: - reg_sosreport | length > 0 - not reg_stat_sosreport_copied.stat.exists - reg_stat_sosreport.stat.exists is defined - reg_stat_sosreport.stat.exists

file: ansible-role-sosreport/tasks/main.yml

The main.yml is the starting point of an Ansible role. Since the logic is implemented in a modular approach, we are including the other two tasks files in main.yml.

# tasks file for ansible-role-sosreport
- include_tasks: install.yml tags: sosreport_install - include_tasks: report.yml tags: sosreport_report

file: ansible-role-sosreport/defaults/main.yml

It is always a good practice to externalize the variables than hardcoding in the tasks file. Eventually, these values can be overridden when we use inventory (not the scope of this tutorial)

sosreport_apt_repo: "ppa:canonical-support/support-tools"
sosreport_temp_dir: /tmp
sosreport_final_dest_path: "{{ lookup('env','HOME') }}/temp/" # Customer Name should be with out spaces
sosreport_customer_name: 'shoneslab'
sosreport_caseid: 42799

4. Run the Test

This is the iterative process. Once you see all the errors are gone, your Ansible module is completed with a TDD approach.

molecule --debug test


It is easy to develop an Ansible role using Molecule and can easily integrate with CI tools like Gitlab/Jenkins.  

This Ansible role can be downloaded using the following command:

ansible-galaxy install shoneslab.sosreport

Original Link

Installing Java, Supervisord, and Other Service Dependencies With Ansible

Let’s say you spent the past 18 months building your MVP application complete with several microservices, SQL, NoSQL, Kafka, Redis, Elasticsearch, service discovery, and on-demand configuration updates. You are the bleeding edge of technology and are prepared to scale to millions of users overnight. Your only problem? everything is running locally and nothing has been set up in AWS. Go hire a dev ops engineer and delay your launch another month or two, good luck.

Let’s try that again and rewind 17 months. You decided to go with a single Java monolith app which you build a single deployable jar with Gradle using the shadowjar plugin. Sure, monoliths aren’t sexy but they work. Realistically speaking they can scale pretty far, remember when Twitter was still on monoliths with millions of tweets a day? Our goal is to get the app up and running in as little effort as possible but keeping in mind we may want to be able to scale out to more than one server. No, we don’t need auto-scaling just yet! and probably won’t for a long while if ever.

Head over to Amazon Web Services and spin up a single server. Maybe even a t2.micro will be good enough for your use case. It can always be resized later. Next, it would be nice to have a way to reliably install all of the dependencies for our app. Enter Ansible, an SSH based automation solution for provisioning, configuring, and deploying applications using YAML. Take some time for a brief introduction getting started with Ansible so we can dive right in. We will be following the second directory structure from Ansible best practices.

Ansible Playbook to Configure Our EC2 Instance With Java and Supervisord

An Ansible Playbook is often the main entry point to some task/process / configuration. It can contain which hosts specific commands are to be run on and include all of the reusable modules (roles) for the task at hand. We can also provide variables or variable overrides here. Our playbook will include everything our app needs to be running on our newly spun up EC2 instance. This includes the current version of Java and Supervisord, which we will use to run and make sure the process gets restarted if it crashes.

# ansible-playbook --vault-password-file .vault_pw.txt -i inventories/production stubbornjava.yml
- hosts: stubbornjava become: true roles: - role: common - role: apps/jvm_app_base app_name: stubbornjava app_command: "java8 -Denv={{env}} -Xmx640m -cp 'stubbornjava-all.jar' com.stubbornjava.webapp.StubbornJavaWebApp"

View on GitHub

Ansible Hosts

The hosts property allows us to specify which hosts we want to execute the roles/tasks in this section on. In this case, we are choosing our server group stubbornjava.


View on GitHub

Ansible Custom Java Application Base Role

The common role is pretty trivial and just sets a time zone and enables the EPEL repo so we will skip it. The next role jvm_app_base is where most of our configuration happens. We are passing two parameters into our jvm_app_base role. Our app_name which is just an identifier for our app that will be used throughout the role and app_command the actual command we want to run.

Java Application Base Variables

Our base role includes the following variables. As you can see we are reusing the app_name passed in from the playbook to define some additional variables.

--- app_directory: "/apps/{{app_name}}" app_user: "{{app_name}}" app_user_group: "{{app_name}}"

View on GitHub

Java Application Base Role

Roles are one of Ansible primary building blocks. A role can be collections of tasks, variables, templates, files at its core. Our role is fairly generic so we can use it for multiple different Java applications if we want. Like a Playbook roles can also include other roles. We are using a public open source role to install Java for us. We are also including another custom role for Supervisord. Next, we will use some built-in Ansible modules to create a user and group for our app as well as the directory structure we want. Notice the use of variables cascading down and so far mostly being populated by just our app_name. Finally we wrap up with adding our Supervisord config file for this app. This file tells us where to put logs, what command to run and any other Supervisord specific settings we care about. Finally, we are copying over our environment specific configuration with all of our secrets using Ansible Vault.

- name: Install Java include_role: name: java_packages: - java-1.8.0-openjdk - name: Including Supervisord include_role: name: supervisord - name: "Creating {{app_user_group}} Group" group: name: "{{app_user_group}}" state: present - name: "Creating {{app_user}} User" user: name: "{{app_user}}" group: "{{app_user_group}}" - name: "Creating {{app_directory}} Directory" file: path: "{{app_directory}}" state: directory mode: 0755 owner: "{{app_user}}" group: "{{app_user_group}}" - name: "Copying supervisor {{app_name}}.conf" template: src: supervisorapp.conf.j2 dest: "/etc/supervisor.d/{{app_name}}.conf" notify: restart supervisord # If we start refreshing configs in the app we can
# turn off the restart. For now we load configs once
# at boot. So restart the app any time it changes.
- name: Copying stubbornjava secure.conf template: src: secure.conf.j2 dest: "{{app_directory}}/secure.conf" owner: "{{app_user}}" group: "{{app_user_group}}" mode: u=r,g=r,o= notify: restart supervisord

View on GitHub

As you can see, our config file with secrets is very bare bones. Most of our production config can live in our normal code base. The only thing we want to be split out is our secrets. We want them stored more securely. Also, we want our configs that change the most frequently to just be part of the build/deploy. Since our secrets change much less frequently it’s reasonable to split it out.

db { url="{{db['url']}}" user="{{db['user']}}" password="{{db['password']}}"
} github { clientId="{{github['client_id']}}" clientSecret="{{github['client_secret']}}"

View on GitHub


In the previous role, we mentioned including the Supervisord role as well as setting up the app specific conf file. Here is a quick look at the role with some files excluded, you can view them all here. The tasks should be fairly straightforward.

- name: Supervisor init script copy: src=supervisord dest=/etc/init.d/supervisord mode=0755 notify: restart supervisord - name: Supervisor conf copy: src=supervisord.conf dest=/etc/supervisord.conf notify: restart supervisord - name: Supervisor conf dir file: path=/etc/supervisor.d/ state=directory notify: restart supervisord - name: Install supervisord easy_install: name=supervisor notify: restart supervisord # TODO: Don't always restart supervisord. We can reload configs
# or even just the app when we deploy new code. Restarting it all is overkill.

View on GitHub

Next, we have our Supervisord handlers. Ansible handlers are like tasks but with a very special property. They are triggered by notify and only run once each at the end of all tasks. Changing the database passwords, changing supervisor conf, changing JVM properties or any of our roles can trigger a handler with notify. Whether its triggered one or N times it will only run once after all of the tasks have completed. This is to prevent you from accidentally restarting your service N times when updating commands.

- name: restart supervisord service: name=supervisord state=restarted - name: "supervisorctl restart {{app_name}}" command: "supervisorctl restart {{app_name}}"

View on GitHub

Lastly, for Supervisord, here is the app-specific config file that was referenced above in the JVM app role. Our app_commandvariable all the way from the playbook is what is getting passed here as the executing command.


View on GitHub

Running Our Playbook

All that’s left is to run our playbook. By the end of the execution, our server should be fully configured and only missing the actual JAR file to execute. We will cover that with a deploy script in another post. If we decide we need some redundancy or larger servers simply update the Ansible hosts file with the new hosts and run the playbook again and all the servers should then be configured. There are even dynamic inventories for cloud infrastructure like AWS where you can target all instances based on their tags. There are also options to stagger the commands across servers so not all of them are updating at once.

ansible-playbook --vault-password-file .vault_pw.txt -i inventories/production stubbornjava.yml PLAY [stubbornjava] ************************************************************ TASK [setup] *******************************************************************
ok: [] TASK [common : EPEL] ***********************************************************
ok: [] TASK [common : New York Time Zone] *********************************************
ok: [] TASK [ : Include OS-specific variables.] ***********************
ok: [] TASK [ : Include OS-specific variables for Fedora.] ************
skipping: [] TASK [ : Include version-specific variables for Ubuntu.] *******
skipping: [] TASK [ : Define java_packages.] ********************************
ok: [] TASK [ : include] **********************************************
included: /usr/local/etc/ansible/roles/ for TASK [ : Ensure Java is installed.] ****************************
ok: [] => (item=[u'java-1.7.0-openjdk']) TASK [ : include] **********************************************
skipping: [] TASK [ : include] **********************************************
skipping: [] TASK [ : Set JAVA_HOME if configured.] *************************
skipping: [] TASK [supervisord : Supervisor init script] ************************************
ok: [] TASK [supervisord : Supervisor conf] *******************************************
ok: [] TASK [supervisord : Supervisor conf dir] ***************************************
ok: [] TASK [supervisord : Install supervisord] ***************************************
ok: [] TASK [apps/jvm_app_base : Creating stubbornjava Group] *************************
ok: [] TASK [apps/jvm_app_base : Creating stubbornjava User] **************************
ok: [] TASK [apps/jvm_app_base : Creating /apps/stubbornjava Directory] ***************
ok: [] TASK [apps/jvm_app_base : Copying supervisor stubbornjava.conf] ****************
changed: [] TASK [apps/jvm_app_base : Copying stubbornjava secure.conf] ********************
ok: [] RUNNING HANDLER [supervisord : restart supervisord] ****************************
changed: [] PLAY RECAP ********************************************************************* : ok=17 changed=2 unreachable=0 failed=0

Original Link

How to Check Log Files on a Server Without Logging Into the Server

My sysadmin friends spend part of their time helping the developers in troubleshooting. Sometimes, when there’s a big problem, it increases a lot. When it happens, it’s not difficult to feel overwhelmed, by the pressure of solving the problem itself, and unfortunately, by the setbacks faced throughout the troubleshooting process.

Many companies have strict security policies that prevent developers from accessing servers through SSH. The problem arises when they need to check log files that exist on such servers, during an outage, for example. When a crisis happens, there’s no time to spend with bureaucracies, the log files must be accessible right away for troubleshooting.

One solution to that is to provide the log files to the developers or anyone in charge of troubleshooting with no need of logging in the servers. The security policies are followed and the required availability of the log files is met. It’s possible by installing and configuring the Apache HTTP Server in a way that the log files are accessible through a web browser.

The solution can be checked out on Github. It uses Ansible to automate the task of making the log files accessible, and Vagrant + VirtualBox to create the development and testing environment for such automation.

The Development Environment

The development environment is very important to create. It must be created locally on your own computer. It’s needless to develop and test Ansible playbooks another way. You might ask why not use some server to do such task, but be aware, servers are usually shared, and someone may accidentally mess with your stuff.

Furthermore, coding is very dynamic. You need an environment in which to experiment and make mistakes (trial-and-error method). Some code, you will surely throw away until you find the solution. So imagine if you test your code against a real server and leave it in a state that’s hard to rollback? With your own environment, you can easily recreate VMs and retest your code from scratch, over and over, at your will.

Vagrant is an awesome tool to build your development environment. Its default integration with VirtualBox simplifies a lot managing VMs. Through the command line, you can create, provision, connect to via SSH, and destroy VMs, to name just a few operations. The command vagrant up, for example, puts your environment up and running, based on the Vagrantfile, like the one below.

Vagrant.configure("2") do |config| config.vm.define "jenkins" do |jenkins| = "minimal/trusty64" jenkins.vm.hostname = "jenkins.local" "private_network", ip: "" jenkins.vm.provision "ansible" do |ansible| ansible.playbook = "playbook-jenkins.yml" end end

In order to simulate a server where an application runs and adds data into log files, only one VM was used. It’s important to have a VM as similar as possible to your real servers. For that reason, use VMs with the same OS and even with the same basic configuration. Packer is a great tool to create VM images that are alike your servers. In the solution scope, a reduced version of an Ubuntu VM was used (minimal/trusty64).

Notice that the VM is provisioned during its booting up. Vagrant has integration with several provisioners, including Ansible. In the VM is basically installed Oracle Java and Jenkins, in this order. Jenkins is an open source automation server, broadly used for delivering software, and with the adoption of Infrastructure as Code, can be used for delivering infrastructure as well. If your delivery process is done by Jenkins, for sure you will need to take a look at the tool log files once in a while.

- hosts: jenkins become: yes gather_facts: no tasks: - name: Install apt-transport-https (required for the apt_repository task) apt: name: apt-transport-https update_cache: yes - name: Install Oracle Java 8 (required for Jenkins installation) include_tasks: oracle-java8-installation.yml - name: Install Jenkins include_tasks: jenkins-installation.yml

During the playbook-jenkins.yml execution, the tasks related to the Oracle Java installation (oracle-java8-installation.yml) and the ones concerning the Jenkins installation (jenkins-installation.yml) are included dynamically through the include_tasks statement. It’s a good practice of code organizing, once keeps everything in its right place, and maintain the playbook files as small as possible. Moreover, it’s a great way of enabling code reusing.

The Solution Implementation

Right after the Jenkins server is turned on, you can open your web browser and type the URL You will see the Jenkins configuration initial page. It asks for the auto-generated administrator password, informed in the jenkins.log file. Please don’t get the password, accessing the VM through SSH. Remember, that’s what we want to prevent. So keep calm and implement the solution first.

Jenkins stores its log files in the /var/log/jenkins directory. Then, we must configure the Apache HTTP Server to expose such folder. This is done by using the apache-logs.conf file shown below. This is a template that can be used for any directory you want to make visible through the web browser.

If you want more details on how this configuration works, take a look at the Directory and the Alias directives documentation. For now, all we need to know is that the {{directory}} and the {{alias}} will be replaced respectively by the log files folder and the alias required to complement the URL address.

<Directory "{{directory}}"> Options Indexes FollowSymLinks AllowOverride None Require all granted
</Directory> Alias "{{alias}}" "{{directory}}"

The variables defined in the playbook-jenkins.logs.yml below are used in such replacement. Notice that the directory variable points to the cited Jenkins log files folder, and the alias value is /logs/jenkins. The other variable (conf) defines the configuration file resultant that will be placed in the Apache folders reserved for configuration files (/etc/apache2/conf*).

The Ansible playbook can be easily adapted to meet your needs. If some developer comes to you asking for help because he or she has to check inaccessible log files, just change the variables values and execute the playbook against the server where the files are.

Let’s finally implement the solution. Execute the command ansible-playbook playbook-jenkins-logs.yml -u vagrant -k -i hosts. The -u argument defines the SSH user, the -k argument prompts for password input (vagrant, too), and the -i argument points to the hosts file, where Ansible can find the Jenkins server IP address.

- hosts: jenkins become: yes gather_facts: no vars: - directory: /var/log/jenkins - alias: /logs/jenkins - conf: jenkins-logs.conf tasks: - name: Install Apache 2 apt: name: apache2 update_cache: yes - name: Config Apache logs template: src: apache-logs.conf dest: /etc/apache2/conf-available/{{conf}} owner: root group: root mode: 0644 - name: Enable new config file: src: ../conf-available/{{conf}} dest: /etc/apache2/conf-enabled/{{conf}} owner: root group: root state: link - name: Restart Apache 2 service: name: apache2 state: restarted

During the execution, the Apache HTTP Server is installed, and the configuration file is placed with the right values in the /etc/apache2/conf-available. The file content can be verified through the Ansible ad-hoc command ansible jenkins -m shell -a “cat /etc/apache2/conf-available/jenkins-logs.conf” -u vagrant -k -i hosts. After that, the configuration is enabled by creating a symbolic link in /etc/apache2/conf-enabled folder, pointing right to the configuration file. Lastly, the Apache HTTP server is restarted.

Now open a new tab in your web browser and type the URL You will see all the content of the Jenkins server /var/log/jenkins folder, including the jenkins.log file! Notice that the URL has the /logs/jenkins configured alias. You can, after all, open the log file in order to get the auto-generated administrator password. Just copy it, go back to the Jenkins configuration initial page, paste the password and continue.


Despite the fact we must follow the company security policies, we must facilitate the troubleshooting process too. DevOps also means one problem is everyone’s problem, so let’s work together in order to solve all of them. If you enjoyed the solution, share it right now!

Before I forget, if you want my help in automating something, please give me more details, tell me your problem. It may be someone else’s problem, too.

Original Link

Dealing With the Same Configuration File With Different Content in Different Environments

Different from the previous post, in this case, it was a demand of a dev friend. His application required a specific properties file in order to get the database connection string, a URL to connect to the MongoDB instance. The problem was that each environment had its own MongoDB instance, so the properties file content was different, depending on where it was placed.

The common approach to such problem is to have different versions of the same file, each version with the appropriate content for the related environment. What differentiates one file from another are the directories in the filesystem or the branches in the SCM repository where the files are placed, because they are named based on the environments’ names. When this approach is adopted, the right version of the configuration file is usually embedded in the application package during the deployment process.

The solution tried to eliminate that complexity, decoupling the configuration from the application, and centralizing all the needed configuration in just one file. The solution can be checked out on Github. It was developed using Ansible, and tested in a VM environment built using Vagrant and the VirtualBox hypervisor. The details are shown below.

The Test Environment

In order to simulate my friend’s QA environment, with different servers where the application is deployed, 3 VMs were booted up locally: qa1, qa2, and qa3. This way it was possible to test the Ansible playbook during its development, before executing it directly to the real servers.

The Vagrantfile below was used to build such test environment. Notice this is Ruby, each VM was defined within a loop, and received an IP address. The VM image (box) used was minimal/trusty64, a reduced version of Ubuntu, for a faster first-time download and set up during the vagrant up command execution.

Vagrant.configure("2") do |config| = "minimal/trusty64" (1..3).each do |i| config.vm.define "qa#{i}" do |qa| qa.vm.hostname = "qa#{i}.local" "private_network", ip: "192.168.33.#{i}0" end end

The Playbook Execution

With Ansible, you can perform tasks on several servers at the same time. It’s possible because everything is done through SSH from a master host, even if it’s your own machine. Besides that, Ansible knows the target servers through the inventory file (hosts), where they are defined and also grouped. In the hosts file, the below the QA servers were defined inside the group qa.


The core of the solution is undoubtedly the config.json file. It concentrates all the needed configuration for each QA server. If my friend’s application requires more parameters, they can be easily added. The host element identifies the target server, and the items are the properties the application has to have in order to run appropriately.

[ { "host": "qa1", "items": [ { "key": "prop1", "value": "A" }, { "key": "prop2", "value": "B" } ] }, { "host": "qa2", "items": [ { "key": "prop1", "value": "C" }, { "key": "prop2", "value": "D" } ] }, { "host": "qa3", "items": [ { "key": "prop1", "value": "E" }, { "key": "prop2", "value": "F" } ] }

In the solution, the configuration file is /etc/conf, but it could have any name and could be placed in any directory of the application server. The etc folder has root permissions, so it requires that the SSH user is able to become root (become: yes).

The playbook.yml below is pointing to the qa group previously defined in the hosts file (hosts: qa). Ansible then can execute it against the 3 VMs: qa1, qa2 and qa3. Each one is found out during the gathering facts phase, when the hostname variable is set.

The config variable points to the config.json file content, and the items_query variable is necessary to find inside the JSON content the properties key/value pairs of the respective server. The task ensures that there will be a line in the configuration file for each property.

- hosts: qa become: yes vars: hostname: "{{ansible_hostname}}" config: "{{lookup('file', 'config.json')}}" items_query: "[?host=='{{hostname}}'].items" tasks: - name: Set the configuration file content lineinfile: path: /etc/conf create: yes regexp: "^{{item.key}}=.*$" line: "{{item.key}}={{item.value}}" with_items: "{{config|json_query(items_query)}}"

The execution of the playbook.yml has the following output. The -u parameter defines the SSH user and the -k parameter prompts for vagrant password (vagrant too). All Vagrant boxes have the vagrant user. Finally, the -i parameter points to the hosts file where the QA servers were defined.

Notice that the changes are made by Ansible in parallel in the servers. If the ansible-playbook command is executed several times, you will have different outputs, because Ansible forks the main process in order to perform the tasks simultaneously on the servers.

ansible-playbook playbook.yml -u vagrant -k -i hosts
SSH password: PLAY [qa] ************************************************************************************************************************************************************************************************** TASK [Gathering Facts] *************************************************************************************************************************************************************************************
ok: []
ok: []
ok: [] TASK [Set the configuration file content] ******************************************************************************************************************************************************************
changed: [] => (item={'value': u'E', 'key': u'prop1'})
changed: [] => (item={'value': u'C', 'key': u'prop1'})
changed: [] => (item={'value': u'A', 'key': u'prop1'})
changed: [] => (item={'value': u'D', 'key': u'prop2'})
changed: [] => (item={'value': u'F', 'key': u'prop2'})
changed: [] => (item={'value': u'B', 'key': u'prop2'}) PLAY RECAP ************************************************************************************************************************************************************************************************* : ok=2 changed=1 unreachable=0 failed=0 : ok=2 changed=1 unreachable=0 failed=0 : ok=2 changed=1 unreachable=0 failed=0

Finally, you can validate the playbook execution by using Ansible ad-hoc commands, like the one shown below. The command cat /etc/conf was used to ensure that each configuration file content is as expected. Ad-hoc commands are excellent to know what you want about several servers in just one shot.

ansible qa -m shell -a "cat /etc/conf" -u vagrant -k -i hosts
SSH password: | SUCCESS | rc=0 >>
prop2=F | SUCCESS | rc=0 >>
prop2=B | SUCCESS | rc=0 >>

One interesting aspect of this solution is the capacity of the playbook be executed over and over keeping the same results. In other words, even if someone inadvertently changes the configuration file content, it will be fixed right in the next time the playbook is once more executed. It’s called idempotence.


Once again, I helped a friend, and I’m happy for that. Instead of maintaining several files, he maintains a single one, and it turns the configuration much simpler.

This solution can be applied in many use cases, so share it because you will certainly help someone else. And don’t forget to tell me your problem, I want to help you, too.

Original Link

How to Unarchive Different Files in Different Servers in Just One Shot

It would be simpler if you had to unarchive just one file in several servers, but what about different files on different servers? A sysadmin friend of mine reached out me with such challenge: once, quite often, he had to place specific files in a bunch of servers for monitoring purposes.

He had a routine to package all the needed files for each server, in TAR.GZ files. After the packaging step, he put all the tarball files in an Apache server, in a way they could be accessed for downloading, each one by an URL. Finally, no matter how long it would take, he logged in server by server, downloaded the specific compressed file, and extracted it to a directory.  Needless to say, there was a better way.

The solution can be checked out on GitHub. It was developed using Ansible and tested in a VM environment built using Vagrant and the VirtualBox hypervisor. The details are shown below.

The Environment

In order to simulate my friend’s environment, 3 VMs were used: 1 representing the Apache server, called repo, and 2 representing the different servers: server1 and server2. Each one received an IP address, and the communication between them was established through a private network. Vagrant was the VM management tool used to turn them all on in just one command: vagrant up.  The Vagrantfile below was required by Vagrant to do such task.

Vagrant.configure("2") do |config| = "minimal/trusty64" config.vm.define "repo" do |repo| repo.vm.hostname = "repo.local" "private_network", ip: "" repo.vm.provision "ansible" do |ansible| ansible.playbook = "playbook-repo.yml" end end config.vm.define "server1" do |server1| server1.vm.hostname = "server1.local" "private_network", ip: "" end config.vm.define "server2" do |server2| server2.vm.hostname = "server2.local" "private_network", ip: "" end

Notice that in the Vagrantfile were defined:

  • The VM image (box) to be used: minimal/trusty64 (requires the Oracle VM VirtualBox Extension Pack), with a reduced version of Ubuntu (faster download and boot);

  • The hostname and the IP of each VM, including how they communicate with each other: private_network;

  • The provisioning of the repo VM, done by Ansible, automation tool required to be installed in the Vagrant host machine beforehand.

The Repo Server Provisioning

The repo server is provisioned by Ansible during the vagrant up execution. The Apache HTTP Server is installed and 2 compressed files are obtained from the Internet. The objective is to make the files available for downloading internally, by their URLs. The playbook-repo.yml below is executed by Ansible in order to do such task.

- hosts: repo become: yes gather_facts: no tasks: - name: Install Apache 2 apt: name: apache2 update_cache: yes - name: Download files get_url: url: "{{item.url}}" dest: "/var/www/html/{{item.dest}}" with_items: [{"url": "", "dest": "server1.tar.gz"}, {"url": "", "dest": ""}]

Some details about the playbook-repo.yml execution:

  • The VM user must become root, in order to install the Apache Server, hence the become: yes;
  • Ansible by default collects information about the target host. It’s an initial step before the task’s execution. When such information is not necessary, the step can be bypassed. The gather_facts : no, in this case, is recommended to save time, too;
  • The installation of the Apache Server was done through apt_get, the package management tool of Ubuntu. If the OS were CentOS, for example, it could be installed through yum;
  • Both files are downloaded in just one task. It’s possible because Ansible allows the use of loops, through the with_items statement.

The playbook-servers.yml Execution

Ansible can be used for executing tasks in several target hosts in just one shot. It’s possible because of the inventory file, where groups of hosts can be defined. In the hosts file below was defined the servers group, composed by  server1 ( and server2 (

[repo] [servers]

An important part of the solution was separate all the needed parameters in a specific file, called params.json. In this file, each server has its compressed file URL defined, as long as its target directory, where the downloaded file will be extracted, like shown below. Notice that both URLs point to the repo server (, and each one to the file previously provided during the provisioning phase.

[ { "host": "server1", "url": "", "target": "/var/target" }, { "host": "server2", "url": "", "target": "/var/target" }

With the environment up and the parameters defined, we can finally unarchive different files in different servers in just one shot, executing the command ansible-playbook playbook-servers.yml -u vagrant -k -i hosts. The -u argument defines the SSH user, the -k argument prompts for password input (vagrant, too), and the -i argument points to the hosts file, commented earlier, instead of the default /etc/ansible/hosts.

- hosts: servers become: yes vars: hostname: "{{ansible_hostname}}" params: "{{lookup('file', 'params.json')}}" url_query: "[?host=='{{hostname}}'].url" url_param: "{{(params|json_query(url_query))[0]}}" target_query: "[?host=='{{hostname}}'].target" target_param: "{{(params|json_query(target_query))[0]}}" tasks: - name: Create the target directory if it doesn't exist file: path: "{{target_param}}" state: directory - name: Install unzip apt: name: unzip update_cache: yes when: url_param | match(".*\.zip$") - name: Unarchive from url unarchive: src: "{{url_param}}" dest: "{{target_param}}" remote_src: yes

Some details about the playbook-servers.yml execution:

  • By pointing to the group servers (hosts: servers), Ansible is able to execute the same playbook for both servers: server1 and server2;
  • The parameters of each server are obtained through variables:
    • hostname – the name of the current host found by Ansible during the gathering facts phase;
    • params – the params.json file content, returned by the lookup function;
    • url_query – the query to find the URL parameter defined for the current host;
    • url_param – the URL parameter defined for the current host, returned by the json_query filter;
    • target_query – the query to find the target parameter defined for the current host;
    • target_param – the target directory defined for the current host, returned by the json_query filter.
  • The target directory is created, if it doesn’t exist yet. It’s required by the unarchive task. Otherwise, an error occurs;
  • The unzip tool is installed, only if the remote file has the extension ZIP. This step is necessary because that’s the case of the server2’s remote file, and the subsequent unarchive task can extract files compressed through different algorithms. If the when statement condition is not met, the task is skipped;
  • Finally, the compressed file is downloaded from the repo server and extracted to the target directory.
ansible-playbook playbook-servers.yml -u vagrant -k -i hosts
SSH password: PLAY [servers] ********************************************************************************************************************************************************************************************* TASK [Gathering Facts] *************************************************************************************************************************************************************************************
ok: []
ok: [] TASK [Create the target directory if it doesn't exist] *****************************************************************************************************************************************************
changed: []
changed: [] TASK [Install unzip] ***************************************************************************************************************************************************************************************
skipping: []
changed: [] TASK [Unarchive from url] **********************************************************************************************************************************************************************************
changed: []
changed: [] PLAY RECAP ************************************************************************************************************************************************************************************************* : ok=3 changed=2 unreachable=0 failed=0 : ok=4 changed=3 unreachable=0 failed=0


My friend became really happy to save a lot of his time using such automation, and I’m sure other sysadmins with the same or similar tasks can benefit from it. So, if you enjoyed the solution, or think it’s useful for some friend of yours, don’t hesitate and share it.

Regardless its utility, bear in mind this solution is a work in progress, so feel free to collaborate and to improve it. After all, that’s the open source way.

Finally, if you want my help in automating something, please give me more details, tell me your problem. It may be a problem of someone else too.

Original Link

Getting Started With Ansible

What Is Configuration Management?

Before starting with Ansible, let’s discuss what configuration management is. Configuration Management (CM) is the process of handling changes in any system systematically, and it maintains the consistency of the product. It retains its consistency because it is applied over the entire lifecycle of the system. Configuration Management provides the capability of controlling and monitoring the performance of the system. Using this capability of monitoring we can prevent errors by notifying and controlling the capability to change anything in the system. If any of node in cluster gets failed we can reconfigure it. Also, configuration management keeps the snapshot of all the version of infrastructure.

Why Configuration Management?

The reason why we should use configuration management is to overcome the difficult situation that we face while setting up the cluster. A few of these are:

  • Managing multiple servers
  • Scaling up and scaling down
  • Syncing up with development team and infrastructure team

What Is Ansible?

Ansible is a radically simple IT automation engine that automates cloud provisioning, configuration management, application deployment, intra-service orchestration, and many other IT needs. It can be used for advanced tasks such as continuous deployments or zero downtime rolling updates. Ansible is a push-based configuration management tool; it means we can push configuration onto the node machine directly without having any central node. It communicates with remote machines over SSH. The main goals of Ansible are ease of use and simplicity.  

Features of Ansible

  • Simple and easy to use
  • Agentless
  • YAML-based playbook


Ansible communicates with other node using SSH protocol. It does not require any database installation or any running process. You only need to install it on one machine; it can even be your local machine. You just need Python installed on your system. However, Windows is not supported for the control machine.


sudo apt install ansible

CentOS/Fedora :

sudo dnf install ansible
yum install ansible


Arch Linux-based :

sudo pacman -S ansible

FreeBSD :

sudo pkg install ansible



sudo pip install ansible

Ansible Inventory

Ansible can work for multiple systems at a time. It achieves this by selecting portions of systems listed in Ansible’s inventory, which is by default saved in the location /etc/ansible/hosts. You can specify a different inventory file using the -i <path> option on command line.

The inventory file can be in one of many formats, depending on the inventory plugins you have. For this example, the format for /etc/ansible/hosts is an INI-like and looks like this:

[mailservers] [webservers] [dbservers]

A YAML version would look like this:

all: hosts: children: webservers: hosts: dbservers: hosts:

It is easy to assign variables to hosts that will be used in playbooks:

host1 http_port=80 maxRequestsPerChild=808
host2 http_port=303 maxRequestsPerChild=909

We can also define variables that can be applied to an entire group:

host2 [testserver:vars]

In Ansible inventory, we can create groups of groups and can set a variable for those groups of groups:

host2 [japan]
host4 [asia:children]
Japan [asia:vars]
ansible _user=value [world:children]

There are two default groups: all and ungrouped. all contains every host and ungrouped contains all the hosts that do not have a group aside from all.

Ansible Ad-Hoc Commands

An ad-hoc command is something that you might type in to do something rapidly, but don’t want to save for later. Just like executing a command in the shell instead of creating the shell script for that. An ad-hoc command contains two different parameters; the host group on which task is going to run and the module to run. If you want to ping each host with a single command, you can do it using the following:

ansible host_group -m ping

Similarly, you can perform many other operations using ansible like copying a file, managing packages, gathering facts, etc.

Ad-hoc commands are a powerful yet straightforward feature of Ansible.

Ansible Playbook and Modules

Playbooks are a completely different way to use Ansible than in ad-hoc task execution mode and are particularly powerful. There is a way to send commands to the remote node using the script, like a shell script that contains the set of command. Ansible Playbooks are written in the YAML format. YAML is a data serialization language.

In every Playbook, there are one or more “plays” in a list. The goal of the play is to map hosts with a certain function. Ansible does it through the task, which is nothing more than a call to an Ansible module.

Example of a playbook:

- hosts: webservers vars: http_port: 80 max_clients: 200 remote_user: root tasks: - name: ensure apache is at the latest version yum: name=httpd state=latest - name: write the apache config file template: src=/srv/httpd.j2 dest=/etc/httpd.conf notify: - restart apache - name: ensure apache is running (and enable it at boot) service: name=httpd state=started enabled=yes handlers: - name: restart apache service: name=httpd state=restarted

Every playbook starts with three dashes (—) followed by host list, then a variable list, then a task list, and at the end there are handlers.

The host list contains the list of hosts where we want to run the task.

The variable list is to set the properties for the current play.

The task list contains the number of tasks which are going to execute.

The handlers are also tasks; the only difference is that in order to execute handler we need some trigger in the list of task. For example, notify. These ‘notify’ actions are triggered at the end of each block of tasks in a play, and will only be triggered once even if notified by multiple different tasks.

To run a playbook, we can use the following command:

ansible -playbook

Ansible ships with many modules (called the “module library”) that can be executed directly on remote hosts or through Playbooks.

Users can also write their own modules. These modules can control system resources like services, packages, or files (anything really), or handle executing system commands.

Original Link