One month with Ansible

Bertrand Florat - Feb 3, 2018

Ansible is an Open Source IT automation tool written in python and sponsored by RedHat. Best known alternatives are Puppet, Chef and Salt.

I used Ansible for the first time (2.4.3, last release in early 2018) in an attempt to produce some quite sophisticated Docker Swarm docker-compose files and others yaml configuration files that includes a significant volume of logic (port number increments, conditional suffixes, variable number of sections according to lists of items, etc.)

I achieved my goals in about five or six days of effective work, including the reading of most of the official manual. Be able to achieve such a real task in six days is acceptable when we have to learn it first but I think I would have made it in a single day in bash (that I already know). However, Ansible is much more powerful. My first contacts and real works with Ansible were really enjoyable and I was very surprised to make it work so easily. I also tried to apply all the documented best practices with success. Sadly, I spent the last three days struggling with the last 5% of remaining work, dealing with limitations/bugs that I found hard to understand and quite irritating.

What I liked

The concept of desired state is very powerful: Ansible playbooks (list of tasks to performed against some servers) are idempotent : only the final states have to be described (like " a /tmp/foo directory with 600 rights), not the actions required to reach it (like in bash : mkdir, chown, chmod...). It's powerful partially because you don't have to test existence of the final state (in a bash in exit on error mode, you would have to check existence of each directory for instance).
Ansible is agentless : nothing to install on targeted servers. All you need is an ssh key exchange to allow the headless ssh connections. Ansible generates python scripts from the playbook, copy them using scp or sftp and run them remotely using ssh as well.
The role concept is a kind of operation process packaged (like "add a mysql user" or "create and configure an Apache server"). It enables a lot of reuse and is really great. A marketplace of shared roles is available on Galaxy.
The manual and reference documentation is good and extensive.

What I found irritating

UPDATE November 2019 : all of the issues described here has been resolved in the mean time by the Ansible team, KUTGW !

I don't like yaml for complex structures. I find it harder to read than json and syntax errors are very frequent and occur a great waste of time. The data structures are described by (space) indentation I found brittle. Worst : different indentation forms can be both valid but mean different things (like a map of map or one more key/value for the current map). Validators exist but AFAIK, formatters doesn't. However, yaml comes with fine features like comments or multi-documents.
Playbooks execution is rather slow because of a new ssh connection for each task + one for the generated python scripts sending to remote host. Note however that even if tasks are always executed sequentially, the tasks are run in parallel against all the targeted servers.
You need to create a playbook that just wrap a role to run it, you cannot launch a role directly from command line
There are 16 kinds of loops in Ansible like with_fileglob or with_filetree. Is it really necessary ?
I wasn't able to increment a variable inside a loop in a jinja2 templates : https://github.com/pallets/jinja/issues/641 . This is a feature, not a bug. Incrementing things (like ports) is nevertheless a very basic requirement IMO. Hopefully, there is a workaround (using a list, append and pop).
It isn't possible to match a directory with with_fileglob : https://github.com/ansible/ansible/issues/17136. You have to use with_filetree that comes with other constraints.
It is difficult to debug the templating, especially when using templates fragments (with import). On any template module error, you only get the playbook line and the full template content (very difficuly to read BTW).
I find the syntax sometimes twisted, like when we have to use doubles quotes around variables and sometimes not. Also, why should we add white space around the variables names ? (like ). I find this ugly and annoying. Apparently, we can drop the spaces in playbooks but not in the jinja2 templates...
Ansible is not compatible with python 3.0 to 3.5. Sometimes (like with the copy module), I didn't get any error message despite the fact that the python package on the target server was unsupported.
It is not possible to copy recursively with src_remote (https://github.com/ansible/ansible/issues/14131). I had to use a hack (run template on the Ansible host using connection: local ) and then to copy using src instead of src_remote.

Final thoughts

As a conclusion, Ansible is a good product but can become cumbersome when trying to make it run too much logic. It is mainly a declarative system, not imperative. Next time, we'll have a look at salt, it may be a more suitable solution, or maybe not ?