snippetMinor
How to safeguard Ansible deployment to mitigate accidents?
Viewed 0 times
ansiblemitigatesafeguardaccidentshowdeployment
Problem
Recently the Amazon S3 had a major outage in the us-east-1 region. It looks like it was likely caused by a spelling error when running a maintenance playbook in Ansible or a similar tool. You can put a shell script wrapper around ansible-playbook to look like:
But what are some other ways you use to improve the safety and reduce a chance of error causing a major outage for your company.
#!/bin/bash
/usr/bin/ansible-playbook "$@" --list-hosts --list-tasks
read -p "Are you sure? (y/n) " answer
test "$answer" = "y" || exit 0
exec /usr/bin/ansible-playbook "$@"But what are some other ways you use to improve the safety and reduce a chance of error causing a major outage for your company.
Solution
We're using jobs in jenkins to trigger deployments. It ensures that no matter who does the deployment, the ansible command that is run will be the same. A nice bonus is the build logs record when deployments were triggered, who triggered them and what exactly happened during the deployment.
It's certainly not foolproof, but it's been a nice improvement over running ansible playbooks by hand.
For larger/riskier changes this should ideally be combined with some form of change management so changes are made only after another person/team reviews the change and the approach to the change to help identify and resolve potential issues early.
Additionally it never hurts to have a teammate who understands the change you're making be present and watching while you make big changes so they can watch for and help prevent mistakes in the execution of the change.
It's certainly not foolproof, but it's been a nice improvement over running ansible playbooks by hand.
For larger/riskier changes this should ideally be combined with some form of change management so changes are made only after another person/team reviews the change and the approach to the change to help identify and resolve potential issues early.
Additionally it never hurts to have a teammate who understands the change you're making be present and watching while you make big changes so they can watch for and help prevent mistakes in the execution of the change.
Context
StackExchange DevOps Q#309, answer score: 6
Revisions (0)
No revisions yet.