thomasorgeval.com
Notebook entry

Idempotent VPS hardening with Ansible

A practical, reusable baseline to provision and harden Linux VPS instances with Ansible, then keep them updated automatically.

If you set up a VPS once, manual hardening feels fine.

If you do it repeatedly, manual steps become risk: forgotten firewall rule, inconsistent SSH config, missing updates, and no clear baseline to reuse.

This guide shows a practical way to move from ad-hoc setup to a repeatable and safer model using Ansible.

In this context, idempotence means you can run the same playbook multiple times and still converge to the same desired state, without duplicating users, re-applying risky changes blindly, or drifting configuration over time.

The goal is simple:

  • bootstrap a new server fast
  • enforce a consistent security baseline
  • keep maintenance automated over time
  • avoid “works on this one server only”

What this setup covers

This baseline focuses on concrete essentials that most projects need:

  • admin + automation users
  • SSH hardening (no password auth, no root login)
  • restrictive firewall defaults
  • Fail2Ban for brute-force protection
  • package updates and cleanup
  • weekly maintenance in CI

You can use it for personal products, internal tools, or small client workloads.

Repository layout

Here is a practical repo layout you can copy and adapt:

infra-as-code/
├── .github/
│   ├── actions/
│   │   └── setup-ansible/
│   │       └── action.yml                  # Shared CI setup (SSH key, deps, tooling)
│   └── workflows/
│       ├── ansible-setup-vps.yml           # Bootstrap a new VPS
│       ├── ansible-weekly-maintenance.yml  # Scheduled updates + checks
│       └── ansible-quality.yml             # Lint and syntax validation
└── ansible/
    ├── ansible.cfg
    ├── inventories/
    │   └── hosts.ini                       # Bootstrap inventory
    ├── group_vars/
    │   └── all.yml                         # Global defaults (security, maintenance)
    ├── playbooks/
    │   ├── setup-vps.yml                   # Main provisioning entry point
    │   ├── weekly-maintenance.yml          # Recurring maintenance
    │   └── ping.yml                        # Connectivity smoke test
    └── roles/
        ├── setup_vps/
        ├── ssh_hardening/
        ├── firewall/
        ├── fail2ban/
        └── docker/

This structure keeps responsibilities clear: playbooks orchestrate, roles implement, and workflows automate execution.

1) Start with a minimal inventory

Use hostnames that describe intent, not random IP labels.

# inventories/hosts.ini
[vps_public]
app-prod-1 ansible_host=203.0.113.10
app-dev-1  ansible_host=203.0.113.11

# global inventory vars
ansible_user=root
ansible_port=22

For bootstrap, root access is acceptable once. After provisioning, switch to your automation user.

In a real Ansible inventory, those global values typically live under the standard all:vars section.

2) Define baseline variables once

Keep defaults centralized in group_vars/all.yml so every host follows the same policy.

# group_vars/all.yml
admin_user: admin
automation_user: automation
deploy_user: deploy

ssh_port: 22
ssh_password_authentication: "no"
ssh_permit_root_login: "no"

ufw_allowed_tcp_ports:
  - "{{ ssh_port }}"

maintenance_reboot: true
maintenance_critical_services:
  - docker
  - ssh

The important part is consistency: one source of truth for security defaults.

3) Compose a readable playbook

Keep the main playbook short; push details into roles.

# playbooks/setup-vps.yml
---
- name: Setup and harden VPS
  hosts: vps_public
  become: true # run tasks with sudo/root privileges
  gather_facts: true # collect OS/network facts used by tasks and conditionals

  roles:
    - setup_users
    - ssh_hardening
    - firewall
    - fail2ban
    - docker

This makes the setup easier to audit and evolve safely.

4) Example hardening tasks (simplified)

Create dedicated users

- name: Ensure automation user exists
  ansible.builtin.user:
    name: "{{ automation_user }}"
    shell: /bin/bash
    groups: sudo
    append: true # keep existing groups; do not overwrite them
    create_home: true # idempotent: creates home only if missing

Passwordless sudo for automation (controlled)

- name: Allow passwordless sudo for automation user
  ansible.builtin.copy:
    dest: "/etc/sudoers.d/90-{{ automation_user }}"
    owner: root
    group: root
    mode: "0440" # required permission for sudoers include files
    content: "{{ automation_user }} ALL=(ALL) NOPASSWD:ALL\n"
    validate: /usr/sbin/visudo -cf %s # syntax-check before writing, avoids lockout

Harden SSH

- name: Disable SSH password authentication
  ansible.builtin.lineinfile:
    path: /etc/ssh/sshd_config
    regexp: "^#?PasswordAuthentication" # replace both commented and uncommented lines
    line: "PasswordAuthentication {{ ssh_password_authentication }}"
  notify: Restart ssh # handler runs only when file changed

- name: Disable root login
  ansible.builtin.lineinfile:
    path: /etc/ssh/sshd_config
    regexp: "^#?PermitRootLogin" # same idea: match current state safely
    line: "PermitRootLogin {{ ssh_permit_root_login }}"
  notify: Restart ssh

Restrictive firewall

- name: Deny all incoming by default
  community.general.ufw:
    direction: incoming
    policy: deny

- name: Allow required TCP ports
  community.general.ufw:
    rule: allow
    port: "{{ item }}"
    proto: tcp
  loop: "{{ ufw_allowed_tcp_ports }}" # iterate on declared ports, no hardcoded duplication

- name: Enable UFW
  community.general.ufw:
    state: enabled # idempotent: no-op if already enabled

5) Add weekly maintenance (so baseline does not drift)

Provisioning is day one. Operations are every week.

# playbooks/weekly-maintenance.yml
---
- name: Weekly maintenance
  hosts: vps_public
  become: true
  gather_facts: true
  serial: 1 # update one host at a time to reduce blast radius

  tasks:
    - name: Update apt cache
      ansible.builtin.apt:
        update_cache: true
        cache_valid_time: 3600 # skip refresh if cache is still fresh

    - name: Upgrade packages
      ansible.builtin.apt:
        upgrade: dist

    - name: Remove unused packages
      ansible.builtin.apt:
        autoremove: true
        autoclean: true

This turns “I should update servers” into “servers are updated every week by design”.

6) Run and verify

Typical bootstrap run:

ansible-playbook -u root -i inventories/hosts.ini playbooks/setup-vps.yml -v

After first run, verify at least:

  • SSH login with non-root users works
  • password auth is disabled
  • firewall only exposes intended ports
  • Fail2Ban is active
  • Docker is running
  • maintenance workflow completes successfully

Common mistakes to avoid

  • Mixing human and automation access in one account
  • Keeping root SSH enabled “temporarily” and forgetting it
  • Hardcoding secrets in repo variables
  • Skipping post-maintenance checks after package upgrades
  • Treating one successful run as enough (idempotence matters)

Final takeaway

The value is not “using Ansible”.

The value is building a security and operations baseline that is:

  • repeatable
  • auditable
  • fast to bootstrap
  • stable over time

Once you have that, each new VPS is no longer a custom snowflake. It becomes another host that follows the same production rules.