Fast Provisioning with Netbox and Ansible

Motivation

Administrators of services need new servers fast. They do not want to wait until the master of the virtualization environment has time to setup their new machines.

The solution is automation. The administrators get access to the inventory tool and can click their new machines themselves. Clever automation in the background does the provisioning of the new machines. At best it takes only a few minutes to setup a new server in such an environment.

In this article I will describe the provisioning of RHEL 8 / CentOS 8 machines using kickstart. Other distributions have other means of automated install processes. It shouldn’t be too difficult to apply the setup described here to other Linux variants.

Setup

In our environment we use Netbox as the central Source of Truth tool. What is documented in Netbox should exist. To enforce Netbox' view of the world in the real environment we use Ansible. It uses Netbox as a dynamic inventory. Ansible’s netbox documentation describes how to use Netbox as a dynamic inventory.

Any administrator with access to the Netbox can add a virtual server that should run on a cluster also defined in Netbox. To add a new virtual server the amount of CPUs, RAM and disk space need to be specified.

Now Ansible can do its work in the background. Since we use RHEL / CentOS the tasks would be

  • Find the right cluster and node to set up the new server

  • Prepare kickstart

  • Prepare storage

  • Create the server

  • Initialize the server

Finding the Correct Cluster

When Ansible wants to create the new server it has to know on what device it should execute its tasks. The inventory data of the virtual machine only contains the name and ID of the cluster it should run on. So lets fetch these data:

- name: Get the cluster for virtual machine
  uri:
    url: "http://[::1]/api/virtualization/virtual-machines/?name={{ inventory_hostname }}"
    method: GET
    return_content: yes
    headers:
      Accept: application/json
      Authorization: Token {{netbox_token}}
    register: nb_vm
    delegate_to: localhost

- name: Get a list of hosts in the cluster
  uri:
    url: "http://[::1]/api/dcim/devices/?cluster_id={{ nb_vm.json.results[0].cluster.id }}"
    method: GET
    return_content: yes
    headers:
      Accept: application/json
      Authorization: Token {{netbox_token}}
  register: nb_hosts
  delegate_to: localhost

- name: Finally select one random host from that list
  set_fact:
    host: "{{ nb_hosts.json.results[nb_hosts.json.count | random | int].name }}"

Please note that Ansible stored the token to access the Netbox in the netbox_token variable. Normally you fetch it from a vault.

Preparing the Storage

We use logical volumes for the storage of the virtual servers. They are located on a DRBD replicated device for high availability. In order to create the logical device we need the name of the volume group on that specific cluster node. The name is stored in a custom field of Netbox called VGName. Therefore we retrieve the information about the node we selected from Netbox again.

- name: Get the VG name on host "{{ host }}"
  uri:
    url: "http://[::1]/api/dcim/devices/?name={{ host }}"
    method: GET
    return_content: yes
    headers:
      Accept: application/json
      Authorization: Token {{netbox_token}}
    register: nb_host
    delegate_to: localhost

- name: Set fact of VG name
  set_fact:
    vgname: "{{ nb_host.json.results[0].custom_fields.VGName }}"

Creating the Disk with the Kickstart Information

The next task is to create the a disk in the host with the kickstart information. The following entries in the playbook will do:

- name: Create basic disk image
  # dd_cmd is a variable defined: dd if=/dev/zero \
  #   of=/var/lib/libvirt/images/ks-"{{ inventory_hostname }}".img bs=1M count=1
  # I found no other way to pass the special chars to the shell module.
  shell: "{{ dd_cmd }}"
  delegate_to: "{{ host }}"

- name: Create filesystem on ks file
  shell: "mkfs -t ext4 /var/lib/libvirt/images/ks-{{ inventory_hostname }}.img"
  delegate_to: "{{ host }}"

- name: Give the filesystem the correct label
  shell: "e2label /var/lib/libvirt/images/ks-{{ inventory_hostname }}.img  OEMDRV"
  delegate_to: "{{ host }}"

- name: Mount ks filesytem
  shell: "mount -o loop /var/lib/libvirt/images/ks-{{ inventory_hostname }}.img /mnt/ks/"
  delegate_to: "{{ host }}"

- name: Copy kickstart file from template
  template:
    src: ks-template.j2
    dest: "/mnt/ks/ks.cfg"
  delegate_to: "{{ host }}"

- name: Umount ks file
  shell: "umount /mnt/ks"
  delegate_to: "{{ host }}"

Please note, that these tasks are delegeted to the node we selected before. The kickstart template, of course, also makes use of the variables defined in Ansible. Since the kickstart file heavily depends on the organisation, it would make no sense to show our complete kickstart file. To show the use of variables, the code snipplet below defines the network configuration of the new host as given in Netbox:

# Part of the ks-template.j2
# Network information
network  --bootproto=static --device=ens1 --nameserver=2001:db8::53 --ipv6={{ primary_ip6 }}/64 --activate --ipv6gateway=2001:db8::1
network  --hostname={{ inventory_hostname }}.sys4.de

Creating the Storage

Creating the Storage for the new server is no problem at all with the information we gathered above. Ansible offers the lvol module for this task.

- name: Create LV for new host
  lvol:
    lv: "{{ inventory_hostname }}"
    size: "{{ disk }}G"
    vg: "{{ vgname }}"
  delegate_to: "{{ host }}"

Please note that the size also is derived from the definition in Netbox.

Creation of the New Server

The final step when creating the server is to define it in the virsh environment and to start it. Within Ansible we can use the virt module.

- name: Create VM
  virt:
    command: define
    xml: "{{ lookup('template', 'c8-template.j2') }}"
    autostart: no
  delegate_to: "{{ host }}"

- name: Start new VM
  virt:
    name: "{{ inventory_hostname }}"
    command: start
  delegate_to: "{{ host }}"

The define command of the virt module takes a XML parameter, that can be a template again. From an existing machine we exported the definition and modified it with the variables we need. It will only show some interesting parts of the definition, that show the use of variables.

<domain type='kvm'>
  <name>{{ inventory_hostname }}</name>
  <uuid>{{ (999999999999999999999 | random | string + (lookup('pipe', 'date +%s%N'))) | to_uuid() }}</uuid>
  <metadata>
    <libosinfo:libosinfo xmlns:libosinfo="http://libosinfo.org/xmlns/libvirt/domain/1.0">
      <libosinfo:os id="http://redhat.com/rhel/8.0"/>
    </libosinfo:libosinfo>
  </metadata>
  <memory unit='MiB'>{{ memory }}</memory>
  <currentMemory unit='MiB'>{{ memory }}</currentMemory>
  <vcpu placement='static'>{{ vcpus }}</vcpu>
  ...
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='cdrom'>
      <driver name='qemu' type='raw'/>
      <source file='/var/lib/libvirt/images/CentOS-8-x86_64-1905-dvd1.iso'/>
      <target dev='hdc' bus='ide'/>
      <readonly/>
      <address type='drive' controller='0' bus='1' target='0' unit='0'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source dev='/dev/{{ vgname }}/{{ inventory_hostname }}'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/>
    </disk>
    <disk type='file' device='disk'>
      <driver name='qemu' type='raw' cache='none' io='native'/>
      <source file='/var/lib/libvirt/images/ks-{{ inventory_hostname }}.img'/>
      <target dev='vdb' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x0b' function='0x0'/>
    </disk>
  </devices>
  ...
</domain>

The number of CPUs, RAM and the disc(s) are defined in Netbox. The definition in the virtual machine just uses these variables.

Final Tasks

The final task for the new machine is to modify its definition to boot from the new installation disc and not from the install CD. The work is done by a shell command virt-xml on the cluster node. The problem is the timing. The command can only be fired after the machine started correctly. After that Ansible has to wait until the kickstart process has finished its work and then can reboot the machine.

These Ansible tasks are left to the reader as an exercise.

Feedback, improvments and other comments to ms@sys4.de.

Michael Schwartzkopff, 20 May 2020