Deploying on OpenStack with rootVolume and etcd on local disk

As a day 2 operation, you can resolve and prevent performance issues of your Red Hat OpenStack Platform (RHOSP) installation by moving etcd from a root volume (provided by OpenStack Cinder) to a dedicated ephemeral local disk.

Deploying RHOSP on local disk

If you have an existing RHOSP cloud, you can move etcd from that cloud to a dedicated ephemeral local disk.

Prerequisites

You have an OpenStack cloud with a working Cinder.
Your OpenStack cloud has at least 75 GB of available storage to accommodate 3 root volumes for the OpenShift control plane.
The OpenStack cloud is deployed with Nova ephemeral storage that uses a local storage backend and not rbd.

Procedure

Create a Nova flavor for the control plane with at least 10 GB of ephemeral disk by running the following command, replacing the values for --ram, --disk, and <flavor_name> based on your environment:
```
$ openstack flavor create --<ram 16384> --<disk 0> --ephemeral 10 --vcpus 4 <flavor_name>
```

Deploy a cluster with root volumes for the control plane; for example:

Example YAML file

# ...
controlPlane:
  name: master
  platform:
    openstack:
      type: ${CONTROL_PLANE_FLAVOR}
      rootVolume:
        size: 25
        types:
        - ${CINDER_TYPE}
  replicas: 3
# ...

Deploy the cluster you created by running the following command:
```
$ openshift-install create cluster --dir <installation_directory> 
```
1. For <installation_directory>, specify the location of the customized ./install-config.yaml file that you previously created.
Verify that the cluster you deployed is healthy before proceeding to the next step by running the following command:
```
$ oc wait clusteroperators --all --for=condition=Progressing=false 
```
1. Ensures that the cluster operators are finished progressing and that the cluster is not deploying or updating.

Create a file named 98-var-lib-etcd.yaml by using the following YAML file:

Details

apiVersion: machineconfiguration.openshift.io/v1
kind: MachineConfig
metadata:
  labels:
    machineconfiguration.openshift.io/role: master
  name: 98-var-lib-etcd
spec:
  config:
    ignition:
      version: 3.5.0
    systemd:
      units:
      - contents: |
          [Unit]
          Description=Mount local-etcd to /var/lib/etcd

          [Mount]
          What=/dev/disk/by-label/local-etcd 
          Where=/var/lib/etcd
          Type=xfs
          Options=defaults,prjquota

          [Install]
          WantedBy=local-fs.target
        enabled: true
        name: var-lib-etcd.mount
      - contents: |
          [Unit]
          Description=Create local-etcd filesystem
          DefaultDependencies=no
          After=local-fs-pre.target
          ConditionPathIsSymbolicLink=!/dev/disk/by-label/local-etcd 

          [Service]
          Type=oneshot
          RemainAfterExit=yes
          ExecStart=/bin/bash -c "[ -L /dev/disk/by-label/ephemeral0 ] || ( >&2 echo Ephemeral disk does not exist; /usr/bin/false )"
          ExecStart=/usr/sbin/mkfs.xfs -f -L local-etcd /dev/disk/by-label/ephemeral0 

          [Install]
          RequiredBy=dev-disk-by\x2dlabel-local\x2detcd.device
        enabled: true
        name: create-local-etcd.service
      - contents: |
          [Unit]
          Description=Migrate existing data to local etcd
          After=var-lib-etcd.mount
          Before=crio.service 

          Requisite=var-lib-etcd.mount
          ConditionPathExists=!/var/lib/etcd/member
          ConditionPathIsDirectory=/sysroot/ostree/deploy/rhcos/var/lib/etcd/member 

          [Service]
          Type=oneshot
          RemainAfterExit=yes

          ExecStart=/bin/bash -c "if [ -d /var/lib/etcd/member.migrate ]; then rm -rf /var/lib/etcd/member.migrate; fi" 

          ExecStart=/usr/bin/cp -aZ /sysroot/ostree/deploy/rhcos/var/lib/etcd/member/ /var/lib/etcd/member.migrate
          ExecStart=/usr/bin/mv /var/lib/etcd/member.migrate /var/lib/etcd/member 

          [Install]
          RequiredBy=var-lib-etcd.mount
        enabled: true
        name: migrate-to-local-etcd.service
      - contents: |
          [Unit]
          Description=Relabel /var/lib/etcd

          After=migrate-to-local-etcd.service
          Before=crio.service
          Requisite=var-lib-etcd.mount

          [Service]
          Type=oneshot
          RemainAfterExit=yes

          ExecCondition=/bin/bash -c "[ -n \"$(restorecon -nv /var/lib/etcd)\" ]" 

          ExecStart=/usr/sbin/restorecon -R /var/lib/etcd

          [Install]
          RequiredBy=var-lib-etcd.mount
        enabled: true
        name: relabel-var-lib-etcd.service

The etcd database must be mounted by the device, not a label, to ensure that systemd generates the device dependency used in this config to trigger filesystem creation.
Do not run if the file system dev/disk/by-label/local-etcd already exists.
Fails with an alert message if /dev/disk/by-label/ephemeral0 does not exist.
Migrates existing data to local etcd database. This config does so after /var/lib/etcd is mounted, but before CRI-O starts so etcd is not running yet.
Requires that etcd is mounted and does not contain a member directory, but the ostree does.
Cleans up any previous migration state.
Copies and moves in separate steps to ensure atomic creation of a complete member directory.
Performs a quick check of the mount point directory before performing a full recursive relabel. If restorecon in the file path /var/lib/etcd cannot rename the directory, the recursive rename is not performed.

Warning

After you apply the 98-var-lib-etcd.yaml file to the system, do not remove it. Removing this file will break etcd members and lead to system instability.

If a rollback is necessary, modify the ControlPlaneMachineSet object to use a flavor that does not include ephemeral disks. This change regenerates the control plane nodes without using ephemeral disks for the etcd partition, which avoids issues related to the 98-var-lib-etcd.yaml file. It is safe to remove the 98-var-lib-etcd.yaml file only after the update to the ControlPlaneMachineSet object is complete and no control plane nodes are using ephemeral disks.

Create the new MachineConfig object by running the following command:
```
$ oc create -f 98-var-lib-etcd.yaml
```
Note

Moving the etcd database onto the local disk of each control plane machine takes time.
Verify that the etcd databases has been transferred to the local disk of each control plane by running the following commands:
1. Verify that the cluster is still updating by running the following command:
  $ oc wait --timeout=45m --for=condition=Updating=false machineconfigpool/master
2. Verify that the cluster is ready by running the following command:
  $ oc wait node --selector='node-role.kubernetes.io/master' --for condition=Ready --timeout=30s
3. Verify that the cluster Operators are running in the cluster by running the following command:
  $ oc wait clusteroperators --timeout=30m --all --for=condition=Progressing=false

Deploying on OpenStack with rootVolume and etcd on local disk

Deploying RHOSP on local disk

Additional resources