Creating a cluster with multi-architecture compute machines on IBM Z and IBM LinuxONE with z/VM
To create a cluster with multi-architecture compute machines on IBM Z® and IBM® LinuxONE (s390x) with z/VM, you must have an existing single-architecture x86_64 cluster. You can then add s390x compute machines to your OpenShift Container Platform cluster.
Before you can add s390x nodes to your cluster, you must upgrade your cluster to one that uses the multi-architecture payload. For more information on migrating to the multi-architecture payload, see Migrating to a cluster with multi-architecture compute machines.
The following procedures explain how to create a RHCOS compute machine using a z/VM instance. This will allow you to add s390x nodes to your cluster and deploy a cluster with multi-architecture compute machines.
To create an IBM Z® or IBM® LinuxONE (s390x) cluster with multi-architecture compute machines on x86_64, follow the instructions for
Installing a cluster on IBM Z® and IBM® LinuxONE. You can then add x86_64 compute machines as described in Creating a cluster with multi-architecture compute machines on bare metal, IBM Power, or IBM Z.
Note
Before adding a secondary architecture node to your cluster, it is recommended to install the Multiarch Tuning Operator, and deploy a ClusterPodPlacementConfig object. For more information, see Managing workloads on multi-architecture clusters by using the Multiarch Tuning Operator.
Verifying cluster compatibility
Before you can start adding compute nodes of different architectures to your cluster, you must verify that your cluster is multi-architecture compatible.
-
You installed the OpenShift CLI (
oc).
-
Log in to the OpenShift CLI (
oc). -
You can check that your cluster uses the architecture payload by running the following command:
$ oc adm release info -o jsonpath="{ .metadata.metadata}"
-
If you see the following output, your cluster is using the multi-architecture payload:
{ "release.openshift.io/architecture": "multi", "url": "https://access.redhat.com/errata/<errata_version>" }You can then begin adding multi-arch compute nodes to your cluster.
-
If you see the following output, your cluster is not using the multi-architecture payload:
{ "url": "https://access.redhat.com/errata/<errata_version>" }Important
To migrate your cluster so the cluster supports multi-architecture compute machines, follow the procedure in Migrating to a cluster with multi-architecture compute machines.
Creating RHCOS machines on IBM Z with z/VM
You can create more Red Hat Enterprise Linux CoreOS (RHCOS) compute machines running on IBM Z® with z/VM and attach them to your existing cluster.
-
You have a domain name server (DNS) that can perform hostname and reverse lookup for the nodes.
-
You have an HTTP or HTTPS server running on your provisioning machine that is accessible to the machines you create.
-
Extract the Ignition config file from the cluster by running the following command:
$ oc extract -n openshift-machine-api secret/worker-user-data-managed --keys=userData --to=- > worker.ign -
Upload the
worker.ignIgnition config file you exported from your cluster to your HTTP server. Note the URL of this file. -
You can validate that the Ignition file is available on the URL. The following example gets the Ignition config file for the compute node:
$ curl -k http://<http_server>/worker.ign -
Download the RHEL live
kernel,initramfs, androotfsfiles by running the following commands:$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.kernel.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.initramfs.location')$ curl -LO $(oc -n openshift-machine-config-operator get configmap/coreos-bootimages -o jsonpath='{.data.stream}' \ | jq -r '.architectures.s390x.artifacts.metal.formats.pxe.rootfs.location') -
Move the downloaded RHEL live
kernel,initramfs, androotfsfiles to an HTTP or HTTPS server that is accessible from the RHCOS guest you want to add. -
Create a parameter file for the guest. The following parameters are specific for the virtual machine:
-
Optional: To specify a static IP address, add an
ip=parameter with the following entries, with each separated by a colon:-
The IP address for the machine.
-
An empty string.
-
The gateway.
-
The netmask.
-
The machine host and domain name in the form
hostname.domainname. If you omit this value, RHCOS obtains the hostname through a reverse DNS lookup. -
The network interface name. If you omit this value, RHCOS applies the IP configuration to all available interfaces.
-
The value
none.
-
-
For
coreos.inst.ignition_url=, specify the URL to theworker.ignfile. Only HTTP and HTTPS protocols are supported. -
For
coreos.live.rootfs_url=, specify the matching rootfs artifact for thekernelandinitramfsyou are booting. Only HTTP and HTTPS protocols are supported. -
For installations on DASD-type disks, complete the following tasks:
-
For
coreos.inst.install_dev=, specify/dev/dasda. -
Use
rd.dasd=to specify the DASD where RHCOS is to be installed. -
You can adjust further parameters if required.
The following is an example parameter file,
additional-worker-dasd.parm:cio_ignore=all,!condev rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/dasda \ coreos.inst.ignition_url=http://<http_server>/worker.ign \ coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \ ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ rd.dasd=0.0.3490 \ zfcp.allow_lun_scan=0Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
-
For installations on FCP-type disks, complete the following tasks:
-
Use
rd.zfcp=<adapter>,<wwpn>,<lun>to specify the FCP disk where RHCOS is to be installed. For multipathing, repeat this step for each additional path.Note
When you install with multiple paths, you must enable multipathing directly after the installation, not at a later point in time, as this can cause problems.
-
Set the install device as:
coreos.inst.install_dev=/dev/sda.Note
If additional LUNs are configured with NPIV, FCP requires
zfcp.allow_lun_scan=0. If you must enablezfcp.allow_lun_scan=1because you use a CSI driver, for example, you must configure your NPIV so that each node cannot access the boot partition of another node. -
You can adjust further parameters if required.
Important
Additional postinstallation steps are required to fully enable multipathing. For more information, see “Enabling multipathing with kernel arguments on RHCOS" in Machine configuration.
The following is an example parameter file,
additional-worker-fcp.parmfor a worker node with multipathing:cio_ignore=all,!condev rd.neednet=1 \ console=ttysclp0 \ coreos.inst.install_dev=/dev/sda \ coreos.live.rootfs_url=http://<http_server>/rhcos-<version>-live-rootfs.<architecture>.img \ coreos.inst.ignition_url=http://<http_server>/worker.ign \ ip=<ip>::<gateway>:<netmask>:<hostname>::none nameserver=<dns> \ rd.znet=qeth,0.0.bdf0,0.0.bdf1,0.0.bdf2,layer2=1,portno=0 \ zfcp.allow_lun_scan=0 \ rd.zfcp=0.0.1987,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763070bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.1987,0x50050763071bc5e3,0x4008400B00000000 \ rd.zfcp=0.0.19C7,0x50050763071bc5e3,0x4008400B00000000Write all options in the parameter file as a single line and make sure that you have no newline characters.
-
-
-
Transfer the
initramfs,kernel, parameter files, and RHCOS images to z/VM, for example, by using FTP. For details about how to transfer the files with FTP and boot from the virtual reader, see Booting the installation on IBM Z® to install RHEL in z/VM. -
Punch the files to the virtual reader of the z/VM guest virtual machine.
See PUNCH in IBM® Documentation.
Tip
You can use the CP PUNCH command or, if you use Linux, the vmur command to transfer files between two z/VM guest virtual machines.
-
Log in to CMS on the bootstrap machine.
-
IPL the bootstrap machine from the reader by running the following command:
$ ipl c
See IPL in IBM® Documentation.
Approving the certificate signing requests for your machines
To add machines to a cluster, verify the status of the certificate signing requests (CSRs) generated for each machine. If manual approval is required, approve the client requests first, followed by the server requests.
-
You added machines to your cluster.
-
Confirm that the cluster recognizes the machines:
$ oc get nodesExample outputNAME STATUS ROLES AGE VERSION master-0 Ready master 63m v1.34.2 master-1 Ready master 63m v1.34.2 master-2 Ready master 64m v1.34.2The output lists all of the machines that you created.
Note
The preceding output might not include the compute nodes, also known as worker nodes, until some CSRs are approved.
-
Review the pending CSRs and ensure that you see the client requests with the
PendingorApprovedstatus for each machine that you added to the cluster:$ oc get csrExample outputNAME AGE REQUESTOR CONDITION csr-8b2br 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending csr-8vnps 15m system:serviceaccount:openshift-machine-config-operator:node-bootstrapper Pending ...In this example, two machines are joining the cluster. You might see more approved CSRs in the list.
-
If the CSRs were not approved, after all of the pending CSRs for the machines you added are in
Pendingstatus, approve the CSRs for your cluster machines:Note
Because the CSRs rotate automatically, approve your CSRs within an hour of adding the machines to the cluster. If you do not approve them within an hour, the certificates will rotate, and more than two certificates will be present for each node. You must approve all of these certificates. After the client CSR is approved, the Kubelet creates a secondary CSR for the serving certificate, which requires manual approval. Then, subsequent serving certificate renewal requests are automatically approved by the
machine-approverif the Kubelet requests a new certificate with identical parameters.Note
For clusters running on platforms that are not machine API enabled, such as bare metal and other user-provisioned infrastructure, you must implement a method of automatically approving the kubelet serving certificate requests (CSRs). If a request is not approved, then the
oc exec,oc rsh, andoc logscommands cannot succeed, because a serving certificate is required when the API server connects to the kubelet. Any operation that contacts the Kubelet endpoint requires this certificate approval to be in place. The method must watch for new CSRs, confirm that the CSR was submitted by thenode-bootstrapperservice account in thesystem:nodeorsystem:admingroups, and confirm the identity of the node.-
To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>-
Specifies the name of a CSR from the list of current CSRs.
-
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs --no-run-if-empty oc adm certificate approveNote
Some Operators might not become available until some CSRs are approved.
-
-
Now that your client requests are approved, you must review the server requests for each machine that you added to the cluster:
$ oc get csrExample outputNAME AGE REQUESTOR CONDITION csr-bfd72 5m26s system:node:ip-10-0-50-126.us-east-2.compute.internal Pending csr-c57lv 5m26s system:node:ip-10-0-95-157.us-east-2.compute.internal Pending ... -
If the remaining CSRs are not approved, and are in the
Pendingstatus, approve the CSRs for your cluster machines:-
To approve them individually, run the following command for each valid CSR:
$ oc adm certificate approve <csr_name>where:
<csr_name>-
Specifies the name of a CSR from the list of current CSRs.
-
To approve all pending CSRs, run the following command:
$ oc get csr -o go-template='{{range .items}}{{if not .status}}{{.metadata.name}}{{"\n"}}{{end}}{{end}}' | xargs oc adm certificate approve
-
-
After all client and server CSRs have been approved, the machines have the
Readystatus. Verify this by running the following command:$ oc get nodesExample outputNAME STATUS ROLES AGE VERSION master-0 Ready master 73m v1.34.2 master-1 Ready master 73m v1.34.2 master-2 Ready master 74m v1.34.2 worker-0 Ready worker 11m v1.34.2 worker-1 Ready worker 11m v1.34.2Note
It can take a few minutes after approval of the server CSRs for the machines to transition to the
Readystatus.