Skip to main content

Installing OKD 4.22 Single-Node on Bare Metal: A Homelab Guide That Actually Works

·2222 words·11 mins
Author
Ifesinachi Osude
Writing about infrastructure, automation, observability, networking, security, and homelab engineering.
OKD Homelab Series - This article is part of a series.
Part 1: This Article

Install OKD on Bare Metal

I just got OKD 4.22.0-okd-scos.0 running on a single Supermicro server in my homelab, and the install eventually went smoothly. Getting to “smooth” took multiple failed attempts and a lot of swearing at DNS. This post is the guide I wish I’d had on day one.

If you want to install OKD as Single-Node OpenShift (SNO) on bare metal with bootstrap-in-place — no separate bootstrap host, no fancy infrastructure — read on.

What we’re building
#

  • One physical server running both the control plane and workloads (SNO)
  • OKD 4.22 SCOS (CentOS Stream CoreOS)
  • Bootstrap-in-place install: a single live ISO that installs OKD onto the server’s disk, then pivots itself into being the cluster
  • Bring-your-own DNS, no cloud, no DHCP magic, no Metal3

The end result: a fully working OpenShift web console at https://console-openshift-console.apps.okd.example.com.

Why not just use the Assisted Installer pod?
#

If you’ve Googled “OKD homelab install,” you’ve seen the Assisted Installer — a containerized service you run with podman play kube that gives you a slick wizard UI.

It does not work with OKD-SCOS. I burned an afternoon proving this. The assisted-service hardcodes a lookup for an okd-rpms image tag inside the release image:

error: no image tag "okd-rpms" exists in the release image
quay.io/okd/scos-release@sha256:...
failed to get okd-rpms image from release image

Old FCOS-based OKD had that tag. OKD-SCOS (4.18 and later) does not. Every ISO build for any modern OKD release will fail the same way. Don’t waste your afternoon. Use bootstrap-in-place (this guide) or openshift-install agent create image (newer agent-based installer, also works for SCOS).

Hardware checklist
#

  • CPU: 8+ cores
  • RAM: 16 GB+ (32+ recommended if you actually want to deploy anything)
  • Disk: 100 GB+ on the target install disk (mine is /dev/sda)
  • NIC: wired Ethernet on a stable subnet, static IP or reliable DHCP reservation
  • IPMI / BMC with virtual media support (Supermicro IPMI, iDRAC, iLO, etc.). You’ll need this to mount the install ISO without physically touching the box.
  • A second machine to act as a “bootstrap host” — any Linux box on the same network with internet access. This runs openshift-install, hosts the generated ignition, and watches the install. Mine is a small Rocky Linux 10 VM.

Pre-flight: the pull secret
#

OKD pulls some container images from quay.io/openshift-release-dev which requires authentication. Grab a pull secret from https://console.redhat.com/openshift/install/pull-secret (free Red Hat account is enough — you’re not paying for a subscription, just identifying as not-a-bot).

Save it; you’ll paste it into install-config.yaml in a minute.


Step 1: DNS — the most-skipped, most-critical step
#

More than half of OKD homelab install failures are DNS. Get this right before you touch the install media or you will hate everything.

You need forward records, a wildcard for the apps domain, and a clean reverse record (PTR) for the master node’s IP. Pick a cluster name and base domain. I’ll use okd and example.com throughout — your DNS becomes:

Record Type Target Why
api.okd.example.com A <master-ip> External API endpoint
api-int.okd.example.com A <master-ip> Internal API endpoint — kubelet uses this
master-0.okd.example.com A <master-ip> Node hostname
*.apps.okd.example.com A <master-ip> Wildcard for OKD ingress
<master-ip> reverse PTR master-0.okd.example.com. MUST point at a single concrete hostname

The DNS landmine I want you to avoid
#

When CoreOS boots for the first time, it does a reverse DNS lookup of its own IP and uses the result as the system hostname. If your PTR for the master IP is wrong, your node ends up with a broken hostname baked into all cluster certificates and the install is unrecoverable in place.

On my first attempt, my master booted up and named itself console-openshift-console.apps.okd.example.com.example.com — that’s the wildcard *.apps getting matched by reverse DNS, with the search domain example.com appended for good measure. The entire control plane’s CA, etcd certs, and kubelet identity baked that absurd name in. Wiping and reinstalling was the only fix.

Rules:

  • The forward wildcard *.apps.okd.example.com → <master-ip> is required and totally fine.
  • The reverse PTR for <master-ip> must be master-0.okd.example.com. (or any single, sensible hostname). It must not be a wildcard pattern, and it must not match *.apps by accident.
  • Test reverse DNS before booting: dig -x <master-ip> should return your master’s hostname.

Verify both forward and reverse from any client on your LAN
#

for n in api.okd.example.com api-int.okd.example.com master-0.okd.example.com \
         oauth-openshift.apps.okd.example.com; do
  echo "$n -> $(dig +short @<your-dns-ip> $n)"
done
echo "PTR -> $(dig +short -x <master-ip> @<your-dns-ip>)"

Every forward record should return <master-ip>. PTR should return master-0.okd.example.com.. If you have multiple DNS servers, check them all — secondary zones won’t help you if one server is missing the reverse zone entirely.

If you run Technitium (it’s great), don’t be fooled by the UI’s “Add Record” form letting you choose PTR for a record under a forward zone — that creates a backwards PTR that doesn’t help anything. Add PTRs in the actual reverse zone (0.10.10.in-addr.arpa for 10.10.0.0/24).


Step 2: Set up the bootstrap host
#

On your bootstrap Linux box, install the basics:

sudo dnf install -y curl jq podman
sudo mkdir -p /opt/okd-install
sudo chown $USER /opt/okd-install
cd /opt/okd-install

Set the OKD version you want. As of writing this, 4.22.0-okd-scos.0 is the latest GA:

OKD_VERSION=4.22.0-okd-scos.0

Download openshift-install and oc for the matching version:

curl -L https://github.com/okd-project/okd/releases/download/$OKD_VERSION/openshift-install-linux-$OKD_VERSION.tar.gz | tar xz
curl -L https://github.com/okd-project/okd/releases/download/$OKD_VERSION/openshift-client-linux-$OKD_VERSION.tar.gz   | tar xz
chmod +x openshift-install oc kubectl

Step 3: Grab the SCOS live ISO
#

The OKD installer can tell you the exact ISO URL for the SCOS build that matches your OKD release:

./openshift-install coreos print-stream-json > stream.json
ISO_URL=$(jq -r '.architectures.x86_64.artifacts.metal.formats.iso.disk.location' stream.json)
echo "ISO: $ISO_URL"
curl -L "$ISO_URL" -o scos-live.iso

This is ~900 MB. Coffee break.

Step 4: Write install-config.yaml
#

cat > install-config.yaml <<'EOF'
apiVersion: v1
baseDomain: example.com
metadata:
  name: okd
compute:
- name: worker
  replicas: 0
controlPlane:
  name: master
  replicas: 1
networking:
  clusterNetwork:
  - cidr: 10.128.0.0/14
    hostPrefix: 23
  machineNetwork:
  - cidr: 10.0.0.0/24            # <- your LAN
  networkType: OVNKubernetes
  serviceNetwork:
  - 172.30.0.0/16
platform:
  none: {}
bootstrapInPlace:
  installationDisk: /dev/sda     # <- the actual disk you want OKD on
pullSecret: 'PASTE-YOUR-PULL-SECRET-JSON-HERE'
sshKey: |
  ssh-ed25519 AAAA... your@key
EOF

A few notes:

  • compute.replicas: 0 + controlPlane.replicas: 1 is the SNO incantation.
  • platform: none: {} means you are providing the infrastructure (no AWS, no vSphere, no Metal3). OKD’s installer will not provision the host — you boot it manually.
  • bootstrapInPlace.installationDisk is the device on the target server, not on your bootstrap host. /dev/sda for SATA/SCSI, /dev/nvme0n1 for NVMe, /dev/vda for virtio.
  • Your machineNetwork CIDR must match the subnet of your master node.
  • Generate an SSH keypair (ssh-keygen -t ed25519 -f ~/.ssh/okd-key) and paste the public key here. You’ll use the private key to SSH into the node later as core@<master-ip>.

Step 5: Generate the ignition config
#

mkdir sno
cp install-config.yaml sno/
./openshift-install --dir=sno create single-node-ignition-config

This produces sno/bootstrap-in-place-for-live-iso.ign (~300 KB) and sno/auth/ containing kubeconfig + kubeadmin-password. Keep that auth folder safe — that’s how you’ll log in.

Note: openshift-install consumes install-config.yaml (deletes it). Always work from a copy.

Step 6: Embed the ignition into the live ISO
#

coreos-installer is the tool. The cleanest way to run it is via its official container:

podman run --privileged --rm \
  -v /dev:/dev -v /run/udev:/run/udev \
  -v "$PWD:/data" -w /data \
  quay.io/coreos/coreos-installer:release \
  iso ignition embed -fi /data/sno/bootstrap-in-place-for-live-iso.ign /data/scos-live.iso

This modifies scos-live.iso in place. If you want to keep the original, cp it to a new name first.

The ISO now contains everything needed to install OKD on your target server with zero further interaction.

Step 7: Boot the server from the ISO
#

Mount scos-live.iso as virtual media via your IPMI/iDRAC/iLO web console, set the server to boot from the virtual CD once, and power it on.

What happens (no input needed from you):

  1. CoreOS live environment boots from the ISO.
  2. Ignition runs, kicks off coreos-installer install /dev/sda against the configured disk.
  3. The system reboots into the freshly-installed OS.
  4. The installed system comes up as the bootstrap node. It runs the control plane components as static pods (etcd, kube-apiserver, etc.).
  5. Cluster Operators install, the node registers, machine-config pivots the installed OS to the cluster’s desired state, reboots again.
  6. Final reboot, master comes back, cluster Available.

End-to-end this takes ~30–45 minutes on a modern box with a fast disk.

Important: After the first install completes on disk, you want the server to boot from the disk, not the virtual ISO again. Either eject the virtual media after the first reboot, or set the BIOS boot order so the install disk comes first.

Step 8: Watch the install from the bootstrap host
#

./openshift-install --dir=sno wait-for install-complete

This will tail progress and exit when the cluster is Available. The default timeout is 40 minutes — long-running operator rollouts can blow past it. Don’t panic if it times out; the install keeps going in the background. Just rerun the same command — it’s safe to repeat.

If you want a live dashboard instead, watch the operators:

watch 'oc --kubeconfig=sno/auth/kubeconfig get clusterversion,nodes; oc --kubeconfig=sno/auth/kubeconfig get co | head -20'

You’re looking for:

  • clusterversion AVAILABLE=True PROGRESSING=False
  • One Ready node named master-0.okd.example.com with role control-plane,master,worker
  • All ~33 cluster operators showing AVAILABLE=True DEGRADED=False

Step 9: Log in
#

# Web console
echo "https://console-openshift-console.apps.okd.example.com"
echo "user:     kubeadmin"
echo "password: $(cat sno/auth/kubeadmin-password)"
# CLI (from your laptop, after copying the kubeconfig over)
scp user@bootstrap-host:/opt/okd-install/sno/auth/kubeconfig ~/.kube/okd-config
export KUBECONFIG=~/.kube/okd-config
oc get nodes
oc whoami

You’ll get a TLS warning in the browser the first time — OKD’s ingress uses a self-signed cert from the cluster’s own CA. Click through, then plan to either trust the cluster CA or replace the ingress cert with one from your real CA. (I use cert-manager with my homelab CA for this; topic for another post.)


Troubleshooting: what to check when it doesn’t work
#

These are the four failure modes I hit on this install. If you’re stuck, work through them in order.

Symptom: oc get nodes is empty after the install timeout
#

Your kubelet didn’t register. SSH into the node and look at it:

ssh -i ~/.ssh/okd-key core@<master-ip>
sudo systemctl status kubelet
sudo journalctl -u kubelet -n 50

Common causes from the journal:

Error Cause Fix
lookup api-int.okd.example.com on <dns>:53: no such host DNS missing the api-int record Add it, re-run.
Unable to register node with API server Same DNS issue, plus often a hostname issue Fix DNS first.
The hostname in error logs looks like console-openshift-console.apps.okd.example.com.example.com Reverse DNS for master IP matched your *.apps wildcard Fix the PTR record. Re-install (cert state is poisoned).

Symptom: install hits 64% and stalls forever with “cluster operators are not functioning”
#

Combined with oc get nodes empty, this is the chicken-and-egg of “operators need a node, no node will register, you’re stuck.” Always traces back to DNS.

Symptom: discovery ISO boots but pings the wrong hostname / installs to the wrong disk
#

The ISO is just the SCOS live image with your ignition embedded. If you re-generated ignition without re-embedding (or re-downloaded the live ISO without re-embedding the new ignition), you have a stale ISO. Always re-embed after re-generating:

rm -rf sno && mkdir sno && cp install-config.yaml sno/
./openshift-install --dir=sno create single-node-ignition-config
# important: start from a CLEAN ISO copy each time
cp scos-live-original.iso scos-live.iso
podman run --privileged --rm -v /dev:/dev -v /run/udev:/run/udev -v "$PWD:/data" -w /data \
  quay.io/coreos/coreos-installer:release \
  iso ignition embed -fi /data/sno/bootstrap-in-place-for-live-iso.ign /data/scos-live.iso

Symptom: install seems fine but the bootstrap host shows the install state from days ago
#

openshift-install is stateful — it persists state in sno/.openshift_install_state.json. If you’ve re-run it without wiping, you might be looking at logs/state from an earlier run. When in doubt, blow away the sno/ directory and start clean (you’ll need a fresh ignition, so re-embed the ISO too).


Post-install: what’s worth doing next
#

The cluster is live, but kubeadmin is a temporary install user. Realistically, the next 30 minutes of work:

  1. Add an OIDC identity provider. OKD has first-class OIDC support. If you already run something like Authentik, Keycloak, or Authelia in your homelab, wire it in via oc edit oauth/cluster. Then oc delete secret kubeadmin -n kube-system to remove the install user.
  2. Trust or replace the ingress certificate. Default is self-signed by the cluster CA. Two options: (a) export the cluster CA and import to your laptop’s trust store, or (b) replace with a wildcard cert from cert-manager / your homelab CA. The latter is much nicer.
  3. Plan storage. A SNO install has no real shared storage out of the box. For homelab, I run Longhorn or local-path-provisioner. Pick one before you start deploying stateful workloads.
  4. Back up etcd. Even SNO clusters can rot. oc adm node-cluster backup regularly, save the snapshots somewhere off-cluster.

Summary
#

The install itself is mostly mechanical — write the config, generate ignition, embed, boot, wait. Where it goes sideways is DNS: a forgotten api-int record, a typo, or — the killer — a wildcard PTR that poisons the node’s hostname at first boot. Fix DNS before you touch the install media and the rest is straightforward.

If you’ve been bouncing off the Assisted Installer pod with okd-rpms errors, skip it. The bootstrap-in-place path documented here works today for OKD 4.22 SCOS. The newer openshift-install agent create image is the other modern option and is on my list to write up next.

Happy clustering.


References
#

OKD Homelab Series - This article is part of a series.
Part 1: This Article