6.5 KiB
Security
The trust model in one page. Read this before deciding where to put
qu and who can talk to it.
What qu is trying to defend against
- Eavesdropping on cluster traffic. Defended: TLS 1.3 only, fingerprint-pinned per peer.
- MITM on the cluster's inter-node link. Defended: TLS 1.3 with
out-of-band fingerprint verification at
qu node add. - A random internet host enrolling itself as a peer. Defended:
pre-shared cluster secret on every
Join. - A compromised peer issuing forged cluster-config mutations. Not
defended. A peer trusted enough to be in
cluster.yaml.peerscan propose mutations through the master. Treat membership as a privilege. - A compromised peer becoming master. Election is deterministic on
the smallest live
NodeID, so a compromised peer can become master if itsNodeIDsorts first. The master can rewritecluster.yamlarbitrarily. This is the worst-case blast radius from one compromised node. - DoS by handshake flood. Not directly defended at the application layer. The TLS stack accepts anyone's handshake; rate-limiting belongs at the firewall — see public-internet.md.
The three secrets on disk
| Secret | What it is | Loss impact |
|---|---|---|
keys/private.pem |
RSA private key, this node's identity. | Anyone with it can impersonate this node. |
node.yaml.cluster_secret |
Pre-shared base64 string. | Anyone with it can Join the cluster. |
trust.yaml.entries[].cert_pem |
Other peers' public certs (not secrets, but they enable mTLS). | Loss only forces re-trust. |
The first two are real secrets and live under 0600 permissions in
the data directory. Back them up; never commit them; never paste them
in chat.
TLS handshake step by step
For every inter-node call:
- Caller dials peer on its
advertiseaddress. - TLS 1.3 handshake. Both sides present their self-signed leaf cert.
- The caller's
VerifyPeerCertificate(set ininternal/transport/tls.go) computes the SPKI fingerprint of the server's cert and compares it againsttrust.yaml. If the caller knows whichNodeIDit expected, a strict verifier ensures the fingerprint matches that specific entry — not just any trusted peer. - The server's TLS layer accepts any client cert (
RequireAnyClientCert,InsecureSkipVerify: true) because trust is enforced one layer up. - The RPC dispatcher reads the client's cert, computes its
fingerprint, and looks it up in the server's
trust.yaml. If no entry exists, only theJoinmethod is permitted. Joinperforms a constant-time comparison of the inboundClusterSecretagainstnode.yaml.cluster_secret. Mismatch → refusal.
So:
- An adversary who gets your public cert can't impersonate you.
- An adversary who gets your fingerprint can't impersonate you.
- An adversary who gets your private key can impersonate you to any peer that trusts your fingerprint.
The TOFU step
qu node add <host:port> runs a one-shot insecure dial against the
target (the only place InsecureBootstrapConfig is used in the
codebase, see internal/transport/tls.go:91). It fetches the
remote's cert, prints the fingerprint, and asks for confirmation.
This is identical to SSH's first-connection prompt. The operator
must verify the fingerprint out of band — by running qu status on
the remote side, or by reading keys/cert.pem directly, or via a
known-good distribution channel.
If you skip verification, you trust the network at that moment. If the network was MITM'd at exactly that moment, you trust the attacker. After the prompt, the cert is pinned and the window closes.
Cluster secret rotation
There is no built-in command to rotate the cluster secret. The hard part isn't generating a new one — it's distributing it consistently across every node. The pragmatic recipe:
- Generate a new secret on one node and copy it to every other node.
- Update
node.yaml.cluster_secreton every node (manual edit). - Restart each daemon one at a time, verifying quorum returns between restarts.
Rotation only protects future Join calls, not anything else. If you
suspect the old secret has been seen by an adversary, also assume any
peer that was added during the leaked window is compromised, and
re-init those peers from scratch.
Identity rotation
To roll a node's RSA keypair (e.g., the private key was on a laptop that got stolen):
# On the compromised node:
sudo systemctl stop quptime
sudo rm -rf /etc/quptime
sudo -u quptime qu init \
--advertise this-host.example.com:9901 \
--secret '<existing cluster secret>'
sudo systemctl start quptime
# On a surviving healthy node:
sudo -u quptime qu node remove <old-node-id> # evict the old identity
sudo -u quptime qu node add this-host.example.com:9901
The new node_id is a fresh UUID; the old one is gone for good. Any
historical references to it (e.g., the updated_by field on past
versions of cluster.yaml) are cosmetic.
What the local control socket protects
$XDG_RUNTIME_DIR/quptime/quptime.sock (or /var/run/quptime/...) is
the channel the CLI uses to talk to the local daemon. It's 0600
permissioned and authenticated solely by filesystem ACLs — no TLS, no
secrets in the protocol.
Anyone who can read+write the socket can:
- Propose cluster mutations (will be relayed to the master).
- Read full cluster state including
cluster.yaml. - Trigger test alerts.
So: don't put the daemon's user in a group that other unprivileged
users share. The default systemd setup with a dedicated quptime
user gets this right.
Hardening checklist
- Dedicated
quptimesystem user. - Data directory owned by that user, mode 0750.
keys/private.pemmode 0600.node.yamlmode 0600.- systemd unit uses
ProtectSystem=strict,NoNewPrivileges=true, and the rest of the hardening directives in systemd.md. - If
:9901is internet-reachable, firewall allow-list to peer IPs or use an overlay — see public-internet.md and tailscale.md. - Cluster secret generated by
qu init(not chosen by a human), stored in your secret manager. - Backups of
keys/andnode.yamlare encrypted at rest.