AI assisted documentation
Container image / image (push) Successful in 1m37s

This commit is contained in:
2026-05-15 04:05:30 +00:00
parent 364ba222e2
commit 6953709574
12 changed files with 2029 additions and 0 deletions
+198
View File
@@ -0,0 +1,198 @@
# Deployment: Docker / docker-compose
The published image is a 14 MB distroless static container with the
`qu` binary as the entrypoint. It runs as root by default so the
daemon can bind privileged ports and open ICMP sockets; override with
`--user` if your host doesn't need that.
## Image references
```
git.cer.sh/axodouble/quptime:master # tip of main, multi-arch
git.cer.sh/axodouble/quptime:v0.1.0 # tagged release
git.cer.sh/axodouble/quptime:v0.1.0-amd64 # single-arch (if you must pin)
```
The image embeds `QUPTIME_DIR=/etc/quptime` and declares it a volume —
treat it as the only piece of state worth persisting.
## Single-node, single-container compose
For a development cluster or a single-node smoke test:
```yaml
# compose.yaml
services:
quptime:
image: git.cer.sh/axodouble/quptime:v0.1.0
container_name: quptime
restart: unless-stopped
ports:
- "9901:9901"
volumes:
- quptime-data:/etc/quptime
# ICMP UDP-mode pings need a permissive sysctl on the host:
# sysctl net.ipv4.ping_group_range="0 2147483647"
# Or grant CAP_NET_RAW (more accurate, raw ICMP).
cap_add:
- NET_RAW
volumes:
quptime-data:
```
You must **`qu init` before the daemon will start**. With this compose
file:
```sh
docker compose run --rm quptime init --advertise <host-ip>:9901
docker compose up -d
docker compose exec quptime qu status
```
`<host-ip>` must be reachable from every other node — the loopback
address inside the container is useless to peers.
## Three-node compose on a single host
For local testing of the full quorum machinery without three machines:
```yaml
# compose.yaml
x-quptime: &quptime
image: git.cer.sh/axodouble/quptime:v0.1.0
restart: unless-stopped
cap_add:
- NET_RAW
services:
alpha:
<<: *quptime
container_name: alpha
ports: ["9901:9901"]
volumes: ["alpha-data:/etc/quptime"]
bravo:
<<: *quptime
container_name: bravo
ports: ["9902:9901"]
volumes: ["bravo-data:/etc/quptime"]
charlie:
<<: *quptime
container_name: charlie
ports: ["9903:9901"]
volumes: ["charlie-data:/etc/quptime"]
volumes:
alpha-data:
bravo-data:
charlie-data:
```
Bootstrap:
```sh
# First node: prints the secret to stdout.
docker compose run --rm alpha init --advertise alpha:9901
# Capture the secret (or read it back from alpha-data).
SECRET=$(docker compose exec alpha cat /etc/quptime/node.yaml | grep cluster_secret | awk '{print $2}')
docker compose run --rm bravo init --advertise bravo:9901 --secret "$SECRET"
docker compose run --rm charlie init --advertise charlie:9901 --secret "$SECRET"
docker compose up -d
# Invite from alpha. The hostnames resolve over the compose network.
docker compose exec alpha qu node add bravo:9901
sleep 3 # wait for heartbeats before the next add
docker compose exec alpha qu node add charlie:9901
docker compose exec alpha qu status
```
For a cluster on three separate hosts, replicate the compose file on
each box with different `advertise` addresses (the public hostname or
the overlay IP) and bootstrap the same way.
## Multi-host compose
The natural unit is one compose file per host, each running one
`qu` container. The minimum-viable file per host:
```yaml
# /etc/qu-stack/compose.yaml
services:
quptime:
image: git.cer.sh/axodouble/quptime:v0.1.0
container_name: quptime
restart: unless-stopped
ports:
- "9901:9901"
volumes:
- /srv/quptime/data:/etc/quptime
cap_add:
- NET_RAW
```
Persistence is a bind-mount under `/srv/quptime/data` so backups and
upgrades hit a known path. See [operations.md](../operations.md) for
the backup recipe.
Inter-host traffic on TCP/9901 must be reachable. If the boxes don't
share a private network, prefer the
[Tailscale recipe](tailscale.md) over exposing 9901 directly — see
[public-internet.md](public-internet.md) for the threat model if you
must expose it.
## Behind a reverse proxy
**Don't.** `qu` is mTLS-pinned at the application layer, so a TLS-
terminating proxy would force the daemon to trust whatever cert the
proxy presents — defeating fingerprint pinning. If you need a single
public address per node, use a Layer 4 TCP proxy (`nginx stream`,
HAProxy `mode tcp`, or a plain firewall NAT) that forwards bytes
without touching them.
## Image internals
Build locally if you want to inspect what you're running:
```sh
docker buildx build \
--build-arg VERSION=$(git describe --tags --always) \
--platform linux/amd64,linux/arm64 \
--file docker/Dockerfile \
--tag quptime:dev \
--load \
.
```
The Dockerfile (see `docker/Dockerfile`) is two stages: a `golang:1.24-alpine`
builder that cross-compiles with `-trimpath -ldflags "-s -w"`, and a
`gcr.io/distroless/static-debian12` runtime. No shell, no package
manager, no SSH; you cannot `docker exec -it sh` into it. Use
`docker exec quptime qu ...` for everything.
## Healthcheck
The container exits non-zero if the daemon crashes, so the default
`restart: unless-stopped` policy is enough for liveness. A more
useful readiness check requires the binary to be in your healthchecker:
```yaml
healthcheck:
test: ["CMD", "/usr/local/bin/qu", "status"]
interval: 30s
timeout: 5s
retries: 3
start_period: 10s
```
`qu status` exits 0 when the daemon socket is reachable and the
control RPC succeeds — it does **not** fail on quorum loss. That's
intentional: restarting a quorum-less node won't bring quorum back,
and a healthcheck that flaps a follower in and out of `unhealthy`
state every time the master is briefly unreachable is worse than no
check. If you want a stricter readiness signal, pipe `qu status`
through `grep -q 'quorum true'`.
+180
View File
@@ -0,0 +1,180 @@
# Deployment: public-internet exposure
If your nodes do not share a private network and you can't put an
overlay between them (see [tailscale.md](tailscale.md)), this is the
recipe for exposing TCP/9901 directly to the open internet without
losing sleep.
The short version: `qu` is designed for this — every inbound call is
mTLS-pinned at the application layer and gated by the cluster secret
— but defence in depth is cheap and you should take it.
## Threat model in one paragraph
Anyone on the internet can establish a TLS connection to `:9901`
because the daemon must accept handshakes from currently-untrusted
peers (otherwise no node could ever join). The RPC dispatcher then
rejects every method except `Join` for callers whose fingerprint
isn't in `trust.yaml`. `Join` itself is gated by the **cluster
secret**, compared in constant time. So the realistic attack surface
is:
1. The TLS 1.3 stack accepting handshakes from arbitrary peers.
2. The `Join` handler's secret check and downstream cert ingestion.
3. The blast radius of a leaked cluster secret (an attacker who has
it can enrol themselves as a peer and propose mutations, which is
game over).
What can't trivially happen:
- A random attacker observing or modifying cluster traffic — TLS 1.3
with fingerprint pinning sees to that.
- A random attacker calling any method other than `Join` — the RPC
dispatcher refuses.
What you should still do:
- Treat `node.yaml.cluster_secret` like an SSH host key. Out-of-band
distribution only. Never in git, never in CI logs, never in chat.
- Rate-limit and IP-allowlist where you can. The `Join` handler does
not currently rate-limit at the application layer, so a determined
attacker could try secrets at TLS-handshake rate.
- Run on a non-default port if your operations workflow allows it.
Doesn't add security, but reduces background internet noise in the
logs and makes IDS / WAF rules cleaner.
## Firewall
### nftables (recommended)
A drop-in `/etc/nftables.d/quptime.nft`:
```nft
table inet filter {
set quptime_peers {
type ipv4_addr
elements = { 198.51.100.10, 198.51.100.11, 198.51.100.12 }
}
chain quptime_input {
# Drop everything that didn't come from a known peer.
ip saddr @quptime_peers tcp dport 9901 accept
tcp dport 9901 log prefix "quptime-drop: " level info drop
}
chain input {
type filter hook input priority 0; policy drop;
ct state established,related accept
iif lo accept
jump quptime_input
# ... your other rules
}
}
```
The allowlist is the highest-ROI mitigation by far — if you maintain
fixed IPs for your monitor nodes, use this and move on.
### ufw
```sh
sudo ufw allow from 198.51.100.10 to any port 9901 proto tcp
sudo ufw allow from 198.51.100.11 to any port 9901 proto tcp
sudo ufw allow from 198.51.100.12 to any port 9901 proto tcp
```
### Dynamic peer IPs
If peer IPs aren't fixed (e.g., one node is on a home connection with
a rotating address), you have three options ranked by preference:
1. Use an overlay instead — see [tailscale.md](tailscale.md). This is
the right answer.
2. DNS-based allowlisting (`ipset`-from-DNS or a small reconciler that
re-resolves an allowlist hostname every minute). Beware: a
compromised DNS resolver becomes a compromise of the allowlist.
3. Drop the allowlist and rely solely on the cluster secret + mTLS.
This is what `qu` is designed to survive; just be sure the secret
actually has the entropy `qu init` generated for it (32 random
bytes, base64-encoded).
## Rate-limiting failed handshakes
`qu` does not currently rate-limit `Join` attempts at the application
layer. You can do it at the firewall, which catches both connect
floods and slow brute-force:
```nft
table inet filter {
chain quptime_input {
tcp dport 9901 ct state new \
meter quptime_ratemeter { ip saddr limit rate over 10/second } \
log prefix "quptime-rate: " drop
tcp dport 9901 accept
}
}
```
Or `fail2ban` with a tiny custom filter that watches `journalctl -u
quptime` for repeated `peer rejected join` lines:
```ini
# /etc/fail2ban/filter.d/quptime.conf
[Definition]
failregex = ^.*quptime:.*peer rejected join.*from <ADDR>.*$
```
```ini
# /etc/fail2ban/jail.d/quptime.local
[quptime]
enabled = true
filter = quptime
backend = systemd
journalmatch = _SYSTEMD_UNIT=quptime.service
maxretry = 3
findtime = 600
bantime = 86400
```
Note: the daemon doesn't currently log the *peer address* on rejected
joins. The log filter above is illustrative; check what your version
actually emits before relying on it.
## Secret hygiene
The single most important thing on a public-internet deployment:
- **Generate the secret on the first node.** `qu init` with no
`--secret` produces 32 random bytes from `crypto/rand`, base64-
encoded. Don't replace that with something memorable.
- **Transport out of band.** Paste it into your secret manager
immediately; share via 1Password / Vault / encrypted email.
- **Rotate if anyone with access has left.** Rotation isn't a CLI
command; do it the brute-force way: `qu init` a fresh cluster on
new ports, re-add every check via `cluster.yaml` export, swap DNS.
- **One secret per cluster.** Do not reuse the secret across staging
and prod, or across customers if you run several clusters.
## Non-default ports
```sh
# Each node, in node.yaml — or pass --port on init.
qu init --advertise alpha.example.com:51234 --port 51234
```
Open the corresponding firewall rule, restart the daemon. The
cluster doesn't require uniform ports across nodes; each peer's
`advertise` field tells everyone else what to dial.
## What you should monitor on a public deployment
- `term` from `qu status` — if it's ticking up frequently the master
is flapping, which probably means at least one peer's network is
unstable. Could be benign, could be a probe attempt.
- The firewall drop counter on the `quptime-drop` rule above.
- The number of TLS handshakes on `:9901`. A spike in handshakes that
don't progress to a successful RPC is the signature of a brute-force
on the cluster secret.
For the operational side — backups, upgrades, recovery — see
[operations.md](../operations.md).
+250
View File
@@ -0,0 +1,250 @@
# Deployment: systemd on bare metal / VM
The canonical way to run `qu` on a Linux host. Single static binary,
managed by systemd, with a hardened unit file. Most production users
should start here.
## Audience and assumptions
- You have root (or `sudo`) on the host.
- You have at least three hosts that can reach each other on TCP/9901.
(Three is the minimum for a useful quorum; fewer is fine for
development but a 2-node cluster offers no consensus protection.)
- The hosts have a way to authenticate each other — direct IP or a
resolvable hostname is fine. For overlay networks see
[tailscale.md](tailscale.md).
## Install the binary
See [installation.md](../installation.md). The official `install.sh`
script writes a *minimal* unit file that's fine for development. For
production replace it with the hardened version below.
## Create a dedicated user
Running as a dedicated unprivileged user is best practice, but ICMP
support adds a wrinkle — see the next section.
```sh
sudo useradd --system --no-create-home --shell /usr/sbin/nologin quptime
sudo install -d -o quptime -g quptime -m 0750 /etc/quptime
sudo install -d -o quptime -g quptime -m 0750 /var/run/quptime
```
## ICMP capabilities
ICMP probes have two implementations:
1. **Unprivileged UDP pings** — Linux's `dgram` ICMP socket. Works on
any modern kernel without elevated privileges, but only if
`net.ipv4.ping_group_range` includes the daemon's GID. This is the
default in `qu`.
2. **Raw ICMP** — requires `CAP_NET_RAW`, more accurate latency
numbers and works for IPv6 from arbitrary kernels.
The simplest path: stick with unprivileged pings and widen
`ping_group_range`. Sysctl, persistent across reboots:
```sh
# /etc/sysctl.d/10-quptime.conf
net.ipv4.ping_group_range = 0 2147483647
```
```sh
sudo sysctl --system
```
If you need raw ICMP instead, grant the capability on the binary:
```sh
sudo setcap cap_net_raw=+ep /usr/local/bin/qu
```
Note that `setcap` is overwritten by every `qu` upgrade — bake the
`setcap` call into your deploy script, or re-run it after each
package update.
## Hardened unit file
Drop this in `/etc/systemd/system/quptime.service`:
```ini
[Unit]
Description=QUptime distributed uptime monitor
Documentation=https://git.cer.sh/axodouble/quptime
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
ExecStart=/usr/local/bin/qu serve
Restart=always
RestartSec=5s
User=quptime
Group=quptime
# Where state lives. RuntimeDirectory creates /var/run/quptime/ each
# boot owned by User:Group with mode 0750.
Environment=QUPTIME_DIR=/etc/quptime
RuntimeDirectory=quptime
RuntimeDirectoryMode=0750
ReadWritePaths=/etc/quptime /var/run/quptime
# Hardening. Comment out individual directives if a probe needs
# something we've revoked.
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true
PrivateDevices=true
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
ProtectClock=true
ProtectHostname=true
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
LockPersonality=true
MemoryDenyWriteExecute=true
# Network access is required (we're a network monitor). Keep address
# families minimal — AF_NETLINK is needed for some libc lookups.
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
# If you need raw ICMP, *also* uncomment:
# AmbientCapabilities=CAP_NET_RAW
# CapabilityBoundingSet=CAP_NET_RAW
# Otherwise drop all capabilities:
CapabilityBoundingSet=
[Install]
WantedBy=multi-user.target
```
Reload systemd and enable:
```sh
sudo systemctl daemon-reload
sudo systemctl enable quptime.service
```
## Initialise the node
**Don't start the service yet**`qu init` must run first, and it
must run as the `quptime` user so it creates files with the right
ownership.
On the **first** host (it will print a secret; copy it):
```sh
sudo -u quptime QUPTIME_DIR=/etc/quptime \
qu init --advertise alpha.example.com:9901
```
On every **other** host (paste the secret):
```sh
sudo -u quptime QUPTIME_DIR=/etc/quptime \
qu init --advertise bravo.example.com:9901 --secret '<paste>'
sudo -u quptime QUPTIME_DIR=/etc/quptime \
qu init --advertise charlie.example.com:9901 --secret '<paste>'
```
## Open the firewall
`qu` needs TCP/9901 reachable between cluster members. Adjust to your
firewall:
```sh
# ufw
sudo ufw allow from <peer-ip> to any port 9901 proto tcp
# firewalld
sudo firewall-cmd --permanent --zone=internal \
--add-rich-rule='rule family=ipv4 source address=<peer-ip> port port=9901 protocol=tcp accept'
sudo firewall-cmd --reload
# nftables (drop-in)
table inet filter {
chain input {
ip saddr { 10.0.0.10, 10.0.0.11, 10.0.0.12 } tcp dport 9901 accept
}
}
```
For exposing 9901 to the open internet see
[public-internet.md](public-internet.md).
## Start the daemon
```sh
sudo systemctl start quptime
sudo systemctl status quptime
journalctl -u quptime -f
```
## Invite peers
From one node (typically `alpha`):
```sh
sudo -u quptime qu node add bravo.example.com:9901
# Pause a few seconds so heartbeats reach the new peer before the next add —
# otherwise the "needs ≥2 live to mutate" check rejects the second invite.
sudo -u quptime qu node add charlie.example.com:9901
```
`qu node add` prints each remote's fingerprint and asks for SSH-style
confirmation. Verify it matches an out-of-band channel (the remote
operator can show their fingerprint with
`sudo -u quptime qu status` or by reading `trust.yaml`).
## Verify
```sh
sudo -u quptime qu status
```
Expect to see all three peers `live=true` and one of them as
`master`.
## Log scraping
`journalctl -u quptime` is the canonical log stream. Notable lines:
| Pattern | Meaning |
| ------------------------------------------------------------- | --------------------------------------------------------- |
| `listening on ... as node ...` | Daemon up. |
| `manual-edit: cluster.yaml changed externally — replicating…` | An operator edited `cluster.yaml` directly. |
| `manual-edit: parse cluster.yaml: ...` | Invalid YAML on disk; the operator must fix and re-save. |
| `report to master ...: <err>` | A follower couldn't ship a probe result to the master. |
| `replicate: pull from ...: <err>` | A follower couldn't pull a higher-version config snapshot. |
## Sample reload / restart drill
After editing the unit file:
```sh
sudo systemctl daemon-reload
sudo systemctl restart quptime
```
After editing `cluster.yaml` by hand:
```sh
sudoedit /etc/quptime/cluster.yaml
# No restart needed — the watcher picks it up within 2s and pushes to master.
```
After upgrading the binary:
```sh
sudo install -m 0755 qu-new /usr/local/bin/qu
sudo setcap cap_net_raw=+ep /usr/local/bin/qu # if you use raw ICMP
sudo systemctl restart quptime
```
Doing rolling upgrades? See [operations.md](../operations.md).
+181
View File
@@ -0,0 +1,181 @@
# Deployment: Tailscale / WireGuard overlay
When your nodes live in different networks — different VPS providers,
different physical sites, a mix of home and cloud — exposing TCP/9901
to the open internet is a poor idea. An overlay network gives every
node a stable private IP regardless of NAT, and `qu` only needs to
listen on that overlay address.
This page focuses on Tailscale because the repo ships an example
compose for it, but everything generalises to WireGuard, Nebula, or a
self-hosted Headscale.
## The big idea
```
+--- host A (VPS, no public ICMP) ----+
| tailscale ←→ overlay ip 100.64.1.1 |
| qu listening on 100.64.1.1:9901 |
+-------------------------------------+
│ mTLS over overlay
+--- host B (homelab behind NAT) -----+
| tailscale ←→ overlay ip 100.64.1.2 |
| qu listening on 100.64.1.2:9901 |
+-------------------------------------+
```
`bind_addr` is set to the tailscale IP, the host's public interface
has no port 9901 open, and the cluster secret + mTLS handshake gate
the link inside the tunnel.
## Compose recipe
The repo ships [`docker/docker-compose-tailscale.yml`](../../docker/docker-compose-tailscale.yml).
The relevant trick is `network_mode: "service:tailscale"` — the
`quptime` container shares the network namespace of the `tailscale`
sidecar so it sees the tailnet as its own interface.
```yaml
services:
tailscale:
image: tailscale/tailscale:latest
container_name: tailscale
cap_add: [NET_ADMIN]
environment:
- TS_AUTHKEY=${TAILSCALE_AUTHKEY} # provision via .env
- TS_HOSTNAME=quptime-${HOST} # name visible in admin
volumes:
- /dev/net/tun:/dev/net/tun
- tailscale:/var/lib/tailscale
restart: unless-stopped
quptime:
image: git.cer.sh/axodouble/quptime:v0.1.0
container_name: quptime
volumes:
- quptime:/etc/quptime
network_mode: "service:tailscale"
depends_on: [tailscale]
cap_add: [NET_RAW]
# No restart directive yet — needs `qu init` first.
volumes:
tailscale:
quptime:
```
### One-time bootstrap
Each host runs the same script with different `HOST` and `TAILSCALE_AUTHKEY`:
```sh
# .env
HOST=alpha
TAILSCALE_AUTHKEY=tskey-auth-xxxxxxxx
```
Start Tailscale alone first so it gets an IP:
```sh
docker compose up -d tailscale
sleep 5
TSIP=$(docker compose exec tailscale tailscale ip --4)
echo "this node's tailnet IP: $TSIP"
```
On the **first** host, init without `--secret`:
```sh
docker compose run --rm quptime init --advertise "$TSIP:9901"
# Grab the printed secret; pipe through your password manager.
```
On every **other** host, paste the secret:
```sh
docker compose run --rm quptime init \
--advertise "$TSIP:9901" \
--secret "$CLUSTER_SECRET"
```
Then bring up `qu` on every node and invite from the first:
```sh
# Each host
docker compose up -d quptime
# From alpha
docker compose exec quptime qu node add 100.64.1.2:9901
sleep 3
docker compose exec quptime qu node add 100.64.1.3:9901
docker compose exec quptime qu status
```
## Tailscale ACLs
Belt and braces — even though mTLS pins identities, lock down the
tailnet itself so only the `qu` nodes can reach each other's :9901.
In the Tailscale admin console:
```jsonc
{
"tagOwners": { "tag:qu-node": ["group:ops"] },
"acls": [
{
"action": "accept",
"src": ["tag:qu-node"],
"dst": ["tag:qu-node:9901"]
}
// ...your other rules
]
}
```
Then tag every `qu` node in its auth key:
```yaml
environment:
- TS_AUTHKEY=${TAILSCALE_AUTHKEY}?ephemeral=false&tags=tag:qu-node
```
## WireGuard / Nebula / Headscale equivalents
The recipe generalises:
1. Provision the overlay interface on each host with a stable
private IP (the tunnel's own address).
2. `qu init --advertise <overlay-ip>:9901`.
3. Set `bind_addr: <overlay-ip>` in `node.yaml` so the daemon does
**not** also listen on the public interface.
4. Open `:9901` only on the overlay interface in your firewall — for
nftables that's something like `iifname "wg0" tcp dport 9901
accept`.
The cluster secret and mTLS fingerprints still apply; the overlay just
removes the open-internet attack surface.
## Why prefer overlay over public exposure
- Single failure domain at the network layer: an attacker who finds an
exploit in your overlay client (rare; Tailscale and WireGuard are
small surfaces) still hits the application-layer pinning before any
cluster-level operation.
- The cluster secret can be lower-entropy when it's already
unreachable from outside. (You should still treat it as a real
secret; "defence in depth" only works if every layer is real.)
- ICMP probes from a homelab to a target on the public internet are
trivial through NAT, but ICMP *into* a homelab usually isn't.
Running `qu` on a tailnet means peers can heartbeat each other
regardless of NAT direction.
## Trade-offs
- One more thing to monitor. If your tailnet is down, your monitor is
down. Counter-measure: run *another* tiny `qu` cluster (or a single
node) on the public internet that watches the overlay's coordinator
health.
- Probe latency includes the overlay's hop. Tailscale's wireguard is
fast (<1 ms LAN, single-digit ms WAN) so this rarely matters, but
if you're alerting on tight latency thresholds, account for it.