This commit is contained in:
@@ -0,0 +1,198 @@
|
||||
# Deployment: Docker / docker-compose
|
||||
|
||||
The published image is a 14 MB distroless static container with the
|
||||
`qu` binary as the entrypoint. It runs as root by default so the
|
||||
daemon can bind privileged ports and open ICMP sockets; override with
|
||||
`--user` if your host doesn't need that.
|
||||
|
||||
## Image references
|
||||
|
||||
```
|
||||
git.cer.sh/axodouble/quptime:master # tip of main, multi-arch
|
||||
git.cer.sh/axodouble/quptime:v0.1.0 # tagged release
|
||||
git.cer.sh/axodouble/quptime:v0.1.0-amd64 # single-arch (if you must pin)
|
||||
```
|
||||
|
||||
The image embeds `QUPTIME_DIR=/etc/quptime` and declares it a volume —
|
||||
treat it as the only piece of state worth persisting.
|
||||
|
||||
## Single-node, single-container compose
|
||||
|
||||
For a development cluster or a single-node smoke test:
|
||||
|
||||
```yaml
|
||||
# compose.yaml
|
||||
services:
|
||||
quptime:
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
container_name: quptime
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "9901:9901"
|
||||
volumes:
|
||||
- quptime-data:/etc/quptime
|
||||
# ICMP UDP-mode pings need a permissive sysctl on the host:
|
||||
# sysctl net.ipv4.ping_group_range="0 2147483647"
|
||||
# Or grant CAP_NET_RAW (more accurate, raw ICMP).
|
||||
cap_add:
|
||||
- NET_RAW
|
||||
|
||||
volumes:
|
||||
quptime-data:
|
||||
```
|
||||
|
||||
You must **`qu init` before the daemon will start**. With this compose
|
||||
file:
|
||||
|
||||
```sh
|
||||
docker compose run --rm quptime init --advertise <host-ip>:9901
|
||||
docker compose up -d
|
||||
docker compose exec quptime qu status
|
||||
```
|
||||
|
||||
`<host-ip>` must be reachable from every other node — the loopback
|
||||
address inside the container is useless to peers.
|
||||
|
||||
## Three-node compose on a single host
|
||||
|
||||
For local testing of the full quorum machinery without three machines:
|
||||
|
||||
```yaml
|
||||
# compose.yaml
|
||||
x-quptime: &quptime
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
restart: unless-stopped
|
||||
cap_add:
|
||||
- NET_RAW
|
||||
|
||||
services:
|
||||
alpha:
|
||||
<<: *quptime
|
||||
container_name: alpha
|
||||
ports: ["9901:9901"]
|
||||
volumes: ["alpha-data:/etc/quptime"]
|
||||
|
||||
bravo:
|
||||
<<: *quptime
|
||||
container_name: bravo
|
||||
ports: ["9902:9901"]
|
||||
volumes: ["bravo-data:/etc/quptime"]
|
||||
|
||||
charlie:
|
||||
<<: *quptime
|
||||
container_name: charlie
|
||||
ports: ["9903:9901"]
|
||||
volumes: ["charlie-data:/etc/quptime"]
|
||||
|
||||
volumes:
|
||||
alpha-data:
|
||||
bravo-data:
|
||||
charlie-data:
|
||||
```
|
||||
|
||||
Bootstrap:
|
||||
|
||||
```sh
|
||||
# First node: prints the secret to stdout.
|
||||
docker compose run --rm alpha init --advertise alpha:9901
|
||||
# Capture the secret (or read it back from alpha-data).
|
||||
SECRET=$(docker compose exec alpha cat /etc/quptime/node.yaml | grep cluster_secret | awk '{print $2}')
|
||||
|
||||
docker compose run --rm bravo init --advertise bravo:9901 --secret "$SECRET"
|
||||
docker compose run --rm charlie init --advertise charlie:9901 --secret "$SECRET"
|
||||
|
||||
docker compose up -d
|
||||
|
||||
# Invite from alpha. The hostnames resolve over the compose network.
|
||||
docker compose exec alpha qu node add bravo:9901
|
||||
sleep 3 # wait for heartbeats before the next add
|
||||
docker compose exec alpha qu node add charlie:9901
|
||||
|
||||
docker compose exec alpha qu status
|
||||
```
|
||||
|
||||
For a cluster on three separate hosts, replicate the compose file on
|
||||
each box with different `advertise` addresses (the public hostname or
|
||||
the overlay IP) and bootstrap the same way.
|
||||
|
||||
## Multi-host compose
|
||||
|
||||
The natural unit is one compose file per host, each running one
|
||||
`qu` container. The minimum-viable file per host:
|
||||
|
||||
```yaml
|
||||
# /etc/qu-stack/compose.yaml
|
||||
services:
|
||||
quptime:
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
container_name: quptime
|
||||
restart: unless-stopped
|
||||
ports:
|
||||
- "9901:9901"
|
||||
volumes:
|
||||
- /srv/quptime/data:/etc/quptime
|
||||
cap_add:
|
||||
- NET_RAW
|
||||
```
|
||||
|
||||
Persistence is a bind-mount under `/srv/quptime/data` so backups and
|
||||
upgrades hit a known path. See [operations.md](../operations.md) for
|
||||
the backup recipe.
|
||||
|
||||
Inter-host traffic on TCP/9901 must be reachable. If the boxes don't
|
||||
share a private network, prefer the
|
||||
[Tailscale recipe](tailscale.md) over exposing 9901 directly — see
|
||||
[public-internet.md](public-internet.md) for the threat model if you
|
||||
must expose it.
|
||||
|
||||
## Behind a reverse proxy
|
||||
|
||||
**Don't.** `qu` is mTLS-pinned at the application layer, so a TLS-
|
||||
terminating proxy would force the daemon to trust whatever cert the
|
||||
proxy presents — defeating fingerprint pinning. If you need a single
|
||||
public address per node, use a Layer 4 TCP proxy (`nginx stream`,
|
||||
HAProxy `mode tcp`, or a plain firewall NAT) that forwards bytes
|
||||
without touching them.
|
||||
|
||||
## Image internals
|
||||
|
||||
Build locally if you want to inspect what you're running:
|
||||
|
||||
```sh
|
||||
docker buildx build \
|
||||
--build-arg VERSION=$(git describe --tags --always) \
|
||||
--platform linux/amd64,linux/arm64 \
|
||||
--file docker/Dockerfile \
|
||||
--tag quptime:dev \
|
||||
--load \
|
||||
.
|
||||
```
|
||||
|
||||
The Dockerfile (see `docker/Dockerfile`) is two stages: a `golang:1.24-alpine`
|
||||
builder that cross-compiles with `-trimpath -ldflags "-s -w"`, and a
|
||||
`gcr.io/distroless/static-debian12` runtime. No shell, no package
|
||||
manager, no SSH; you cannot `docker exec -it sh` into it. Use
|
||||
`docker exec quptime qu ...` for everything.
|
||||
|
||||
## Healthcheck
|
||||
|
||||
The container exits non-zero if the daemon crashes, so the default
|
||||
`restart: unless-stopped` policy is enough for liveness. A more
|
||||
useful readiness check requires the binary to be in your healthchecker:
|
||||
|
||||
```yaml
|
||||
healthcheck:
|
||||
test: ["CMD", "/usr/local/bin/qu", "status"]
|
||||
interval: 30s
|
||||
timeout: 5s
|
||||
retries: 3
|
||||
start_period: 10s
|
||||
```
|
||||
|
||||
`qu status` exits 0 when the daemon socket is reachable and the
|
||||
control RPC succeeds — it does **not** fail on quorum loss. That's
|
||||
intentional: restarting a quorum-less node won't bring quorum back,
|
||||
and a healthcheck that flaps a follower in and out of `unhealthy`
|
||||
state every time the master is briefly unreachable is worse than no
|
||||
check. If you want a stricter readiness signal, pipe `qu status`
|
||||
through `grep -q 'quorum true'`.
|
||||
@@ -0,0 +1,180 @@
|
||||
# Deployment: public-internet exposure
|
||||
|
||||
If your nodes do not share a private network and you can't put an
|
||||
overlay between them (see [tailscale.md](tailscale.md)), this is the
|
||||
recipe for exposing TCP/9901 directly to the open internet without
|
||||
losing sleep.
|
||||
|
||||
The short version: `qu` is designed for this — every inbound call is
|
||||
mTLS-pinned at the application layer and gated by the cluster secret
|
||||
— but defence in depth is cheap and you should take it.
|
||||
|
||||
## Threat model in one paragraph
|
||||
|
||||
Anyone on the internet can establish a TLS connection to `:9901`
|
||||
because the daemon must accept handshakes from currently-untrusted
|
||||
peers (otherwise no node could ever join). The RPC dispatcher then
|
||||
rejects every method except `Join` for callers whose fingerprint
|
||||
isn't in `trust.yaml`. `Join` itself is gated by the **cluster
|
||||
secret**, compared in constant time. So the realistic attack surface
|
||||
is:
|
||||
|
||||
1. The TLS 1.3 stack accepting handshakes from arbitrary peers.
|
||||
2. The `Join` handler's secret check and downstream cert ingestion.
|
||||
3. The blast radius of a leaked cluster secret (an attacker who has
|
||||
it can enrol themselves as a peer and propose mutations, which is
|
||||
game over).
|
||||
|
||||
What can't trivially happen:
|
||||
|
||||
- A random attacker observing or modifying cluster traffic — TLS 1.3
|
||||
with fingerprint pinning sees to that.
|
||||
- A random attacker calling any method other than `Join` — the RPC
|
||||
dispatcher refuses.
|
||||
|
||||
What you should still do:
|
||||
|
||||
- Treat `node.yaml.cluster_secret` like an SSH host key. Out-of-band
|
||||
distribution only. Never in git, never in CI logs, never in chat.
|
||||
- Rate-limit and IP-allowlist where you can. The `Join` handler does
|
||||
not currently rate-limit at the application layer, so a determined
|
||||
attacker could try secrets at TLS-handshake rate.
|
||||
- Run on a non-default port if your operations workflow allows it.
|
||||
Doesn't add security, but reduces background internet noise in the
|
||||
logs and makes IDS / WAF rules cleaner.
|
||||
|
||||
## Firewall
|
||||
|
||||
### nftables (recommended)
|
||||
|
||||
A drop-in `/etc/nftables.d/quptime.nft`:
|
||||
|
||||
```nft
|
||||
table inet filter {
|
||||
set quptime_peers {
|
||||
type ipv4_addr
|
||||
elements = { 198.51.100.10, 198.51.100.11, 198.51.100.12 }
|
||||
}
|
||||
|
||||
chain quptime_input {
|
||||
# Drop everything that didn't come from a known peer.
|
||||
ip saddr @quptime_peers tcp dport 9901 accept
|
||||
tcp dport 9901 log prefix "quptime-drop: " level info drop
|
||||
}
|
||||
|
||||
chain input {
|
||||
type filter hook input priority 0; policy drop;
|
||||
ct state established,related accept
|
||||
iif lo accept
|
||||
jump quptime_input
|
||||
# ... your other rules
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
The allowlist is the highest-ROI mitigation by far — if you maintain
|
||||
fixed IPs for your monitor nodes, use this and move on.
|
||||
|
||||
### ufw
|
||||
|
||||
```sh
|
||||
sudo ufw allow from 198.51.100.10 to any port 9901 proto tcp
|
||||
sudo ufw allow from 198.51.100.11 to any port 9901 proto tcp
|
||||
sudo ufw allow from 198.51.100.12 to any port 9901 proto tcp
|
||||
```
|
||||
|
||||
### Dynamic peer IPs
|
||||
|
||||
If peer IPs aren't fixed (e.g., one node is on a home connection with
|
||||
a rotating address), you have three options ranked by preference:
|
||||
|
||||
1. Use an overlay instead — see [tailscale.md](tailscale.md). This is
|
||||
the right answer.
|
||||
2. DNS-based allowlisting (`ipset`-from-DNS or a small reconciler that
|
||||
re-resolves an allowlist hostname every minute). Beware: a
|
||||
compromised DNS resolver becomes a compromise of the allowlist.
|
||||
3. Drop the allowlist and rely solely on the cluster secret + mTLS.
|
||||
This is what `qu` is designed to survive; just be sure the secret
|
||||
actually has the entropy `qu init` generated for it (32 random
|
||||
bytes, base64-encoded).
|
||||
|
||||
## Rate-limiting failed handshakes
|
||||
|
||||
`qu` does not currently rate-limit `Join` attempts at the application
|
||||
layer. You can do it at the firewall, which catches both connect
|
||||
floods and slow brute-force:
|
||||
|
||||
```nft
|
||||
table inet filter {
|
||||
chain quptime_input {
|
||||
tcp dport 9901 ct state new \
|
||||
meter quptime_ratemeter { ip saddr limit rate over 10/second } \
|
||||
log prefix "quptime-rate: " drop
|
||||
tcp dport 9901 accept
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
Or `fail2ban` with a tiny custom filter that watches `journalctl -u
|
||||
quptime` for repeated `peer rejected join` lines:
|
||||
|
||||
```ini
|
||||
# /etc/fail2ban/filter.d/quptime.conf
|
||||
[Definition]
|
||||
failregex = ^.*quptime:.*peer rejected join.*from <ADDR>.*$
|
||||
```
|
||||
|
||||
```ini
|
||||
# /etc/fail2ban/jail.d/quptime.local
|
||||
[quptime]
|
||||
enabled = true
|
||||
filter = quptime
|
||||
backend = systemd
|
||||
journalmatch = _SYSTEMD_UNIT=quptime.service
|
||||
maxretry = 3
|
||||
findtime = 600
|
||||
bantime = 86400
|
||||
```
|
||||
|
||||
Note: the daemon doesn't currently log the *peer address* on rejected
|
||||
joins. The log filter above is illustrative; check what your version
|
||||
actually emits before relying on it.
|
||||
|
||||
## Secret hygiene
|
||||
|
||||
The single most important thing on a public-internet deployment:
|
||||
|
||||
- **Generate the secret on the first node.** `qu init` with no
|
||||
`--secret` produces 32 random bytes from `crypto/rand`, base64-
|
||||
encoded. Don't replace that with something memorable.
|
||||
- **Transport out of band.** Paste it into your secret manager
|
||||
immediately; share via 1Password / Vault / encrypted email.
|
||||
- **Rotate if anyone with access has left.** Rotation isn't a CLI
|
||||
command; do it the brute-force way: `qu init` a fresh cluster on
|
||||
new ports, re-add every check via `cluster.yaml` export, swap DNS.
|
||||
- **One secret per cluster.** Do not reuse the secret across staging
|
||||
and prod, or across customers if you run several clusters.
|
||||
|
||||
## Non-default ports
|
||||
|
||||
```sh
|
||||
# Each node, in node.yaml — or pass --port on init.
|
||||
qu init --advertise alpha.example.com:51234 --port 51234
|
||||
```
|
||||
|
||||
Open the corresponding firewall rule, restart the daemon. The
|
||||
cluster doesn't require uniform ports across nodes; each peer's
|
||||
`advertise` field tells everyone else what to dial.
|
||||
|
||||
## What you should monitor on a public deployment
|
||||
|
||||
- `term` from `qu status` — if it's ticking up frequently the master
|
||||
is flapping, which probably means at least one peer's network is
|
||||
unstable. Could be benign, could be a probe attempt.
|
||||
- The firewall drop counter on the `quptime-drop` rule above.
|
||||
- The number of TLS handshakes on `:9901`. A spike in handshakes that
|
||||
don't progress to a successful RPC is the signature of a brute-force
|
||||
on the cluster secret.
|
||||
|
||||
For the operational side — backups, upgrades, recovery — see
|
||||
[operations.md](../operations.md).
|
||||
@@ -0,0 +1,250 @@
|
||||
# Deployment: systemd on bare metal / VM
|
||||
|
||||
The canonical way to run `qu` on a Linux host. Single static binary,
|
||||
managed by systemd, with a hardened unit file. Most production users
|
||||
should start here.
|
||||
|
||||
## Audience and assumptions
|
||||
|
||||
- You have root (or `sudo`) on the host.
|
||||
- You have at least three hosts that can reach each other on TCP/9901.
|
||||
(Three is the minimum for a useful quorum; fewer is fine for
|
||||
development but a 2-node cluster offers no consensus protection.)
|
||||
- The hosts have a way to authenticate each other — direct IP or a
|
||||
resolvable hostname is fine. For overlay networks see
|
||||
[tailscale.md](tailscale.md).
|
||||
|
||||
## Install the binary
|
||||
|
||||
See [installation.md](../installation.md). The official `install.sh`
|
||||
script writes a *minimal* unit file that's fine for development. For
|
||||
production replace it with the hardened version below.
|
||||
|
||||
## Create a dedicated user
|
||||
|
||||
Running as a dedicated unprivileged user is best practice, but ICMP
|
||||
support adds a wrinkle — see the next section.
|
||||
|
||||
```sh
|
||||
sudo useradd --system --no-create-home --shell /usr/sbin/nologin quptime
|
||||
sudo install -d -o quptime -g quptime -m 0750 /etc/quptime
|
||||
sudo install -d -o quptime -g quptime -m 0750 /var/run/quptime
|
||||
```
|
||||
|
||||
## ICMP capabilities
|
||||
|
||||
ICMP probes have two implementations:
|
||||
|
||||
1. **Unprivileged UDP pings** — Linux's `dgram` ICMP socket. Works on
|
||||
any modern kernel without elevated privileges, but only if
|
||||
`net.ipv4.ping_group_range` includes the daemon's GID. This is the
|
||||
default in `qu`.
|
||||
2. **Raw ICMP** — requires `CAP_NET_RAW`, more accurate latency
|
||||
numbers and works for IPv6 from arbitrary kernels.
|
||||
|
||||
The simplest path: stick with unprivileged pings and widen
|
||||
`ping_group_range`. Sysctl, persistent across reboots:
|
||||
|
||||
```sh
|
||||
# /etc/sysctl.d/10-quptime.conf
|
||||
net.ipv4.ping_group_range = 0 2147483647
|
||||
```
|
||||
|
||||
```sh
|
||||
sudo sysctl --system
|
||||
```
|
||||
|
||||
If you need raw ICMP instead, grant the capability on the binary:
|
||||
|
||||
```sh
|
||||
sudo setcap cap_net_raw=+ep /usr/local/bin/qu
|
||||
```
|
||||
|
||||
Note that `setcap` is overwritten by every `qu` upgrade — bake the
|
||||
`setcap` call into your deploy script, or re-run it after each
|
||||
package update.
|
||||
|
||||
## Hardened unit file
|
||||
|
||||
Drop this in `/etc/systemd/system/quptime.service`:
|
||||
|
||||
```ini
|
||||
[Unit]
|
||||
Description=QUptime distributed uptime monitor
|
||||
Documentation=https://git.cer.sh/axodouble/quptime
|
||||
Wants=network-online.target
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
ExecStart=/usr/local/bin/qu serve
|
||||
Restart=always
|
||||
RestartSec=5s
|
||||
|
||||
User=quptime
|
||||
Group=quptime
|
||||
|
||||
# Where state lives. RuntimeDirectory creates /var/run/quptime/ each
|
||||
# boot owned by User:Group with mode 0750.
|
||||
Environment=QUPTIME_DIR=/etc/quptime
|
||||
RuntimeDirectory=quptime
|
||||
RuntimeDirectoryMode=0750
|
||||
ReadWritePaths=/etc/quptime /var/run/quptime
|
||||
|
||||
# Hardening. Comment out individual directives if a probe needs
|
||||
# something we've revoked.
|
||||
NoNewPrivileges=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
PrivateTmp=true
|
||||
PrivateDevices=true
|
||||
ProtectKernelTunables=true
|
||||
ProtectKernelModules=true
|
||||
ProtectControlGroups=true
|
||||
ProtectClock=true
|
||||
ProtectHostname=true
|
||||
RestrictNamespaces=true
|
||||
RestrictRealtime=true
|
||||
RestrictSUIDSGID=true
|
||||
LockPersonality=true
|
||||
MemoryDenyWriteExecute=true
|
||||
|
||||
# Network access is required (we're a network monitor). Keep address
|
||||
# families minimal — AF_NETLINK is needed for some libc lookups.
|
||||
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
|
||||
|
||||
# If you need raw ICMP, *also* uncomment:
|
||||
# AmbientCapabilities=CAP_NET_RAW
|
||||
# CapabilityBoundingSet=CAP_NET_RAW
|
||||
# Otherwise drop all capabilities:
|
||||
CapabilityBoundingSet=
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
```
|
||||
|
||||
Reload systemd and enable:
|
||||
|
||||
```sh
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl enable quptime.service
|
||||
```
|
||||
|
||||
## Initialise the node
|
||||
|
||||
**Don't start the service yet** — `qu init` must run first, and it
|
||||
must run as the `quptime` user so it creates files with the right
|
||||
ownership.
|
||||
|
||||
On the **first** host (it will print a secret; copy it):
|
||||
|
||||
```sh
|
||||
sudo -u quptime QUPTIME_DIR=/etc/quptime \
|
||||
qu init --advertise alpha.example.com:9901
|
||||
```
|
||||
|
||||
On every **other** host (paste the secret):
|
||||
|
||||
```sh
|
||||
sudo -u quptime QUPTIME_DIR=/etc/quptime \
|
||||
qu init --advertise bravo.example.com:9901 --secret '<paste>'
|
||||
|
||||
sudo -u quptime QUPTIME_DIR=/etc/quptime \
|
||||
qu init --advertise charlie.example.com:9901 --secret '<paste>'
|
||||
```
|
||||
|
||||
## Open the firewall
|
||||
|
||||
`qu` needs TCP/9901 reachable between cluster members. Adjust to your
|
||||
firewall:
|
||||
|
||||
```sh
|
||||
# ufw
|
||||
sudo ufw allow from <peer-ip> to any port 9901 proto tcp
|
||||
|
||||
# firewalld
|
||||
sudo firewall-cmd --permanent --zone=internal \
|
||||
--add-rich-rule='rule family=ipv4 source address=<peer-ip> port port=9901 protocol=tcp accept'
|
||||
sudo firewall-cmd --reload
|
||||
|
||||
# nftables (drop-in)
|
||||
table inet filter {
|
||||
chain input {
|
||||
ip saddr { 10.0.0.10, 10.0.0.11, 10.0.0.12 } tcp dport 9901 accept
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
For exposing 9901 to the open internet see
|
||||
[public-internet.md](public-internet.md).
|
||||
|
||||
## Start the daemon
|
||||
|
||||
```sh
|
||||
sudo systemctl start quptime
|
||||
sudo systemctl status quptime
|
||||
journalctl -u quptime -f
|
||||
```
|
||||
|
||||
## Invite peers
|
||||
|
||||
From one node (typically `alpha`):
|
||||
|
||||
```sh
|
||||
sudo -u quptime qu node add bravo.example.com:9901
|
||||
# Pause a few seconds so heartbeats reach the new peer before the next add —
|
||||
# otherwise the "needs ≥2 live to mutate" check rejects the second invite.
|
||||
sudo -u quptime qu node add charlie.example.com:9901
|
||||
```
|
||||
|
||||
`qu node add` prints each remote's fingerprint and asks for SSH-style
|
||||
confirmation. Verify it matches an out-of-band channel (the remote
|
||||
operator can show their fingerprint with
|
||||
`sudo -u quptime qu status` or by reading `trust.yaml`).
|
||||
|
||||
## Verify
|
||||
|
||||
```sh
|
||||
sudo -u quptime qu status
|
||||
```
|
||||
|
||||
Expect to see all three peers `live=true` and one of them as
|
||||
`master`.
|
||||
|
||||
## Log scraping
|
||||
|
||||
`journalctl -u quptime` is the canonical log stream. Notable lines:
|
||||
|
||||
| Pattern | Meaning |
|
||||
| ------------------------------------------------------------- | --------------------------------------------------------- |
|
||||
| `listening on ... as node ...` | Daemon up. |
|
||||
| `manual-edit: cluster.yaml changed externally — replicating…` | An operator edited `cluster.yaml` directly. |
|
||||
| `manual-edit: parse cluster.yaml: ...` | Invalid YAML on disk; the operator must fix and re-save. |
|
||||
| `report to master ...: <err>` | A follower couldn't ship a probe result to the master. |
|
||||
| `replicate: pull from ...: <err>` | A follower couldn't pull a higher-version config snapshot. |
|
||||
|
||||
## Sample reload / restart drill
|
||||
|
||||
After editing the unit file:
|
||||
|
||||
```sh
|
||||
sudo systemctl daemon-reload
|
||||
sudo systemctl restart quptime
|
||||
```
|
||||
|
||||
After editing `cluster.yaml` by hand:
|
||||
|
||||
```sh
|
||||
sudoedit /etc/quptime/cluster.yaml
|
||||
# No restart needed — the watcher picks it up within 2s and pushes to master.
|
||||
```
|
||||
|
||||
After upgrading the binary:
|
||||
|
||||
```sh
|
||||
sudo install -m 0755 qu-new /usr/local/bin/qu
|
||||
sudo setcap cap_net_raw=+ep /usr/local/bin/qu # if you use raw ICMP
|
||||
sudo systemctl restart quptime
|
||||
```
|
||||
|
||||
Doing rolling upgrades? See [operations.md](../operations.md).
|
||||
@@ -0,0 +1,181 @@
|
||||
# Deployment: Tailscale / WireGuard overlay
|
||||
|
||||
When your nodes live in different networks — different VPS providers,
|
||||
different physical sites, a mix of home and cloud — exposing TCP/9901
|
||||
to the open internet is a poor idea. An overlay network gives every
|
||||
node a stable private IP regardless of NAT, and `qu` only needs to
|
||||
listen on that overlay address.
|
||||
|
||||
This page focuses on Tailscale because the repo ships an example
|
||||
compose for it, but everything generalises to WireGuard, Nebula, or a
|
||||
self-hosted Headscale.
|
||||
|
||||
## The big idea
|
||||
|
||||
```
|
||||
+--- host A (VPS, no public ICMP) ----+
|
||||
| tailscale ←→ overlay ip 100.64.1.1 |
|
||||
| qu listening on 100.64.1.1:9901 |
|
||||
+-------------------------------------+
|
||||
│ mTLS over overlay
|
||||
▼
|
||||
+--- host B (homelab behind NAT) -----+
|
||||
| tailscale ←→ overlay ip 100.64.1.2 |
|
||||
| qu listening on 100.64.1.2:9901 |
|
||||
+-------------------------------------+
|
||||
```
|
||||
|
||||
`bind_addr` is set to the tailscale IP, the host's public interface
|
||||
has no port 9901 open, and the cluster secret + mTLS handshake gate
|
||||
the link inside the tunnel.
|
||||
|
||||
## Compose recipe
|
||||
|
||||
The repo ships [`docker/docker-compose-tailscale.yml`](../../docker/docker-compose-tailscale.yml).
|
||||
The relevant trick is `network_mode: "service:tailscale"` — the
|
||||
`quptime` container shares the network namespace of the `tailscale`
|
||||
sidecar so it sees the tailnet as its own interface.
|
||||
|
||||
```yaml
|
||||
services:
|
||||
tailscale:
|
||||
image: tailscale/tailscale:latest
|
||||
container_name: tailscale
|
||||
cap_add: [NET_ADMIN]
|
||||
environment:
|
||||
- TS_AUTHKEY=${TAILSCALE_AUTHKEY} # provision via .env
|
||||
- TS_HOSTNAME=quptime-${HOST} # name visible in admin
|
||||
volumes:
|
||||
- /dev/net/tun:/dev/net/tun
|
||||
- tailscale:/var/lib/tailscale
|
||||
restart: unless-stopped
|
||||
|
||||
quptime:
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
container_name: quptime
|
||||
volumes:
|
||||
- quptime:/etc/quptime
|
||||
network_mode: "service:tailscale"
|
||||
depends_on: [tailscale]
|
||||
cap_add: [NET_RAW]
|
||||
# No restart directive yet — needs `qu init` first.
|
||||
|
||||
volumes:
|
||||
tailscale:
|
||||
quptime:
|
||||
```
|
||||
|
||||
### One-time bootstrap
|
||||
|
||||
Each host runs the same script with different `HOST` and `TAILSCALE_AUTHKEY`:
|
||||
|
||||
```sh
|
||||
# .env
|
||||
HOST=alpha
|
||||
TAILSCALE_AUTHKEY=tskey-auth-xxxxxxxx
|
||||
```
|
||||
|
||||
Start Tailscale alone first so it gets an IP:
|
||||
|
||||
```sh
|
||||
docker compose up -d tailscale
|
||||
sleep 5
|
||||
TSIP=$(docker compose exec tailscale tailscale ip --4)
|
||||
echo "this node's tailnet IP: $TSIP"
|
||||
```
|
||||
|
||||
On the **first** host, init without `--secret`:
|
||||
|
||||
```sh
|
||||
docker compose run --rm quptime init --advertise "$TSIP:9901"
|
||||
# Grab the printed secret; pipe through your password manager.
|
||||
```
|
||||
|
||||
On every **other** host, paste the secret:
|
||||
|
||||
```sh
|
||||
docker compose run --rm quptime init \
|
||||
--advertise "$TSIP:9901" \
|
||||
--secret "$CLUSTER_SECRET"
|
||||
```
|
||||
|
||||
Then bring up `qu` on every node and invite from the first:
|
||||
|
||||
```sh
|
||||
# Each host
|
||||
docker compose up -d quptime
|
||||
|
||||
# From alpha
|
||||
docker compose exec quptime qu node add 100.64.1.2:9901
|
||||
sleep 3
|
||||
docker compose exec quptime qu node add 100.64.1.3:9901
|
||||
|
||||
docker compose exec quptime qu status
|
||||
```
|
||||
|
||||
## Tailscale ACLs
|
||||
|
||||
Belt and braces — even though mTLS pins identities, lock down the
|
||||
tailnet itself so only the `qu` nodes can reach each other's :9901.
|
||||
In the Tailscale admin console:
|
||||
|
||||
```jsonc
|
||||
{
|
||||
"tagOwners": { "tag:qu-node": ["group:ops"] },
|
||||
"acls": [
|
||||
{
|
||||
"action": "accept",
|
||||
"src": ["tag:qu-node"],
|
||||
"dst": ["tag:qu-node:9901"]
|
||||
}
|
||||
// ...your other rules
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
Then tag every `qu` node in its auth key:
|
||||
|
||||
```yaml
|
||||
environment:
|
||||
- TS_AUTHKEY=${TAILSCALE_AUTHKEY}?ephemeral=false&tags=tag:qu-node
|
||||
```
|
||||
|
||||
## WireGuard / Nebula / Headscale equivalents
|
||||
|
||||
The recipe generalises:
|
||||
|
||||
1. Provision the overlay interface on each host with a stable
|
||||
private IP (the tunnel's own address).
|
||||
2. `qu init --advertise <overlay-ip>:9901`.
|
||||
3. Set `bind_addr: <overlay-ip>` in `node.yaml` so the daemon does
|
||||
**not** also listen on the public interface.
|
||||
4. Open `:9901` only on the overlay interface in your firewall — for
|
||||
nftables that's something like `iifname "wg0" tcp dport 9901
|
||||
accept`.
|
||||
|
||||
The cluster secret and mTLS fingerprints still apply; the overlay just
|
||||
removes the open-internet attack surface.
|
||||
|
||||
## Why prefer overlay over public exposure
|
||||
|
||||
- Single failure domain at the network layer: an attacker who finds an
|
||||
exploit in your overlay client (rare; Tailscale and WireGuard are
|
||||
small surfaces) still hits the application-layer pinning before any
|
||||
cluster-level operation.
|
||||
- The cluster secret can be lower-entropy when it's already
|
||||
unreachable from outside. (You should still treat it as a real
|
||||
secret; "defence in depth" only works if every layer is real.)
|
||||
- ICMP probes from a homelab to a target on the public internet are
|
||||
trivial through NAT, but ICMP *into* a homelab usually isn't.
|
||||
Running `qu` on a tailnet means peers can heartbeat each other
|
||||
regardless of NAT direction.
|
||||
|
||||
## Trade-offs
|
||||
|
||||
- One more thing to monitor. If your tailnet is down, your monitor is
|
||||
down. Counter-measure: run *another* tiny `qu` cluster (or a single
|
||||
node) on the public internet that watches the overlay's coordinator
|
||||
health.
|
||||
- Probe latency includes the overlay's hop. Tailscale's wireguard is
|
||||
fast (<1 ms LAN, single-digit ms WAN) so this rarely matters, but
|
||||
if you're alerting on tight latency thresholds, account for it.
|
||||
Reference in New Issue
Block a user