2.3 KiB
2.3 KiB
QUptime documentation
Production-oriented documentation for qu, a small distributed uptime
monitor that votes on the health of HTTP/TCP/ICMP targets across a
cluster of cooperating nodes.
The top-level README.md is the marketing pitch and quick-start. The
pages here go deeper and are organised by what you're trying to do.
Getting set up
- Installation — pre-built binaries, building from source, verifying release artifacts, what the install script does.
- Configuration —
node.yaml,cluster.yaml,trust.yaml, environment variables, file layout, defaults.
Running it
- Architecture — how nodes form quorum, how a master is elected, how cluster state replicates, what happens during a partition, and exactly which guarantees the design gives you.
- Operations — day-2 tasks: upgrades, backups,
recovery from a lost node, recovery from a lost quorum, monitoring
quitself. - Security — the mTLS / TOFU trust model, what the cluster secret protects, how to rotate keys, what to put on a public network and what not to.
- Troubleshooting — common failure modes with the log lines you'll see and the fix.
Deployment recipes
Pick the one that matches your environment. They share most of the
operational guidance — what differs is how qu is packaged and how
the inter-node link is secured at the network layer.
- systemd on bare metal / VM — single static
binary, hardened unit file,
CAP_NET_RAWfor ICMP. - Docker / docker-compose — official image, single-node and multi-node compose files, persistent volumes.
- Tailscale / WireGuard overlay — nodes in separate networks with no public ingress; cluster traffic stays on the tailnet.
- Public-internet exposure — when
you have no overlay and
:9901is reachable from the open internet: firewalling, rate-limiting, secret hygiene.
A note on stability
The wire protocol (internal/transport) and the on-disk format
(cluster.yaml, node.yaml, trust.yaml) are considered stable
within a minor version. Breaking changes will bump the major version
and ship with a migration note.