13 KiB
Configuration
This page is the canonical reference for the on-disk files, the
environment variables, and every field that qu reads. It's
deliberately tedious — when something doesn't behave the way you
expect, this is where the answer lives.
File layout
When running as root (the typical case under systemd):
/etc/quptime/
├── node.yaml identity, never replicated
├── cluster.yaml replicated state
├── trust.yaml local fingerprint trust store
└── keys/
├── private.pem RSA private key (0600)
├── public.pem RSA public key
└── cert.pem self-signed X.509 cert
/var/run/quptime/quptime.sock control socket (0600)
When running as a non-root user (the typical case for go run or a
desktop test):
~/.config/quptime/... same shape as /etc/quptime
$XDG_RUNTIME_DIR/quptime/quptime.sock control socket
Override the data directory with QUPTIME_DIR=/some/path qu serve.
Override the socket path with QUPTIME_SOCKET=/run/foo.sock.
Environment variables
Paths
| Variable | Purpose |
|---|---|
QUPTIME_DIR |
Data directory. Defaults to /etc/quptime (root) or $XDG_CONFIG_HOME/quptime. |
QUPTIME_SOCKET |
Path to the CLI ↔ daemon unix socket. Defaults to /var/run/quptime/quptime.sock (root) or $XDG_RUNTIME_DIR/quptime/…. |
XDG_CONFIG_HOME |
Honored when running as non-root and QUPTIME_DIR is unset. |
XDG_RUNTIME_DIR |
Honored when running as non-root and QUPTIME_SOCKET is unset. |
node.yaml field overrides
Every field in node.yaml can also be supplied via an environment
variable. This is the recommended way to drive Docker / Compose
deployments: drop the env vars into the compose file and the daemon
will bootstrap on first start without a separate qu init step.
| Variable | node.yaml field |
Notes |
|---|---|---|
QUPTIME_NODE_ID |
node_id |
Pin a specific UUID. Leave unset to let qu init / auto-init generate one. |
QUPTIME_BIND_ADDR |
bind_addr |
Defaults to 0.0.0.0. |
QUPTIME_BIND_PORT |
bind_port |
Integer. Defaults to 9901. |
QUPTIME_ADVERTISE |
advertise |
host:port other peers use to reach this node. Required when bound to a wildcard or behind NAT. |
QUPTIME_CLUSTER_SECRET |
cluster_secret |
Pre-shared join secret. Set the same value on every node. If unset on the very first node, one is generated. |
Precedence is env > file > compiled default. Non-empty env values
win over whatever is stored in node.yaml at load time, so changing a
variable in docker-compose.yml and restarting the container is
enough to roll out new bind/advertise values — no on-disk edit
required. Empty env values are ignored (they will not clear a
previously persisted field).
For qu init specifically, explicit command-line flags take
precedence over env values; env values fill in only the fields the
operator did not pass on the command line.
The daemon does not read any other environment variables. SMTP, Discord,
and HTTP probe targets are configured exclusively in cluster.yaml.
Auto-init on qu serve
If node.yaml does not exist when qu serve starts, the daemon
bootstraps it in-place using the QUPTIME_* env vars above: a fresh
UUID is generated (or QUPTIME_NODE_ID is honored if set), an RSA
keypair and self-signed cert are written under keys/, and
cluster.yaml is seeded with this node as its sole peer. If no
QUPTIME_CLUSTER_SECRET was provided, a random one is generated and
printed to stderr — copy it to every follower node's
QUPTIME_CLUSTER_SECRET (or --secret flag) before they start.
This is what makes the docker-compose flow docker compose up-only
on a fresh volume. To opt out (e.g. so a misconfigured deployment
crashes loudly instead of silently generating a new identity), run
qu init against the volume yourself before letting qu serve ever
see it.
node.yaml — local identity
Never replicated. One file per host. Generated by qu init.
node_id: 7f3a5b9e-... # UUIDv4, immutable after init
bind_addr: 0.0.0.0 # listen address for :9901
bind_port: 9901 # listen port
advertise: alpha.example.com:9901 # how peers reach us; may differ from bind
cluster_secret: 4hZqK8vT9... # base64; required to Join, never replicated
Field reference
node_id— UUIDv4 generated atqu init. Used by every peer to refer to this node across IP changes and restarts. Do not edit.bind_addr— Address the daemon listens on.0.0.0.0is the default. Set to127.0.0.1if you only want to expose the daemon through an overlay (Tailscale, WireGuard) — see deployment/tailscale.md.bind_port— Defaults to9901. Change here if 9901 is taken; the cluster does not require port-uniformity, peers just need to know what to dial via theadvertisefield.advertise— Host:port other nodes use to reach this one. Must be routable from every peer. Falls back tobind_addr:bind_portif unset, which is rarely what you want behind NAT.cluster_secret— Pre-shared base64 string. Required on everyJoinRPC; constant-time comparison on the receiver. Generate on the first node, distribute out-of-band, keep out of version control.
How qu init populates this file
qu init \
--advertise alpha.example.com:9901 \
--bind 0.0.0.0 \
--port 9901 \
--secret '<paste from first node, or omit on the first node>'
Idempotent in one direction only: if node.yaml exists, qu init
refuses to overwrite. To re-init, delete the data directory entirely.
cluster.yaml — replicated state
This is the file that every node converges on. The master is the only
one allowed to bump version; followers Replace it whole each time
they receive a higher-versioned snapshot.
version: 12
updated_at: 2026-05-15T14:01:00Z
updated_by: 7f3a5b9e-...
peers:
- node_id: 7f3a5b9e-...
advertise: alpha.example.com:9901
fingerprint: SHA256:abcd...
cert_pem: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
checks:
- id: 0006a1...
name: homepage
type: http
target: https://example.com
interval: 30s
timeout: 10s
expect_status: 200
alert_ids: [oncall]
suppress_alert_ids: []
alerts:
- id: f001ab...
name: oncall
type: discord
default: true
discord_webhook: https://discord.com/api/webhooks/...
body_template: |
:rotating_light: {{.Check.Name}} is {{.Verb}}
Top-level fields
| Field | Owner | Notes |
|---|---|---|
version |
master | Monotonic. Followers reject snapshots whose version is ≤ their local. |
updated_at |
master | UTC RFC3339. Cosmetic — humans use it, no logic depends on it. |
updated_by |
master | NodeID of the committing master. |
peers |
editable | Cluster members. Edits go through add_peer / remove_peer mutations. |
checks |
editable | Monitored targets. |
alerts |
editable | Notifier destinations. |
peers[]
- node_id: 7f3a5b9e-... # immutable, the peer's own UUID
advertise: host:port # how anyone dials this peer
fingerprint: SHA256:... # SPKI fingerprint of the peer's cert
cert_pem: | # full PEM so other peers can mTLS without a separate invite
-----BEGIN CERTIFICATE-----
...
The cert_pem field is what enables N-node clusters without N×(N-1)
manual invites: when peer X is added via the master, every other node
that receives the new cluster.yaml learns X's cert at the same time
and adds it to the local trust store. See
internal/daemon/daemon.go:syncTrustFromCluster.
checks[]
- id: 0006a1... # UUIDv4, generated when the check is created
name: homepage # human-friendly, must be unique within cluster
type: http # http | tcp | icmp
target: https://example.com
interval: 30s # Go duration syntax: 5s, 1m30s, 2h
timeout: 10s # default 10s
expect_status: 200 # http only; 0 = accept anything < 400
body_match: "OK" # http only; substring match on response body
alert_ids: [oncall] # alerts attached explicitly
suppress_alert_ids: [] # opt out of specific default alerts
Defaults:
interval: 30stimeout: 10sexpect_status: 0 → any 2xx is OK; otherwise the configured status must match exactly.
ICMP checks default to unprivileged UDP-mode pings so the daemon does not need root. For raw ICMP, grant the capability — see deployment/systemd.md.
alerts[]
Two notifier kinds, distinguished by type:
# Discord
- id: f001ab...
name: oncall
type: discord
default: true # attach to every check automatically
discord_webhook: https://...
body_template: | # optional Go text/template override
{{.Check.Name}} is {{.Verb}}
# SMTP
- id: f002cd...
name: ops
type: smtp
smtp_host: smtp.example.com
smtp_port: 587
smtp_user: mailbot
smtp_password: '...'
smtp_from: monitor@example.com
smtp_to: [ops@example.com]
smtp_starttls: true
subject_template: '[{{.Verb}}] {{.Check.Name}}'
body_template: |
Check {{.Check.Name}} ({{.Check.Target}}) is now {{.Verb}}.
If default: true, the alert fires for every check unless the check
lists the alert's ID or name in suppress_alert_ids. Otherwise the
alert only fires for checks that name it in alert_ids.
Templates are Go text/template. The full variable list is in the
top-level README under "Custom alert messages" — qu alert add smtp --help and qu alert add discord --help print the same table.
Suppression precedence
For each check, the dispatcher computes the effective alert list as:
( explicit alert_ids ∪ alerts with default=true ) \ suppress_alert_ids
de-duplicated by alert ID. So a check can both opt in to specific alerts and opt out of specific defaults.
trust.yaml — local trust store
A flat list of fingerprints this node accepts. One entry per peer,
populated by qu node add (or pulled in automatically when a peer's
cert arrives via the replicated cluster.yaml).
entries:
- node_id: 7f3a5b9e-...
address: alpha.example.com:9901
fingerprint: SHA256:...
cert_pem: |
-----BEGIN CERTIFICATE-----
...
Never edit this by hand. Use qu trust list and qu trust remove.
Key material
keys/private.pem is the only secret on disk besides
node.yaml.cluster_secret. It's chmod 0600 by default; preserve that.
The public cert at keys/cert.pem is what gets fingerprinted and
shipped in cluster.yaml.peers[].cert_pem.
There is no automatic key rotation. Rolling a node's identity
means wiping its data directory, running qu init again, and
re-adding it from another node as a fresh peer.
Tunables that don't live in YAML
A few values are compiled constants. Change them in source and rebuild if you need different behaviour.
| Constant | Default | What it does |
|---|---|---|
quorum.DefaultHeartbeatInterval |
1s |
How often each node heartbeats every peer. |
quorum.DefaultDeadAfter |
4s |
A peer is dead if no heartbeat is seen within this window. |
checks.HysteresisCount |
2 |
Consecutive aggregate evaluations needed before a state flip. |
checks.ReconcileInterval |
5s |
How often the scheduler reconciles its workers vs checks[]. |
daemon.manualEditPollInterval (internal/daemon/watcher.go) |
2s |
How often the daemon hashes cluster.yaml for hand edits. |