10 KiB
Configuration
This page is the canonical reference for the on-disk files, the
environment variables, and every field that qu reads. It's
deliberately tedious — when something doesn't behave the way you
expect, this is where the answer lives.
File layout
When running as root (the typical case under systemd):
/etc/quptime/
├── node.yaml identity, never replicated
├── cluster.yaml replicated state
├── trust.yaml local fingerprint trust store
└── keys/
├── private.pem RSA private key (0600)
├── public.pem RSA public key
└── cert.pem self-signed X.509 cert
/var/run/quptime/quptime.sock control socket (0600)
When running as a non-root user (the typical case for go run or a
desktop test):
~/.config/quptime/... same shape as /etc/quptime
$XDG_RUNTIME_DIR/quptime/quptime.sock control socket
Override the data directory with QUPTIME_DIR=/some/path qu serve.
Override the socket path with QUPTIME_SOCKET=/run/foo.sock.
Environment variables
| Variable | Purpose |
|---|---|
QUPTIME_DIR |
Data directory. Defaults to /etc/quptime (root) or $XDG_CONFIG_HOME/quptime. |
QUPTIME_SOCKET |
Path to the CLI ↔ daemon unix socket. Defaults to /var/run/quptime/quptime.sock (root) or $XDG_RUNTIME_DIR/quptime/…. |
XDG_CONFIG_HOME |
Honored when running as non-root and QUPTIME_DIR is unset. |
XDG_RUNTIME_DIR |
Honored when running as non-root and QUPTIME_SOCKET is unset. |
The daemon does not read any other environment variables. SMTP, Discord,
and HTTP probe targets are configured exclusively in cluster.yaml.
node.yaml — local identity
Never replicated. One file per host. Generated by qu init.
node_id: 7f3a5b9e-... # UUIDv4, immutable after init
bind_addr: 0.0.0.0 # listen address for :9901
bind_port: 9901 # listen port
advertise: alpha.example.com:9901 # how peers reach us; may differ from bind
cluster_secret: 4hZqK8vT9... # base64; required to Join, never replicated
Field reference
node_id— UUIDv4 generated atqu init. Used by every peer to refer to this node across IP changes and restarts. Do not edit.bind_addr— Address the daemon listens on.0.0.0.0is the default. Set to127.0.0.1if you only want to expose the daemon through an overlay (Tailscale, WireGuard) — see deployment/tailscale.md.bind_port— Defaults to9901. Change here if 9901 is taken; the cluster does not require port-uniformity, peers just need to know what to dial via theadvertisefield.advertise— Host:port other nodes use to reach this one. Must be routable from every peer. Falls back tobind_addr:bind_portif unset, which is rarely what you want behind NAT.cluster_secret— Pre-shared base64 string. Required on everyJoinRPC; constant-time comparison on the receiver. Generate on the first node, distribute out-of-band, keep out of version control.
How qu init populates this file
qu init \
--advertise alpha.example.com:9901 \
--bind 0.0.0.0 \
--port 9901 \
--secret '<paste from first node, or omit on the first node>'
Idempotent in one direction only: if node.yaml exists, qu init
refuses to overwrite. To re-init, delete the data directory entirely.
cluster.yaml — replicated state
This is the file that every node converges on. The master is the only
one allowed to bump version; followers Replace it whole each time
they receive a higher-versioned snapshot.
version: 12
updated_at: 2026-05-15T14:01:00Z
updated_by: 7f3a5b9e-...
peers:
- node_id: 7f3a5b9e-...
advertise: alpha.example.com:9901
fingerprint: SHA256:abcd...
cert_pem: |
-----BEGIN CERTIFICATE-----
...
-----END CERTIFICATE-----
checks:
- id: 0006a1...
name: homepage
type: http
target: https://example.com
interval: 30s
timeout: 10s
expect_status: 200
alert_ids: [oncall]
suppress_alert_ids: []
alerts:
- id: f001ab...
name: oncall
type: discord
default: true
discord_webhook: https://discord.com/api/webhooks/...
body_template: |
:rotating_light: {{.Check.Name}} is {{.Verb}}
Top-level fields
| Field | Owner | Notes |
|---|---|---|
version |
master | Monotonic. Followers reject snapshots whose version is ≤ their local. |
updated_at |
master | UTC RFC3339. Cosmetic — humans use it, no logic depends on it. |
updated_by |
master | NodeID of the committing master. |
peers |
editable | Cluster members. Edits go through add_peer / remove_peer mutations. |
checks |
editable | Monitored targets. |
alerts |
editable | Notifier destinations. |
peers[]
- node_id: 7f3a5b9e-... # immutable, the peer's own UUID
advertise: host:port # how anyone dials this peer
fingerprint: SHA256:... # SPKI fingerprint of the peer's cert
cert_pem: | # full PEM so other peers can mTLS without a separate invite
-----BEGIN CERTIFICATE-----
...
The cert_pem field is what enables N-node clusters without N×(N-1)
manual invites: when peer X is added via the master, every other node
that receives the new cluster.yaml learns X's cert at the same time
and adds it to the local trust store. See
internal/daemon/daemon.go:syncTrustFromCluster.
checks[]
- id: 0006a1... # UUIDv4, generated when the check is created
name: homepage # human-friendly, must be unique within cluster
type: http # http | tcp | icmp
target: https://example.com
interval: 30s # Go duration syntax: 5s, 1m30s, 2h
timeout: 10s # default 10s
expect_status: 200 # http only; 0 = accept anything < 400
body_match: "OK" # http only; substring match on response body
alert_ids: [oncall] # alerts attached explicitly
suppress_alert_ids: [] # opt out of specific default alerts
Defaults:
interval: 30stimeout: 10sexpect_status: 0 → any 2xx is OK; otherwise the configured status must match exactly.
ICMP checks default to unprivileged UDP-mode pings so the daemon does not need root. For raw ICMP, grant the capability — see deployment/systemd.md.
alerts[]
Two notifier kinds, distinguished by type:
# Discord
- id: f001ab...
name: oncall
type: discord
default: true # attach to every check automatically
discord_webhook: https://...
body_template: | # optional Go text/template override
{{.Check.Name}} is {{.Verb}}
# SMTP
- id: f002cd...
name: ops
type: smtp
smtp_host: smtp.example.com
smtp_port: 587
smtp_user: mailbot
smtp_password: '...'
smtp_from: monitor@example.com
smtp_to: [ops@example.com]
smtp_starttls: true
subject_template: '[{{.Verb}}] {{.Check.Name}}'
body_template: |
Check {{.Check.Name}} ({{.Check.Target}}) is now {{.Verb}}.
If default: true, the alert fires for every check unless the check
lists the alert's ID or name in suppress_alert_ids. Otherwise the
alert only fires for checks that name it in alert_ids.
Templates are Go text/template. The full variable list is in the
top-level README under "Custom alert messages" — qu alert add smtp --help and qu alert add discord --help print the same table.
Suppression precedence
For each check, the dispatcher computes the effective alert list as:
( explicit alert_ids ∪ alerts with default=true ) \ suppress_alert_ids
de-duplicated by alert ID. So a check can both opt in to specific alerts and opt out of specific defaults.
trust.yaml — local trust store
A flat list of fingerprints this node accepts. One entry per peer,
populated by qu node add (or pulled in automatically when a peer's
cert arrives via the replicated cluster.yaml).
entries:
- node_id: 7f3a5b9e-...
address: alpha.example.com:9901
fingerprint: SHA256:...
cert_pem: |
-----BEGIN CERTIFICATE-----
...
Never edit this by hand. Use qu trust list and qu trust remove.
Key material
keys/private.pem is the only secret on disk besides
node.yaml.cluster_secret. It's chmod 0600 by default; preserve that.
The public cert at keys/cert.pem is what gets fingerprinted and
shipped in cluster.yaml.peers[].cert_pem.
There is no automatic key rotation. Rolling a node's identity
means wiping its data directory, running qu init again, and
re-adding it from another node as a fresh peer.
Tunables that don't live in YAML
A few values are compiled constants. Change them in source and rebuild if you need different behaviour.
| Constant | Default | What it does |
|---|---|---|
quorum.DefaultHeartbeatInterval |
1s |
How often each node heartbeats every peer. |
quorum.DefaultDeadAfter |
4s |
A peer is dead if no heartbeat is seen within this window. |
checks.HysteresisCount |
2 |
Consecutive aggregate evaluations needed before a state flip. |
checks.ReconcileInterval |
5s |
How often the scheduler reconciles its workers vs checks[]. |
daemon.manualEditPollInterval (internal/daemon/watcher.go) |
2s |
How often the daemon hashes cluster.yaml for hand edits. |