# Configuration This page is the canonical reference for the on-disk files, the environment variables, and every field that `qu` reads. It's deliberately tedious — when something doesn't behave the way you expect, this is where the answer lives. ## File layout When running as **root** (the typical case under systemd): ``` /etc/quptime/ ├── node.yaml identity, never replicated ├── cluster.yaml replicated state ├── trust.yaml local fingerprint trust store └── keys/ ├── private.pem RSA private key (0600) ├── public.pem RSA public key └── cert.pem self-signed X.509 cert /var/run/quptime/quptime.sock control socket (0600) ``` When running as a **non-root** user (the typical case for `go run` or a desktop test): ``` ~/.config/quptime/... same shape as /etc/quptime $XDG_RUNTIME_DIR/quptime/quptime.sock control socket ``` Override the data directory with `QUPTIME_DIR=/some/path qu serve`. Override the socket path with `QUPTIME_SOCKET=/run/foo.sock`. ## Environment variables ### Paths | Variable | Purpose | | ----------------- | ------------------------------------------------------------------------------------------------------------------------- | | `QUPTIME_DIR` | Data directory. Defaults to `/etc/quptime` (root) or `$XDG_CONFIG_HOME/quptime`. | | `QUPTIME_SOCKET` | Path to the CLI ↔ daemon unix socket. Defaults to `/var/run/quptime/quptime.sock` (root) or `$XDG_RUNTIME_DIR/quptime/…`. | | `XDG_CONFIG_HOME` | Honored when running as non-root and `QUPTIME_DIR` is unset. | | `XDG_RUNTIME_DIR` | Honored when running as non-root and `QUPTIME_SOCKET` is unset. | ### `node.yaml` field overrides Every field in `node.yaml` can also be supplied via an environment variable. This is the recommended way to drive Docker / Compose deployments: drop the env vars into the compose file and the daemon will bootstrap on first start without a separate `qu init` step. | Variable | `node.yaml` field | Notes | | ------------------------ | ----------------- | -------------------------------------------------------------------------------------------------------------- | | `QUPTIME_NODE_ID` | `node_id` | Pin a specific UUID. Leave unset to let `qu init` / auto-init generate one. | | `QUPTIME_BIND_ADDR` | `bind_addr` | Defaults to `0.0.0.0`. | | `QUPTIME_BIND_PORT` | `bind_port` | Integer. Defaults to `9901`. | | `QUPTIME_ADVERTISE` | `advertise` | `host:port` other peers use to reach this node. Required when bound to a wildcard or behind NAT. | | `QUPTIME_CLUSTER_SECRET` | `cluster_secret` | Pre-shared join secret. Set the same value on every node. If unset on the very first node, one is generated. | Precedence is **env > file > compiled default**. Non-empty env values win over whatever is stored in `node.yaml` at load time, so changing a variable in `docker-compose.yml` and restarting the container is enough to roll out new bind/advertise values — no on-disk edit required. Empty env values are ignored (they will not clear a previously persisted field). For `qu init` specifically, explicit command-line flags take precedence over env values; env values fill in only the fields the operator did not pass on the command line. The daemon does not read any other environment variables. SMTP, Discord, and HTTP probe targets are configured exclusively in `cluster.yaml`. ## Auto-init on `qu serve` If `node.yaml` does not exist when `qu serve` starts, the daemon bootstraps it in-place using the `QUPTIME_*` env vars above: a fresh UUID is generated (or `QUPTIME_NODE_ID` is honored if set), an RSA keypair and self-signed cert are written under `keys/`, and `cluster.yaml` is seeded with this node as its sole peer. If no `QUPTIME_CLUSTER_SECRET` was provided, a random one is generated and printed to stderr — copy it to every follower node's `QUPTIME_CLUSTER_SECRET` (or `--secret` flag) before they start. This is what makes the docker-compose flow `docker compose up`-only on a fresh volume. To opt out (e.g. so a misconfigured deployment crashes loudly instead of silently generating a new identity), run `qu init` against the volume yourself before letting `qu serve` ever see it. ## `node.yaml` — local identity Never replicated. One file per host. Generated by `qu init`. ```yaml node_id: 7f3a5b9e-... # UUIDv4, immutable after init bind_addr: 0.0.0.0 # listen address for :9901 bind_port: 9901 # listen port advertise: alpha.example.com:9901 # how peers reach us; may differ from bind cluster_secret: 4hZqK8vT9... # base64; required to Join, never replicated ``` ### Field reference - `node_id` — UUIDv4 generated at `qu init`. Used by every peer to refer to this node across IP changes and restarts. Do not edit. - `bind_addr` — Address the daemon listens on. `0.0.0.0` is the default. Set to `127.0.0.1` if you only want to expose the daemon through an overlay (Tailscale, WireGuard) — see [deployment/tailscale.md](deployment/tailscale.md). - `bind_port` — Defaults to `9901`. Change here if 9901 is taken; the cluster does not require port-uniformity, peers just need to know what to dial via the `advertise` field. - `advertise` — Host:port other nodes use to reach this one. Must be routable from every peer. Falls back to `bind_addr:bind_port` if unset, which is rarely what you want behind NAT. - `cluster_secret` — Pre-shared base64 string. Required on every `Join` RPC; constant-time comparison on the receiver. Generate on the first node, distribute out-of-band, keep out of version control. ### How `qu init` populates this file ```sh qu init \ --advertise alpha.example.com:9901 \ --bind 0.0.0.0 \ --port 9901 \ --secret '' ``` Idempotent in one direction only: if `node.yaml` exists, `qu init` refuses to overwrite. To re-init, delete the data directory entirely. ## `cluster.yaml` — replicated state This is the file that every node converges on. The master is the only one allowed to bump `version`; followers `Replace` it whole each time they receive a higher-versioned snapshot. ```yaml version: 12 updated_at: 2026-05-15T14:01:00Z updated_by: 7f3a5b9e-... peers: - node_id: 7f3a5b9e-... advertise: alpha.example.com:9901 fingerprint: SHA256:abcd... cert_pem: | -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- checks: - id: 0006a1... name: homepage type: http target: https://example.com interval: 30s timeout: 10s expect_status: 200 alert_ids: [oncall] suppress_alert_ids: [] alerts: - id: f001ab... name: oncall type: discord default: true discord_webhook: https://discord.com/api/webhooks/... body_template: | :rotating_light: {{.Check.Name}} is {{.Verb}} ``` ### Top-level fields | Field | Owner | Notes | | ------------ | -------- | ---------------------------------------------------------------------------------- | | `version` | master | Monotonic. Followers reject snapshots whose version is ≤ their local. | | `updated_at` | master | UTC RFC3339. Cosmetic — humans use it, no logic depends on it. | | `updated_by` | master | NodeID of the committing master. | | `peers` | editable | Cluster members. Edits go through `add_peer` / `remove_peer` mutations. | | `checks` | editable | Monitored targets. | | `alerts` | editable | Notifier destinations. | ### `peers[]` ```yaml - node_id: 7f3a5b9e-... # immutable, the peer's own UUID advertise: host:port # how anyone dials this peer fingerprint: SHA256:... # SPKI fingerprint of the peer's cert cert_pem: | # full PEM so other peers can mTLS without a separate invite -----BEGIN CERTIFICATE----- ... ``` The `cert_pem` field is what enables N-node clusters without N×(N-1) manual invites: when peer X is added via the master, every other node that receives the new `cluster.yaml` learns X's cert at the same time and adds it to the local trust store. See `internal/daemon/daemon.go:syncTrustFromCluster`. ### `checks[]` ```yaml - id: 0006a1... # UUIDv4, generated when the check is created name: homepage # human-friendly, must be unique within cluster type: http # http | tcp | icmp target: https://example.com interval: 30s # Go duration syntax: 5s, 1m30s, 2h timeout: 10s # default 10s expect_status: 200 # http only; 0 = accept anything < 400 body_match: "OK" # http only; substring match on response body alert_ids: [oncall] # alerts attached explicitly suppress_alert_ids: [] # opt out of specific default alerts ``` Defaults: - `interval`: 30s - `timeout`: 10s - `expect_status`: 0 → any 2xx is OK; otherwise the configured status must match exactly. ICMP checks default to **unprivileged UDP-mode pings** so the daemon does not need root. For raw ICMP, grant the capability — see [deployment/systemd.md](deployment/systemd.md). ### `alerts[]` Two notifier kinds, distinguished by `type`: ```yaml # Discord - id: f001ab... name: oncall type: discord default: true # attach to every check automatically discord_webhook: https://... body_template: | # optional Go text/template override {{.Check.Name}} is {{.Verb}} # SMTP - id: f002cd... name: ops type: smtp smtp_host: smtp.example.com smtp_port: 587 smtp_user: mailbot smtp_password: '...' smtp_from: monitor@example.com smtp_to: [ops@example.com] smtp_starttls: true subject_template: '[{{.Verb}}] {{.Check.Name}}' body_template: | Check {{.Check.Name}} ({{.Check.Target}}) is now {{.Verb}}. ``` If `default: true`, the alert fires for every check unless the check lists the alert's ID or name in `suppress_alert_ids`. Otherwise the alert only fires for checks that name it in `alert_ids`. Templates are Go `text/template`. The full variable list is in the top-level README under "Custom alert messages" — `qu alert add smtp --help` and `qu alert add discord --help` print the same table. ### Suppression precedence For each check, the dispatcher computes the effective alert list as: ``` ( explicit alert_ids ∪ alerts with default=true ) \ suppress_alert_ids ``` de-duplicated by alert ID. So a check can both opt in to specific alerts and opt out of specific defaults. ## `trust.yaml` — local trust store A flat list of fingerprints this node accepts. One entry per peer, populated by `qu node add` (or pulled in automatically when a peer's cert arrives via the replicated `cluster.yaml`). ```yaml entries: - node_id: 7f3a5b9e-... address: alpha.example.com:9901 fingerprint: SHA256:... cert_pem: | -----BEGIN CERTIFICATE----- ... ``` Never edit this by hand. Use `qu trust list` and `qu trust remove`. ## Key material `keys/private.pem` is the only secret on disk besides `node.yaml.cluster_secret`. It's chmod 0600 by default; preserve that. The public cert at `keys/cert.pem` is what gets fingerprinted and shipped in `cluster.yaml.peers[].cert_pem`. There is **no automatic key rotation**. Rolling a node's identity means wiping its data directory, running `qu init` again, and re-adding it from another node as a fresh peer. ## Tunables that don't live in YAML A few values are compiled constants. Change them in source and rebuild if you need different behaviour. | Constant | Default | What it does | | ----------------------------------------------------- | ------- | ------------------------------------------------------------- | | `quorum.DefaultHeartbeatInterval` | `1s` | How often each node heartbeats every peer. | | `quorum.DefaultDeadAfter` | `4s` | A peer is dead if no heartbeat is seen within this window. | | `checks.HysteresisCount` | `2` | Consecutive aggregate evaluations needed before a state flip. | | `checks.ReconcileInterval` | `5s` | How often the scheduler reconciles its workers vs `checks[]`. | | `daemon.manualEditPollInterval` (`internal/daemon/watcher.go`) | `2s` | How often the daemon hashes `cluster.yaml` for hand edits. |