# Configuration This page is the canonical reference for the on-disk files, the environment variables, and every field that `qu` reads. It's deliberately tedious — when something doesn't behave the way you expect, this is where the answer lives. ## File layout When running as **root** (the typical case under systemd): ``` /etc/quptime/ ├── node.yaml identity, never replicated ├── cluster.yaml replicated state ├── trust.yaml local fingerprint trust store └── keys/ ├── private.pem RSA private key (0600) ├── public.pem RSA public key └── cert.pem self-signed X.509 cert /var/run/quptime/quptime.sock control socket (0600) ``` When running as a **non-root** user (the typical case for `go run` or a desktop test): ``` ~/.config/quptime/... same shape as /etc/quptime $XDG_RUNTIME_DIR/quptime/quptime.sock control socket ``` Override the data directory with `QUPTIME_DIR=/some/path qu serve`. Override the socket path with `QUPTIME_SOCKET=/run/foo.sock`. ## Environment variables | Variable | Purpose | | ----------------- | ------------------------------------------------------------------------------------------------------------------------- | | `QUPTIME_DIR` | Data directory. Defaults to `/etc/quptime` (root) or `$XDG_CONFIG_HOME/quptime`. | | `QUPTIME_SOCKET` | Path to the CLI ↔ daemon unix socket. Defaults to `/var/run/quptime/quptime.sock` (root) or `$XDG_RUNTIME_DIR/quptime/…`. | | `XDG_CONFIG_HOME` | Honored when running as non-root and `QUPTIME_DIR` is unset. | | `XDG_RUNTIME_DIR` | Honored when running as non-root and `QUPTIME_SOCKET` is unset. | The daemon does not read any other environment variables. SMTP, Discord, and HTTP probe targets are configured exclusively in `cluster.yaml`. ## `node.yaml` — local identity Never replicated. One file per host. Generated by `qu init`. ```yaml node_id: 7f3a5b9e-... # UUIDv4, immutable after init bind_addr: 0.0.0.0 # listen address for :9901 bind_port: 9901 # listen port advertise: alpha.example.com:9901 # how peers reach us; may differ from bind cluster_secret: 4hZqK8vT9... # base64; required to Join, never replicated ``` ### Field reference - `node_id` — UUIDv4 generated at `qu init`. Used by every peer to refer to this node across IP changes and restarts. Do not edit. - `bind_addr` — Address the daemon listens on. `0.0.0.0` is the default. Set to `127.0.0.1` if you only want to expose the daemon through an overlay (Tailscale, WireGuard) — see [deployment/tailscale.md](deployment/tailscale.md). - `bind_port` — Defaults to `9901`. Change here if 9901 is taken; the cluster does not require port-uniformity, peers just need to know what to dial via the `advertise` field. - `advertise` — Host:port other nodes use to reach this one. Must be routable from every peer. Falls back to `bind_addr:bind_port` if unset, which is rarely what you want behind NAT. - `cluster_secret` — Pre-shared base64 string. Required on every `Join` RPC; constant-time comparison on the receiver. Generate on the first node, distribute out-of-band, keep out of version control. ### How `qu init` populates this file ```sh qu init \ --advertise alpha.example.com:9901 \ --bind 0.0.0.0 \ --port 9901 \ --secret '' ``` Idempotent in one direction only: if `node.yaml` exists, `qu init` refuses to overwrite. To re-init, delete the data directory entirely. ## `cluster.yaml` — replicated state This is the file that every node converges on. The master is the only one allowed to bump `version`; followers `Replace` it whole each time they receive a higher-versioned snapshot. ```yaml version: 12 updated_at: 2026-05-15T14:01:00Z updated_by: 7f3a5b9e-... peers: - node_id: 7f3a5b9e-... advertise: alpha.example.com:9901 fingerprint: SHA256:abcd... cert_pem: | -----BEGIN CERTIFICATE----- ... -----END CERTIFICATE----- checks: - id: 0006a1... name: homepage type: http target: https://example.com interval: 30s timeout: 10s expect_status: 200 alert_ids: [oncall] suppress_alert_ids: [] alerts: - id: f001ab... name: oncall type: discord default: true discord_webhook: https://discord.com/api/webhooks/... body_template: | :rotating_light: {{.Check.Name}} is {{.Verb}} ``` ### Top-level fields | Field | Owner | Notes | | ------------ | -------- | ---------------------------------------------------------------------------------- | | `version` | master | Monotonic. Followers reject snapshots whose version is ≤ their local. | | `updated_at` | master | UTC RFC3339. Cosmetic — humans use it, no logic depends on it. | | `updated_by` | master | NodeID of the committing master. | | `peers` | editable | Cluster members. Edits go through `add_peer` / `remove_peer` mutations. | | `checks` | editable | Monitored targets. | | `alerts` | editable | Notifier destinations. | ### `peers[]` ```yaml - node_id: 7f3a5b9e-... # immutable, the peer's own UUID advertise: host:port # how anyone dials this peer fingerprint: SHA256:... # SPKI fingerprint of the peer's cert cert_pem: | # full PEM so other peers can mTLS without a separate invite -----BEGIN CERTIFICATE----- ... ``` The `cert_pem` field is what enables N-node clusters without N×(N-1) manual invites: when peer X is added via the master, every other node that receives the new `cluster.yaml` learns X's cert at the same time and adds it to the local trust store. See `internal/daemon/daemon.go:syncTrustFromCluster`. ### `checks[]` ```yaml - id: 0006a1... # UUIDv4, generated when the check is created name: homepage # human-friendly, must be unique within cluster type: http # http | tcp | icmp target: https://example.com interval: 30s # Go duration syntax: 5s, 1m30s, 2h timeout: 10s # default 10s expect_status: 200 # http only; 0 = accept anything < 400 body_match: "OK" # http only; substring match on response body alert_ids: [oncall] # alerts attached explicitly suppress_alert_ids: [] # opt out of specific default alerts ``` Defaults: - `interval`: 30s - `timeout`: 10s - `expect_status`: 0 → any 2xx is OK; otherwise the configured status must match exactly. ICMP checks default to **unprivileged UDP-mode pings** so the daemon does not need root. For raw ICMP, grant the capability — see [deployment/systemd.md](deployment/systemd.md). ### `alerts[]` Two notifier kinds, distinguished by `type`: ```yaml # Discord - id: f001ab... name: oncall type: discord default: true # attach to every check automatically discord_webhook: https://... body_template: | # optional Go text/template override {{.Check.Name}} is {{.Verb}} # SMTP - id: f002cd... name: ops type: smtp smtp_host: smtp.example.com smtp_port: 587 smtp_user: mailbot smtp_password: '...' smtp_from: monitor@example.com smtp_to: [ops@example.com] smtp_starttls: true subject_template: '[{{.Verb}}] {{.Check.Name}}' body_template: | Check {{.Check.Name}} ({{.Check.Target}}) is now {{.Verb}}. ``` If `default: true`, the alert fires for every check unless the check lists the alert's ID or name in `suppress_alert_ids`. Otherwise the alert only fires for checks that name it in `alert_ids`. Templates are Go `text/template`. The full variable list is in the top-level README under "Custom alert messages" — `qu alert add smtp --help` and `qu alert add discord --help` print the same table. ### Suppression precedence For each check, the dispatcher computes the effective alert list as: ``` ( explicit alert_ids ∪ alerts with default=true ) \ suppress_alert_ids ``` de-duplicated by alert ID. So a check can both opt in to specific alerts and opt out of specific defaults. ## `trust.yaml` — local trust store A flat list of fingerprints this node accepts. One entry per peer, populated by `qu node add` (or pulled in automatically when a peer's cert arrives via the replicated `cluster.yaml`). ```yaml entries: - node_id: 7f3a5b9e-... address: alpha.example.com:9901 fingerprint: SHA256:... cert_pem: | -----BEGIN CERTIFICATE----- ... ``` Never edit this by hand. Use `qu trust list` and `qu trust remove`. ## Key material `keys/private.pem` is the only secret on disk besides `node.yaml.cluster_secret`. It's chmod 0600 by default; preserve that. The public cert at `keys/cert.pem` is what gets fingerprinted and shipped in `cluster.yaml.peers[].cert_pem`. There is **no automatic key rotation**. Rolling a node's identity means wiping its data directory, running `qu init` again, and re-adding it from another node as a fresh peer. ## Tunables that don't live in YAML A few values are compiled constants. Change them in source and rebuild if you need different behaviour. | Constant | Default | What it does | | ----------------------------------------------------- | ------- | ------------------------------------------------------------- | | `quorum.DefaultHeartbeatInterval` | `1s` | How often each node heartbeats every peer. | | `quorum.DefaultDeadAfter` | `4s` | A peer is dead if no heartbeat is seen within this window. | | `checks.HysteresisCount` | `2` | Consecutive aggregate evaluations needed before a state flip. | | `checks.ReconcileInterval` | `5s` | How often the scheduler reconciles its workers vs `checks[]`. | | `daemon.manualEditPollInterval` (`internal/daemon/watcher.go`) | `2s` | How often the daemon hashes `cluster.yaml` for hand edits. |