Updated readme

This commit is contained in:
2026-05-14 00:31:02 +00:00
parent 1dc3ad1215
commit 6d7c0ce58b
+64 -2
View File
@@ -44,6 +44,12 @@ Master election is deterministic: among the live members of the quorum,
the node with the lexicographically smallest NodeID wins. No the node with the lexicographically smallest NodeID wins. No
negotiation, no split-brain window. negotiation, no split-brain window.
`cluster.yaml` is the single replicated source of truth (peers, checks,
alerts). Mutations from the CLI route through the master, which bumps a
monotonic version and broadcasts the result. The same file is also
watched on disk, so an operator can `sudoedit cluster.yaml` on any node
and the daemon will replicate the edit cluster-wide.
## Build ## Build
Requires Go 1.23 or newer. Requires Go 1.23 or newer.
@@ -150,6 +156,61 @@ Mutations always route to the master, which bumps a monotonic version
and pushes the new `cluster.yaml` to every peer. If quorum is lost, and pushes the new `cluster.yaml` to every peer. If quorum is lost,
mutating commands fail loudly. mutating commands fail loudly.
`qu status` shows the effective alert list for each check. Default
alerts are suffixed with `*` so you can tell at a glance which alerts
were attached automatically vs explicitly listed on the check:
```
CHECKS
ID NAME STATE OK/TOTAL ALERTS DETAIL
ddbd... homepage up 3/3 oncall,ops*
0006... db down 1/3 ops* dial timeout
24f4... gateway up 3/3 -
(alerts marked * are attached as defaults)
```
## Default alerts (attach to every check)
Rather than listing the same `--alerts` on every `check add`, mark an
alert as default and it fires for every check automatically:
```sh
# at creation
qu alert add discord oncall --webhook https://... --default
# or toggle later
qu alert default oncall on
qu alert default oncall off
```
`qu alert list` shows a DEFAULT column. A check can opt out of a
specific default by adding the alert's ID or name to its
`suppress_alert_ids` list in `cluster.yaml` (see "Edit cluster.yaml
directly" below).
## Edit cluster.yaml directly
Anything you can do through the CLI you can also do by editing
`$QUPTIME_DIR/cluster.yaml` on any node. The daemon polls the file every
few seconds; when it sees a hash that differs from what it last wrote,
it parses the YAML and forwards the change through the master, which
bumps the version and broadcasts the result everywhere — so a hand-edit
on `bravo` propagates to `alpha` and `charlie` automatically.
```sh
sudoedit /etc/quptime/cluster.yaml
# add `default: true` to an alert, or `suppress_alert_ids: [oncall]`
# on a check, then save and quit
```
You'll see a `manual-edit: cluster.yaml changed externally —
replicating via master` line in the daemon log when it picks the change
up. Invalid YAML is logged and ignored until you save a valid file.
The replicated fields are `peers`, `checks`, and `alerts`. `version`,
`updated_at`, and `updated_by` are server-controlled — the master
overwrites them on commit.
## Test an alert without waiting for a real outage ## Test an alert without waiting for a real outage
```sh ```sh
@@ -197,9 +258,10 @@ qu check add tcp <name> <host:port>
qu check add icmp <name> <host> qu check add icmp <name> <host>
qu check list qu check list
qu check remove <id-or-name> qu check remove <id-or-name>
qu alert add smtp <name> --host … --port … --from … --to … [--user --password --starttls] qu alert add smtp <name> --host … --port … --from … --to … [--user --password --starttls] [--default]
qu alert add discord <name> --webhook … qu alert add discord <name> --webhook … [--default]
qu alert list / remove / test <id-or-name> qu alert list / remove / test <id-or-name>
qu alert default <id-or-name> on|off toggle default attachment to every check
qu trust list / remove <node-id> qu trust list / remove <node-id>
``` ```