v0.0.1 release
This commit is contained in:
+34
@@ -0,0 +1,34 @@
|
||||
# Build artifacts
|
||||
/qu
|
||||
/qu-*
|
||||
/dist/
|
||||
*.exe
|
||||
*.test
|
||||
*.out
|
||||
|
||||
# Go workspace / module cache (only relevant if vendored)
|
||||
/vendor/
|
||||
|
||||
# Local node state — never commit anything that looks like a data dir
|
||||
/quptime/
|
||||
/etc/quptime/
|
||||
node.yaml
|
||||
cluster.yaml
|
||||
trust.yaml
|
||||
keys/
|
||||
|
||||
# Compose / secrets
|
||||
.env
|
||||
.env.local
|
||||
*.local.yml
|
||||
*.local.yaml
|
||||
|
||||
# Editor / OS scratch
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
.DS_Store
|
||||
|
||||
# Test / coverage
|
||||
coverage.out
|
||||
coverage.html
|
||||
@@ -0,0 +1,86 @@
|
||||
# Changelog
|
||||
|
||||
All notable changes to this project are documented here. The format
|
||||
follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/) and
|
||||
this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||
|
||||
## [v0.0.1] — 2026-05-15
|
||||
|
||||
Initial public release.
|
||||
|
||||
### Added
|
||||
|
||||
- **Quorum-based uptime monitoring.** Multiple cooperating nodes run
|
||||
the same probes (HTTP, TCP, ICMP) and vote on the cluster-wide
|
||||
truth. A check flips state only after two consecutive aggregate
|
||||
evaluations agree (hysteresis), so single-node flake doesn't page
|
||||
anyone.
|
||||
- **Deterministic master election.** Among the live members of the
|
||||
quorum the lexicographically smallest NodeID wins — no negotiation
|
||||
step, no split-brain window.
|
||||
- **mTLS inter-node transport** with TLS 1.3 minimum, SSH-style
|
||||
fingerprint pinning, and a pre-shared `cluster_secret` gating the
|
||||
Join RPC.
|
||||
- **Replicated `cluster.yaml`** carrying peers, checks, and alerts.
|
||||
Master is the only writer; followers receive monotonic-versioned
|
||||
snapshots and converge on the latest. Hand-edits to the file on any
|
||||
node are picked up by the manual-edit watcher and forwarded through
|
||||
the master.
|
||||
- **HTTP, TCP, and ICMP probes** with configurable interval,
|
||||
timeout, expected status, and optional body-substring match. ICMP
|
||||
defaults to unprivileged UDP-mode pings so the daemon can run as a
|
||||
non-root user.
|
||||
- **SMTP and Discord alerts** with optional Go `text/template`
|
||||
subject/body overrides per alert, default-attach mode (`default:
|
||||
true`), and per-check opt-outs via `suppress_alert_ids`.
|
||||
- **Docker-friendly env-var configuration.** Every field in
|
||||
`node.yaml` can also be supplied via a `QUPTIME_*` environment
|
||||
variable; `qu serve` auto-initialises a fresh data volume from
|
||||
these on first start, so `docker compose up` is enough to launch a
|
||||
node.
|
||||
- **Interactive TUI** (`qu tui`) for peers, checks, and alerts with
|
||||
live refresh.
|
||||
- **Hardened systemd unit** shipped via `install.sh`: dedicated
|
||||
`quptime` user, `ProtectSystem=strict`, all capabilities dropped by
|
||||
default.
|
||||
- **Multi-arch Docker images** (`linux/amd64`, `linux/arm64`)
|
||||
published to `git.cer.sh/axodouble/quptime`.
|
||||
- **Static Linux binaries** (`amd64`, `arm64`) published per tag with
|
||||
a `SHA256SUMS` file; the official installer verifies the checksum
|
||||
before placing the binary on disk.
|
||||
|
||||
### Security
|
||||
|
||||
- Cluster secret is compared in constant time
|
||||
(`crypto/subtle.ConstantTimeCompare`).
|
||||
- Self-signed RSA certs minted at `qu init`; SPKI SHA-256
|
||||
fingerprints are what's pinned, matching the canonical OpenSSL
|
||||
representation.
|
||||
- Private keys are written with mode `0600`; data and runtime
|
||||
directories with `0700`/`0750`.
|
||||
- All `cluster.yaml` writes go through an atomic `tmpfile + rename`.
|
||||
- `install.sh` downloads the published `SHA256SUMS` and refuses to
|
||||
install if the downloaded binary doesn't match.
|
||||
|
||||
### Known limitations
|
||||
|
||||
- **Cluster-wide secret distribution.** SMTP passwords and Discord
|
||||
webhook URLs configured via `qu alert add …` are stored in
|
||||
`cluster.yaml`, which is replicated to every node. Treat every node
|
||||
as having read access to every alert credential. Restrict who can
|
||||
reach the data directory accordingly. See
|
||||
[docs/security.md](docs/security.md) for the threat model.
|
||||
- **No automatic key rotation.** Rolling a node's identity means
|
||||
wiping its data directory, running `qu init` again, and re-adding
|
||||
it from another node.
|
||||
- **No historical metrics.** Only the current aggregate state is kept
|
||||
in memory. There is no built-in graph store, SLA calculator, or
|
||||
audit log.
|
||||
- **Master-flap state.** Aggregator hysteresis state lives in
|
||||
memory on the current master. When leadership changes the new
|
||||
master starts from `StateUnknown` and re-accumulates hysteresis —
|
||||
expect a few seconds of delayed alerting after a master switch.
|
||||
- **No release signing beyond SHA256SUMS** (no cosign / GPG).
|
||||
Planned for a future release.
|
||||
|
||||
[v0.0.1]: https://git.cer.sh/axodouble/quptime/releases/tag/v0.0.1
|
||||
@@ -88,7 +88,7 @@ go build -o qu ./cmd/qu
|
||||
To stamp the version into the binary:
|
||||
|
||||
```sh
|
||||
go build -ldflags "-X main.version=v0.1.0" -o qu ./cmd/qu
|
||||
go build -ldflags "-X main.version=v0.0.1" -o qu ./cmd/qu
|
||||
qu --version
|
||||
```
|
||||
|
||||
@@ -100,7 +100,7 @@ amd64 and arm64, and publishes them as a Gitea release with a
|
||||
`SHA256SUMS` file alongside.
|
||||
|
||||
```sh
|
||||
git tag v0.1.0
|
||||
git tag v0.0.1
|
||||
git push --tags
|
||||
```
|
||||
|
||||
@@ -166,6 +166,15 @@ c0d4... charlie.example.com:9901 true 2026-05-12T15:01:32Z
|
||||
|
||||
## Adding checks and alerts
|
||||
|
||||
> ⚠️ **Alert credentials are replicated cluster-wide.** SMTP passwords
|
||||
> and Discord webhook URLs live in `cluster.yaml`, which is mirrored to
|
||||
> every node. Any node that can read its own data directory can read
|
||||
> every alert secret. Treat compromising one node as compromising every
|
||||
> alert credential, and restrict who can reach `$QUPTIME_DIR` on each
|
||||
> host (the hardened systemd unit and the Docker image both default to
|
||||
> `0700`/`0750`). See [docs/security.md](docs/security.md) for the full
|
||||
> threat model.
|
||||
|
||||
```sh
|
||||
# alerts first so checks can reference them
|
||||
qu alert add discord oncall --webhook https://discord.com/api/webhooks/...
|
||||
|
||||
@@ -9,8 +9,9 @@ daemon can bind privileged ports and open ICMP sockets; override with
|
||||
|
||||
```
|
||||
git.cer.sh/axodouble/quptime:master # tip of main, multi-arch
|
||||
git.cer.sh/axodouble/quptime:v0.1.0 # tagged release
|
||||
git.cer.sh/axodouble/quptime:v0.1.0-amd64 # single-arch (if you must pin)
|
||||
git.cer.sh/axodouble/quptime:latest # latest tagged release
|
||||
git.cer.sh/axodouble/quptime:v0.0.1 # specific tagged release
|
||||
git.cer.sh/axodouble/quptime:latest-amd64 # single-arch (if you must pin)
|
||||
```
|
||||
|
||||
The image embeds `QUPTIME_DIR=/etc/quptime` and declares it a volume —
|
||||
@@ -24,7 +25,7 @@ For a development cluster or a single-node smoke test:
|
||||
# compose.yaml
|
||||
services:
|
||||
quptime:
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
image: git.cer.sh/axodouble/quptime:latest
|
||||
container_name: quptime
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
@@ -76,7 +77,7 @@ For local testing of the full quorum machinery without three machines:
|
||||
```yaml
|
||||
# compose.yaml
|
||||
x-quptime: &quptime
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
image: git.cer.sh/axodouble/quptime:latest
|
||||
restart: unless-stopped
|
||||
cap_add:
|
||||
- NET_RAW
|
||||
@@ -146,7 +147,7 @@ The natural unit is one compose file per host, each running one
|
||||
# /etc/qu-stack/compose.yaml
|
||||
services:
|
||||
quptime:
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
image: git.cer.sh/axodouble/quptime:latest
|
||||
container_name: quptime
|
||||
restart: unless-stopped
|
||||
environment:
|
||||
|
||||
@@ -51,7 +51,7 @@ services:
|
||||
restart: unless-stopped
|
||||
|
||||
quptime:
|
||||
image: git.cer.sh/axodouble/quptime:v0.1.0
|
||||
image: git.cer.sh/axodouble/quptime:latest
|
||||
container_name: quptime
|
||||
environment:
|
||||
# host:port other QUptime nodes use to reach this one. Should be
|
||||
|
||||
@@ -79,7 +79,8 @@ registry on every tag and every push to `master`:
|
||||
|
||||
```
|
||||
git.cer.sh/axodouble/quptime:master # tip of main
|
||||
git.cer.sh/axodouble/quptime:v0.1.0 # tagged release
|
||||
git.cer.sh/axodouble/quptime:latest # latest tagged release
|
||||
git.cer.sh/axodouble/quptime:v0.0.1 # pinned release
|
||||
```
|
||||
|
||||
See the [Docker deployment guide](deployment/docker.md) for compose
|
||||
|
||||
+166
-29
@@ -1,21 +1,28 @@
|
||||
#!/bin/bash
|
||||
# QUptime installer.
|
||||
#
|
||||
# Downloads the latest released `qu` binary from the Gitea release
|
||||
# page, verifies it against the published SHA256SUMS, installs it to
|
||||
# /usr/local/bin, and (on systemd hosts) drops in a hardened
|
||||
# quptime.service that matches the unit documented in
|
||||
# docs/deployment/systemd.md. Idempotent — re-running upgrades the
|
||||
# binary and refreshes the unit without touching the data directory.
|
||||
set -euo pipefail
|
||||
|
||||
INSTALL_BIN="/usr/local/bin/qu"
|
||||
SERVICE_FILE="/etc/systemd/system/qu-serve.service"
|
||||
SERVICE_USER="${SUDO_USER:-$(whoami)}"
|
||||
SERVICE_GROUP="$(id -gn "$SERVICE_USER" 2>/dev/null || echo root)"
|
||||
SERVICE_FILE="/etc/systemd/system/quptime.service"
|
||||
SERVICE_NAME="$(basename "$SERVICE_FILE")"
|
||||
SERVICE_USER="quptime"
|
||||
SERVICE_GROUP="quptime"
|
||||
DATA_DIR="/etc/quptime"
|
||||
REPO_API="https://git.cer.sh/api/v1/repos/axodouble/quptime/releases/latest"
|
||||
RELEASE_BASE="https://git.cer.sh/axodouble/quptime/releases/download"
|
||||
|
||||
fail() {
|
||||
echo "Error: $*" >&2
|
||||
exit 1
|
||||
}
|
||||
|
||||
echo_cmd() {
|
||||
echo -e "\033[90m> $1\033[0m"
|
||||
eval "$1"
|
||||
}
|
||||
|
||||
require_command() {
|
||||
command -v "$1" >/dev/null 2>&1 || fail "$1 is not installed. Please install $1 and try again."
|
||||
}
|
||||
@@ -31,52 +38,182 @@ write_completion() {
|
||||
return 1
|
||||
}
|
||||
|
||||
require_command jq
|
||||
require_command curl
|
||||
require_command jq
|
||||
require_command sha256sum
|
||||
require_command install
|
||||
require_command mktemp
|
||||
|
||||
# --- target architecture ------------------------------------------------
|
||||
case "$(uname -m)" in
|
||||
x86_64) ARCH=amd64 ;;
|
||||
aarch64|arm64) ARCH=arm64 ;;
|
||||
*) fail "unsupported architecture: $(uname -m). Pre-built binaries are published for amd64 and arm64 only — build from source for other platforms." ;;
|
||||
esac
|
||||
|
||||
if [ ! -w "$(dirname "$INSTALL_BIN")" ]; then
|
||||
fail "You are not allowed to write to $(dirname "$INSTALL_BIN"). Run this script with sudo or install qu manually."
|
||||
fail "Cannot write to $(dirname "$INSTALL_BIN"). Run this script with sudo, or set INSTALL_BIN to a writable location."
|
||||
fi
|
||||
|
||||
RELEASE=$(curl -s https://git.cer.sh/api/v1/repos/axodouble/quptime/releases/latest | jq -r '.tag_name')
|
||||
# --- latest release tag -------------------------------------------------
|
||||
RELEASE=$(curl -fsSL "$REPO_API" | jq -r '.tag_name')
|
||||
[ -n "$RELEASE" ] && [ "$RELEASE" != "null" ] \
|
||||
|| fail "could not determine the latest release tag from $REPO_API"
|
||||
|
||||
echo_cmd "curl -L -o '$INSTALL_BIN' 'https://git.cer.sh/axodouble/quptime/releases/download/${RELEASE}/qu-${RELEASE}-linux-amd64'"
|
||||
echo_cmd "chmod +x '$INSTALL_BIN'"
|
||||
echo "> qu has been installed to $INSTALL_BIN"
|
||||
BINARY_NAME="qu-${RELEASE}-linux-${ARCH}"
|
||||
BINARY_URL="${RELEASE_BASE}/${RELEASE}/${BINARY_NAME}"
|
||||
SUMS_URL="${RELEASE_BASE}/${RELEASE}/SHA256SUMS"
|
||||
|
||||
# --- download + verify --------------------------------------------------
|
||||
# Stage in a temp dir so a failed verification never leaves a partial
|
||||
# or unverified binary on disk.
|
||||
TMPDIR=$(mktemp -d)
|
||||
trap 'rm -rf "$TMPDIR"' EXIT
|
||||
|
||||
echo "> downloading $BINARY_NAME"
|
||||
curl -fsSL --proto '=https' --tlsv1.2 -o "$TMPDIR/$BINARY_NAME" "$BINARY_URL"
|
||||
echo "> downloading SHA256SUMS"
|
||||
curl -fsSL --proto '=https' --tlsv1.2 -o "$TMPDIR/SHA256SUMS" "$SUMS_URL"
|
||||
|
||||
echo "> verifying checksum"
|
||||
# Pull just our binary's entry so sha256sum -c doesn't fail on the
|
||||
# arches we didn't download.
|
||||
(
|
||||
cd "$TMPDIR"
|
||||
if ! grep -E "[[:space:]]\\*?${BINARY_NAME}\$" SHA256SUMS > expected.sum; then
|
||||
fail "no entry for $BINARY_NAME in published SHA256SUMS — refusing to install"
|
||||
fi
|
||||
if ! sha256sum -c expected.sum >/dev/null 2>&1; then
|
||||
echo "expected: $(awk '{print $1}' expected.sum)"
|
||||
echo "actual: $(sha256sum "$BINARY_NAME" | awk '{print $1}')"
|
||||
fail "checksum mismatch for $BINARY_NAME — refusing to install"
|
||||
fi
|
||||
)
|
||||
echo "> checksum OK"
|
||||
|
||||
install -m 0755 "$TMPDIR/$BINARY_NAME" "$INSTALL_BIN"
|
||||
echo "> qu ${RELEASE} installed to $INSTALL_BIN"
|
||||
|
||||
# --- shell completions --------------------------------------------------
|
||||
if "$INSTALL_BIN" --help 2>/dev/null | grep -q "completion"; then
|
||||
write_completion bash /usr/share/bash-completion/completions/qu \
|
||||
|| write_completion bash /etc/bash_completion.d/qu || true
|
||||
|| write_completion bash /etc/bash_completion.d/qu \
|
||||
|| true
|
||||
write_completion zsh /usr/share/zsh/site-functions/_qu || true
|
||||
write_completion fish /usr/share/fish/vendor_completions.d/qu.fish || true
|
||||
else
|
||||
echo "> qu does not expose completion support; skipping shell completion installation."
|
||||
fi
|
||||
|
||||
# --- systemd unit -------------------------------------------------------
|
||||
if ! command -v systemctl >/dev/null 2>&1; then
|
||||
echo "> Warning: systemd is not available on this system. qu serve will not be automatically started on boot."
|
||||
echo "Installation complete, before starting qu serve, make sure to run qu init and read the documentation."
|
||||
echo
|
||||
echo "> systemd is not available on this system. Installation stops here."
|
||||
echo "> Run \`qu serve\` manually (or wire it into the supervisor of your choice)."
|
||||
exit 0
|
||||
fi
|
||||
|
||||
echo "> Creating systemd service file for qu serve..."
|
||||
cat > "$SERVICE_FILE" <<EOL
|
||||
# Dedicated service user. Hardened unit drops all capabilities and
|
||||
# locks the daemon down with ProtectSystem=strict, so it must run as
|
||||
# its own unprivileged account rather than the invoking sudo user.
|
||||
if ! id "$SERVICE_USER" >/dev/null 2>&1; then
|
||||
echo "> creating system user $SERVICE_USER"
|
||||
useradd --system --no-create-home --shell /usr/sbin/nologin "$SERVICE_USER"
|
||||
fi
|
||||
|
||||
install -d -o "$SERVICE_USER" -g "$SERVICE_GROUP" -m 0750 "$DATA_DIR"
|
||||
|
||||
echo "> writing $SERVICE_FILE"
|
||||
cat > "$SERVICE_FILE" <<'EOF'
|
||||
[Unit]
|
||||
Description=QUptime Serve
|
||||
After=network.target
|
||||
Description=QUptime distributed uptime monitor
|
||||
Documentation=https://git.cer.sh/axodouble/quptime
|
||||
Wants=network-online.target
|
||||
After=network-online.target
|
||||
|
||||
[Service]
|
||||
ExecStart=$INSTALL_BIN serve
|
||||
Type=simple
|
||||
ExecStart=/usr/local/bin/qu serve
|
||||
Restart=always
|
||||
User=$SERVICE_USER
|
||||
Group=$SERVICE_GROUP
|
||||
RestartSec=5s
|
||||
|
||||
User=quptime
|
||||
Group=quptime
|
||||
|
||||
# Where state lives. RuntimeDirectory creates /var/run/quptime/ each
|
||||
# boot owned by User:Group with mode 0750.
|
||||
Environment=QUPTIME_DIR=/etc/quptime
|
||||
RuntimeDirectory=quptime
|
||||
RuntimeDirectoryMode=0750
|
||||
ReadWritePaths=/etc/quptime /var/run/quptime
|
||||
|
||||
# Hardening. Comment out individual directives if a probe needs
|
||||
# something we've revoked.
|
||||
NoNewPrivileges=true
|
||||
ProtectSystem=strict
|
||||
ProtectHome=true
|
||||
PrivateTmp=true
|
||||
PrivateDevices=true
|
||||
ProtectKernelTunables=true
|
||||
ProtectKernelModules=true
|
||||
ProtectControlGroups=true
|
||||
ProtectClock=true
|
||||
ProtectHostname=true
|
||||
RestrictNamespaces=true
|
||||
RestrictRealtime=true
|
||||
RestrictSUIDSGID=true
|
||||
LockPersonality=true
|
||||
MemoryDenyWriteExecute=true
|
||||
|
||||
# Network access is required (we're a network monitor). Keep address
|
||||
# families minimal — AF_NETLINK is needed for some libc lookups.
|
||||
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
|
||||
|
||||
# If you need raw ICMP, *also* uncomment:
|
||||
# AmbientCapabilities=CAP_NET_RAW
|
||||
# CapabilityBoundingSet=CAP_NET_RAW
|
||||
# Otherwise drop all capabilities:
|
||||
CapabilityBoundingSet=
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
EOL
|
||||
EOF
|
||||
|
||||
echo_cmd "systemctl daemon-reload"
|
||||
echo_cmd "systemctl enable $(basename "$SERVICE_FILE")"
|
||||
echo "> qu serve service has been created and enabled. You can start it with 'systemctl start $(basename "$SERVICE_FILE")'"
|
||||
systemctl daemon-reload
|
||||
systemctl enable "$SERVICE_NAME" >/dev/null
|
||||
echo "> ${SERVICE_NAME} installed and enabled (not yet started)"
|
||||
|
||||
echo "Installation complete, before starting qu serve, make sure to run qu init and read the documentation."
|
||||
cat <<EOF
|
||||
|
||||
Installation complete.
|
||||
|
||||
Next steps:
|
||||
|
||||
1. Initialise the node identity. Either:
|
||||
|
||||
a) Let \`qu serve\` auto-init from environment variables.
|
||||
Drop a systemd override like:
|
||||
|
||||
sudo systemctl edit ${SERVICE_NAME}
|
||||
[Service]
|
||||
Environment=QUPTIME_ADVERTISE=<this-host>:9901
|
||||
# On follower nodes, also set the shared join secret:
|
||||
# Environment=QUPTIME_CLUSTER_SECRET=<paste from first node>
|
||||
|
||||
b) Or run \`qu init\` once explicitly:
|
||||
|
||||
sudo -u ${SERVICE_USER} QUPTIME_DIR=${DATA_DIR} \\
|
||||
qu init --advertise <this-host>:9901
|
||||
|
||||
2. Start the service:
|
||||
|
||||
sudo systemctl start ${SERVICE_NAME}
|
||||
sudo -u ${SERVICE_USER} qu status
|
||||
|
||||
3. For ICMP checks, the daemon defaults to unprivileged UDP-mode
|
||||
pings — those need the ping_group_range sysctl widened to include
|
||||
the ${SERVICE_USER} GID, or grant CAP_NET_RAW in the unit. See
|
||||
docs/deployment/systemd.md for the recipes.
|
||||
|
||||
Full documentation: https://git.cer.sh/axodouble/quptime/src/branch/master/docs
|
||||
EOF
|
||||
|
||||
Reference in New Issue
Block a user