← All projects

One box, twenty services

Operating daily · 2025–2026

A personal AI platform on a single RTX 3090 server — built, instrumented, and operated solo with the discipline of a small infra team.

45 projects under management~24 scheduled jobs33 mirrored repos3-tier backups → Backblaze B2deadman switch, 30-min cadence
Architecture diagram: One box, twenty services

Most side projects die because nothing is watching them. I built the thing that watches.

tjserv is a single Ubuntu server (RTX 3090, 24 GB VRAM) running my entire AI ecosystem: an evaluation lab, a knowledge archive, segmentation and transcription services, LLM decision tools, and the voice/assistant stack — around 20 services at any time, with a ROCK 5B+ single-board computer running 13 more at the edge.

The interesting part isn't any one service. It's that the fleet stays alive.

The fabric

  • Project tracker as control plane. Every project declares itself with one TOML file (.tracker/manifest.toml); a hub service (FastAPI + PostgreSQL) discovers it, polls its health endpoint, maps its dependencies into a live force-directed graph, and renders a single dashboard that answers "what's the state of everything?" in one glance. Projects never call the tracker — they write to their own directory, the tracker reads. Zero coupling, so abandoned projects can't leave stale integrations behind.
  • Markdown is the work queue. Tasks live in version-controlled markdown, rendered live at /work, with a pre-computed daily LLM recommendation answering "what should I work on right now?" — I removed the kanban after markdown won in practice.
  • Operational safety nets, layered. Nightly restic backups to Backblaze B2 (system tier) and weekly local tier; nightly source + state archives with a cross-disk tarball mirror; a daily git auto-snapshot that commits dirty trees across all 33 mirrored repos; and a deadman switch every 30 minutes that pings Discord if any daemon stops stamping — rate-limited so it alerts once, not forever.
  • Health is a state machine, not a boolean. Services report green; stale health files degrade to amber after six hours; hard dependency edges (and only those) cascade failures red. Watchdogs that exit non-zero by design are first-class, not false alarms.

Why it matters

Anyone can start twenty projects. The platform is what lets one person operate twenty — pause thirteen of them without killing their services, recover any of them from any of three backup layers, and know within thirty minutes when something silently dies.

Status & limits — running daily as my actual morning ritual; v1 trust phase in progress. Single-node by design: high availability isn't the goal, recoverability is. Quarterly restore drills are scheduled, and the first one already caught a gap.

One box, twenty services screenshot
One box, twenty services screenshot

Stack

UbuntusystemdFastAPIPostgreSQLnginxresticHermes agent platformDiscord alerting

← All projects