in progress

Homelab Infrastructure

Self-hosted infrastructure stack with reverse proxying, monitoring, secrets management, encrypted backups, file storage, and Ansible-based deployment, built to document practical Linux and operations workflows.

LinuxDocker ComposeAnsibleCaddyUptime KumaPrometheusGrafanaVaultwardenSeafileRestic

Problem

The goal was to build a reproducible homelab that demonstrates reverse proxying, service monitoring, metrics collection, encrypted backups, and self-hosted operational tooling without relying on ad hoc manual setup.

Constraints

Built locally first on Windows 11 using WSL2 Ubuntu as the Linux control environment.
Deployment needed to be repeatable and not rely on manual container startup.
Secrets, runtime state, and persistent service data had to stay out of version control.
The architecture needed to remain expandable to multiple remote servers later.

Architecture

Ansible renders deployment templates into a separate runtime directory and launches the stack with Docker Compose.
Caddy acts as the single browser-facing entry point for local service routing.
Prometheus scrapes node_exporter metrics and Grafana visualizes the resulting dashboards.
Uptime Kuma performs availability checks separately from metrics collection.
Restic performs encrypted backups of persistent runtime data with documented restore testing.
Seafile provides self-hosted file storage, while Vaultwarden provides self-hosted password management.

Key decisions

Separated source repo from runtime deployment path

Kept Git-tracked templates and documentation in the repo while rendering live deployment files into a separate runtime directory to avoid mixing source-controlled files with generated state and persistent data.

Used Caddy as the single ingress layer

Routed browser traffic through Caddy instead of exposing each service independently, which kept the architecture cleaner and closer to real deployments.

Split uptime monitoring from metrics monitoring

Used Uptime Kuma for availability checks and Prometheus plus Grafana for metrics, which reflects a more realistic observability setup than relying on a single tool for everything.

Added encrypted backups early

Introduced Restic before treating the storage stack as usable, so the system had actual recovery capability instead of just persistent folders.

Risks and mitigations

Single-node local deployment is still a limited approximation of real multi-node infrastructure

The repository and Ansible structure were designed from the start to expand to remote servers without rewriting the stack.

Stateful services can become fragile if secrets or initialization state change after first boot

Secrets were separated from version-controlled templates, and the deployment model was adjusted to keep runtime state explicit and isolated.

Next steps

Bootstrap remote server 1 and remote server 2 with the same Ansible deployment flow
Extend Prometheus scraping and Grafana dashboards to multiple nodes
Add Tailscale-only remote management
Expand restore testing into a fuller disaster recovery runbook