2.2 KiB
2.2 KiB
BetelgeuseBytes – Architecture Overview
High-Level Architecture
This platform is a self-hosted, production-grade Kubernetes stack designed for:
- AI / ML experimentation and serving
- Data engineering & observability
- Knowledge graphs & vector search
- Automation, workflows, and research tooling
The architecture follows a hub-and-spoke model:
- Core Infrastructure: Kubernetes + networking + storage
- Platform Services: databases, messaging, auth, observability
- ML / AI Services: labeling, embeddings, LLM serving, notebooks
- Automation & Workflows: Argo Workflows, n8n
- Access Layer: DNS, Ingress, TLS
Logical Architecture Diagram (Textual)
Internet
│
▼
DNS (betelgeusebytes.io)
│
▼
Ingress-NGINX (TLS via cert-manager)
│
├── Platform UIs (Grafana, Kibana, Gitea, Neo4j, MinIO, etc.)
├── ML UIs (Jupyter, Label Studio, MLflow)
├── Automation (n8n, Argo)
└── APIs (Postgres TCP, Neo4j Bolt, Kafka)
Kubernetes Cluster
├── Control Plane
├── Worker Nodes
├── Stateful Workloads (local SSD)
└── Observability Stack
Key Design Principles
- Bare‑metal friendly (Hetzner dedicated servers)
- Local SSD storage for stateful workloads
- Everything observable (logs, metrics, traces)
- CPU-first ML with optional GPU expansion
- Single-tenant but multi-project ready
Networking
- Cilium CNI (eBPF-based networking)
- NGINX Ingress Controller
- TCP services exposed via Ingress patch (Postgres, Neo4j Bolt)
- WireGuard mesh between nodes
Security Model
- TLS everywhere (cert-manager + Let’s Encrypt)
- Namespace isolation per domain (db, ml, graph, observability…)
- Secrets stored in Kubernetes Secrets
- Optional Basic Auth on sensitive UIs
- Keycloak available for future SSO
Scalability Notes
-
Currently single control-plane + workers
-
Designed to add:
- More workers
- Dedicated control-plane VPS nodes
- GPU nodes (for vLLM / training)
What This Enables
- Research platforms
- Knowledge graph + LLM pipelines
- End-to-end ML lifecycle
- Automated data pipelines
- Production observability-first apps