betelgeusebytes/INFRASTRUCTURE.md

2.9 KiB
Raw Blame History

BetelgeuseBytes Infrastructure & Cluster Configuration

Hosting Provider

  • Provider: Hetzner
  • Server Type: Dedicated servers
  • Region: EU
  • Network: Private LAN + WireGuard

Nodes

Current Nodes

Node Role Notes
hetzner-1 control-plane + worker runs core workloads
hetzner-2 worker + storage hosts local SSD PVs

Kubernetes Setup

  • Kubernetes installed via kubeadm
  • Single cluster
  • Control plane is also schedulable

CNI

  • Cilium

    • eBPF dataplane
    • kube-proxy replacement
    • Network policy support

Storage

Persistent Volumes

  • Backed by local NVMe / SSD
  • Manually provisioned PVs
  • Bound via PVCs

Storage Layout

/mnt/local-ssd/
├── postgres/
├── neo4j/
├── elasticsearch/
├── prometheus/
├── loki/
├── tempo/
├── grafana/
├── minio/
└── qdrant/

Networking

  • Ingress Controller: nginx

  • External DNS records → ingress IP

  • TCP mappings for:

    • PostgreSQL
    • Neo4j Bolt

TLS & Certificates

  • cert-manager
  • ClusterIssuer: Lets Encrypt
  • Automatic renewal

Namespaces

Namespace Purpose
db Databases (Postgres, Redis)
graph Neo4j
broker Kafka
ml ML tooling (Jupyter, Argo, MLflow)
observability Grafana, Prometheus, Loki, Tempo
automation n8n
devops Gitea, Argo CD

What This Infra Enables

  • Full onprem AI platform
  • Predictable performance
  • Low-latency data access
  • Independence from cloud providers
flowchart TB
  subgraph NET[Internet]
    W[Web/Clients]
  end

  subgraph EDGE[Edge]
    DNS[DNS: betelgeusebytes.io\nA/AAAA -> Ingress IP]
    LE[cert-manager\nLet's Encrypt]
    ING[Ingress-NGINX]
    DNS --> ING
    LE --> ING
    W --> DNS
  end

  subgraph K8S
    direction TB
    subgraph N1[Node 1]
      CP[control-plane + worker]
      PV1[(local SSD PVs)]
    end
    subgraph N2[Node 2]
      WK[worker + storage-heavy]
      PV2[(local SSD PVs)]
    end

    subgraph NS
      AI[ai: LLM, TEI, Label Studio]
      VEC[vec: Qdrant]
      GRAPH[graph: Neo4j]
      DB[db: Postgres, Redis]
      BROKER[broker: Kafka]
      STORE[storage: MinIO]
      MLOPS[ml/mlops: MLflow, Argo WF, Jupyter]
      OBS[observability: Grafana/Prom/Loki/Tempo/Alloy]
      DEV[devops: ArgoCD, Gitea]
      HAD
    end

    CP --- WK
    PV1 --- DB
    PV2 --- STORE
    PV2 --- OBS
    PV2 --- GRAPH
    PV2 --- VEC
  end

  ING -->| host routing| NS
  ING -.TCP (optional).- DB
  ING -.Bolt (optional).- GRAPH