betelgeusebytes/ARCHITECTURE.md

7.1 KiB
Raw Permalink Blame History

BetelgeuseBytes Architecture Overview

High-Level Architecture

This platform is a self-hosted, production-grade Kubernetes stack designed for:

  • AI / ML experimentation and serving
  • Data engineering & observability
  • Knowledge graphs & vector search
  • Automation, workflows, and research tooling

The architecture follows a hub-and-spoke model:

  • Core Infrastructure: Kubernetes + networking + storage
  • Platform Services: databases, messaging, auth, observability
  • ML / AI Services: labeling, embeddings, LLM serving, notebooks
  • Automation & Workflows: Argo Workflows, n8n
  • Access Layer: DNS, Ingress, TLS

Logical Architecture Diagram (Textual)

Internet
   │
   ▼
DNS (betelgeusebytes.io)
   │
   ▼
Ingress-NGINX (TLS via cert-manager)
   │
   ├── Platform UIs (Grafana, Kibana, Gitea, Neo4j, MinIO, etc.)
   ├── ML UIs (Jupyter, Label Studio, MLflow)
   ├── Automation (n8n, Argo)
   └── APIs (Postgres TCP, Neo4j Bolt, Kafka)

Kubernetes Cluster
   ├── Control Plane
   ├── Worker Nodes
   ├── Stateful Workloads (local SSD)
   └── Observability Stack

Key Design Principles

  • Baremetal friendly (Hetzner dedicated servers)
  • Local SSD storage for stateful workloads
  • Everything observable (logs, metrics, traces)
  • CPU-first ML with optional GPU expansion
  • Single-tenant but multi-project ready

Networking

  • Cilium CNI (eBPF-based networking)
  • NGINX Ingress Controller
  • TCP services exposed via Ingress patch (Postgres, Neo4j Bolt)
  • WireGuard mesh between nodes

Security Model

  • TLS everywhere (cert-manager + Lets Encrypt)
  • Namespace isolation per domain (db, ml, graph, observability…)
  • Secrets stored in Kubernetes Secrets
  • Optional Basic Auth on sensitive UIs
  • Keycloak available for future SSO

Scalability Notes

  • Currently single control-plane + workers

  • Designed to add:

    • More workers
    • Dedicated control-plane VPS nodes
    • GPU nodes (for vLLM / training)

What This Enables

  • Research platforms
  • Knowledge graph + LLM pipelines
  • End-to-end ML lifecycle
  • Automated data pipelines
  • Production observability-first apps
flowchart TB
  %% =========================
  %% BetelgeuseBytes AI Platform  Full Architecture (CPU-first, K8s)
  %% =========================

  %% ---- External / Users ----
  subgraph EXT["External Users"]
    U1["Scholar / Admin User\n"]
    U2["API Client\n(curl / SDK / Bots)"]
    U3["Annotator\n(Labeling UI)"]
  end

  %% ---- DNS + TLS + Ingress ----
  subgraph EDGE["Edge: DNS → TLS → Ingress"]
    DNS["DNS: betelgeusebytes.io\nA/AAAA records → Ingress IP"]
    CM["cert-manager\nLet's Encrypt TLS"]
    INGRESS["NGINX Ingress Controller\nHTTP(S) + SNI routing"]
    TCPMAP["Ingress TCP Services\n(Postgres, Neo4j Bolt)"]
  end

  %% ---- Kubernetes Cluster ----
  subgraph K8S["K8S Cluster"]
    direction TB

    subgraph NET["Networking"]
      CILIUM["Cilium CNI\n(eBPF dataplane / policies)"]
      WG["WireGuard\n(node mesh / private networking)"]
    end

    subgraph DEVOPS["Dev/GitOps"]
      GITEA["Gitea\nGit repos"]
      ARGOCD["Argo CD\nGitOps deployments"]
    end

    subgraph OBS["Observability"]
      ALLOY["Grafana Alloy\n(collect logs+traces)"]
      PROM["Prometheus\n(metrics)"]
      LOKI["Loki\n(logs)"]
      TEMPO["Tempo\n(traces)"]
      GRAF["Grafana\n(dashboards)"]
      KSM["kube-state-metrics"]
      NODEX["node-exporter"]
    end

    subgraph DATA["Core Data Layer"]
      PG["PostgreSQL\n(app DB / MLflow / Label Studio)\nNamespace: db"]
      REDIS["Redis\n(cache)\nNamespace: db"]
      ES["Elasticsearch\n(search/log store)\nNamespace: elastic"]
      KIB["Kibana\nUI\nNamespace: elastic"]
      KAFKA["Kafka\n(event bus)\nNamespace: broker"]
      KAFKAUI["Kafka UI\nUI\nNamespace: broker"]
      MINIO["MinIO (S3)\n(datasets & artifacts)\nNamespace: storage"]
    end

    subgraph KG["Knowledge & Retrieval"]
      NEO4J["Neo4j\n(knowledge graph)\nNamespace: graph"]
      QDRANT["Qdrant\n(vector DB + UI)\nNamespace: vec"]
      TEI["Text Embeddings Inference\n(embeddings API)\nNamespace: ai"]
    end

    subgraph AI["AI / ML Services"]
      LLM["LLM Server (CPU)\nOllama / llama.cpp\nNamespace: ai"]
      JUP["Jupyter\n(research notebooks)\nNamespace: ml"]
      LABEL["Label Studio\n(annotation UI)\nNamespace: ai"]
      MLFLOW["MLflow\n(tracking + registry)\nNamespace: mlops/ml"]
    end

    subgraph PIPE["Automation / Pipelines"]
      ARGO_WF["Argo Workflows\n(pipelines)\nNamespace: ml/argo"]
      N8N["n8n\n(automation)\nNamespace: automation"]
    end

    subgraph AUTH["Authentication"]
      KEYCLOAK["Keycloak\n(OIDC/SSO)\nNamespace: auth"]
    end

    subgraph APPS["Custom Applications (to build)"]
      ORCH["Hadith Orchestrator API\nNamespace: hadith"]
      ADMIN["Hadith Admin UI\nNamespace: hadith"]
      NER["NER Service\nNamespace: hadith"]
      RE["Relation Extraction Service\nNamespace: hadith"]
    end
  end

  %% ---- Edge wiring ----
  U1 --> DNS
  U2 --> DNS
  U3 --> DNS
  DNS --> INGRESS
  CM --> INGRESS

  %% ---- Public HTTP(S) routes ----
  INGRESS -->|hadith-admin.betelgeusebytes.io| ADMIN
  INGRESS -->|hadith-api.betelgeusebytes.io| ORCH
  INGRESS -->|llm.betelgeusebytes.io| LLM
  INGRESS -->|embeddings.betelgeusebytes.io| TEI
  INGRESS -->|vector.betelgeusebytes.io| QDRANT
  INGRESS -->|neo4j.betelgeusebytes.io| NEO4J
  INGRESS -->|label.betelgeusebytes.io| LABEL
  INGRESS -->|mlflow.betelgeusebytes.io| MLFLOW
  INGRESS -->|minio.betelgeusebytes.io| MINIO
  INGRESS -->|argo.betelgeusebytes.io| ARGO_WF
  INGRESS -->|auth.betelgeusebytes.io| KEYCLOAK
  INGRESS -->|grafana.betelgeusebytes.io| GRAF
  INGRESS -->|kibana.betelgeusebytes.io| KIB
  INGRESS -->|broker.betelgeusebytes.io| KAFKAUI

  %% ---- TCP routes (optional/external) ----
  TCPMAP -.-> PG
  TCPMAP -.-> NEO4J

  %% ---- GitOps flow ----
  GITEA -->|manifests + app code| ARGOCD
  ARGOCD -->|sync/apply| K8S

  %% ---- Auth flows ----
  ADMIN -->|OIDC login| KEYCLOAK
  ORCH -->|validate JWT / introspect| KEYCLOAK
  LABEL -->|optional OIDC| KEYCLOAK
  MLFLOW -->|OIDC| KEYCLOAK

  %% ---- Orchestrator runtime data flows ----
  ORCH -->|reasoning / JSON extraction| LLM
  ORCH -->|embed queries/docs| TEI
  ORCH -->|vector search| QDRANT
  ORCH -->|graph read/write| NEO4J
  ORCH -->|metadata/users/jobs| PG
  ORCH -->|cache| REDIS
  ORCH -->|full-text search| ES

  %% ---- NER/RE services (future) ----
  ORCH --> NER
  ORCH --> RE
  NER -->|entities| NEO4J
  RE -->|relations| NEO4J

  %% ---- Data curation loop ----
  LABEL -->|labeled datasets| MINIO
  ARGO_WF -->|training data| MINIO
  ARGO_WF -->|log metrics| MLFLOW
  ARGO_WF -->|publish artifacts| MINIO
  MLFLOW -->|model versions| MINIO
  ARGO_WF -->|deploy/update services| ARGOCD

  %% ---- Event-driven (optional) ----
  ORCH -->|events| KAFKA
  ARGO_WF -->|consume triggers| KAFKA
  N8N -->|integrations/alerts| KAFKA

  %% ---- Observability wiring ----
  ALLOY --> LOKI
  ALLOY --> TEMPO
  PROM --> GRAF
  LOKI --> GRAF
  TEMPO --> GRAF
  KSM --> PROM
  NODEX --> PROM

  %% ---- Internal networking ----
  CILIUM --- INGRESS
  WG --- CILIUM