155 lines
2.9 KiB
Markdown
155 lines
2.9 KiB
Markdown
# BetelgeuseBytes – Infrastructure & Cluster Configuration
|
||
|
||
## Hosting Provider
|
||
|
||
* **Provider**: Hetzner
|
||
* **Server Type**: Dedicated servers
|
||
* **Region**: EU
|
||
* **Network**: Private LAN + WireGuard
|
||
|
||
---
|
||
|
||
## Nodes
|
||
|
||
### Current Nodes
|
||
|
||
| Node | Role | Notes |
|
||
| --------- | ---------------------- | ------------------- |
|
||
| hetzner-1 | control-plane + worker | runs core workloads |
|
||
| hetzner-2 | worker + storage | hosts local SSD PVs |
|
||
|
||
---
|
||
|
||
## Kubernetes Setup
|
||
|
||
* Kubernetes installed via kubeadm
|
||
* Single cluster
|
||
* Control plane is also schedulable
|
||
|
||
### CNI
|
||
|
||
* **Cilium**
|
||
|
||
* eBPF dataplane
|
||
* kube-proxy replacement
|
||
* Network policy support
|
||
|
||
---
|
||
|
||
## Storage
|
||
|
||
### Persistent Volumes
|
||
|
||
* Backed by **local NVMe / SSD**
|
||
* Manually provisioned PVs
|
||
* Bound via PVCs
|
||
|
||
### Storage Layout
|
||
|
||
```
|
||
/mnt/local-ssd/
|
||
├── postgres/
|
||
├── neo4j/
|
||
├── elasticsearch/
|
||
├── prometheus/
|
||
├── loki/
|
||
├── tempo/
|
||
├── grafana/
|
||
├── minio/
|
||
└── qdrant/
|
||
```
|
||
|
||
---
|
||
|
||
## Networking
|
||
|
||
* Ingress Controller: nginx
|
||
* External DNS records → ingress IP
|
||
* TCP mappings for:
|
||
|
||
* PostgreSQL
|
||
* Neo4j Bolt
|
||
|
||
---
|
||
|
||
## TLS & Certificates
|
||
|
||
* cert-manager
|
||
* ClusterIssuer: Let’s Encrypt
|
||
* Automatic renewal
|
||
|
||
---
|
||
|
||
## Namespaces
|
||
|
||
| Namespace | Purpose |
|
||
| ------------- | ---------------------------------- |
|
||
| db | Databases (Postgres, Redis) |
|
||
| graph | Neo4j |
|
||
| broker | Kafka |
|
||
| ml | ML tooling (Jupyter, Argo, MLflow) |
|
||
| observability | Grafana, Prometheus, Loki, Tempo |
|
||
| automation | n8n |
|
||
| devops | Gitea, Argo CD |
|
||
|
||
---
|
||
|
||
## What This Infra Enables
|
||
|
||
* Full on‑prem AI platform
|
||
* Predictable performance
|
||
* Low-latency data access
|
||
* Independence from cloud providers
|
||
|
||
```mermaid
|
||
flowchart TB
|
||
subgraph NET[Internet]
|
||
W[Web/Clients]
|
||
end
|
||
|
||
subgraph EDGE[Edge]
|
||
DNS[DNS: betelgeusebytes.io\nA/AAAA -> Ingress IP]
|
||
LE[cert-manager\nLet's Encrypt]
|
||
ING[Ingress-NGINX]
|
||
DNS --> ING
|
||
LE --> ING
|
||
W --> DNS
|
||
end
|
||
|
||
subgraph K8S
|
||
direction TB
|
||
subgraph N1[Node 1]
|
||
CP[control-plane + worker]
|
||
PV1[(local SSD PVs)]
|
||
end
|
||
subgraph N2[Node 2]
|
||
WK[worker + storage-heavy]
|
||
PV2[(local SSD PVs)]
|
||
end
|
||
|
||
subgraph NS
|
||
AI[ai: LLM, TEI, Label Studio]
|
||
VEC[vec: Qdrant]
|
||
GRAPH[graph: Neo4j]
|
||
DB[db: Postgres, Redis]
|
||
BROKER[broker: Kafka]
|
||
STORE[storage: MinIO]
|
||
MLOPS[ml/mlops: MLflow, Argo WF, Jupyter]
|
||
OBS[observability: Grafana/Prom/Loki/Tempo/Alloy]
|
||
DEV[devops: ArgoCD, Gitea]
|
||
HAD
|
||
end
|
||
|
||
CP --- WK
|
||
PV1 --- DB
|
||
PV2 --- STORE
|
||
PV2 --- OBS
|
||
PV2 --- GRAPH
|
||
PV2 --- VEC
|
||
end
|
||
|
||
ING -->| host routing| NS
|
||
ING -.TCP (optional).- DB
|
||
ING -.Bolt (optional).- GRAPH
|
||
|