154 lines
6.1 KiB
Markdown
154 lines
6.1 KiB
Markdown
# 🧠 BetelgeuseBytes – Full Stack Catalog
|
||
|
||
|
||
This document lists **every major component deployed in the cluster**, what it is used for today, and what it can be reused for.
|
||
|
||
---
|
||
|
||
## Core Platform
|
||
|
||
| Component | Namespace | Purpose | Reuse |
|
||
| ------------- | ------------- | --------------- | --------------- |
|
||
| Kubernetes | all | Orchestration | Any platform |
|
||
| Cilium | kube-system | Networking | Secure clusters |
|
||
| NGINX Ingress | ingress-nginx | Traffic routing | API gateway |
|
||
| cert-manager | cert-manager | TLS automation | PKI |
|
||
|
||
---
|
||
|
||
## Databases & Messaging
|
||
|
||
| Component | URL / Access | Purpose | Reuse |
|
||
| ------------- | --------------- | --------------- | ---------------- |
|
||
| PostgreSQL | TCP via Ingress | Relational DB | App backends |
|
||
| Redis | internal | Cache | Queues |
|
||
| Kafka | kafka-ui UI | Event streaming | Streaming ETL |
|
||
| Elasticsearch | Kibana UI | Search + logs | Full‑text search |
|
||
|
||
---
|
||
|
||
## Knowledge & Vector
|
||
|
||
| Component | URL | Purpose | Reuse |
|
||
| --------- | ------------------------- | --------------- | --------------- |
|
||
| Neo4j | neo4j.betelgeusebytes.io | Knowledge graph | Graph analytics |
|
||
| Qdrant | vector.betelgeusebytes.io | Vector search | RAG |
|
||
|
||
---
|
||
|
||
## ML & AI
|
||
|
||
| Component | URL | Purpose | Reuse |
|
||
| ------------ | ----------------------------- | --------------- | ---------------- |
|
||
| Jupyter | notebook UI | Experiments | Research |
|
||
| Label Studio | label.betelgeusebytes.io | Annotation | Dataset creation |
|
||
| MLflow | mlflow.betelgeusebytes.io | Model tracking | MLOps |
|
||
| Ollama / LLM | llm.betelgeusebytes.io | LLM inference | Agents |
|
||
| Embeddings | embeddings.betelgeusebytes.io | Text embeddings | Semantic search |
|
||
|
||
---
|
||
|
||
## Automation & DevOps
|
||
|
||
| Component | URL | Purpose | Reuse |
|
||
| -------------- | ----------------------- | ------------------- | ----------- |
|
||
| Argo Workflows | argo.betelgeusebytes.io | Pipelines | ETL |
|
||
| Argo CD | argocd UI | GitOps | CI/CD |
|
||
| Gitea | gitea UI | Git hosting | SCM |
|
||
| n8n | automation UI | Workflow automation | Integration |
|
||
|
||
---
|
||
|
||
## Observability (LGTM)
|
||
|
||
| Component | Purpose | Reuse |
|
||
| ---------- | --------------- | ---------------------- |
|
||
| Grafana | Dashboards | Ops center |
|
||
| Prometheus | Metrics | Monitoring |
|
||
| Loki | Logs | Debugging |
|
||
| Tempo | Traces | Distributed tracing |
|
||
| Alloy | Telemetry agent | Standardized telemetry |
|
||
|
||
---
|
||
|
||
## Authentication
|
||
|
||
| Component | Purpose | Reuse |
|
||
| --------- | ---------- | ----- |
|
||
| Keycloak | OIDC / SSO | IAM |
|
||
|
||
---
|
||
|
||
## Why This Stack Matters
|
||
|
||
* Covers **data → ML → serving → observability** end‑to‑end
|
||
* Suitable for research **and** production
|
||
* Modular and future‑proof
|
||
|
||
|
||
# 📚 Stack Catalog — Services, URLs, Access & Usage
|
||
|
||
This document lists **every deployed component**, how to access it,
|
||
what it is used for **now**, and what it enables **in the future**.
|
||
|
||
---
|
||
|
||
## 🌐 Public Services (Ingress / HTTPS)
|
||
|
||
| Component | URL | Auth | What It Is | Current Usage | Future Usage |
|
||
|--------|-----|------|------------|---------------|--------------|
|
||
| LLM Inference | https://llm.betelgeusebytes.io | none / internal | CPU LLM server (Ollama / llama.cpp) | Extract sanad & matn as JSON | Agents, doc AI, RAG |
|
||
| Embeddings | https://embeddings.betelgeusebytes.io | none / internal | Text Embeddings Inference (HF) | Hadith & bio embeddings | Semantic search |
|
||
| Vector DB | https://vector.betelgeusebytes.io | none | Qdrant + UI | Similarity search | Recommendations |
|
||
| Graph DB | https://neo4j.betelgeusebytes.io | Basic Auth | Neo4j Browser | Isnād graph | Knowledge graphs |
|
||
| Orchestrator | https://hadith-api.betelgeusebytes.io | OIDC | FastAPI router | Core AI API | Any AI backend |
|
||
| Admin UI | https://hadith-admin.betelgeusebytes.io | OIDC | Next.js UI | Scholar review | Any internal tool |
|
||
| Labeling | https://label.betelgeusebytes.io | Local / OIDC | Label Studio | NER/RE annotation | Dataset curation |
|
||
| ML Tracking | https://mlflow.betelgeusebytes.io | OIDC | MLflow UI | Experiments & models | Governance |
|
||
| Object Storage | https://minio.betelgeusebytes.io | Access key | MinIO Console | Datasets & artifacts | Data lake |
|
||
| Pipelines | https://argo.betelgeusebytes.io | SA / OIDC | Argo Workflows UI | ML pipelines | ETL |
|
||
| Auth | https://auth.betelgeusebytes.io | Admin login | Keycloak | SSO & tokens | IAM |
|
||
| Observability | https://grafana.betelgeusebytes.io | Login | Grafana | Metrics/logs/traces | Ops center |
|
||
|
||
---
|
||
|
||
## 🔐 Authentication & Access Summary
|
||
|
||
| System | Auth Method | Who Uses It |
|
||
|-----|------------|-------------|
|
||
| Keycloak | Username / Password | Admins |
|
||
| Admin UI | OIDC (Keycloak) | Scholars |
|
||
| Orchestrator API | OIDC Bearer Token | Apps |
|
||
| MLflow | OIDC | ML engineers |
|
||
| Label Studio | Local / OIDC | Annotators |
|
||
| Neo4j | Basic Auth | Engineers |
|
||
| MinIO | Access / Secret key | Pipelines |
|
||
| Grafana | Login | Operators |
|
||
|
||
---
|
||
|
||
## 🧠 Internal Cluster Services (ClusterIP)
|
||
|
||
| Component | Namespace | Purpose |
|
||
|--------|-----------|--------|
|
||
| PostgreSQL | db | Relational storage |
|
||
| Redis | db | Cache / temp state |
|
||
| Kafka | broker | Event backbone |
|
||
| Prometheus | observability | Metrics |
|
||
| Loki | observability | Logs |
|
||
| Tempo | observability | Traces |
|
||
| Alloy | observability | Telemetry agent |
|
||
|
||
---
|
||
|
||
## 🗂 Storage Responsibilities
|
||
|
||
| Storage | Used By | Contains |
|
||
|------|--------|---------|
|
||
| MinIO | Pipelines, MLflow | Datasets, models |
|
||
| Neo4j PVC | Graph DB | Isnād graph |
|
||
| Qdrant PVC | Vector DB | Embeddings |
|
||
| PostgreSQL PVC | DB | Metadata |
|
||
| Observability PVCs | LGTM | Logs, metrics, traces |
|
||
|