P5G Marvis · AI Integration
End-to-End Request Flow
How a natural-language query becomes a network-aware AI response
On-Prem LLM No data leaves the network
1 · Operator
NCM Browser
React SPA (HPE NCM)
P5G Marvis AI pane
loaded in iframe
HTTPS query
2 · Proxy
Traefik
172.27.0.159
/core/marvis/*
→ :8100 strip prefix
POST /api/query
3 · Backend
p5g-marvis
FastAPI · :8100
Fetches live NF status
+ active alerts
→ builds system prompt
OpenAI-compat API
4 · Inference
llama.cpp server
172.27.0.135:8001 · HTTPS · self-signed TLS
Gemma 4 · 26B Q4_K_S quant Reasoning model
Context injected into system prompt by p5g-marvis before every LLM call
📡 12 NF states (UDR, AMF, SMF, UPF…)
🔴 Active alerts (name, severity, summary)
🕐 Timestamp
📝 User query (natural language)
No training data leaves the site.
Context window: ~1 024 tokens out · 120 s timeout.
↩ Response path: LLM generates markdown analysis (via content or reasoning_content field) → FastAPI returns JSON → iframe renders markdown → operator reads actionable insight
1 / 2
P5G Marvis · AI Integration
Configuration & Design Choices
How the integration is wired, and why
172.27.0.159 only Rule-based fallback
Design Choices
🔒
Fully air-gapped inference
LLM at 172.27.0.135 stays inside the private 5G network. No cloud API keys, no data egress. Self-signed TLS with verify=False for local trust boundary.
🧠
Context-enriched prompt engineering
Every request carries live NF state and alert data. The model never sees a bare question — it always gets the full network picture, so answers are grounded in real telemetry.
Reasoning model handling
Gemma 4 returns a reasoning_content field when content is empty. The backend falls back gracefully so the thinking trace is surfaced rather than dropped.
🛡️
Rule-based fallback
If the LLM is unreachable or times out, the backend falls through to a deterministic rule engine that still returns a formatted, accurate network health summary.
Runtime Configuration (systemd env)
# /etc/systemd/system/p5g-marvis.service Environment=MARVIS_AI_MODE=openai Environment=MARVIS_OPENAI_BASE_URL=https://172.27.0.135:8001 Environment=MARVIS_OPENAI_MODEL=gemma-4-26B-A4B-it-UD-Q4_K_S.gguf
Key Parameters
LLM endpoint172.27.0.135:8001
API format/v1/chat/completions
Auth headerNone (skipped if key empty)
TLS verifyDisabled (self-signed)
max_tokens1 024
Timeout120 s
Hosts with LLM mode172.27.0.159 only
192.168.86.173 moderule (deterministic)
Routing (Traefik)
NCM sidebar injectpatch-ncm.py → JS bundle
Marvis iframe path/core/marvis/ → :8100
AI sub-page/core/marvis/ai
2 / 2