Logovat LLM calls je baseline. V 2025: real-time quality scoring, embedding drift detection, predictive alerting.
Beyond logging¶
- Real-time quality: Každý response hodnocen inline
- Embedding drift: Auto-detect změny v distribuce queries
- Predictive cost: Forecast AI spending
- User satisfaction: Korelace feedback vs quality scores
Stack 2025¶
Langfuse pro tracing. Arize Phoenix pro evaluace. Grafana pro business metriky. PagerDuty pro alerts.
Alert fatigue¶
Quality drop >10% sustained 1h → alert. Cost spike >50% → alert. Error rate >5% → immediate. Everything else → daily digest.
Observability is the new testing¶
V nedeterministickém LLM světě je production monitoring důležitější než pre-production testing.
llm monitoringobservabilityai opsproduction