Concepts

Production Telemetry

Your AI learns from what actually breaks in production — automatically.

When your AI writes code and that code breaks in production, ekkOS detects the failure, links it to the session that wrote the code, and asks you to confirm before creating a permanent anti-pattern. Next time your AI writes similar code, it avoids the mistake. No setup required — it works through the conversation itself.

The Problem

Every AI coding tool today operates in a bubble. Your AI writes code, you deploy it, something breaks at 2am. You get a Sentry alert, dig through logs, find the root cause, go back to your IDE, explain the whole context to your AI, and fix it. The AI learns nothing from the production failure. Next time it writes similar code, it makes the same mistake.

AI writes code → deploy → breaks at 2am → you investigate → explain to AI → fix → AI forgets → repeat

How It Works

Three things happen silently in the background. None of them require setup.

1. Commit Tracking

Every time your AI commits code during a session, ekkOS records which session produced which commit. This is the attribution chain.

2. Failure Detection

When you paste a stack trace, mention a CI failure, or share an error in conversation, ekkOS detects it and links it to the commit that caused it.

3. User Verification

ekkOS asks you to confirm: "Was this failure caused by your code change, or something else?" Only confirmed failures become anti-patterns.

AI commits code → ekkOS tracks commit → CI fails → ekkOS detects → asks you → you confirm → anti-pattern forged → AI never makes that mistake again

Two Ways It Works

Production telemetry works through both the Pulse proxy and MCP tools directly.

Via Pulse (Automatic)

If you route through the ekkOS proxy, everything is automatic. Commits are tracked passively, failures are detected from conversation text, and GitHub App events flow in without any setup.

Zero configuration required

Via MCP Tools (Universal)

If you use ekkOS via MCP tools only (Cursor, Windsurf, ChatGPT, etc.), three tools give you the same telemetry loop. Your AI calls them when relevant.

Works with any MCP-compatible client

MCP Tools

Three tools for MCP-only users. Add these rules to your CLAUDE.md or system prompt.

ekkOS_TrackCommit

Call after any git commit. Links the commit SHA to the current session.

ekkOS_TrackCommit({
  commit_sha: "abc1234",
  branch: "main",
  message: "fix: auth middleware null check"
})

ekkOS_ReportFailure

Call when the user reports a production or CI failure. Paste the full error.

ekkOS_ReportFailure({
  error_content: "TypeError: Cannot read property...",
  commit_sha: "abc1234",
  source: "github_actions"
})

ekkOS_CheckCandidates

Call at the start of a session. Returns pending failures that need user verification.

ekkOS_CheckCandidates()
// Returns: pending failures to ask the user about

CLAUDE.md Instructions

Add these rules so your AI calls the tools automatically:

## Production Telemetry
- After any git commit, call ekkOS_TrackCommit with the SHA
- When the user reports a production/CI failure, call ekkOS_ReportFailure
- At the start of each session, call ekkOS_CheckCandidates

Safety Guarantees

Production telemetry never auto-forges patterns. Every safeguard is designed to prevent false lessons from entering your memory.

User Verification

Nothing is forged without your confirmation

AI Causality Check

Gemini analyzes if the commit plausibly caused the error

Flaky Test Detection

Same error across 3+ users = infrastructure, not code

Noise Filtering

Rate limiting, dedup, and signature normalization

Staleness Decay

Patterns lose priority if not reinforced in 90 days

Collective Gate

3 independent users must confirm before a pattern enters collective intelligence

GitHub Integration

If you've connected GitHub through the ekkOS dashboard, CI failure detection works automatically via the GitHub App. No additional setup needed.

When a GitHub Actions workflow fails, ekkOS receives the event, finds the commit SHA, looks up which session wrote that code, runs a causality check, and stores it as a candidate for your next session.

Related