ekkOS_docs
Architecture

Data Flow

How information moves through the ekkOS memory system.

Ingestion Pipeline

User Message
     │
     ▼
┌─────────────────────────────────────────────────┐
│  WORKING MEMORY (Layer 1)                       │
│  • Raw messages stored with timestamps          │
│  • 24-hour retention window                     │
│  • Indexed by session_id                        │
└─────────────────────────────────────────────────┘
     │
     │ (episodic-ingestion worker, every 5 min)
     ▼
┌─────────────────────────────────────────────────┐
│  EPISODIC MEMORY (Layer 2)                      │
│  • Problem-solution episodes extracted          │
│  • LLM analyzes conversation patterns           │
│  • 30-day retention                             │
└─────────────────────────────────────────────────┘
     │
     │ (semantic compression)
     ▼
┌─────────────────────────────────────────────────┐
│  SEMANTIC MEMORY (Layer 3)                      │
│  • Facts distilled from episodes                │
│  • Vector embeddings generated                  │
│  • Permanent storage                            │
└─────────────────────────────────────────────────┘
     │
     │ (pattern discovery)
     ▼
┌─────────────────────────────────────────────────┐
│  PATTERN MEMORY (Layer 4)                       │
│  • Reusable problem-solution pairs              │
│  • Confidence scores assigned                   │
│  • Success metrics tracked                      │
└─────────────────────────────────────────────────┘

Step 1: Capture

Messages are written to the chat_messages table via MCP tools or REST API. Each message includes role, content, session_id, and source platform.

Step 2: Episode Extraction

The episodic ingestion worker runs every 5 minutes. It uses an LLM to identify complete problem-solution cycles and extracts them as episodes with titles, summaries, and key learnings.

Step 3: Semantic Compression

Episodes are compressed into semantic knowledge entries. Facts are extracted and embedded using OpenAI's text-embedding-3-small (1536 dimensions).

Step 4: Pattern Discovery

The system identifies reusable patterns from semantic knowledge. Each pattern gets an initial confidence score of 0.5 that evolves based on application outcomes.

Retrieval Pipeline

User Query: "How do we handle authentication?"
     │
     ▼
┌─────────────────────────────────────────────────┐
│  QUERY PROCESSING                               │
│  • Generate embedding for query                 │
│  • Parse intent and extract keywords            │
└─────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────┐
│  MULTI-LAYER SEARCH (parallel)                  │
│  ├── L3 Semantic: cosine similarity search      │
│  ├── L4 Patterns: match problem descriptions    │
│  └── L9 Directives: check behavioral rules      │
└─────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────┐
│  RESULT RANKING                                 │
│  • Relevance score (similarity)                 │
│  • Recency boost (newer = higher)               │
│  • Confidence weighting (pattern success rate)  │
└─────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────┐
│  CONTEXT ASSEMBLY                               │
│  • Top N results selected                       │
│  • Formatted for AI consumption                 │
│  • Retrieval ID generated for tracking          │
└─────────────────────────────────────────────────┘

Vector Search

HNSW indexes enable ~18ms searches across millions of embeddings. Uses cosine similarity for semantic matching.

Ranking Formula

score = similarity × recency_boost × confidence

Application & Learning

Retrieved Context
     │
     ▼
┌─────────────────────────────────────────────────┐
│  PROMPT INJECTION                               │
│  • Context added to system prompt               │
│  • Patterns formatted with success rates        │
│  • Directives applied as rules                  │
└─────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────┐
│  AI RESPONSE                                    │
│  • LLM generates response with context          │
│  • May use patterns to guide solution           │
└─────────────────────────────────────────────────┘
     │
     ▼
┌─────────────────────────────────────────────────┐
│  OUTCOME TRACKING                               │
│  • Application recorded                         │
│  • Success/failure reported                     │
│  • Confidence scores updated                    │
│                                                 │
│  success: confidence += 0.1/sqrt(n+1)           │
│  failure: confidence -= 0.1/sqrt(n+1)           │
└─────────────────────────────────────────────────┘

Performance Metrics

18ms
Avg retrieval
<50ms
P99 latency
1M+
Vectors/index
5 min
Ingestion cycle

Related