Memory System & IT Raven Roadmap¶

Implementing Knowledge Graph + Multi-Agent Architecture¶

Date: October 28, 2025 Status: Phase 3 Planning (Post-Trading System) Based On: Architecture Evaluation from 3 articles (SuperClaude, Multi-Agent, Knowledge Graph)

🎯 Executive Summary¶

Now that the trading system is isolated and complete, we can focus 100% on IT Raven Enterprise with these architectural improvements:

Top 3 Priorities¶

Knowledge Graph Migration ⭐⭐⭐⭐⭐ (HIGHEST)
Migrate 100+ file-based entities to graph database
Implement canonical ID system
Add observation-based entity model with citations
Multi-MCP Architecture ⭐⭐⭐⭐⭐ (HIGHEST)
Split memory_server into specialized MCP servers
Memory, HubSpot, Notion, GitHub servers
Domain-aware routing logic
Multi-Agent System ⭐⭐⭐⭐☆ (HIGH)
MemoryAgent, HubSpotAgent, ProjectAgent, NotionAgent
Event-driven communication via Redis
Circuit breakers for external APIs

Why Now?¶

✅ Trading system is isolated - No longer competing for development time ✅ Patterns proven - Just successfully implemented multi-agent for trading ✅ Clear business need - IT Raven requires enterprise-scale knowledge management ✅ 100+ entities waiting - File-based system doesn't scale for complex queries

📊 Current State Analysis¶

What We Have¶

✅ Strengths: - 100+ entity files with YAML frontmatter - FastMCP server with 51KB of memory tools - Qdrant vector database (17,661 entities indexed) - HubSpot MCP integration (hybrid approach) - Auto-indexer running every 4 hours - ChatGPT history imported (context preservation)

❌ Pain Points: - No graph structure - Can't query "all projects where person X is owner" - No canonical IDs - Duplicate entities (person in HubSpot vs Notion) - Weak relationships - Just ID references, no context or timestamps - No citations - Can't track where facts came from - Single MCP server - Monolithic, hard to scale - No complex queries - Grep/search doesn't work for relationship traversal

What Enterprise IT Raven Needs¶

Multi-developer platform (10+ developers): - Complex relationship queries - Cross-system search (HubSpot + Notion + GitHub + Memory) - Deduplication across sources - Audit trail (who added what, when, from where) - Real-time collaboration - Enterprise reliability

🏗️ Phase 3: Knowledge Graph Foundation¶

Timeline: 4-6 weeks Priority: ⭐⭐⭐⭐⭐ HIGHEST

Objective¶

Transform file-based entities into a knowledge graph with: - Canonical IDs (deterministic, human-readable) - Observation-based model (structured data + citations) - Rich relationships (typed, timestamped, sourced) - Complex query support (Cypher-like queries)

Components¶

1. Canonical ID System (Week 1)¶

Problem: Same entity appears multiple places with different IDs

Before:

entities/people/bert.md              (id: bert)
hubspot: contact_id_12345            (id: 12345)
notion: Person-ABC123                (id: ABC123)
→ Three separate entities for same person!

After (Canonical ID):

person__bert_frichot__itraven
  - Source 1: memory (entities/people/bert.md)
  - Source 2: hubspot (contact_id_12345)
  - Source 3: notion (Person-ABC123)
→ Single unified entity with multiple sources!

Implementation:

# canonical_id.py

def generate_canonical_id(entity_type: str, name: str, context: dict = None) -> str:
    """
    Generate deterministic, human-readable entity ID.

    Examples:
    - project__itraven_dashboard
    - person__bert_frichot__itraven
    - decision__2025-10-28__hubspot_migration
    - lesson__2025-10-19__loki_schema_v13
    - company__itraven
    - contact__acme_corp__john_smith
    """

    # Normalize name
    normalized = name.lower().replace(" ", "_").replace("-", "_")

    # Add context for disambiguation
    if entity_type == "person" and context and "company" in context:
        return f"person__{normalized}__{context['company']}"

    if entity_type == "decision" and context and "date" in context:
        return f"decision__{context['date']}__{normalized}"

    if entity_type == "lesson" and context and "date" in context:
        return f"lesson__{context['date']}__{normalized}"

    # Default
    return f"{entity_type}__{normalized}"


# Deduplication rules
def merge_entities(canonical_id: str, sources: List[EntitySource]):
    """
    Merge multiple entity sources into single canonical entity.

    Rules:
    - Observations from all sources preserved
    - Most recent observation wins for conflicts
    - All source_ids tracked
    """

    merged = {
        "id": canonical_id,
        "sources": [],
        "observations": [],
        "relations": []
    }

    for source in sources:
        merged["sources"].append({
            "source_type": source.type,  # "memory", "hubspot", "notion"
            "source_id": source.id,
            "synced_at": source.timestamp
        })

        # Merge observations
        for obs in source.observations:
            merged["observations"].append({
                **obs,
                "source": source.type,
                "source_id": source.id
            })

    return merged

2. Observation-Based Entity Model (Week 2)¶

Problem: Entity content is unstructured blob text, no citations

Before (entities/people/bert.md):

---
id: bert
entity_type: person
relations:
  - proj_itraven
  - proj_trading_center
---

# Bert Frichot

Founder of IT Raven. Trader. Developer. Likes automation.

After (Observation-Based):



name="__codelineno-4-1" href="#__codelineno-4-1">{ "id": "person__bert_frichot__itraven", "name": "Bert Frichot", "entity_type": "person", "observations": [ { "name": "role", "value": "founder", "source": "hubspot", "source_id": "company__itraven", "timestamp": "2025-10-28T10:00:00Z", "citation": "HubSpot company record: Owner field", "confidence": 1.0 }, { "name": "interest", "value": "trading", "source": "memory", "source_id": "lesson__2025-10-28__trend_following", "timestamp": "2025-10-28T12:00:00Z", "citation": "Created trading strategy document with 3:1 R/R", "confidence": 1.0 }, { "name": "skill", "value": "python_development", "source": "github", "source_id": "repo__mem_agent_mcp", "timestamp": "2025-10-28T09:00:00Z", "citation": "97 Python scripts in tools/ directory", "confidence": 0.9 }, { "name": "preference", "value": "9_percent_stop_losses", "source": "memory", "source_id": "preference__trading_preferences", "timestamp": "2025-10-22T15:00:00Z", "citation": "User stated: 'I prefer 9% stop losses on crypto'", "confidence": 1.0 } ], "relations": [ { "type": "OWNS", "target": "company__itraven", "source": "hubspot", "since": "2020-01-01", "confidence": 1.0 }, { "type": "WORKS_ON", "target": "project__mem_agent_mcp", "source": "memory", "since": "2025-10-01", "status": "active", "confidence": 1.0 }, { "type": "LEARNED", "target": "lesson__2025-10-19__loki_schema_v13", "source": "memory", "timestamp": "2025-10-19T10:00:00Z", "confidence": 1.0 } ] class="p">}
Implementation:
# observation_model.py

@dataclass
class Observation:
    """Single fact about an entity with citation."""
    name: str          # "role", "skill", "preference", "status"
    value: Any         # "founder", "python", "9%"
    source: str        # "hubspot", "memory", "github"
    source_id: str     # Specific record that supports this fact
    timestamp: str     # When fact was recorded (ISO 8601)
    citation: str      # Human-readable source citation
    confidence: float  # 0.0-1.0 (for inferred facts)


@dataclass
class Relation:
    """Relationship to another entity."""
    type: str          # "OWNS", "WORKS_ON", "LEARNED", "USES"
    target: str        # Canonical ID of target entity
    source: str        # Where relationship came from
    since: str         # When relationship started (optional)
    until: str         # When relationship ended (optional)
    status: str        # "active", "past", "planned"
    metadata: dict     # Additional context
    confidence: float  # 0.0-1.0


@dataclass
class Entity:
    """Knowledge graph entity with observations and relations."""
    id: str                    # Canonical ID
    name: str                  # Human-readable name
    entity_type: str           # "person", "project", "company", etc.
    observations: List[Observation]
    relations: List[Relation]
    sources: List[dict]        # All source systems
    created_at: str
    updated_at: str

3. Migration Script (Week 3)¶
Goal: Convert 100+ existing entity files to graph format
Implementation:
# migrate_entities_to_graph.py

import yaml
from pathlib import Path
from graph_store import GraphStore
from canonical_id import generate_canonical_id
from observation_model import Entity, Observation, Relation

def migrate_entity_file(file_path: Path) -> Entity:
    """Convert single entity file to graph format."""

    # Parse YAML frontmatter + content
    with open(file_path) as f:
        content = f.read()

    parts = content.split('---')
    if len(parts) >= 3:
        frontmatter = yaml.safe_load(parts[1])
        markdown_content = parts[2].strip()
    else:
        raise ValueError(f"Invalid entity file: {file_path}")

    # Generate canonical ID
    canonical_id = generate_canonical_id(
        entity_type=frontmatter.get("entity_type", "unknown"),
        name=frontmatter.get("name", file_path.stem),
        context=frontmatter
    )

    # Extract observations from frontmatter
    observations = []

    # Status observation
    if "status" in frontmatter:
        observations.append(Observation(
            name="status",
            value=frontmatter["status"],
            source="memory",
            source_id=canonical_id,
            timestamp=frontmatter.get("created_at", "unknown"),
            citation=f"Entity file: {file_path}",
            confidence=1.0
        ))

    # Parse markdown content for additional observations
    # (Simple NLP: extract key-value pairs, dates, skills, etc.)
    content_observations = extract_observations_from_text(markdown_content)
    observations.extend(content_observations)

    # Convert relations
    relations = []
    for rel_id in frontmatter.get("relations", []):
        relations.append(Relation(
            type="RELATES_TO",  # Generic, refine later
            target=rel_id,
            source="memory",
            since=frontmatter.get("created_at"),
            status="active",
            metadata={},
            confidence=1.0
        ))

    # Create entity
    entity = Entity(
        id=canonical_id,
        name=frontmatter.get("name", file_path.stem),
        entity_type=frontmatter["entity_type"],
        observations=observations,
        relations=relations,
        sources=[{
            "source_type": "memory",
            "source_id": str(file_path),
            "synced_at": datetime.now().isoformat()
        }],
        created_at=frontmatter.get("created_at", datetime.now().isoformat()),
        updated_at=datetime.now().isoformat()
    )

    return entity


def migrate_all_entities():
    """Migrate all entity files to graph database."""

    entity_dir = Path("/Users/bertfrichot/Documents/memory/entities/")
    graph = GraphStore()

    migrated = 0
    errors = []

    # Find all entity markdown files
    for file_path in entity_dir.rglob("*.md"):
        try:
            entity = migrate_entity_file(file_path)
            graph.store_entity(entity)
            migrated += 1
            print(f"✅ Migrated: {entity.id}")
        except Exception as e:
            errors.append((file_path, str(e)))
            print(f"❌ Error: {file_path} - {e}")

    print(f"\n✅ Migrated: {migrated} entities")
    print(f"❌ Errors: {len(errors)} entities")

    return migrated, errors


if __name__ == "__main__":
    migrate_all_entities()

4. Graph Database Choice (Week 3)¶
Options:

Neo4j (Recommended for IT Raven)
✅ Native graph database
✅ Cypher query language (powerful)
✅ Enterprise features (ACID, clustering)
✅ Graph visualization tools

❌ Cost: $50-150/month (managed) or self-host


Qdrant with Graph Extensions

✅ Already using for vectors
✅ Can add graph layer on top
✅ Free (self-hosted)
❌ Less mature graph features

❌ Custom query language needed


PostgreSQL with Graph Extension

✅ SQL familiar
✅ Reliable, proven
✅ Free
❌ Not optimized for graph traversal

Recommendation: Neo4j for IT Raven (production use case)
5. Complex Query Support (Week 4)¶
Goal: Enable relationship-based queries
Example Queries:
# Query 1: "All active projects where Bert is owner"
query = """
MATCH (p:Person {id: "person__bert_frichot__itraven"})
      -[:OWNS]->(proj:Project {status: "active"})
RETURN proj.name, proj.created_at, proj.updated_at
ORDER BY proj.created_at DESC
"""

# Query 2: "Who are all people related to IT Raven?"
query = """
MATCH (company:Company {id: "company__itraven"})
      <-[:WORKS_FOR|:OWNS|:CONTACTS]-(person:Person)
RETURN person.name, person.id, relationship_type
"""

# Query 3: "What lessons have we learned about Loki?"
query = """
MATCH (lesson:Lesson)-[:RELATES_TO]->(topic:Topic)
WHERE topic.name CONTAINS "loki"
RETURN lesson.name, lesson.date, lesson.solution
ORDER BY lesson.date DESC
"""

# Query 4: "Project dependency graph"
query = """
MATCH path = (proj:Project)-[:DEPENDS_ON*1..3]->(dep:Project)
RETURN path
"""

# Query 5: "People who know Python AND work on active projects"
query = """
MATCH (p:Person)-[:HAS_SKILL]->(skill:Skill {name: "python"})
MATCH (p)-[:WORKS_ON]->(proj:Project {status: "active"})
RETURN p.name, proj.name

Implementation:
# graph_queries.py

from neo4j import GraphDatabase

class GraphQueries:
    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def get_active_projects_for_person(self, person_id: str):
        """Get all active projects where person is owner."""
        with self.driver.session() as session:
            result = session.run("""
                MATCH (p:Person {id: $person_id})
                      -[:OWNS]->(proj:Project {status: "active"})
                RETURN proj.name AS name,
                       proj.created_at AS created,
                       proj.status AS status
                ORDER BY proj.created_at DESC
            """, person_id=person_id)

            return [dict(record) for record in result]

    def get_related_entities(self, entity_id: str, max_depth: int = 2):
        """Get all entities related to given entity."""
        with self.driver.session() as session:
            result = session.run("""
                MATCH path = (e {id: $entity_id})-[*1..$max_depth]-(related)
                RETURN related.id AS id,
                       related.name AS name,
                       related.entity_type AS type,
                       length(path) AS distance
                ORDER BY distance
            """, entity_id=entity_id, max_depth=max_depth)

            return [dict(record) for record in result]

    def search_by_observation(self, obs_name: str, obs_value: str):
        """Find entities with specific observation."""
        with self.driver.session() as session:
            result = session.run("""
                MATCH (e:Entity)-[:HAS_OBSERVATION]->(obs:Observation)
                WHERE obs.name = $obs_name AND obs.value CONTAINS $obs_value
                RETURN e.id AS id, e.name AS name, obs.citation AS source
            """, obs_name=obs_name, obs_value=obs_value)

            return [dict(record) for record in result]


🔄 Phase 4: Multi-MCP Architecture¶
Timeline: 4-6 weeks (parallel with Phase 3)
Priority: ⭐⭐⭐⭐⭐ HIGHEST
Objective¶
Split monolithic memory_server into specialized MCP servers that work together via domain-aware routing.
Architecture¶
┌─────────────────────────────────────────────────────────┐
│                     Claude Desktop                       │
└─────────────────┬───────────────────────────────────────┘
                  │
     ┌────────────┼────────────┬────────────┬──────────────┐
     │            │            │            │              │
┌────▼───┐  ┌────▼───┐  ┌────▼───┐  ┌────▼───┐  ┌───────▼───────┐
│ Memory │  │ HubSpot│  │ Notion │  │ GitHub │  │ File System   │
│  MCP   │  │  MCP   │  │  MCP   │  │  MCP   │  │     MCP       │
└────┬───┘  └────┬───┘  └────┬───┘  └────┬───┘  └───────┬───────┘
     │            │            │            │              │
     └────────────┴────────────┴────────────┴──────────────┘
                               │
                      ┌────────▼────────┐
                      │   Neo4j Graph   │
                      │    Database     │
                      └─────────────────┘

MCP Server Breakdown¶
1. Memory MCP Server (Core)¶
Responsibility: Core knowledge graph operations
Tools:
- store_entity(entity: Entity) -> str
- get_entity(entity_id: str) -> Entity
- search_entities(query: str, filters: dict) -> List[Entity]
- add_observation(entity_id: str, observation: Observation) -> bool
- add_relation(source_id: str, relation: Relation) -> bool
- get_related(entity_id: str, relation_type: str, max_depth: int) -> List[Entity]
- complex_query(cypher: str) -> List[dict]
- merge_entities(canonical_id: str, source_ids: List[str]) -> Entity

Location: agent/memory_server.py (refactored)
2. HubSpot MCP Server¶
Responsibility: CRM data (contacts, companies, deals)
Tools (already exists, enhance):
- search_contacts(query: str) -> List[Contact]
- get_company(company_id: str) -> Company
- get_deals(filters: dict) -> List[Deal]
- sync_to_memory() -> SyncResult  # NEW: Auto-sync to memory graph

Location: tools/hubspot_mcp_server.py (exists)
3. Notion MCP Server (NEW)¶
Responsibility: Extended Brain (projects, domains, identities)
Tools:
- get_projects(filters: dict) -> List[Project]
- get_domains() -> List[Domain]
- get_identities() -> List[Identity]
- create_project(name: str, domain_id: str) -> Project
- sync_to_memory() -> SyncResult  # Auto-sync to memory graph

Location: memory_connectors/notion_mcp/server.py (NEW)
4. GitHub MCP Server (NEW)¶
Responsibility: Code repos, issues, PRs
Tools:
- list_repos() -> List[Repo]
- get_issues(repo: str, state: str) -> List[Issue]
- get_prs(repo: str, state: str) -> List[PR]
- search_code(query: str) -> List[CodeMatch]
- sync_to_memory() -> SyncResult  # Auto-sync to memory graph

Location: memory_connectors/github_mcp/server.py (NEW)
5. Filesystem MCP Server¶
Responsibility: Local codebase access
Status: Already available (mcp-server-filesystem)
Domain-Aware Routing¶
Goal: Claude automatically routes queries to correct MCP servers
Implementation:
# query_router.py

def route_query(query: str) -> List[str]:
    """
    Determine which MCP servers to query based on intent.

    Examples:
    - "Who is the account owner for Acme Corp?" → [hubspot, memory]
    - "What projects is Bert working on?" → [notion, memory]
    - "Show me recent PRs for mem-agent-mcp" → [github, memory]
    - "What are my trading positions?" → [memory]
    """

    # Intent detection (simple keyword matching, could use LLM)
    servers = set()

    # CRM-related
    if any(kw in query.lower() for kw in ["contact", "company", "deal", "account", "customer"]):
        servers.add("hubspot")
        servers.add("memory")

    # Project-related
    if any(kw in query.lower() for kw in ["project", "domain", "identity", "extended brain"]):
        servers.add("notion")
        servers.add("memory")

    # Code-related
    if any(kw in query.lower() for kw in ["repo", "pr", "pull request", "issue", "code", "github"]):
        servers.add("github")
        servers.add("memory")

    # Trading-related
    if any(kw in query.lower() for kw in ["trade", "position", "portfolio", "stock", "alpaca", "schwab"]):
        servers.add("memory")

    # If no specific intent, query memory (fallback)
    if not servers:
        servers.add("memory")

    return list(servers)


# Example: Claude automatically routes queries
query = "Who are all people working on IT Raven projects?"
servers = route_query(query)  # → ["notion", "memory", "hubspot"]

# Claude queries:
# 1. Notion MCP: Get all IT Raven projects
# 2. Memory MCP: Get people relations for those projects
# 3. HubSpot MCP: Enrich with contact details

Cross-Server Synchronization¶
Goal: Keep memory graph in sync with external systems
Implementation:
# sync_coordinator.py

class SyncCoordinator:
    """Coordinate syncs between MCP servers and memory graph."""

    def __init__(self):
        self.hubspot = HubSpotMCP()
        self.notion = NotionMCP()
        self.github = GitHubMCP()
        self.memory = MemoryMCP()

    async def sync_all(self):
        """Sync all external systems to memory graph."""

        # HubSpot → Memory
        hubspot_contacts = await self.hubspot.get_all_contacts()
        for contact in hubspot_contacts:
            canonical_id = generate_canonical_id("person", contact.name, {"company": contact.company})

            entity = Entity(
                id=canonical_id,
                name=contact.name,
                entity_type="person",
                observations=[
                    Observation(name="email", value=contact.email, source="hubspot", ...),
                    Observation(name="company", value=contact.company, source="hubspot", ...),
                ],
                relations=[...],
                sources=[{"source_type": "hubspot", "source_id": contact.id}]
            )

            await self.memory.store_entity(entity)

        # Notion → Memory
        notion_projects = await self.notion.get_all_projects()
        for project in notion_projects:
            canonical_id = generate_canonical_id("project", project.name)

            entity = Entity(
                id=canonical_id,
                name=project.name,
                entity_type="project",
                observations=[
                    Observation(name="status", value=project.status, source="notion", ...),
                    Observation(name="domain", value=project.domain, source="notion", ...),
                ],
                relations=[...],
                sources=[{"source_type": "notion", "source_id": project.id}]
            )

            await self.memory.store_entity(entity)

        # GitHub → Memory
        github_repos = await self.github.get_all_repos()
        # ... similar pattern


# Scheduled sync (every 4 hours, like auto-indexer)
# LaunchAgent or cron job


🤖 Phase 5: Multi-Agent Architecture for IT Raven¶
Timeline: 4-6 weeks (after Phases 3-4)
Priority: ⭐⭐⭐⭐☆ HIGH
Objective¶
Build enterprise agent system modeled after successful trading agent pattern.
Agent Boundaries¶
We just implemented 5 agents for trading - now apply same pattern to IT Raven:
1. MemoryAgent (Core)¶
Responsibility: Entity storage, retrieval, relationships
Subscribes to:
- project.created - Store project entity
- person.added - Store person entity
- lesson.learned - Store lesson entity
- decision.made - Store decision entity
Publishes:
- memory.entity_stored
- memory.entity_updated
- memory.query_result
2. HubSpotAgent (CRM Integration)¶
Responsibility: CRM data sync, contact management
Subscribes to:
- sync.hubspot_requested - Trigger sync
- contact.updated - Update HubSpot contact
Publishes:
- hubspot.contact_synced
- hubspot.sync_complete
- hubspot.sync_failed
Circuit Breaker: 3 failures → 5 min backoff
3. ProjectAgent (Knowledge Management)¶
Responsibility: Project tracking, decision logging
State Machine: Draft → Active → On Hold → Complete → Archived
Subscribes to:
- project.create_requested
- project.status_changed
- decision.logged
Publishes:
- project.created
- project.milestone_reached
- project.archived
4. NotionAgent (Extended Brain Sync)¶
Responsibility: Notion workspace sync
Subscribes to:
- sync.notion_requested
- project.created - Create Notion page
Publishes:
- notion.page_created
- notion.sync_complete
5. AnalyticsAgent (Insights & Reporting)¶
Responsibility: Cross-system analytics, dashboards
Subscribes to:
- analytics.report_requested
- *.* - Collect all events for metrics
Publishes:
- analytics.report_ready
- analytics.metrics_updated
Event-Driven Flow Example¶
User: "Create new project: IT Raven Dashboard"

1. ProjectAgent receives project.create_requested event
   → Creates project entity
   → Transitions state to "Draft"
   → Publishes project.created event

2. MemoryAgent receives project.created event
   → Stores entity in graph database
   → Generates canonical ID: project__itraven_dashboard
   → Publishes memory.entity_stored event

3. NotionAgent receives project.created event
   → Creates Notion page in Projects database
   → Links to domain and identities
   → Publishes notion.page_created event

4. HubSpotAgent receives project.created event
   → Creates associated deal in HubSpot
   → Links to company
   → Publishes hubspot.deal_created event

5. AnalyticsAgent receives all events
   → Updates project count metric
   → Logs event timeline
   → Publishes analytics.metrics_updated event

Message Bus Architecture¶
Use Redis Pub/Sub (proven in trading system):
# Example: Publishing project created event

redis_client.publish("project.created", json.dumps({
    "project_id": "project__itraven_dashboard",
    "name": "IT Raven Dashboard",
    "owner": "person__bert_frichot__itraven",
    "status": "draft",
    "created_at": "2025-10-28T10:00:00Z",
    "trace_id": "trace_abc123"  # For distributed tracing
}))

Distributed Tracing¶
Use OpenTelemetry + Jaeger (proven in trading system):
@tracer.start_as_current_span("memory_agent.store_entity")
def store_entity(entity: Entity):
    span = trace.get_current_span()
    span.set_attribute("entity.id", entity.id)
    span.set_attribute("entity.type", entity.entity_type)

    # Store in Neo4j
    result = graph.create_entity(entity)

    span.set_attribute("storage.success", True)
    return result


📅 Implementation Timeline¶
Weeks 1-6: Knowledge Graph Foundation¶

[x] Week 1: Canonical ID system design and implementation
[x] Week 2: Observation-based entity model
[x] Week 3: Migration script + Neo4j setup
[x] Week 4: Complex query support (Cypher)
[x] Week 5: Testing and validation
[x] Week 6: Performance optimization

Weeks 7-12: Multi-MCP Architecture¶

[x] Week 7: Split memory_server into specialized servers
[x] Week 8: Notion MCP server implementation
[x] Week 9: GitHub MCP server implementation
[x] Week 10: Domain-aware routing logic
[x] Week 11: Cross-server sync coordinator
[x] Week 12: Testing and documentation

Weeks 13-18: Multi-Agent System¶

[x] Week 13: MemoryAgent + ProjectAgent
[x] Week 14: HubSpotAgent + NotionAgent
[x] Week 15: AnalyticsAgent
[x] Week 16: Circuit breakers for all external APIs
[x] Week 17: Distributed tracing + observability
[x] Week 18: Load testing and optimization

Weeks 19-24: Production Deployment¶

[x] Week 19: Frontend development starts
[x] Week 20: API endpoints for frontend
[x] Week 21: User authentication & authorization
[x] Week 22: Production infrastructure (Docker, CI/CD)
[x] Week 23: Security audit & hardening
[x] Week 24: Launch IT Raven MVP!

Total Timeline: 24 weeks (~6 months)

🎯 Success Criteria¶
Phase 3: Knowledge Graph¶
✅ 100+ entities migrated to graph database
✅ Canonical IDs generated for all entities
✅ Observations with citations for all facts
✅ Complex queries working (relationship traversal)
✅ Deduplication automatic across sources
Phase 4: Multi-MCP¶
✅ 5 MCP servers running (Memory, HubSpot, Notion, GitHub, Filesystem)
✅ Domain-aware routing working
✅ Cross-server sync automatic every 4 hours
✅ Query performance < 500ms for simple queries
Phase 5: Multi-Agent¶
✅ 5 agents implemented and tested
✅ Event-driven architecture with Redis
✅ Circuit breakers for all external APIs
✅ Distributed tracing with Jaeger
✅ Load testing 100 concurrent users

💡 Lessons from Trading System¶
What Worked Well¶

Event-driven architecture - Agents decoupled, easy to add new ones
State machines - Enforced valid transitions, prevented bugs
Circuit breakers - Graceful degradation when APIs fail
Distributed tracing - Easy debugging across agents
Test-first approach - Foundation tests caught issues early

Apply to IT Raven¶

Same Redis Pub/Sub pattern - Proven reliable
Same state machine approach - Project lifecycle, decision workflow
Same circuit breaker code - Copy from trading, adapt for HubSpot/Notion
Same OpenTelemetry setup - Copy tracing infrastructure
Same testing strategy - Foundation tests → integration tests → load tests

Don't Repeat Mistakes¶

❌ Don't skip planning - Trading took 4 days because of good design doc
❌ Don't skip tests - We wrote tests alongside code, caught bugs early
❌ Don't guess at edge cases - We documented common failures up front


🚀 Next Immediate Steps¶
This Week¶

Review this roadmap with user
Create Phase 3 detailed spec (canonical IDs, observation model)
Set up Neo4j locally for testing
Write canonical ID generator and test with 10 entities
Start migration script (parse first 10 entity files)

Next Week¶

Complete migration script
Migrate all 100+ entities
Validate migration (check for duplicates, missing relations)
Write first complex queries
Document migration results


Status: Ready for Phase 3
Prerequisites: ✅ Trading system isolated and complete
Blockers: None - ready to start!
Estimated Completion: April 2026 (6 months from now)

"From file-based chaos to enterprise knowledge graph. From single MCP server to multi-server architecture. From proof-of-concept to production platform. Let's build IT Raven."