Skip to content

Memory System & IT Raven Roadmap

Implementing Knowledge Graph + Multi-Agent Architecture

Date: October 28, 2025 Status: Phase 3 Planning (Post-Trading System) Based On: Architecture Evaluation from 3 articles (SuperClaude, Multi-Agent, Knowledge Graph)


🎯 Executive Summary

Now that the trading system is isolated and complete, we can focus 100% on IT Raven Enterprise with these architectural improvements:

Top 3 Priorities

  1. Knowledge Graph Migration ⭐⭐⭐⭐⭐ (HIGHEST)
  2. Migrate 100+ file-based entities to graph database
  3. Implement canonical ID system
  4. Add observation-based entity model with citations

  5. Multi-MCP Architecture ⭐⭐⭐⭐⭐ (HIGHEST)

  6. Split memory_server into specialized MCP servers
  7. Memory, HubSpot, Notion, GitHub servers
  8. Domain-aware routing logic

  9. Multi-Agent System ⭐⭐⭐⭐☆ (HIGH)

  10. MemoryAgent, HubSpotAgent, ProjectAgent, NotionAgent
  11. Event-driven communication via Redis
  12. Circuit breakers for external APIs

Why Now?

Trading system is isolated - No longer competing for development time ✅ Patterns proven - Just successfully implemented multi-agent for trading ✅ Clear business need - IT Raven requires enterprise-scale knowledge management ✅ 100+ entities waiting - File-based system doesn't scale for complex queries


📊 Current State Analysis

What We Have

✅ Strengths: - 100+ entity files with YAML frontmatter - FastMCP server with 51KB of memory tools - Qdrant vector database (17,661 entities indexed) - HubSpot MCP integration (hybrid approach) - Auto-indexer running every 4 hours - ChatGPT history imported (context preservation)

❌ Pain Points: - No graph structure - Can't query "all projects where person X is owner" - No canonical IDs - Duplicate entities (person in HubSpot vs Notion) - Weak relationships - Just ID references, no context or timestamps - No citations - Can't track where facts came from - Single MCP server - Monolithic, hard to scale - No complex queries - Grep/search doesn't work for relationship traversal

What Enterprise IT Raven Needs

Multi-developer platform (10+ developers): - Complex relationship queries - Cross-system search (HubSpot + Notion + GitHub + Memory) - Deduplication across sources - Audit trail (who added what, when, from where) - Real-time collaboration - Enterprise reliability


🏗️ Phase 3: Knowledge Graph Foundation

Timeline: 4-6 weeks Priority: ⭐⭐⭐⭐⭐ HIGHEST

Objective

Transform file-based entities into a knowledge graph with: - Canonical IDs (deterministic, human-readable) - Observation-based model (structured data + citations) - Rich relationships (typed, timestamped, sourced) - Complex query support (Cypher-like queries)

Components

1. Canonical ID System (Week 1)

Problem: Same entity appears multiple places with different IDs

Before:

entities/people/bert.md              (id: bert)
hubspot: contact_id_12345            (id: 12345)
notion: Person-ABC123                (id: ABC123)
→ Three separate entities for same person!

After (Canonical ID):

person__bert_frichot__itraven
  - Source 1: memory (entities/people/bert.md)
  - Source 2: hubspot (contact_id_12345)
  - Source 3: notion (Person-ABC123)
→ Single unified entity with multiple sources!

Implementation:

# canonical_id.py

def generate_canonical_id(entity_type: str, name: str, context: dict = None) -> str:
    """
    Generate deterministic, human-readable entity ID.

    Examples:
    - project__itraven_dashboard
    - person__bert_frichot__itraven
    - decision__2025-10-28__hubspot_migration
    - lesson__2025-10-19__loki_schema_v13
    - company__itraven
    - contact__acme_corp__john_smith
    """

    # Normalize name
    normalized = name.lower().replace(" ", "_").replace("-", "_")

    # Add context for disambiguation
    if entity_type == "person" and context and "company" in context:
        return f"person__{normalized}__{context['company']}"

    if entity_type == "decision" and context and "date" in context:
        return f"decision__{context['date']}__{normalized}"

    if entity_type == "lesson" and context and "date" in context:
        return f"lesson__{context['date']}__{normalized}"

    # Default
    return f"{entity_type}__{normalized}"


# Deduplication rules
def merge_entities(canonical_id: str, sources: List[EntitySource]):
    """
    Merge multiple entity sources into single canonical entity.

    Rules:
    - Observations from all sources preserved
    - Most recent observation wins for conflicts
    - All source_ids tracked
    """

    merged = {
        "id": canonical_id,
        "sources": [],
        "observations": [],
        "relations": []
    }

    for source in sources:
        merged["sources"].append({
            "source_type": source.type,  # "memory", "hubspot", "notion"
            "source_id": source.id,
            "synced_at": source.timestamp
        })

        # Merge observations
        for obs in source.observations:
            merged["observations"].append({
                **obs,
                "source": source.type,
                "source_id": source.id
            })

    return merged

2. Observation-Based Entity Model (Week 2)

Problem: Entity content is unstructured blob text, no citations

Before (entities/people/bert.md):

---
id: bert
entity_type: person
relations:
  - proj_itraven
  - proj_trading_center
---

# Bert Frichot

Founder of IT Raven. Trader. Developer. Likes automation.

After (Observation-Based):

{
  "id": "person__bert_frichot__itraven",
  "name": "Bert Frichot",
  "entity_type": "person",
  "observations": [
    {
      "name": "role",
      "value": "founder",
      "source": "hubspot",
      "source_id": "company__itraven",
      "timestamp": "2025-10-28T10:00:00Z",
      "citation": "HubSpot company record: Owner field",
      "confidence": 1.0
    },
    {
      "name": "interest",
      "value": "trading",
      "source": "memory",
      "source_id": "lesson__2025-10-28__trend_following",
      "timestamp": "2025-10-28T12:00:00Z",
      "citation": "Created trading strategy document with 3:1 R/R",
      "confidence": 1.0
    },
    {
      "name": "skill",
      "value": "python_development",
      "source": "github",
      "source_id": "repo__mem_agent_mcp",
      "timestamp": "2025-10-28T09:00:00Z",
      "citation": "97 Python scripts in tools/ directory",
      "confidence": 0.9
    },
    {
      "name": "preference",
      "value": "9_percent_stop_losses",
      "source": "memory",
      "source_id": "preference__trading_preferences",
      "timestamp": "2025-10-22T15:00:00Z",
      "citation": "User stated: 'I prefer 9% stop losses on crypto'",
      "confidence": 1.0
    }
  ],
  "relations": [
    {
      "type": "OWNS",
      "target": "company__itraven",
      "source": "hubspot",
      "since": "2020-01-01",
      "confidence": 1.0
    },
    {
      "type": "WORKS_ON",
      "target": "project__mem_agent_mcp",
      "source": "memory",
      "since": "2025-10-01",
      "status": "active",
      "confidence": 1.0
    },
    {
      "type": "LEARNED",
      "target": "lesson__2025-10-19__loki_schema_v13",
      "source": "memory",
      "timestamp": "2025-10-19T10:00:00Z",
      "confidence": 1.0
    }
  ]
}

Implementation:

# observation_model.py

@dataclass
class Observation:
    """Single fact about an entity with citation."""
    name: str          # "role", "skill", "preference", "status"
    value: Any         # "founder", "python", "9%"
    source: str        # "hubspot", "memory", "github"
    source_id: str     # Specific record that supports this fact
    timestamp: str     # When fact was recorded (ISO 8601)
    citation: str      # Human-readable source citation
    confidence: float  # 0.0-1.0 (for inferred facts)


@dataclass
class Relation:
    """Relationship to another entity."""
    type: str          # "OWNS", "WORKS_ON", "LEARNED", "USES"
    target: str        # Canonical ID of target entity
    source: str        # Where relationship came from
    since: str         # When relationship started (optional)
    until: str         # When relationship ended (optional)
    status: str        # "active", "past", "planned"
    metadata: dict     # Additional context
    confidence: float  # 0.0-1.0


@dataclass
class Entity:
    """Knowledge graph entity with observations and relations."""
    id: str                    # Canonical ID
    name: str                  # Human-readable name
    entity_type: str           # "person", "project", "company", etc.
    observations: List[Observation]
    relations: List[Relation]
    sources: List[dict]        # All source systems
    created_at: str
    updated_at: str

3. Migration Script (Week 3)

Goal: Convert 100+ existing entity files to graph format

Implementation:

# migrate_entities_to_graph.py

import yaml
from pathlib import Path
from graph_store import GraphStore
from canonical_id import generate_canonical_id
from observation_model import Entity, Observation, Relation

def migrate_entity_file(file_path: Path) -> Entity:
    """Convert single entity file to graph format."""

    # Parse YAML frontmatter + content
    with open(file_path) as f:
        content = f.read()

    parts = content.split('---')
    if len(parts) >= 3:
        frontmatter = yaml.safe_load(parts[1])
        markdown_content = parts[2].strip()
    else:
        raise ValueError(f"Invalid entity file: {file_path}")

    # Generate canonical ID
    canonical_id = generate_canonical_id(
        entity_type=frontmatter.get("entity_type", "unknown"),
        name=frontmatter.get("name", file_path.stem),
        context=frontmatter
    )

    # Extract observations from frontmatter
    observations = []

    # Status observation
    if "status" in frontmatter:
        observations.append(Observation(
            name="status",
            value=frontmatter["status"],
            source="memory",
            source_id=canonical_id,
            timestamp=frontmatter.get("created_at", "unknown"),
            citation=f"Entity file: {file_path}",
            confidence=1.0
        ))

    # Parse markdown content for additional observations
    # (Simple NLP: extract key-value pairs, dates, skills, etc.)
    content_observations = extract_observations_from_text(markdown_content)
    observations.extend(content_observations)

    # Convert relations
    relations = []
    for rel_id in frontmatter.get("relations", []):
        relations.append(Relation(
            type="RELATES_TO",  # Generic, refine later
            target=rel_id,
            source="memory",
            since=frontmatter.get("created_at"),
            status="active",
            metadata={},
            confidence=1.0
        ))

    # Create entity
    entity = Entity(
        id=canonical_id,
        name=frontmatter.get("name", file_path.stem),
        entity_type=frontmatter["entity_type"],
        observations=observations,
        relations=relations,
        sources=[{
            "source_type": "memory",
            "source_id": str(file_path),
            "synced_at": datetime.now().isoformat()
        }],
        created_at=frontmatter.get("created_at", datetime.now().isoformat()),
        updated_at=datetime.now().isoformat()
    )

    return entity


def migrate_all_entities():
    """Migrate all entity files to graph database."""

    entity_dir = Path("/Users/bertfrichot/Documents/memory/entities/")
    graph = GraphStore()

    migrated = 0
    errors = []

    # Find all entity markdown files
    for file_path in entity_dir.rglob("*.md"):
        try:
            entity = migrate_entity_file(file_path)
            graph.store_entity(entity)
            migrated += 1
            print(f"✅ Migrated: {entity.id}")
        except Exception as e:
            errors.append((file_path, str(e)))
            print(f"❌ Error: {file_path} - {e}")

    print(f"\n✅ Migrated: {migrated} entities")
    print(f"❌ Errors: {len(errors)} entities")

    return migrated, errors


if __name__ == "__main__":
    migrate_all_entities()

4. Graph Database Choice (Week 3)

Options:

  1. Neo4j (Recommended for IT Raven)
  2. ✅ Native graph database
  3. ✅ Cypher query language (powerful)
  4. ✅ Enterprise features (ACID, clustering)
  5. ✅ Graph visualization tools
  6. ❌ Cost: $50-150/month (managed) or self-host

  7. Qdrant with Graph Extensions

  8. ✅ Already using for vectors
  9. ✅ Can add graph layer on top
  10. ✅ Free (self-hosted)
  11. ❌ Less mature graph features
  12. ❌ Custom query language needed

  13. PostgreSQL with Graph Extension

  14. ✅ SQL familiar
  15. ✅ Reliable, proven
  16. ✅ Free
  17. ❌ Not optimized for graph traversal

Recommendation: Neo4j for IT Raven (production use case)

5. Complex Query Support (Week 4)

Goal: Enable relationship-based queries

Example Queries:

# Query 1: "All active projects where Bert is owner"
query = """
MATCH (p:Person {id: "person__bert_frichot__itraven"})
      -[:OWNS]->(proj:Project {status: "active"})
RETURN proj.name, proj.created_at, proj.updated_at
ORDER BY proj.created_at DESC
"""

# Query 2: "Who are all people related to IT Raven?"
query = """
MATCH (company:Company {id: "company__itraven"})
      <-[:WORKS_FOR|:OWNS|:CONTACTS]-(person:Person)
RETURN person.name, person.id, relationship_type
"""

# Query 3: "What lessons have we learned about Loki?"
query = """
MATCH (lesson:Lesson)-[:RELATES_TO]->(topic:Topic)
WHERE topic.name CONTAINS "loki"
RETURN lesson.name, lesson.date, lesson.solution
ORDER BY lesson.date DESC
"""

# Query 4: "Project dependency graph"
query = """
MATCH path = (proj:Project)-[:DEPENDS_ON*1..3]->(dep:Project)
RETURN path
"""

# Query 5: "People who know Python AND work on active projects"
query = """
MATCH (p:Person)-[:HAS_SKILL]->(skill:Skill {name: "python"})
MATCH (p)-[:WORKS_ON]->(proj:Project {status: "active"})
RETURN p.name, proj.name

Implementation:

# graph_queries.py

from neo4j import GraphDatabase

class GraphQueries:
    def __init__(self, uri, user, password):
        self.driver = GraphDatabase.driver(uri, auth=(user, password))

    def get_active_projects_for_person(self, person_id: str):
        """Get all active projects where person is owner."""
        with self.driver.session() as session:
            result = session.run("""
                MATCH (p:Person {id: $person_id})
                      -[:OWNS]->(proj:Project {status: "active"})
                RETURN proj.name AS name,
                       proj.created_at AS created,
                       proj.status AS status
                ORDER BY proj.created_at DESC
            """, person_id=person_id)

            return [dict(record) for record in result]

    def get_related_entities(self, entity_id: str, max_depth: int = 2):
        """Get all entities related to given entity."""
        with self.driver.session() as session:
            result = session.run("""
                MATCH path = (e {id: $entity_id})-[*1..$max_depth]-(related)
                RETURN related.id AS id,
                       related.name AS name,
                       related.entity_type AS type,
                       length(path) AS distance
                ORDER BY distance
            """, entity_id=entity_id, max_depth=max_depth)

            return [dict(record) for record in result]

    def search_by_observation(self, obs_name: str, obs_value: str):
        """Find entities with specific observation."""
        with self.driver.session() as session:
            result = session.run("""
                MATCH (e:Entity)-[:HAS_OBSERVATION]->(obs:Observation)
                WHERE obs.name = $obs_name AND obs.value CONTAINS $obs_value
                RETURN e.id AS id, e.name AS name, obs.citation AS source
            """, obs_name=obs_name, obs_value=obs_value)

            return [dict(record) for record in result]

🔄 Phase 4: Multi-MCP Architecture

Timeline: 4-6 weeks (parallel with Phase 3) Priority: ⭐⭐⭐⭐⭐ HIGHEST

Objective

Split monolithic memory_server into specialized MCP servers that work together via domain-aware routing.

Architecture

┌─────────────────────────────────────────────────────────┐
│                     Claude Desktop                       │
└─────────────────┬───────────────────────────────────────┘
     ┌────────────┼────────────┬────────────┬──────────────┐
     │            │            │            │              │
┌────▼───┐  ┌────▼───┐  ┌────▼───┐  ┌────▼───┐  ┌───────▼───────┐
│ Memory │  │ HubSpot│  │ Notion │  │ GitHub │  │ File System   │
│  MCP   │  │  MCP   │  │  MCP   │  │  MCP   │  │     MCP       │
└────┬───┘  └────┬───┘  └────┬───┘  └────┬───┘  └───────┬───────┘
     │            │            │            │              │
     └────────────┴────────────┴────────────┴──────────────┘
                      ┌────────▼────────┐
                      │   Neo4j Graph   │
                      │    Database     │
                      └─────────────────┘

MCP Server Breakdown

1. Memory MCP Server (Core)

Responsibility: Core knowledge graph operations

Tools:

- store_entity(entity: Entity) -> str
- get_entity(entity_id: str) -> Entity
- search_entities(query: str, filters: dict) -> List[Entity]
- add_observation(entity_id: str, observation: Observation) -> bool
- add_relation(source_id: str, relation: Relation) -> bool
- get_related(entity_id: str, relation_type: str, max_depth: int) -> List[Entity]
- complex_query(cypher: str) -> List[dict]
- merge_entities(canonical_id: str, source_ids: List[str]) -> Entity

Location: agent/memory_server.py (refactored)

2. HubSpot MCP Server

Responsibility: CRM data (contacts, companies, deals)

Tools (already exists, enhance):

- search_contacts(query: str) -> List[Contact]
- get_company(company_id: str) -> Company
- get_deals(filters: dict) -> List[Deal]
- sync_to_memory() -> SyncResult  # NEW: Auto-sync to memory graph

Location: tools/hubspot_mcp_server.py (exists)

3. Notion MCP Server (NEW)

Responsibility: Extended Brain (projects, domains, identities)

Tools:

- get_projects(filters: dict) -> List[Project]
- get_domains() -> List[Domain]
- get_identities() -> List[Identity]
- create_project(name: str, domain_id: str) -> Project
- sync_to_memory() -> SyncResult  # Auto-sync to memory graph

Location: memory_connectors/notion_mcp/server.py (NEW)

4. GitHub MCP Server (NEW)

Responsibility: Code repos, issues, PRs

Tools:

- list_repos() -> List[Repo]
- get_issues(repo: str, state: str) -> List[Issue]
- get_prs(repo: str, state: str) -> List[PR]
- search_code(query: str) -> List[CodeMatch]
- sync_to_memory() -> SyncResult  # Auto-sync to memory graph

Location: memory_connectors/github_mcp/server.py (NEW)

5. Filesystem MCP Server

Responsibility: Local codebase access

Status: Already available (mcp-server-filesystem)

Domain-Aware Routing

Goal: Claude automatically routes queries to correct MCP servers

Implementation:

# query_router.py

def route_query(query: str) -> List[str]:
    """
    Determine which MCP servers to query based on intent.

    Examples:
    - "Who is the account owner for Acme Corp?" → [hubspot, memory]
    - "What projects is Bert working on?" → [notion, memory]
    - "Show me recent PRs for mem-agent-mcp" → [github, memory]
    - "What are my trading positions?" → [memory]
    """

    # Intent detection (simple keyword matching, could use LLM)
    servers = set()

    # CRM-related
    if any(kw in query.lower() for kw in ["contact", "company", "deal", "account", "customer"]):
        servers.add("hubspot")
        servers.add("memory")

    # Project-related
    if any(kw in query.lower() for kw in ["project", "domain", "identity", "extended brain"]):
        servers.add("notion")
        servers.add("memory")

    # Code-related
    if any(kw in query.lower() for kw in ["repo", "pr", "pull request", "issue", "code", "github"]):
        servers.add("github")
        servers.add("memory")

    # Trading-related
    if any(kw in query.lower() for kw in ["trade", "position", "portfolio", "stock", "alpaca", "schwab"]):
        servers.add("memory")

    # If no specific intent, query memory (fallback)
    if not servers:
        servers.add("memory")

    return list(servers)


# Example: Claude automatically routes queries
query = "Who are all people working on IT Raven projects?"
servers = route_query(query)  # → ["notion", "memory", "hubspot"]

# Claude queries:
# 1. Notion MCP: Get all IT Raven projects
# 2. Memory MCP: Get people relations for those projects
# 3. HubSpot MCP: Enrich with contact details

Cross-Server Synchronization

Goal: Keep memory graph in sync with external systems

Implementation:

# sync_coordinator.py

class SyncCoordinator:
    """Coordinate syncs between MCP servers and memory graph."""

    def __init__(self):
        self.hubspot = HubSpotMCP()
        self.notion = NotionMCP()
        self.github = GitHubMCP()
        self.memory = MemoryMCP()

    async def sync_all(self):
        """Sync all external systems to memory graph."""

        # HubSpot → Memory
        hubspot_contacts = await self.hubspot.get_all_contacts()
        for contact in hubspot_contacts:
            canonical_id = generate_canonical_id("person", contact.name, {"company": contact.company})

            entity = Entity(
                id=canonical_id,
                name=contact.name,
                entity_type="person",
                observations=[
                    Observation(name="email", value=contact.email, source="hubspot", ...),
                    Observation(name="company", value=contact.company, source="hubspot", ...),
                ],
                relations=[...],
                sources=[{"source_type": "hubspot", "source_id": contact.id}]
            )

            await self.memory.store_entity(entity)

        # Notion → Memory
        notion_projects = await self.notion.get_all_projects()
        for project in notion_projects:
            canonical_id = generate_canonical_id("project", project.name)

            entity = Entity(
                id=canonical_id,
                name=project.name,
                entity_type="project",
                observations=[
                    Observation(name="status", value=project.status, source="notion", ...),
                    Observation(name="domain", value=project.domain, source="notion", ...),
                ],
                relations=[...],
                sources=[{"source_type": "notion", "source_id": project.id}]
            )

            await self.memory.store_entity(entity)

        # GitHub → Memory
        github_repos = await self.github.get_all_repos()
        # ... similar pattern


# Scheduled sync (every 4 hours, like auto-indexer)
# LaunchAgent or cron job

🤖 Phase 5: Multi-Agent Architecture for IT Raven

Timeline: 4-6 weeks (after Phases 3-4) Priority: ⭐⭐⭐⭐☆ HIGH

Objective

Build enterprise agent system modeled after successful trading agent pattern.

Agent Boundaries

We just implemented 5 agents for trading - now apply same pattern to IT Raven:

1. MemoryAgent (Core)

Responsibility: Entity storage, retrieval, relationships

Subscribes to: - project.created - Store project entity - person.added - Store person entity - lesson.learned - Store lesson entity - decision.made - Store decision entity

Publishes: - memory.entity_stored - memory.entity_updated - memory.query_result

2. HubSpotAgent (CRM Integration)

Responsibility: CRM data sync, contact management

Subscribes to: - sync.hubspot_requested - Trigger sync - contact.updated - Update HubSpot contact

Publishes: - hubspot.contact_synced - hubspot.sync_complete - hubspot.sync_failed

Circuit Breaker: 3 failures → 5 min backoff

3. ProjectAgent (Knowledge Management)

Responsibility: Project tracking, decision logging

State Machine: Draft → Active → On Hold → Complete → Archived

Subscribes to: - project.create_requested - project.status_changed - decision.logged

Publishes: - project.created - project.milestone_reached - project.archived

4. NotionAgent (Extended Brain Sync)

Responsibility: Notion workspace sync

Subscribes to: - sync.notion_requested - project.created - Create Notion page

Publishes: - notion.page_created - notion.sync_complete

5. AnalyticsAgent (Insights & Reporting)

Responsibility: Cross-system analytics, dashboards

Subscribes to: - analytics.report_requested - *.* - Collect all events for metrics

Publishes: - analytics.report_ready - analytics.metrics_updated

Event-Driven Flow Example

User: "Create new project: IT Raven Dashboard"

1. ProjectAgent receives project.create_requested event
   → Creates project entity
   → Transitions state to "Draft"
   → Publishes project.created event

2. MemoryAgent receives project.created event
   → Stores entity in graph database
   → Generates canonical ID: project__itraven_dashboard
   → Publishes memory.entity_stored event

3. NotionAgent receives project.created event
   → Creates Notion page in Projects database
   → Links to domain and identities
   → Publishes notion.page_created event

4. HubSpotAgent receives project.created event
   → Creates associated deal in HubSpot
   → Links to company
   → Publishes hubspot.deal_created event

5. AnalyticsAgent receives all events
   → Updates project count metric
   → Logs event timeline
   → Publishes analytics.metrics_updated event

Message Bus Architecture

Use Redis Pub/Sub (proven in trading system):

# Example: Publishing project created event

redis_client.publish("project.created", json.dumps({
    "project_id": "project__itraven_dashboard",
    "name": "IT Raven Dashboard",
    "owner": "person__bert_frichot__itraven",
    "status": "draft",
    "created_at": "2025-10-28T10:00:00Z",
    "trace_id": "trace_abc123"  # For distributed tracing
}))

Distributed Tracing

Use OpenTelemetry + Jaeger (proven in trading system):

@tracer.start_as_current_span("memory_agent.store_entity")
def store_entity(entity: Entity):
    span = trace.get_current_span()
    span.set_attribute("entity.id", entity.id)
    span.set_attribute("entity.type", entity.entity_type)

    # Store in Neo4j
    result = graph.create_entity(entity)

    span.set_attribute("storage.success", True)
    return result

📅 Implementation Timeline

Weeks 1-6: Knowledge Graph Foundation

  • [x] Week 1: Canonical ID system design and implementation
  • [x] Week 2: Observation-based entity model
  • [x] Week 3: Migration script + Neo4j setup
  • [x] Week 4: Complex query support (Cypher)
  • [x] Week 5: Testing and validation
  • [x] Week 6: Performance optimization

Weeks 7-12: Multi-MCP Architecture

  • [x] Week 7: Split memory_server into specialized servers
  • [x] Week 8: Notion MCP server implementation
  • [x] Week 9: GitHub MCP server implementation
  • [x] Week 10: Domain-aware routing logic
  • [x] Week 11: Cross-server sync coordinator
  • [x] Week 12: Testing and documentation

Weeks 13-18: Multi-Agent System

  • [x] Week 13: MemoryAgent + ProjectAgent
  • [x] Week 14: HubSpotAgent + NotionAgent
  • [x] Week 15: AnalyticsAgent
  • [x] Week 16: Circuit breakers for all external APIs
  • [x] Week 17: Distributed tracing + observability
  • [x] Week 18: Load testing and optimization

Weeks 19-24: Production Deployment

  • [x] Week 19: Frontend development starts
  • [x] Week 20: API endpoints for frontend
  • [x] Week 21: User authentication & authorization
  • [x] Week 22: Production infrastructure (Docker, CI/CD)
  • [x] Week 23: Security audit & hardening
  • [x] Week 24: Launch IT Raven MVP!

Total Timeline: 24 weeks (~6 months)


🎯 Success Criteria

Phase 3: Knowledge Graph

100+ entities migrated to graph database ✅ Canonical IDs generated for all entities ✅ Observations with citations for all facts ✅ Complex queries working (relationship traversal) ✅ Deduplication automatic across sources

Phase 4: Multi-MCP

5 MCP servers running (Memory, HubSpot, Notion, GitHub, Filesystem) ✅ Domain-aware routing working ✅ Cross-server sync automatic every 4 hours ✅ Query performance < 500ms for simple queries

Phase 5: Multi-Agent

5 agents implemented and tested ✅ Event-driven architecture with Redis ✅ Circuit breakers for all external APIs ✅ Distributed tracing with Jaeger ✅ Load testing 100 concurrent users


💡 Lessons from Trading System

What Worked Well

  1. Event-driven architecture - Agents decoupled, easy to add new ones
  2. State machines - Enforced valid transitions, prevented bugs
  3. Circuit breakers - Graceful degradation when APIs fail
  4. Distributed tracing - Easy debugging across agents
  5. Test-first approach - Foundation tests caught issues early

Apply to IT Raven

  1. Same Redis Pub/Sub pattern - Proven reliable
  2. Same state machine approach - Project lifecycle, decision workflow
  3. Same circuit breaker code - Copy from trading, adapt for HubSpot/Notion
  4. Same OpenTelemetry setup - Copy tracing infrastructure
  5. Same testing strategy - Foundation tests → integration tests → load tests

Don't Repeat Mistakes

  1. Don't skip planning - Trading took 4 days because of good design doc
  2. Don't skip tests - We wrote tests alongside code, caught bugs early
  3. Don't guess at edge cases - We documented common failures up front

🚀 Next Immediate Steps

This Week

  1. Review this roadmap with user
  2. Create Phase 3 detailed spec (canonical IDs, observation model)
  3. Set up Neo4j locally for testing
  4. Write canonical ID generator and test with 10 entities
  5. Start migration script (parse first 10 entity files)

Next Week

  1. Complete migration script
  2. Migrate all 100+ entities
  3. Validate migration (check for duplicates, missing relations)
  4. Write first complex queries
  5. Document migration results

Status: Ready for Phase 3 Prerequisites: ✅ Trading system isolated and complete Blockers: None - ready to start! Estimated Completion: April 2026 (6 months from now)


"From file-based chaos to enterprise knowledge graph. From single MCP server to multi-server architecture. From proof-of-concept to production platform. Let's build IT Raven."