Skip to content

HubSpot Developer Agent - Implementation Plan

Status: 🚧 In Progress Priority: P0 - CRITICAL (10 hrs/week time savings) Complexity: Medium Estimated Time: 16-24 hours ROI: Eliminate manual CRM data entry, proactive client health tracking


Table of Contents

  1. Executive Summary
  2. Current Infrastructure
  3. Architecture Overview
  4. Implementation Phases
  5. Code Implementation
  6. Testing Strategy
  7. Deployment
  8. Success Metrics

Executive Summary

Problem Statement

IT Raven generates 10,000+ client emails annually, but HubSpot CRM data entry is manual, time-consuming, and inconsistent. Critical client interactions aren't being tracked, leading to missed follow-ups and at-risk accounts.

Solution

Build an intelligent HubSpot Developer Agent that: - Auto-syncs M365 emails → HubSpot contacts/companies/deals - Extracts action items, commitments, and issues from email content - Tracks client health scores based on communication frequency - Alerts on at-risk clients (no contact >30 days) - Generates weekly client intelligence reports

Business Impact

  • Time Savings: 10 hrs/week manual data entry eliminated
  • Revenue Protection: Proactive at-risk client identification
  • Data Quality: 100% of client interactions tracked
  • Scalability: Handle 2x client growth without CRM admin overhead

Current Infrastructure

✅ What You Already Have

1. HubSpot MCP Server (tools/hubspot_mcp_server.py) - FastMCP-based server with stdio/HTTP transport - Tools: get_contacts, search_contacts, create_contact, etc. - OpenTelemetry tracing integrated - Credentials configured in .env.hubspot

2. IT Raven Automation (tools/it_raven_hubspot_automation.py) - Auto-links contacts to companies by email domain - Domain caching to reduce API calls - Dry-run mode for testing - Weekly LaunchAgent execution

3. HubSpot Credentials

HUBSPOT_ACCESS_TOKEN=pat-na1-f2786ae5-9e5a-408e-8ce8-0d24375f34d0
HUBSPOT_CLIENT_SECRET=8444640f-03d5-4c2c-b61a-06faa2b5c4ee
HUBSPOT_PORTAL_ID=443524610

4. M365 Email Data - 10,000+ IT Raven client emails in ~/Documents/memory/entities/itraven-email/ - Already indexed in Qdrant - Includes: sender, recipient, subject, body, timestamp - Client discovery JSON with 100+ companies

5. Existing HubSpot Utilities (18 scripts) - CRM exporters, bulk importers, workflow fetchers - Audit tools, health checkers - Proven patterns for rate limiting and error handling

🔨 What Needs to Be Built

1. Email Intelligence Extractor - Parse M365 emails for CRM-relevant information - Extract: client names, action items, issues, commitments - Identify email type (support ticket, sales inquiry, meeting, etc.)

2. Smart CRM Sync Engine - Auto-create/update HubSpot contacts from email senders - Auto-create/update companies from email domains - Auto-create deals for new projects/opportunities - Log email interactions as HubSpot engagements

3. Client Health Monitor - Track days since last contact per client - Calculate health scores (email frequency, issue resolution time) - Alert on at-risk clients (>30 days, unresolved critical issues)

4. Lifecycle Stage Automation - Move contacts through stages based on email patterns - Handle backwards movement (active → inactive → churned) - Trigger workflows based on stage changes

5. Weekly Intelligence Report Generator - Top active clients this week - At-risk clients requiring attention - Unresolved action items - Communication trends


Architecture Overview

┌─────────────────────────────────────────────────────────────┐
│                  M365 Email Source                          │
│  ~/Documents/memory/entities/itraven-email/*.md            │
│  (10,000+ emails, indexed in Qdrant)                       │
└────────────────────┬────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│          Email Intelligence Extractor                       │
│  - Parse email metadata and content                         │
│  - Identify client company from domain/email                │
│  - Extract action items using NER/LLM                       │
│  - Classify email type (support, sales, meeting)           │
│  - Calculate urgency/priority                               │
└────────────────────┬────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│            Smart CRM Sync Engine                            │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Contacts: Create/update from email senders        │   │
│  │  - Extract: firstname, lastname, email, phone      │   │
│  │  - Set: lifecycle stage, last contact date         │   │
│  └─────────────────────────────────────────────────────┘   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Companies: Create/update from domains             │   │
│  │  - Match domain → company name (Clearbit/manual)   │   │
│  │  - Set: industry, size, website                    │   │
│  └─────────────────────────────────────────────────────┘   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Deals: Create from project emails                 │   │
│  │  - Detect keywords: "quote", "proposal", "RFP"     │   │
│  │  - Extract: deal amount, timeline, services        │   │
│  └─────────────────────────────────────────────────────┘   │
│  ┌─────────────────────────────────────────────────────┐   │
│  │  Engagements: Log all email interactions           │   │
│  │  - Type: EMAIL, timestamp, subject, body excerpt   │   │
│  │  - Link to: contact + company + deal (if any)      │   │
│  └─────────────────────────────────────────────────────┘   │
└────────────────────┬────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│              HubSpot MCP Server                             │
│  (Existing: tools/hubspot_mcp_server.py)                   │
│  - FastMCP with get/create/update/search tools             │
│  - Rate limiting with exponential backoff                  │
│  - Token refresh automation                                 │
└────────────────────┬────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────┐
│              HubSpot CRM (Portal: 443524610)                │
│  - Contacts: IT Raven client contacts                      │
│  - Companies: Client organizations                          │
│  - Deals: Active projects/opportunities                     │
│  - Engagements: Email interaction log                       │
└─────────────────────────────────────────────────────────────┘

Data Flow

Email → CRM Sync (Hourly) 1. Scan new emails in ~/Documents/memory/entities/itraven-email/ 2. Extract sender email + company domain 3. Check HubSpot: Contact exists? Company exists? 4. Create/update Contact with latest info 5. Link Contact ← → Company (if not already) 6. Create Engagement record for this email 7. Update "Last Contact Date" on both Contact + Company 8. Extract action items → Create Tasks in HubSpot

Client Health Monitor (Daily) 1. Query all IT Raven companies from HubSpot 2. Calculate days since last contact 3. Flag if >30 days (at-risk), >90 days (churned) 4. Check for unresolved critical issues (from emails) 5. Generate alert email with at-risk client list

Weekly Intelligence Report (Monday 8am) 1. Top 10 most active clients this week 2. At-risk clients requiring attention 3. New companies added 4. Open action items summary 5. Send to IT Raven team + store in Notion


Implementation Phases

Phase 1: Email Intelligence Extractor (4-6 hours)

Goal: Parse M365 emails and extract CRM-relevant structured data

Tasks: 1. ✅ Read email files from ~/Documents/memory/entities/itraven-email/ 2. ✅ Parse metadata: From, To, Subject, Date, Body 3. ✅ Extract sender email and company domain 4. ✅ Classify email type (support, sales, meeting, general) 5. ✅ Extract action items using LLM (LM Studio local model) 6. ✅ Identify urgency/priority keywords 7. ✅ Store extracted data in structured JSON

Deliverables: - tools/email_intelligence_extractor.py - Output: JSON with parsed email data for CRM sync

Testing:

# Test on 10 sample emails
uv run python3 tools/email_intelligence_extractor.py --limit 10 --verbose

# Expected output: JSON with contacts, companies, action items extracted


Phase 2: Smart CRM Sync Engine (6-8 hours)

Goal: Auto-sync extracted email data to HubSpot CRM

Tasks: 1. ✅ Load extracted email intelligence JSON 2. ✅ For each email sender: - Check if contact exists (by email) - Create new contact if not found - Update existing contact (last contact date, lifecycle stage) 3. ✅ For each company domain: - Check if company exists (by domain property) - Create new company if not found - Update existing company (last contact date) 4. ✅ Link contact ← → company association 5. ✅ Create engagement record for this email interaction 6. ✅ Extract and create tasks for action items 7. ✅ Implement rate limiting (100 req/10s, with exponential backoff) 8. ✅ Handle lifecycle stage updates (including backwards movement)

Deliverables: - tools/hubspot_email_sync.py - Dry-run mode for safe testing - Comprehensive logging

Testing:

# Dry run (no actual CRM changes)
uv run python3 tools/hubspot_email_sync.py --dry-run --limit 50

# Live run on small batch
uv run python3 tools/hubspot_email_sync.py --limit 10

# Full sync (after validation)
uv run python3 tools/hubspot_email_sync.py


Phase 3: Client Health Monitor (3-4 hours)

Goal: Track client engagement and identify at-risk accounts

Tasks: 1. ✅ Query all IT Raven companies from HubSpot 2. ✅ Calculate "days since last contact" for each 3. ✅ Categorize: - Active: <30 days - At-Risk: 30-90 days - Churned: >90 days 4. ✅ Check for unresolved critical issues (from email extraction) 5. ✅ Generate health score (0-100) per client 6. ✅ Send alert email with at-risk list 7. ✅ Store health data in Notion database

Deliverables: - tools/client_health_monitor.py - Daily LaunchAgent execution (8am) - Alert email template

Testing:

# Generate health report
uv run python3 tools/client_health_monitor.py --verbose

# Expected: List of at-risk clients with days since last contact


Phase 4: Lifecycle Stage Automation (2-3 hours)

Goal: Auto-update contact lifecycle stages based on email patterns

Tasks: 1. ✅ Define stage transition rules: - Lead → MQL: First meaningful response - MQL → SQL: Meeting scheduled - SQL → Opportunity: Quote/proposal sent - Opportunity → Customer: Contract signed - Customer → Active: Regular communication (<30 days) - Active → Inactive: No contact 30-90 days - Inactive → Churned: No contact >90 days 2. ✅ Implement backwards movement workaround (clear + set) 3. ✅ Trigger HubSpot workflows on stage changes 4. ✅ Log stage transitions with reasoning

Deliverables: - tools/lifecycle_stage_manager.py - Integration with CRM sync engine

Testing:

# Test stage transitions
uv run python3 tools/lifecycle_stage_manager.py --test-mode


Phase 5: Weekly Intelligence Report (2-3 hours)

Goal: Automated weekly summary of client activity and health

Tasks: 1. ✅ Query HubSpot for: - Top 10 most active clients (email count this week) - At-risk clients (30-90 days since contact) - New companies added this week - Open action items summary 2. ✅ Generate markdown report 3. ✅ Send via email to IT Raven team 4. ✅ Store in Notion "Weekly Reports" database 5. ✅ Create LaunchAgent for Monday 8am execution

Deliverables: - tools/weekly_intelligence_report.py - Email template (HTML + markdown) - Notion integration

Testing:

# Generate test report
uv run python3 tools/weekly_intelligence_report.py --week 2025-11-11

# Expected: Markdown report with client activity summary


Code Implementation

Phase 1: Email Intelligence Extractor

File: tools/email_intelligence_extractor.py

Key Functions:

def parse_email_file(filepath: Path) -> Dict:
    """Parse email markdown file and extract structured data."""
    pass

def extract_contact_info(email_data: Dict) -> Dict:
    """Extract firstname, lastname, email, company from sender."""
    pass

def classify_email_type(subject: str, body: str) -> str:
    """Classify: support, sales, meeting, general."""
    pass

def extract_action_items(body: str) -> List[Dict]:
    """Use LLM to extract action items with assignee and due date."""
    pass

def calculate_urgency(subject: str, body: str) -> str:
    """Determine: critical, high, medium, low."""
    pass

LLM Prompt for Action Item Extraction:

PROMPT = """
Extract action items from this email:

Subject: {subject}
Body: {body}

Return JSON array with:
- task: Brief description
- assignee: Who should do it (if mentioned)
- due_date: Deadline (if mentioned)
- priority: critical/high/medium/low

Example:
[
  {
    "task": "Update firewall rules for Acme Corp",
    "assignee": "John",
    "due_date": "2025-11-20",
    "priority": "high"
  }
]
"""


Phase 2: Smart CRM Sync Engine

File: tools/hubspot_email_sync.py

Key Classes:

class HubSpotEmailSyncer:
    def __init__(self, access_token: str, dry_run: bool = False):
        self.client = HubSpot(access_token=access_token)
        self.dry_run = dry_run
        self.rate_limiter = RateLimiter(max_req=100, window=10)  # 100/10s

    def sync_contact(self, email_data: Dict) -> str:
        """Create or update contact from email data."""
        contact_id = self._find_contact(email_data['email'])
        if contact_id:
            return self._update_contact(contact_id, email_data)
        else:
            return self._create_contact(email_data)

    def sync_company(self, domain: str, company_name: str) -> str:
        """Create or update company from domain."""
        pass

    def link_contact_to_company(self, contact_id: str, company_id: str):
        """Associate contact with company."""
        pass

    def create_engagement(self, contact_id: str, email_data: Dict):
        """Log email as engagement in HubSpot."""
        pass

    def update_lifecycle_stage(self, contact_id: str, new_stage: str):
        """Update lifecycle stage with backwards movement handling."""
        # Clear stage first if moving backwards
        current_stage = self._get_lifecycle_stage(contact_id)
        if self._is_backwards_movement(current_stage, new_stage):
            self._clear_lifecycle_stage(contact_id)
            time.sleep(0.5)  # Ensure clear completes
        self._set_lifecycle_stage(contact_id, new_stage)

Rate Limiting Implementation:

class RateLimiter:
    def __init__(self, max_req: int, window: int):
        self.max_req = max_req
        self.window = window
        self.requests = []

    async def acquire(self):
        """Wait if rate limit would be exceeded."""
        now = time.time()
        # Remove requests outside window
        self.requests = [r for r in self.requests if now - r < self.window]

        if len(self.requests) >= self.max_req:
            sleep_time = self.window - (now - self.requests[0])
            print(f"Rate limit reached. Sleeping {sleep_time:.2f}s...")
            await asyncio.sleep(sleep_time)

        self.requests.append(now)

Exponential Backoff for 429 Errors:

async def make_request_with_retry(self, func, *args, max_retries=5, **kwargs):
    """Execute HubSpot API call with exponential backoff."""
    for attempt in range(max_retries):
        try:
            return await func(*args, **kwargs)
        except ApiException as e:
            if e.status == 429:  # Rate limited
                if attempt >= max_retries - 1:
                    raise
                delay = 2 ** attempt  # 1s, 2s, 4s, 8s, 16s
                print(f"Rate limited. Retry {attempt+1}/{max_retries} after {delay}s...")
                await asyncio.sleep(delay)
            else:
                raise


Phase 3: Client Health Monitor

File: tools/client_health_monitor.py

Health Score Algorithm:

def calculate_health_score(company: Dict) -> int:
    """Calculate 0-100 health score for client."""
    score = 100

    # Days since last contact (max penalty: -40 points)
    days_since_contact = (datetime.now() - company['last_contact_date']).days
    if days_since_contact > 90:
        score -= 40
    elif days_since_contact > 60:
        score -= 30
    elif days_since_contact > 30:
        score -= 20
    elif days_since_contact > 14:
        score -= 10

    # Unresolved critical issues (max penalty: -30 points)
    critical_issues = company.get('critical_issues', 0)
    score -= min(critical_issues * 15, 30)

    # Response time to emails (max penalty: -20 points)
    avg_response_time_hours = company.get('avg_response_time_hours', 0)
    if avg_response_time_hours > 48:
        score -= 20
    elif avg_response_time_hours > 24:
        score -= 10
    elif avg_response_time_hours > 8:
        score -= 5

    # Communication frequency (max penalty: -10 points)
    emails_this_month = company.get('emails_this_month', 0)
    if emails_this_month == 0:
        score -= 10
    elif emails_this_month < 2:
        score -= 5

    return max(0, score)  # Floor at 0

At-Risk Alert Email:

def generate_alert_email(at_risk_clients: List[Dict]) -> str:
    """Generate HTML email with at-risk client list."""
    return f"""
    <html>
    <body>
        <h2>🚨 Client Health Alert</h2>
        <p>The following clients require attention:</p>
        <table>
            <tr>
                <th>Company</th>
                <th>Days Since Contact</th>
                <th>Health Score</th>
                <th>Critical Issues</th>
            </tr>
            {''.join([
                f"<tr><td>{c['name']}</td><td>{c['days']}</td><td>{c['score']}</td><td>{c['issues']}</td></tr>"
                for c in at_risk_clients
            ])}
        </table>
        <p>Review and reach out to prevent churn.</p>
    </body>
    </html>
    """


Testing Strategy

Unit Tests

# Test email parsing
pytest tests/test_email_intelligence_extractor.py

# Test CRM sync logic (with mocked HubSpot API)
pytest tests/test_hubspot_email_sync.py

# Test health score calculations
pytest tests/test_client_health_monitor.py

Integration Tests

# Test against HubSpot sandbox portal
HUBSPOT_ACCESS_TOKEN=<sandbox_token> pytest tests/integration/test_hubspot_sync.py

# Test full email → CRM workflow
uv run python3 tests/integration/test_full_workflow.py

Manual Testing Checklist

  1. Email Extraction (Phase 1)
  2. [ ] Parse 10 sample IT Raven emails
  3. [ ] Verify correct contact/company extraction
  4. [ ] Check action item extraction quality

  5. CRM Sync (Phase 2)

  6. [ ] Dry-run sync 50 emails (no CRM changes)
  7. [ ] Live sync 10 emails to test portal
  8. [ ] Verify contacts created correctly
  9. [ ] Verify companies linked to contacts
  10. [ ] Verify engagements logged

  11. Health Monitoring (Phase 3)

  12. [ ] Run health check on all companies
  13. [ ] Verify at-risk client identification
  14. [ ] Test alert email generation

  15. Lifecycle Stages (Phase 4)

  16. [ ] Test forwards movement (Lead → MQL)
  17. [ ] Test backwards movement (Customer → Inactive)
  18. [ ] Verify workflows triggered correctly

  19. Weekly Reports (Phase 5)

  20. [ ] Generate test report for past week
  21. [ ] Verify data accuracy
  22. [ ] Test email delivery

Deployment

LaunchAgents Setup

1. Hourly Email Sync File: ~/Library/LaunchAgents/com.itraven.hubspot.email_sync.plist

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
    <key>Label</key>
    <string>com.itraven.hubspot.email_sync</string>
    <key>ProgramArguments</key>
    <array>
        <string>/Users/bertfrichot/.local/bin/uv</string>
        <string>run</string>
        <string>python3</string>
        <string>/Users/bertfrichot/mem-agent-mcp/tools/hubspot_email_sync.py</string>
    </array>
    <key>StartInterval</key>
    <integer>3600</integer> <!-- Every hour -->
    <key>StandardOutPath</key>
    <string>/tmp/hubspot_email_sync.log</string>
    <key>StandardErrorPath</key>
    <string>/tmp/hubspot_email_sync.error.log</string>
</dict>
</plist>

2. Daily Health Check File: ~/Library/LaunchAgents/com.itraven.hubspot.health_monitor.plist

<!-- Similar structure, runs daily at 8am -->
<key>StartCalendarInterval</key>
<dict>
    <key>Hour</key>
    <integer>8</integer>
    <key>Minute</key>
    <integer>0</integer>
</dict>

3. Weekly Intelligence Report File: ~/Library/LaunchAgents/com.itraven.hubspot.weekly_report.plist

<!-- Runs Mondays at 8am -->
<key>StartCalendarInterval</key>
<dict>
    <key>Weekday</key>
    <integer>1</integer> <!-- Monday -->
    <key>Hour</key>
    <integer>8</integer>
</dict>

Load LaunchAgents:

launchctl load ~/Library/LaunchAgents/com.itraven.hubspot.email_sync.plist
launchctl load ~/Library/LaunchAgents/com.itraven.hubspot.health_monitor.plist
launchctl load ~/Library/LaunchAgents/com.itraven.hubspot.weekly_report.plist


Success Metrics

Quantitative Metrics

Time Savings: - Manual CRM entry: 10 hrs/week → 0 hrs/week - ROI: $5,000/month (at $50/hr contractor rate)

Data Quality: - Email tracking: 0% → 100% - Contact completeness: 60% → 95% - Company linkage accuracy: 70% → 98%

Client Retention: - At-risk client identification: 0 → 15-20 clients/month - Churn reduction: TBD (measure after 3 months)

Qualitative Metrics

  • [ ] IT Raven team no longer manually enters CRM data
  • [ ] Proactive outreach to at-risk clients increases
  • [ ] Client health trends visible in weekly reports
  • [ ] Lifecycle stages accurately reflect client status
  • [ ] Action items from emails don't get lost

Next Steps

Immediate Actions (Next 2 weeks)

  1. Week 1: Implement Phases 1-2 (Email extraction + CRM sync)
  2. Week 2: Implement Phases 3-5 (Health monitoring + reporting)

Phase 1 First Task

Start with tools/email_intelligence_extractor.py:

# Create the extractor
touch tools/email_intelligence_extractor.py
chmod +x tools/email_intelligence_extractor.py

# Test on 10 sample emails
uv run python3 tools/email_intelligence_extractor.py --limit 10 --verbose

Questions for User

  1. Client Prioritization: Should certain clients (e.g., top revenue) get higher health score weight?
  2. Action Item Assignment: Should we auto-assign tasks to specific IT Raven team members?
  3. Alert Thresholds: Is 30 days the right threshold for "at-risk" status?
  4. Report Recipients: Who should receive the weekly intelligence reports?

Document Version: 1.0 Last Updated: 2025-11-13 Author: Claude Code (with Bert Frichot) Implementation Guide Source: Medium article by @Saurabh Rai

Ready to start implementation? Run: uv run python3 tools/email_intelligence_extractor.py --help