Catalyst Data Integration Status¶
Date: 2025-11-03 Session: Post-phase 3 & 4 implementation
Summary¶
Implemented catalyst data provider framework for real-world catalyst scoring. Earnings calendar is now working via yfinance. SEC Edgar and FDA integrations require additional work.
✅ Completed: Earnings Calendar (yfinance)¶
Status: ✅ WORKING
File: tools/catalyst_data_providers.py
API: yfinance (free, no key required)
Features: - Next earnings date for any symbol - EPS estimates (high/low/average) - Revenue estimates - Days-to-earnings calculator - 24-hour cache to reduce API calls
Test Results:
PLTR: Earnings in -1 days (just reported)
AAPL: Earnings in 86 days (Jan 29, 2026)
TSLA: Earnings in 85 days (Jan 28, 2026)
Usage:
from tools.catalyst_data_providers import EarningsCalendarProvider
provider = EarningsCalendarProvider()
earnings = provider.get_earnings_data('AAPL')
print(f"Next earnings: {earnings.next_earnings_date}")
print(f"EPS estimate: ${earnings.estimated_eps}")
days_to_earnings = provider.get_days_to_earnings('AAPL', datetime.now())
print(f"Days until earnings: {days_to_earnings}")
Catalyst Scoring: - Earnings Beat (25 points): Earnings in 30-60 days + >10% surprise expected - Pre-Earnings Risk: Reduce position 50% at 7 days before earnings
⏳ Pending: SEC Edgar Integration¶
Status: ⏳ FRAMEWORK CREATED, NEEDS IMPLEMENTATION API: SEC Edgar (free, no key required) Rate Limit: 10 requests/second
Form 4 - Insider Trading¶
What We Need: - Parse Form 4 XML filings - Extract transaction data (buy/sell, shares, price) - Calculate net buying over 90 days - Determine if buying is >10% of shares outstanding
Catalyst Scoring: - Insider Buying (15 points): >10% shares bought by insiders in 90 days
Implementation Steps: 1. Get CIK number for symbol (SEC lookup) 2. Fetch recent Form 4 filings (last 90 days) 3. Parse XML to extract transactions 4. Sum net shares bought 5. Compare to shares outstanding
Complexity: Medium (XML parsing, need shares outstanding data)
Schedule 13D - Activist Investors¶
What We Need: - Detect recent 13D filings (180-day lookback) - Extract filer name and stake percentage - Flag if activist investor (>5% stake)
Catalyst Scoring: - M&A Activity (20 points): Recent 13D filing or M&A rumors
Implementation Steps: 1. Get CIK number for symbol 2. Fetch Schedule 13D filings (last 180 days) 3. Parse filing to get investor name and stake 4. Flag if stake >5% (activist threshold)
Complexity: Medium (filing detection, stake extraction)
⏳ Pending: FDA API Integration¶
Status: ⏳ RESEARCH NEEDED API: FDA openFDA (free, no key required) Rate Limit: 240 requests/minute
PDUFA Dates (FDA Decision Dates)¶
What We Need: - Map stock symbols to drug candidates - Get FDA action dates (PDUFA calendar) - Estimate approval probability
Catalyst Scoring: - FDA Approval (25 points): PDUFA date within 90 days, >50% approval probability
Challenges: 1. Symbol-to-drug mapping: No direct API - requires manual database 2. PDUFA calendar: FDA doesn't have public API for this 3. Approval probability: Typically from biotech analysts (paid data)
Alternative Data Sources: - Biopharmcatalyst.com (has PDUFA calendar, may need scraping) - FDA's "Drugs@FDA" database (approvals after the fact, not predictions) - Biotech-focused data providers (paid: $500-2000/month)
Recommendation: - Skip for now - Most complex integration - Focus on stocks with clear catalysts (earnings, M&A, insider buying) - Add manually for specific biotech plays if needed
Current Catalyst Scoring (With Earnings Data)¶
Fully Functional (3 catalysts)¶
- ✅ Earnings Beat (25 pts) - yfinance provides earnings dates
- ✅ Technical Breakout (10 pts) - From price/volume data
- ✅ Analyst Upgrade (inferred) (15 pts) - Strong rally + volume
Partially Functional (2 catalysts)¶
- ⏳ M&A Activity (inferred) (20 pts) - Volume spikes (needs 13D for confirmation)
- ⏳ Insider Buying (inferred) (15 pts) - Accumulation patterns (needs Form 4 for confirmation)
Not Yet Functional (2 catalysts)¶
- ❌ Product Launch (15 pts) - Needs press release aggregator or manual tracking
- ❌ FDA Approval (25 pts) - Needs PDUFA calendar + drug-to-symbol mapping
Total Available: 70 points (with current data)¶
Entry Threshold: 50 points (requires 2-3 catalysts)
Realistic Scenarios: - Earnings + Technical + Analyst = 50 pts ✅ (CAN ENTER) - Earnings + M&A + Analyst = 60 pts ✅ (CAN ENTER) - Earnings + Technical + M&A = 55 pts ✅ (CAN ENTER)
Next Steps¶
Short-Term (This Week)¶
- ✅ Earnings integration complete - Working and tested
- ⏳ Update backtest to use real earnings data
- ⏳ Test pre-earnings reduction logic (reduce 50% at 7 days)
- ⏳ Document what works vs what needs more APIs
Medium-Term (Next 2 Weeks)¶
- SEC Edgar Form 4 - Insider trading data
- Research Python libraries (sec-edgar-downloader, sec-api)
- Parse Form 4 XML
-
Calculate net buying
-
SEC Edgar 13D - Activist investors
- Search for Schedule 13D filings
- Extract filer and stake
- Flag activist positions
Long-Term (Future)¶
- FDA Integration - If focusing on biotech
- Build symbol-to-drug mapping database
- Scrape PDUFA calendar
-
Add approval probability estimates
-
Alternative Catalysts
- Press release monitoring (product launches)
- Patent approvals (USPTO API)
- Clinical trial results (clinicaltrials.gov API)
Recommendation: Deploy Momentum Strategy First¶
Given the complexity of full catalyst integration:
Option 1: Deploy Momentum Strategy (Oct 31)
- ✅ Proven (Sharpe 0.49, Win Rate 52.8%)
- ✅ No external APIs needed
- ✅ Ready for paper trading NOW
- ✅ File: micro_cap.yaml already configured
Option 2: Use Hybrid Approach - Deploy momentum strategy as base - Add earnings calendar overlay (pre-earnings reduction) - Gradually add SEC/FDA as APIs are integrated - Best of both worlds
Option 3: Wait for Full Catalyst Integration - Complete SEC Edgar (Form 4 + 13D) - Add FDA if targeting biotech - Re-run full backtest - Could take 2-4 weeks
My Recommendation: Option 2 (Hybrid) - Start with proven momentum strategy - Add earnings-aware risk management (we have this now!) - Reduce positions 50% at 7 days before earnings - Gradually layer in SEC/FDA as time permits
Files Created/Modified¶
Created¶
tools/catalyst_data_providers.py(424 lines)- EarningsCalendarProvider (✅ working)
- SECEdgarProvider (⏳ framework only)
- FDAAPIProvider (⏳ framework only)
- CatalystDataAggregator (unified interface)
Modified¶
trading_agents/strategy_agent.py(+118 lines)- Added
score_catalyst_stack()method -
7 catalyst types with scoring logic
-
trading_agents/position_management_agent.py(+192 lines) - Dynamic 9% stops for micro-caps
_is_micro_cap()detection- Pre-earnings reduction (skeleton)
- Trailing stops (skeleton)
-
Profit-taking logic (skeleton)
-
tools/catalyst_stacking_backtest.py(820 lines) - Full backtest framework
- Catalyst scoring integration
- Ready to use real earnings data
Documentation¶
docs/trading/MICRO_CAP_PHASE_3_4_COMPLETE.md(implementation summary)docs/trading/CATALYST_DATA_INTEGRATION_STATUS.md(this file)
Success Metrics¶
Earnings Integration: ✅ SUCCESS - Working API calls - Correct date parsing - 24-hour caching - Days-to-earnings calculator
SEC Edgar: ⏳ FRAMEWORK READY - Need XML parsing implementation - ~1-2 days of work per filing type
FDA API: ⏳ NEEDS RESEARCH - More complex than others - May not be worth effort for diversified micro-cap portfolio - Better for biotech-focused strategies
Implementation Date: 2025-11-03 Status: Earnings ✅ Complete, SEC ⏳ Pending, FDA ⏳ Pending Recommendation: Deploy momentum + earnings hybrid strategy