Genetic Strategy Optimizer Specification¶
Created: 2025-10-30 Author: Bert Frichot (with Claude Code) Status: Draft Target: Phase 1 of "Raven" AI Trading Agent
1. Executive Summary¶
Problem: Current trading strategies use manually-selected parameters (RSI threshold, stop loss %, position size, etc.). Finding optimal parameters requires dozens of manual backtests.
Solution: Implement genetic algorithm (NSGA-II) to automatically optimize strategy parameters across multiple objectives (Sharpe ratio, total return, max drawdown).
Inspiration: NexusTrade's Aurora agent uses genetic optimization to find optimal rebalancing strategies, generating 1200+ backtests in 15 seconds.
Expected Impact: - 95% reduction in manual parameter tuning time - Multi-objective optimization (not just max returns) - Discover non-obvious parameter combinations - Validate robustness with training/validation splits
2. Technical Architecture¶
2.1 Algorithm Choice: NSGA-II¶
Why NSGA-II (Non-Dominated Sorting Genetic Algorithm II)?
- Multi-objective: Optimize Sharpe ratio AND total return AND max drawdown simultaneously
- Pareto frontier: Returns population of solutions with different trade-offs
- Proven: Used by NexusTrade, academic trading research
- Python library: Available via
pymoopackage
Alternative Considered: Grid Search - Rejected: Too slow (exponential time complexity), no multi-objective support
2.2 Architecture Overview¶
┌─────────────────────────────────────────────────────────────┐
│ Strategy Optimizer │
├─────────────────────────────────────────────────────────────┤
│ │
│ 1. Parameter Space Definition │
│ - Define min/max/step for each parameter │
│ - Example: RSI threshold [20-40], stop loss [5%-15%] │
│ │
│ 2. Genetic Algorithm Engine (NSGA-II) │
│ ┌────────────────────────────────────────┐ │
│ │ Population: 20 individuals/generation │ │
│ │ Generations: 20 iterations │ │
│ │ Crossover: 0.9 probability │ │
│ │ Mutation: 0.1 probability │ │
│ └────────────────────────────────────────┘ │
│ │
│ 3. Fitness Evaluation (per individual) │
│ ┌────────────────────────────────────────┐ │
│ │ For each parameter set: │ │
│ │ - Run backtest on training data │ │
│ │ - Calculate 3 objectives: │ │
│ │ * Maximize: Sharpe ratio │ │
│ │ * Maximize: Total return % │ │
│ │ * Minimize: Max drawdown % │ │
│ └────────────────────────────────────────┘ │
│ │
│ 4. Selection & Evolution │
│ - Non-dominated sorting (Pareto ranking) │
│ - Tournament selection │
│ - Simulated binary crossover (SBX) │
│ - Polynomial mutation │
│ │
│ 5. Validation │
│ - Test best solutions on validation data │
│ - Detect overfitting (train vs. validation gap) │
│ │
│ 6. Output │
│ - Pareto frontier of solutions │
│ - Top 5 strategies by composite score │
│ - Visualization (Sharpe vs. Return vs. Drawdown) │
│ │
└─────────────────────────────────────────────────────────────┘
2.3 Data Flow¶
Input:
- Strategy class (e.g., MeanReversionStrategy)
- Parameter bounds (e.g., rsi_threshold: [20, 40])
- Historical data (2 years, split 70% train / 30% validation)
- Objective weights (default: equal weight)
Processing:
Generation 0: Random 20 parameter sets
For generations 1-20:
1. Evaluate fitness (run backtest for each individual)
2. Non-dominated sorting (rank by Pareto dominance)
3. Selection (pick best performers)
4. Crossover (combine parameters)
5. Mutation (random parameter changes)
Output:
- Pareto frontier (list of non-dominated solutions)
- Best strategy (highest composite score on validation data)
- Optimization report (convergence plot, parameter distributions)
3. Implementation Plan¶
3.1 New Files to Create¶
tools/genetic_optimizer.py (Main optimizer)
- GeneticOptimizer class
- NSGA-II implementation via pymoo
- Fitness evaluation engine
- Result analysis and visualization
tools/optimizable_strategy.py (Base class)
- OptimizableStrategy abstract base class
- Defines parameter space interface
- Backtest execution wrapper
- Objective calculation (Sharpe, return, drawdown)
tools/mean_reversion_optimized.py (Example)
- Extends OptimizableStrategy
- Defines parameter bounds for Mean Reversion strategy
- Example usage of genetic optimizer
docs/guides/genetic-optimization-guide.md (Documentation)
- How to use the genetic optimizer
- How to create optimizable strategies
- Interpreting Pareto frontier results
3.2 Dependencies¶
New Python packages:
pymoo = "^0.6.1" # Multi-objective optimization
matplotlib = "^3.8.0" # Visualization
seaborn = "^0.13.0" # Statistical plots
Existing dependencies:
- alpaca-py (data fetching)
- pandas (data manipulation)
- numpy (numerical operations)
3.3 Implementation Phases¶
Phase 1.1: Core Optimizer (3 days)
- Implement GeneticOptimizer class
- Integrate pymoo NSGA-II algorithm
- Test with dummy fitness function
Phase 1.2: Strategy Base Class (2 days)
- Create OptimizableStrategy abstract class
- Define parameter space interface
- Implement backtest wrapper with train/validation split
Phase 1.3: Mean Reversion Example (2 days)
- Convert existing mean_reversion_backtest.py to optimizable
- Define parameter bounds (RSI, BB std, stop loss, position size)
- Run full optimization and document results
Phase 1.4: Visualization & Reporting (2 days) - Create Pareto frontier plots - Generate optimization report (PDF) - Add convergence analysis
4. Parameter Spaces (Example: Mean Reversion)¶
parameter_space = {
'rsi_oversold': {
'type': 'int',
'bounds': [20, 40], # RSI oversold threshold
'step': 1
},
'rsi_overbought': {
'type': 'int',
'bounds': [60, 80], # RSI overbought threshold
'step': 1
},
'bb_std': {
'type': 'float',
'bounds': [1.5, 3.0], # Bollinger Band standard deviations
'step': 0.1
},
'stop_loss_pct': {
'type': 'float',
'bounds': [0.05, 0.15], # Stop loss percentage
'step': 0.01
},
'base_position_size': {
'type': 'int',
'bounds': [300, 1000], # Position size in dollars
'step': 50
},
'max_positions': {
'type': 'int',
'bounds': [3, 8], # Max concurrent positions
'step': 1
}
}
Total combinations: ~1.8 million Grid search time (assuming 1 sec per backtest): 21 days Genetic algorithm time (20 gen × 20 pop × 1 sec): 6.7 minutes
Speedup: ~4,500X faster
5. Optimization Objectives¶
5.1 Primary Objectives (Multi-Objective)¶
Objective 1: Maximize Sharpe Ratio
- Measures risk-adjusted returns - Preferred by your trading preferences - Target: >1.0 (good), >2.0 (excellent)Objective 2: Maximize Total Return
- Raw performance metric - Can conflict with Sharpe (high return may have high volatility)Objective 3: Minimize Max Drawdown
- Worst peak-to-trough decline - Critical for your risk management preferences - Your target: <20%5.2 Constraints (Hard Limits)¶
Must satisfy ALL constraints to be viable: - Win rate ≥ 50% (per your trading preferences) - Max drawdown ≤ 20% (per your trading preferences) - Average trade P&L > 0 - At least 30 trades in backtest period (statistical significance)
Strategies failing constraints: Assigned worst fitness, eliminated early
6. Training/Validation Split Strategy¶
6.1 Time-Series Split (Recommended)¶
Training Data: First 70% of historical data - Used for fitness evaluation during optimization - Parameter tuning happens here
Validation Data: Last 30% of historical data - Used ONLY for final evaluation - Tests generalization to unseen data
Example (2 years of data): - Training: Jan 2023 - Jul 2024 (17 months) - Validation: Aug 2024 - Oct 2025 (7 months)
6.2 Walk-Forward Windows (Advanced)¶
Alternative approach (used by NexusTrade): - Split training data into 3 non-overlapping windows - Evaluate fitness as average across all 3 windows - Reduces overfitting to specific market regime
Example: - Window 1: Jan 2023 - Jun 2023 (6 months) - Window 2: Jul 2023 - Dec 2023 (6 months) - Window 3: Jan 2024 - Jun 2024 (6 months) - Validation: Jul 2024 - Oct 2025 (15 months)
6.3 Overfitting Detection¶
Red flags: - Training Sharpe: 2.5, Validation Sharpe: 0.8 (huge gap = overfitting) - Training return: 80%, Validation return: 15% (not robust) - Training drawdown: 5%, Validation drawdown: 35% (fragile)
Acceptable ranges: - Sharpe gap: ≤30% degradation (e.g., 2.0 → 1.4) - Return gap: ≤50% degradation (e.g., 40% → 20%) - Drawdown increase: ≤2X (e.g., 10% → 20%)
7. Success Criteria¶
7.1 Phase 1 Success Metrics¶
Functional Requirements (must work): - ✅ Optimizer completes 20 generations in <10 minutes - ✅ Produces Pareto frontier with 10+ non-dominated solutions - ✅ Best solution passes all hard constraints - ✅ Validation performance within acceptable degradation range
Performance Requirements (must beat baseline): - ✅ Optimized strategy Sharpe > manual strategy Sharpe by ≥20% - ✅ Optimized strategy max drawdown ≤ manual strategy drawdown - ✅ Optimized strategy win rate ≥ 50%
Usability Requirements:
- ✅ CLI tool: uv run python3 tools/optimize_strategy.py mean_reversion
- ✅ Clear output report (top 5 strategies, Pareto plot, parameter values)
- ✅ One-line deployment to paper trading
7.2 Comparison to Manual Tuning¶
Current Mean Reversion Performance (from trading-preferences.md):
- Win rate: 37.5% ❌ (below 50% target)
- Sharpe: -0.04 ❌ (negative)
- Max drawdown: $102.08
Optimizer Target: - Win rate: ≥50% ✅ - Sharpe: >1.0 ✅ - Max drawdown: <20% ✅
If optimizer achieves this: Deploy to paper trading for 1 week validation
8. Example Usage¶
8.1 Command-Line Interface¶
# Optimize existing Mean Reversion strategy
uv run python3 tools/optimize_strategy.py \
--strategy mean_reversion \
--symbols AAPL,MSFT,NVDA \
--train-start 2023-01-01 \
--train-end 2024-07-31 \
--val-start 2024-08-01 \
--val-end 2025-10-30 \
--population 20 \
--generations 20 \
--output reports/mean_reversion_optimized.json
# Output:
# 🧬 Generation 1/20 | Best Sharpe: 0.85 | Avg Return: 18.3%
# 🧬 Generation 2/20 | Best Sharpe: 1.12 | Avg Return: 22.1%
# ...
# 🧬 Generation 20/20 | Best Sharpe: 2.15 | Avg Return: 38.7%
#
# ✅ Optimization complete! Found 12 non-dominated solutions
#
# 📊 Top 5 Strategies (by composite score):
# 1. Balanced Strategy
# - Sharpe: 2.15 | Return: 38.7% | Drawdown: 12.3%
# - RSI: [28, 72] | BB Std: 2.2 | Stop: 8.5% | Position: $650
#
# 2. High Return Strategy
# - Sharpe: 1.88 | Return: 52.1% | Drawdown: 18.9%
# - RSI: [25, 75] | BB Std: 2.8 | Stop: 12.0% | Position: $850
#
# 3. Conservative Strategy
# - Sharpe: 2.01 | Return: 28.4% | Drawdown: 8.2%
# - RSI: [30, 70] | BB Std: 1.8 | Stop: 6.5% | Position: $450
#
# 📈 Pareto frontier saved to: reports/mean_reversion_pareto.png
# 📄 Full report saved to: reports/mean_reversion_optimized.json
8.2 Python API¶
from tools.genetic_optimizer import GeneticOptimizer
from tools.mean_reversion_optimized import MeanReversionStrategy
# Define strategy with parameter space
strategy = MeanReversionStrategy(
symbols=['AAPL', 'MSFT', 'NVDA'],
parameter_space={
'rsi_oversold': (20, 40),
'rsi_overbought': (60, 80),
'bb_std': (1.5, 3.0),
'stop_loss_pct': (0.05, 0.15),
'base_position_size': (300, 1000),
'max_positions': (3, 8)
}
)
# Run optimization
optimizer = GeneticOptimizer(
strategy=strategy,
population_size=20,
n_generations=20,
train_period=('2023-01-01', '2024-07-31'),
val_period=('2024-08-01', '2025-10-30')
)
result = optimizer.optimize()
# Get best strategy
best = result.get_best_strategy(metric='sharpe')
print(f"Best Sharpe: {best.sharpe:.2f}")
print(f"Parameters: {best.parameters}")
# Deploy to paper trading
best.deploy_to_paper_trading()
9. Risks & Mitigations¶
Risk 1: Overfitting¶
Problem: Optimizer finds parameters that work perfectly on training data but fail on validation/live
Mitigation: - ✅ Always use train/validation split - ✅ Require validation performance within 30% of training - ✅ Use walk-forward windows (test across multiple time periods) - ✅ Mandate paper trading before live deployment
Risk 2: Optimization Time¶
Problem: 20 gen × 20 pop = 400 backtests, could take hours for complex strategies
Mitigation: - ✅ Use vectorized backtesting (pandas operations, not loops) - ✅ Parallelize fitness evaluation (process pool) - ✅ Start with smaller population/generations, scale up if needed
Risk 3: Parameter Instability¶
Problem: Optimal parameters change frequently (not robust)
Mitigation: - ✅ Re-optimize monthly, track parameter drift - ✅ Use ensemble of top 5 strategies (not just #1) - ✅ Set parameter bounds conservatively
Risk 4: Market Regime Change¶
Problem: Optimized for bull market, fails in bear market
Mitigation: - ✅ Include 2008/2020 crash data in training if possible - ✅ Optimize across multiple market regimes - ✅ Add market regime classifier (bull/bear/sideways detection)
10. Future Enhancements (Phase 2+)¶
Not in scope for Phase 1, but planned:
- Agentic Workflow (Phase 2)
- ReAct loop: Agent decides which strategies to optimize
- Autonomous hypothesis generation
-
Automatic paper trading deployment
-
Natural Language Interface (Phase 2)
- "Create a mean reversion strategy for tech stocks"
- LLM generates strategy code + parameter space
-
Optimizer tunes parameters automatically
-
Multi-Strategy Optimization (Phase 3)
- Optimize portfolio allocation across strategies
- Correlation analysis between strategies
-
Risk parity / Sharpe-optimized portfolio
-
Real-Time Adaptation (Phase 3)
- Detect market regime changes
- Re-optimize parameters dynamically
- A/B test strategies in live trading
11. Next Steps¶
Immediate Actions (This Week):¶
- [TODAY] Review this spec with user → get approval
- [Day 2] Install dependencies (
pymoo,matplotlib,seaborn) - [Day 3-5] Implement
GeneticOptimizercore class - [Day 6-7] Create
OptimizableStrategybase class - [Day 8-9] Convert Mean Reversion to optimizable format
Week 2 Actions:¶
- [Day 10-11] Run full optimization on Mean Reversion
- [Day 12] Analyze results, create Pareto plots
- [Day 13] Deploy best strategy to Alpaca paper trading
- [Day 14] Monitor paper trading results
Success Gate:¶
Proceed to Phase 2 (Agentic Workflow) IF: - ✅ Optimized strategy outperforms manual on validation data - ✅ Paper trading shows positive results (1 week) - ✅ User approves moving forward
12. Questions for User¶
Before implementation, clarify:
- Time commitment: 1-2 weeks for Phase 1 - acceptable?
- Primary objective: Sharpe ratio or total return? (for composite scoring)
- Historical data period: 2 years enough, or prefer 5 years?
- Which strategy first: Mean Reversion, or different strategy?
- Paper trading duration: 1 week minimum - acceptable?
13. References¶
Inspiration: - NexusTrade Aurora AI Agent (Austin Starks, Medium articles) - NSGA-II Algorithm (Deb et al., 2002) - "Advances in Financial Machine Learning" (Marcos López de Prado)
Libraries:
- pymoo: https://pymoo.org/
- NSGA-II tutorial: https://pymoo.org/algorithms/moo/nsga2.html
Related Docs:
- ~/Documents/memory/entities/preferences/trading-preferences.md
- /Users/bertfrichot/mem-agent-mcp/tools/mean_reversion_backtest.py
- /Users/bertfrichot/mem-agent-mcp/tools/strategy_tester.py
Status: ✅ Spec complete, awaiting user approval