← Corpus / investment-memo-orchestrator / reminder

Format Memo According to Outline Input

Refactor the investment memo system to use YAML-based content outlines that agents can reference for structure, guiding questions, and vocabulary.

Path
reminders/Format-Memo-According-to-Template-Input.md
Authors
Michael Staton, AI Labs Team
Augmented with
Claude Code (Sonnet 4.5)
Tags
Refactoring · Outlines · YAML · Agent-Design · Workflow

Format Memo According to Outline Input

Status: In Progress - Phase 1 Complete Date: 2025-11-21 Last Updated: 2025-11-21 Author: AI Labs Team Related: Multi-Agent-Orchestration-for-Investment-Memo-Generation.md, Improving-Memo-Output.md

Outstanding Issues

  • Automated citation agent integration using Perplexity sonar-pro API
    • Make sure Perplexity AI requests to sonar-pro are working.

Executive Summary

This document specifies a refactoring initiative to move section definitions, guiding questions, and vocabulary from hardcoded markdown templates and agent logic into structured YAML outline files. This will enable:

  1. Dynamic section generation based on investment type (direct/fund) and mode (consider/justify)
  2. Consistent but custom prompting with firm-specific outlines guiding sections, topics, vocabulary and nomenclature, as well as questions per section accessible to all agents
  3. Vocabulary control with industry-specific terms and preferred language embedded in each outline
  4. Easier outline maintenance without modifying agent code
  5. Multi-outline support for different firm styles or memo formats

Terminology: We use “outlines” for content structure (sections, questions, vocabulary) and “brand configs” for visual styling (colors, fonts, logos). This clear separation prevents confusion.

The system will maintain backward compatibility with existing templates while gradually migrating to YAML-based outlines.

Phase 1 Status: ✅ COMPLETE - Default outlines created for both direct investment and fund commitment memo types.


Problem Statement

Current Architecture Limitations

Issue #1: Hardcoded Section Definitions

  • Section lists exist in multiple places:
    • src/agents/writer.py:SECTION_ORDER (hardcoded list)
    • templates/<brand>_memo-template-direct.md (markdown headers)
    • templates/<brand>_memo-template-fund.md (markdown headers)
      • <brand> is the brand name of the firm, e.g. “Hypernova”
  • Adding a section requires editing 3+ files
  • Section numbering must be manually synchronized
  • No single source of truth for section structure

Issue #2: No Centralized Guiding Questions

  • Writer agent has implicit knowledge of what each section should contain
  • No structured guidance for what questions to answer per section
  • Difficult to ensure consistency across memo generations
  • New team members must reverse-engineer expectations from examples

Issue #3: Vocabulary Scattered Across System

  • Style guide (templates/style-guide.md) contains general guidance
  • Specific terminology preferences not codified
  • No section-specific vocabulary control
  • Agents may use inconsistent terminology

Issue #4: Template Modifications Require Code Changes

  • Changing section order requires updating SECTION_ORDER constant
  • Adding sections requires updating validation logic
  • Renaming sections breaks file naming conventions
  • Template changes coupled to agent logic

Issue #5: No Mode-Specific Guidance

  • “Consider” vs “Justify” modes have different requirements
  • Recommendation section should differ significantly by mode
  • Currently handled implicitly in prompts, not declaratively

Consequences

  • Maintenance burden: Template changes require code changes
  • Inconsistency: Different agents may interpret sections differently
  • Limited flexibility: Cannot easily create custom memo formats
  • Poor documentation: Section expectations not explicitly documented
  • Error-prone: Manual synchronization between files leads to bugs

Solution: YAML-Based Content Outlines

Core Concept

Replace hardcoded section definitions with structured YAML outline files that define:

  • Section metadata (name, number, filename)
  • Guiding questions (what to address in this section)
  • Vocabulary (preferred terms, phrases to use/avoid) embedded in each outline
  • Mode-specific variations (consider vs justify)
  • Validation criteria (what makes a good section)
  • Default outlines run if no custom outline is specified

Terminology Clarification

ConceptPurposeLocationControls
OutlineContent structuretemplates/outlines/Sections, questions, vocabulary
Brand ConfigVisual stylingtemplates/brand-configs/Colors, fonts, logos, CSS

This separation ensures “templates” refers only to visual styling, while “outlines” refers to content structure.

Architecture Overview

templates/
├── brand-configs/          # VISUAL styling (already exists)
│   ├── brand-hypernova-config.yaml
│   ├── brand-collide-config.yaml
│   └── ...

└── outlines/               # CONTENT structure (NEW)
    ├── direct-investment.yaml       # Default for direct investments
    ├── fund-commitment.yaml         # Default for fund commitments
    ├── sections-schema.json         # Validation schema
    ├── README.md                    # Documentation

    └── custom/                      # Firm-specific overrides
        ├── hypernova-direct-consider.yaml
        ├── hypernova-direct-justify.yaml
        └── ...

Key Design Decision: Vocabulary is embedded within each outline, not split into separate files. This makes outlines self-contained and easier to understand.

Data Flow:

  1. User specifies investment type + mode (optionally custom outline)
  2. System loads appropriate outline YAML (default or custom)
  3. If custom, inherits from default and applies overrides
  4. Agents receive structured guidance for each section (questions + vocabulary)
  5. Writer agent generates sections following outline specifications
  6. Validator checks against outline validation criteria

YAML Schema Design

Outline Definition Schema

File: templates/outlines/direct-investment.yamlCREATED

Key Feature: Vocabulary is embedded within the outline, not in separate files.

# Direct Investment Memo Outline
# Default structure for evaluating startup/company investments

metadata:
  outline_type: "direct_investment"
  version: "1.0.0"
  description: "10-section structure for evaluating direct startup investments"
  date_created: "2025-11-21"
  compatible_modes: ["consider", "justify"]

# Global vocabulary (applies to all sections) - EMBEDDED
vocabulary:
  financial:
    preferred:
      - term: "annual recurring revenue (ARR)"
        first_use: "annual recurring revenue (ARR)"
        subsequent: "ARR"
        definition: "Yearly value of recurring revenue contracts"
    avoid:
      - term: "revenue"
        instead: "Specify ARR, MRR, or total revenue"
        reason: "Ambiguous whether recurring or one-time"

  phrases_to_avoid:
    promotional:
      - phrase: "significant traction"
        reason: "Quantify with specific metrics"

  style_rules:
    tone: "Professional, analytical, balanced"
    citations: "Obsidian-style: [^1], [^2], [^3]"

# Section definitions
sections:
  - number: 1
    name: "Executive Summary"
    filename: "01-executive-summary.md"
    target_length:
      min_words: 150
      max_words: 250
      ideal_words: 200

    description: |
      Concise overview of the investment opportunity, synthesizing key findings
      from all other sections. Written last, read first.

    guiding_questions:
      - "What problem does this company solve?"
      - "What is the solution and how is it differentiated?"
      - "Who are the founders and what relevant experience do they have?"
      - "What traction has been achieved to date? (specific metrics)"
      - "What are the key risks and why are they manageable?"
      - "What is the investment recommendation (PASS/CONSIDER/COMMIT) and primary rationale?"

    # SECTION-SPECIFIC VOCABULARY (in addition to global vocabulary)
    section_vocabulary:
      preferred_terms:
        - "value proposition" (not "unique selling point")
        - "competitive advantage" (not "moat" unless technical context)
        - "go-to-market strategy" (not "GTM" on first use)
        - "traction" (not "progress" for metrics)

      required_elements:
        - Company name and brief description
        - Stage (Seed/Series A/B/C)
        - Specific traction metrics (ARR, customers, growth rate)
        - Clear recommendation with 1-sentence rationale

      avoid:
        - "revolutionary", "game-changing" (without evidence)
        - "significant traction", "strong team" (without specifics)
        - Undefined acronyms
        - "could potentially", "might be able to"

    mode_specific:
      consider:
        emphasis: "Objective assessment with balanced risk discussion"
        recommendation_options: ["PASS", "CONSIDER", "COMMIT"]
      justify:
        emphasis: "Clear rationale for why investment was made"
        recommendation_options: ["COMMIT"]

    validation_criteria:
      - "Length within target range (150-250 words)"
      - "Contains clear recommendation (PASS/CONSIDER/COMMIT)"
      - "Includes specific traction metrics (numbers, not adjectives)"
      - "Mentions founders by name with relevant credentials"
      - "Identifies 2-3 key risks"
      - "No promotional or speculative language"

  # ... (9 more sections with same structure) ...

See templates/outlines/direct-investment.yaml for the complete 10-section definition.

Custom Outline Schema (with Inheritance)

File: templates/outlines/custom/hypernova-direct-consider.yaml

Custom outlines extend default outlines and override specific aspects:

# Hypernova Capital - Direct Investment - Consider Mode

metadata:
  firm: "Hypernova Capital"
  investment_type: "direct"
  mode: "consider"
  extends: "../direct-investment.yaml"  # Inherits from default
  version: "1.0.0"

# Override/extend vocabulary (merged with base)
vocabulary:
  hypernova_preferred:
    - term: "founder-market fit"
      emphasis: "Critical for our thesis"

# Firm-specific philosophy
firm_preferences:
  tone: "Analytical, balanced, not promotional"
  critical_questions:
    - "Why this team?"
    - "Why now?"
    - "Why Hypernova?"

# Override specific sections
section_overrides:
  executive_summary:
    target_length:
      ideal_words: 175  # Shorter than default 200

  team:
    guiding_questions_add:
      - "Assess founder coachability"
    emphasis: "Founder-market fit is critical"

  risks_mitigations:
    minimum_risks: 7  # More than default 5

Inheritance Flow:

  1. Load base outline (direct-investment.yaml)
  2. Merge vocabulary (base + custom additions)
  3. Apply section overrides (modify specific sections)
  4. Result: Complete outline with firm customizations

Removed: Separate Vocabulary Files

Previous Design: Vocabulary in separate templates/vocabulary/*.yaml files New Design: Vocabulary embedded in each outline (self-contained)

Rationale:

  • Simpler: One file to understand
  • Self-contained: Outline has everything it needs
  • Easier to customize: Firms can override vocabulary per outline
  • No cross-file dependencies

Examples from Existing Sections

The document previously showed all 10 sections in detail. For brevity, here are key examples:

Business Overview Section

  - number: 2
    name: "Business Overview"
    guiding_questions:
      - "What does the company do? (in plain language)"
      - "What specific problem are they solving?"
      - "What is the business model?"
      - "What are the unit economics?"

    section_vocabulary:
      preferred_terms:
        - "business model" (not "revenue model")
        - "customer acquisition cost (CAC)"
        - "product-market fit" (not "PMF" on first use)

All 10 sections follow this pattern. See the complete files for all sections:

  • templates/outlines/direct-investment.yaml ✅ Created
  • templates/outlines/fund-commitment.yaml ✅ Created

Implementation Status

Phase 1: Schema Design & Validation ✅ COMPLETE

Completed Tasks:

  • ✅ Created templates/outlines/direct-investment.yaml (complete with all 10 sections)
  • ✅ Created templates/outlines/fund-commitment.yaml (all 10 sections)
  • ✅ Created templates/outlines/sections-schema.json (JSON validation schema)
  • ✅ Created templates/outlines/README.md (complete documentation)
  • Design Decision: Embedded vocabulary in outlines (no separate vocabulary files)

Deliverables:

  • ✅ 2 default outline YAML files (direct, fund)
  • ✅ 1 JSON schema for validation
  • ✅ Complete documentation with examples
  • ⏳ Unit tests for schema validation (pending)

Phase 2: YAML Loader Utility ⏳ NEXT

Objective: Create utility module to load and merge YAML outlines

Remaining Tasks:

  1. Create src/outline_loader.py
  2. Implement loading functions:
    • load_outline(investment_type: str) -> OutlineDefinition
    • load_custom_outline(name: str) -> OutlineDefinition
    • Configuration merging logic (inheritance/overrides)
  3. Create dataclasses in src/schemas/
  4. Unit tests

Phase 3: Writer Agent Integration ⏳ PENDING

Update writer agent to use outline guiding questions

Phase 4: Validator Agent Integration ⏳ PENDING

Update validator to use outline validation criteria

Phase 5: CLI & Configuration Selection ⏳ PENDING

Add --outline flag:

# Default outline
python -m src.main "Avalanche"

# Custom outline
python -m src.main "Avalanche" --outline hypernova-direct-consider

Phases 6-8: Section Naming, Multi-Firm, Template Deprecation ⏳ PENDING


Usage Examples (Post-Implementation)

CLI Usage

# Default outline (uses direct-investment.yaml or fund-commitment.yaml)
python -m src.main "Avalanche"

# Custom firm outline
python -m src.main "Avalanche" --outline hypernova-direct-consider

# Specify in company data file
# data/Avalanche.json: {"outline": "hypernova-direct-consider", ...}
python -m src.main "Avalanche"

Company Data File

Specify outline in data/{CompanyName}.json:

{
  "type": "direct",
  "mode": "consider",
  "outline": "hypernova-direct-consider",  # Optional: use custom outline
  "description": "Company description..."
}

How Agents Will Use Outlines

Writer Agent

Loads outline and uses:

  • Guiding questions to construct prompts for each section
  • Section vocabulary to guide terminology
  • Target length to constrain output
  • Mode-specific guidance to adjust tone

Validator Agent

Loads outline and uses:

  • Validation criteria to check section quality
  • Length targets to ensure appropriate sizing
  • Required elements to verify completeness

Previous Content (Condensed)

The following sections have been condensed for brevity. The original detailed examples for all 10 sections are preserved in the actual YAML files.

Original detailed section examples (click to expand)

[Original 800+ lines of section examples preserved here for reference]


Vocabulary Configuration Schema

REMOVED - Vocabulary is now embedded in outlines

Previous Design:

templates/vocabulary/
├── investment-vocabulary.yaml
├── technical-vocabulary.yaml
└── style-preferences.yaml

New Design: Vocabulary embedded in each outline


Memo Configuration Schema

REMOVED - Replaced by custom outlines with inheritance

Previous Design: Separate templates/memo-configs/ directory New Design: Custom outlines in templates/outlines/custom/ that extend defaults


Implementation Plan

Phase 1: Schema Design & Validation (Week 1-2) ✅ COMPLETE

Objective: Create and validate YAML schemas for outlines

Completed:

  • templates/outlines/direct-investment.yaml (complete with all 10 sections, vocabulary embedded)
  • templates/outlines/fund-commitment.yaml (all 10 sections, vocabulary embedded)
  • templates/outlines/sections-schema.json (validation schema)
  • templates/outlines/README.md (documentation)

Remaining:

  • ⏳ Unit tests for schema validation

Deliverables:

  • ✅ 2 complete outline files
  • ✅ JSON validation schema
  • ✅ Documentation
  • ⏳ Unit tests (pending)

Key Decision: Embedded vocabulary in outlines instead of separate files. This simplifies architecture and makes outlines self-contained.


Phase 2: YAML Loader Utility (Week 2-3) ⏳ NEXT

Objective: Create utility module to load and merge YAML outlines

Tasks:

  1. Create Outline Loader Module

    • Create src/outline_loader.py
    • Implement load_outline(investment_type: str) -> OutlineDefinition
    • Implement load_custom_outline(name: str) -> OutlineDefinition
    • Implement inheritance/override merging logic
    • Add caching for performance
  2. Create Data Classes

    • Create src/schemas/outline_schema.py with dataclasses:
      • OutlineDefinition
      • OutlineMetadata
      • SectionDefinition
      • VocabularyGuide
      • ValidationCriteria
  3. Testing

    • Unit tests for loader functions
    • Test outline merging (overrides work correctly)
    • Test error handling (missing files, invalid YAML)

Deliverables:

  • Working outline loader module
  • Complete dataclass schemas
  • Unit tests passing

Phase 3: Writer Agent Integration (Week 3-4) ⏳ PENDING

Objective: Refactor writer agent to use YAML-based outlines

Tasks:

  1. Update Writer Agent

    • Modify src/agents/writer.py:write_sections_individually()
    • Replace hardcoded SECTION_ORDER with dynamic loading
    • Load outlines based on investment type
    • Generate prompts from outline guiding questions
    • Include vocabulary guidance in prompts
  2. Prompt Generation

    • Create src/prompts/section_prompt_builder.py
    • Implement build_section_prompt(section_def, state) -> str
    • Include guiding questions, vocabulary, mode-specific guidance
  3. Testing

    • Test prompt generation from outlines
    • Regression tests (quality maintained)

Phases 4-8: Remaining Integration ⏳ PENDING

Summary of remaining phases:

  • Phase 4: Validator agent integration
  • Phase 5: CLI --outline flag
  • Phase 6: YAML-defined section filenames
  • Phase 7: Multi-firm support with custom outlines
  • Phase 8: Deprecate old markdown templates

See original sections below for detailed task lists.


Benefits of YAML-Based Outlines

Clarity: “Outlines” vs “Templates”

Before: “Templates” meant both content structure AND visual styling (confusing!) After:

  • Outlines = content structure (sections, questions, vocabulary)
  • Brand configs = visual styling (colors, fonts, logos)

For Developers

  1. Separation of Concerns - Content structure separate from agent logic
  2. Type Safety - Dataclasses and schema validation
  3. Maintainability - Single source of truth for sections
  4. Self-Contained - Vocabulary embedded (no cross-file dependencies)

For Users

  1. Consistency - All agents use same section definitions
  2. Transparency - Section expectations explicitly documented
  3. Flexibility - Easy to customize for different firms
  4. Quality - Guiding questions improve completeness

Success Criteria

Phase 1 (Complete) ✅

  • ✅ All 10 sections defined in YAML for both investment types
  • ✅ Vocabulary embedded in outlines (no separate files)
  • ✅ JSON schema for validation
  • ✅ Complete documentation

Remaining Must-Haves

  • Outline loader module (src/outline_loader.py)
  • Writer agent uses outlines
  • Validator agent uses outline criteria
  • CLI supports --outline flag
  • Backward compatibility maintained

  • Multi-Agent-Orchestration-for-Investment-Memo-Generation.md - Main architecture
  • Improving-Memo-Output.md - Section improvement features
  • templates/brand-configs/README.md - Brand configuration guide (visual styling)
  • templates/outlines/README.md - Outline documentation (content structure) ✅ Created
  • CLAUDE.md - Developer guide

Changelog

2025-11-21:

  • Document created with comprehensive plan for YAML-based outline system
  • Phase 1 COMPLETE: Created default outlines for direct investment and fund commitment
  • Key Decision: Embedded vocabulary in outlines (no separate vocabulary files)
  • Terminology: Renamed “templates” to “outlines” for content structure to avoid confusion with brand templates