Improving Memo Output: Section Improvement & Key Information Rewrite

Status: Feature #1 Implemented ✅ | Feature #2 Planned Date: 2025-11-20 Last Updated: 2025-11-20 Author: AI Labs Team Related: Multi-Agent-Orchestration-for-Investment-Memo-Generation.md

Implementation Status

Feature #1: Section Improvement with Sonar Pro ✅

Status: COMPLETED (Steps 1-2)

Completed Work:

✅ Step 1: Sonar Pro Integration (commit: 6fbafe5)
- Replaced Claude with Perplexity Sonar Pro in improve-section.py
- Added comprehensive citation instructions to prompt
- Citations now added during improvement (one-step process)
✅ Step 2: Automatic Final Draft Reassembly (commit: 6fbafe5)
- Implemented reassemble_final_draft() function
- Automatic reassembly after section improvement
- Includes header.md (company trademark) and all sections
✅ Testing: Successfully tested on Avalanche Team section
- 11 citations added automatically
- Section quality improved significantly
- Final draft reassembled correctly

Test Results (Avalanche Team Section, 2025-11-20):

Input: output/Avalanche-v0.0.1/2-sections/04-team.md (existing weak section)
Command: python improve-section.py "Avalanche" "Team" --version v0.0.1
API: Perplexity Sonar Pro
Time: ~60 seconds

Results:
  ✓ Citations added: 11
  ✓ Obsidian-style formatting: Correct
  ✓ Section saved: 2-sections/04-team.md
  ✓ Final draft reassembled: 4-final-draft.md
  ✓ Quality improvement: Significant (specific metrics, company names, concrete details)

Pending Work:

⏳ Step 3: Before/After Preview Mode (not yet implemented)
🔄 Step 4: Documentation & Testing (IN PROGRESS)

Documentation:

✅ README.md: Section improvement usage documented
🔄 CLAUDE.md: Update in progress
🔄 This spec: Update in progress

Feature #2: Key Information Rewrite Agent 📋

Status: PLANNED (Steps 5-10)

Next Steps:

Create src/agents/key_info_rewrite.py
Create rewrite-key-info.py CLI script
Test on Avalanche $50M → $10M correction

Executive Summary

This document specifies two complementary features for improving memo quality without regenerating entire memos:

Section Improvement: Enhance individual sections with better research and citations using Perplexity Sonar Pro
Key Information Rewrite: Correct crucial facts that appear across multiple sections (e.g., fund size, dates, names)

Both features leverage the section-by-section architecture introduced in the 2025-11-20 refactor, allowing targeted improvements while preserving the artifact trail.

Problem Statement

Current Limitations

Issue #1: No Targeted Section Improvements

When one section is weak, users must regenerate the entire memo
Full regeneration is expensive (10 LLM calls + research)
Good sections may degrade during regeneration
No way to iteratively improve specific sections

Issue #2: No Global Fact Correction

Factual errors often appear in multiple sections
Example: Avalanche memo states “$50M fund” in 7 different sections, but actual size is “$10M”
Manually editing each section is error-prone
Citations may reference the wrong information

Requirements

Feature #1 Requirements:

Improve a single section without touching others
Use Perplexity Sonar Pro for real-time research
Add citations automatically during improvement
Preserve existing artifact structure
Allow reassembly of final draft

Feature #2 Requirements:

Identify all sections affected by a correction
Apply corrections consistently across sections
Preserve citations and formatting
Update research data if needed
Reassemble final draft automatically

Feature #1: Section Improvement with Sonar Pro

Current Implementation Review

Existing File: improve-section.py (created 2025-11-18)

Current Behavior:

Loads artifacts (state, research, other sections)
Uses Claude to improve section content
Saves to 2-sections/ directory
Missing: Citations must be added separately

What Exists:

def improve_section_with_agent(
    section_name: str,
    artifacts: dict,
    artifact_dir: Path,
    console: Console
) -> str:
    """Use agents to improve or create a specific section."""
    # Uses ChatAnthropic (Claude)
    # Does NOT add citations
    # Requires separate citation enrichment step

What’s Needed:

Replace Claude with Perplexity Sonar Pro
Citations added during improvement (not after)
One-step process instead of two-step

Target Architecture

Improved Function:

def improve_section_with_sonar_pro(
    section_name: str,
    artifacts: dict,
    artifact_dir: Path,
    console: Console
) -> str:
    """Use Perplexity Sonar Pro to improve section with citations."""
    from openai import OpenAI

    # Initialize Perplexity client
    perplexity_client = OpenAI(
        api_key=os.getenv("PERPLEXITY_API_KEY"),
        base_url="https://api.perplexity.ai"
    )

    # Build comprehensive improvement prompt
    prompt = build_improvement_prompt(
        section_name=section_name,
        existing_content=artifacts["sections"].get(section_file, ""),
        company_name=artifacts["state"]["company_name"],
        research_data=artifacts["research"],
        other_sections=artifacts["sections"],
        investment_type=artifacts["state"]["investment_type"],
        memo_mode=artifacts["state"]["memo_mode"]
    )

    # Call Sonar Pro with improvement + citation instructions
    response = perplexity_client.chat.completions.create(
        model="sonar-pro",
        messages=[{"role": "user", "content": prompt}]
    )

    improved_content = response.choices[0].message.content

    # Save improved section
    save_section_artifact(artifact_dir, section_num, section_name, improved_content)

    return improved_content

Prompt Design

Sonar Pro Improvement Prompt Structure:

You are improving the '{section_name}' section for an investment memo about {company_name}.

INVESTMENT TYPE: {investment_type.upper()}
MEMO MODE: {memo_mode.upper()} ({'retrospective justification' if justify else 'prospective analysis'})

CURRENT SECTION CONTENT:
{existing_content}

RESEARCH DATA AVAILABLE:
{research_data_json}

CONTEXT FROM OTHER SECTIONS:
{other_sections_summary}

TASK: Significantly improve this section by:
1. Adding specific metrics and data from authoritative sources
2. Removing vague or speculative language ("could potentially", "might be", etc.)
3. Strengthening analysis with concrete evidence
4. Adding inline citations [^1], [^2], [^3] for ALL factual claims
5. Including a comprehensive Citations section at the end

REQUIREMENTS:
- Use Obsidian-style citations: [^1], [^2], etc.
- Place citations AFTER punctuation: "text. [^1]" not "text[^1]."
- Always include ONE SPACE before each citation: "text. [^1] [^2]"
- Use quality sources:
  * Company websites, blogs, press releases
  * TechCrunch, The Information, Sifted, Protocol, Axios
  * Crunchbase, PitchBook (for funding data)
  * SEC filings, investor letters
  * Industry analyst reports (Gartner, CB Insights, McKinsey)
  * Bloomberg, Reuters, WSJ, FT (for news)
- Match the analytical tone of professional VC memos
- Be specific, not promotional or dismissive
- For {memo_mode} mode: {'justify the investment decision' if justify else 'objectively assess'}

CITATION FORMAT:
[^1]: YYYY, MMM DD. [Source Title](https://full-url-here.com). Publisher Name. Published: YYYY-MM-DD | Updated: YYYY-MM-DD

IMPROVED SECTION CONTENT:

Key Differences from Citation Enrichment Agent:

Citation Enrichment: Preserves narrative, only adds citations
Section Improvement: Rewrites for quality AND adds citations
Both use same citation format (Obsidian-style)

CLI Interface

Usage:

# Activate venv first (recommended)
source .venv/bin/activate

# Basic usage: improve section
python improve-section.py "Avalanche" "Team"

# Specify version
python improve-section.py "Avalanche" "Team" --version v0.0.1

# With final draft reassembly
python improve-section.py "Avalanche" "Team" --rebuild-final

# Direct path to artifact directory
python improve-section.py output/Avalanche-v0.0.1 "Market Context"

New Flags:

--rebuild-final: Reassemble 4-final-draft.md after improvement
--preview: Show before/after comparison without saving

Output:

✓ Loading artifacts from: output/Avalanche-v0.0.1/
✓ Loaded state.json
✓ Loaded research data
✓ Loaded 10 existing sections

🔧 Improving section: Team
  Using Perplexity Sonar Pro for real-time research...

✓ Section improved with 8 new citations added
✓ Saved to: output/Avalanche-v0.0.1/2-sections/04-team.md

📊 Changes Summary:
  - Original length: 850 words
  - Improved length: 1,200 words
  - Citations added: 8
  - Vague claims removed: 5
  - Specific metrics added: 12

✓ Reassembled final draft: 4-final-draft.md

Next steps:
  1. Review improved section in: output/Avalanche-v0.0.1/2-sections/
  2. Export to HTML: python export-branded.py output/Avalanche-v0.0.1/4-final-draft.md

Implementation Steps

Step 1: Update `improve-section.py` for Sonar Pro

Files Modified:

improve-section.py

Changes:

Replace improve_section_with_agent() with improve_section_with_sonar_pro()
Import OpenAI client for Perplexity
Update prompt to include citation instructions
Test with PERPLEXITY_API_KEY

Testing:

# Test on weak section
python improve-section.py "Avalanche" "Team" --version v0.0.1

# Verify:
# - Section has inline citations [^1], [^2]
# - Citations section at end with URLs
# - Content quality improved
# - Vague language removed

Step 2: Add Reassembly Feature

Changes:

Add --rebuild-final flag
Implement reassemble_final_draft() function:
- Load header.md if exists
- Load all sections from 2-sections/ in order
- Concatenate with proper spacing
- Save as 4-final-draft.md

Code:

def reassemble_final_draft(artifact_dir: Path, console: Console):
    """Reassemble 4-final-draft.md from section files."""
    console.print("\n[bold]Reassembling final draft...[/bold]")

    # Load header if exists
    header_file = artifact_dir / "header.md"
    if header_file.exists():
        with open(header_file) as f:
            content = f.read() + "\n"
    else:
        content = ""

    # Load sections in order
    sections_dir = artifact_dir / "2-sections"
    section_files = sorted(sections_dir.glob("*.md"))

    for section_file in section_files:
        with open(section_file) as f:
            content += f.read() + "\n\n"

    # Save final draft
    final_draft = artifact_dir / "4-final-draft.md"
    with open(final_draft, "w") as f:
        f.write(content.strip())

    console.print(f"[green]✓ Final draft reassembled:[/green] {final_draft}")

Step 3: Add Before/After Comparison

Changes:

Add --preview flag
Show diff before saving
Require confirmation

Output Example:

📊 Section Improvement Preview:

BEFORE (850 words):
  "The team has extensive experience in the industry..."

AFTER (1,200 words):
  "The founding team brings 40+ years of combined experience. [^1]

   CEO Jane Doe previously scaled Acme Corp from $5M to $150M ARR
   over 6 years (2015-2021). [^2] CTO John Smith led engineering at..."

Changes:
  ✓ Removed 5 vague claims
  ✓ Added 12 specific metrics
  ✓ Added 8 citations
  ✓ Increased depth by 41%

Save improved section? [y/N]:

Step 4: Error Handling & Edge Cases

Handle:

Missing PERPLEXITY_API_KEY
Invalid section names
Missing artifact directories
Network errors during API calls
Malformed citations in response

Code:

def validate_environment():
    """Check required environment variables."""
    if not os.getenv("PERPLEXITY_API_KEY"):
        console.print("[red]Error: PERPLEXITY_API_KEY not set[/red]")
        console.print("[yellow]Set it in .env file or export it[/yellow]")
        sys.exit(1)

def validate_section_name(section_name: str) -> bool:
    """Validate section name against known sections."""
    if section_name not in SECTION_MAP:
        console.print(f"[red]Error: Unknown section '{section_name}'[/red]")
        console.print("\n[yellow]Available sections:[/yellow]")
        for name in sorted(SECTION_MAP.keys()):
            console.print(f"  • {name}")
        return False
    return True

Step 5: Documentation & Testing

Update Files:

CLAUDE.md: Add Section Improvement section
README.md: Add to “Remaining Enhancements” → “Completed”
Create examples in docs/EXAMPLES.md

Test Cases:

✅ Improve existing weak section
✅ Create missing section from scratch
✅ Handle section with existing citations (preserve them)
✅ Error: Invalid section name
✅ Error: Missing artifacts
✅ Reassemble final draft after improvement

Feature #2: Key Information Rewrite Agent

Use Cases

Scenario 1: Fund Size Correction

Error: Memo states “$50M fund” in 7 sections
Correction: Actual size is “$10M”
Impact: Affects deployment strategy, check sizes, portfolio construction, economics

Scenario 2: Person Title Correction

Error: “Katelyn Donnelly, Partner at Avalanche”
Correction: “Katelyn Donnelly, Managing Partner and Founder at Avalanche”
Impact: Affects GP Background, Track Record, decision-making authority

Scenario 3: Date Correction

Error: “Company founded in 2020”
Correction: “Company founded in 2019”
Impact: Affects traction timeline, milestones, growth metrics

Scenario 4: Investment Stage Correction

Error: “Series B company”
Correction: “Series A company”
Impact: Affects valuation expectations, metrics benchmarks, competitive positioning

YAML-Based Correction System

Rationale

Why YAML over CLI flags?

The original design used simple CLI corrections: --correction "Fund size is $10M, not $50M"

Problems with CLI approach:

Can only handle one correction at a time
No way to provide source verification
Cannot distinguish between inaccurate, incomplete, and narrative guidance
Difficult to track/audit corrections
Not reusable across memo versions

YAML Template Benefits:

Structured corrections: Explicit categories (inaccurate vs incomplete vs narrative)
Batch corrections: Multiple corrections in one file
Source references: Authoritative sources for verification
Auditable: Corrections file becomes part of project history
Reusable: Save correction templates for common issues
Version control: Track changes to correction guidance over time
Rich guidance: Narrative shaping comments guide tone and framing

YAML Template Structure

Template Location: data/{CompanyName}-corrections.yaml

Schema:

# Correction template for investment memo improvements
company: "Avalanche"

# VERSION MANAGEMENT
source_version: "v0.0.3"  # Which version to correct (required, can be path or version tag)
# source_version: "output/Avalanche-v0.0.3"  # Alternative: full path
output_mode: "new_version"  # "new_version" or "in_place"
# output_mode: "in_place"  # Overwrites source version artifacts

date_created: "2025-11-20"

corrections:
  # Correction Object 1: Inaccurate Information
  - type: "inaccurate"
    inaccurate_information: |
      The memo states that Avalanche VC Fund II is raising $50M, appearing
      in multiple sections (Fund Strategy, Economics, Portfolio Construction).
    correct_information: |
      Avalanche VC Fund II is raising $10M, not $50M. The fund targets
      $10M with a hard cap at $12M.
    affected_sections:
      - "Fund Strategy & Thesis"
      - "Portfolio Construction"
      - "Fee Structure & Economics"
      - "Executive Summary"
    sources:
      - "https://avalanche.vc/fund-ii"
      - "data/Avalanche-v0.0.1/0-deck-analysis.json"
    narrative_shaping_comments:
      - "Emphasize that the $10M fund size is intentional for boutique, high-touch approach"
      - "Connect fund size to check size strategy ($250K-$500K initial)"
      - "Frame smaller fund as competitive advantage for emerging EdTech companies"

  # Correction Object 2: Incomplete Information
  - type: "incomplete"
    incomplete_information: |
      The Team section mentions Katelyn Donnelly but doesn't specify her
      previous role at Pearson Ventures or the fund's performance metrics.
    additional_information: |
      Katelyn Donnelly was Managing Director at Pearson Ventures, where she
      oversaw a $65M fund that delivered an 18% IRR. She's also a Kauffman
      Fellow (Class 21) and was featured on Forbes 30 Under 30 in 2014.
    affected_sections:
      - "GP Background & Track Record"
      - "Executive Summary"
    sources:
      - "https://www.linkedin.com/in/katelyndonnelly/"
      - "https://avalanche.vc/team"
    narrative_shaping_comments:
      - "Highlight the 18% IRR as significantly above industry average"
      - "Connect Pearson Ventures experience to EdTech sector expertise"
      - "Emphasize operational experience (co-founded Delivery Associates, $40M revenue)"

  # Correction Object 3: Narrative Shaping Only
  - type: "narrative"
    section: "Investment Thesis"
    narrative_shaping_comments:
      - "Reduce promotional language about 'revolutionary' and 'game-changing'"
      - "Add more balanced risk discussion alongside opportunity"
      - "Include specific competitive comparisons (Reach Capital, Learn Capital)"
      - "Quantify claims wherever possible (e.g., 'market leader' → 'top 3 in sector')"
    sources:
      - "https://www.crunchbase.com/organization/reach-capital"
      - "https://www.crunchbase.com/organization/learn-capital"

  # Correction Object 4: Multiple Facts + Narrative
  - type: "mixed"
    inaccurate_information: "Portfolio construction assumes 25 investments"
    correct_information: "Portfolio will include 15-20 core investments, not 25"
    incomplete_information: "No mention of reserve strategy for follow-on rounds"
    additional_information: |
      The fund reserves 50% of capital for follow-on investments in top performers.
      Average initial check: $400K. Reserve per company: $300-500K.
    affected_sections:
      - "Portfolio Construction"
      - "Fund Strategy & Thesis"
    sources:
      - "data/Avalanche-deck.pdf"
    narrative_shaping_comments:
      - "Frame reserve strategy as deliberate capital deployment discipline"
      - "Compare concentration to industry norms (seed funds typically 30-40 companies)"
      - "Connect to ownership targets (8-12% initial, 10-15% after reserves)"

Version Management Options

Critical Design Decision: Should corrections modify the existing version or create a new version?

Option 1: output_mode: "new_version" (Recommended for most cases)

Creates a new version directory with corrected content, preserving the original.

How it works:

source_version: "v0.0.3"
output_mode: "new_version"

Behavior:

Reads all artifacts from output/Avalanche-v0.0.3/
Applies corrections to sections
Creates output/Avalanche-v0.0.4/ with:
- Corrected section files (2-sections/)
- Updated final draft (4-final-draft.md)
- Copied artifacts from v0.0.3 (state.json, research, validation)
- New corrections-log.json documenting changes
- Updated state.json with correction metadata
Increments version: v0.0.3 → v0.0.4
Updates output/versions.json

Use when:

You want to preserve the original memo for comparison
Corrections might substantially change the narrative/recommendation
You want an audit trail of what changed between versions
Multiple stakeholders reviewing different versions
Experimenting with different correction approaches

Example: Fund size correction ($50M → $10M)

Creates v0.0.4 with corrected fund size
Original v0.0.3 remains unchanged
Can export both versions to compare side-by-side
If correction is wrong, v0.0.3 is still available

Option 2: output_mode: "in_place" (Use with caution)

Overwrites the source version’s artifacts directly. Original content is lost.

How it works:

source_version: "v0.0.3"
output_mode: "in_place"

Behavior:

Reads all artifacts from output/Avalanche-v0.0.3/
Applies corrections to sections
Overwrites files in output/Avalanche-v0.0.3/:
- Replaces section files in 2-sections/
- Replaces 4-final-draft.md
- Adds corrections-log.json
- Updates state.json with correction metadata
Version tag remains v0.0.3
No new version created

Use when:

Minor corrections (typos, small factual updates)
You don’t need to preserve the original
Disk space is limited
Corrections are unambiguously correct
Internal draft that hasn’t been shared

Example: Typo correction (“Managign Partner” → “Managing Partner”)

Fixes typo directly in v0.0.3
No need to create v0.0.4 for a typo
Original v0.0.3 is overwritten

⚠️ Warning: This is destructive. Use --preview flag first to verify changes.

Version Specification Options

Option A: Version Tag (Recommended)

source_version: "v0.0.3"

Agent resolves to output/Avalanche-v0.0.3/
Validates version exists
Works with version history

Option B: Full Path

source_version: "output/Avalanche-v0.0.3"

Direct path to artifact directory
Useful if directory isn’t in standard location
Bypasses version resolution

Option C: Latest (via CLI, not YAML)

# Use --source-version latest flag
python rewrite-key-info.py "Avalanche" \
  --corrections data/Avalanche-corrections.yaml \
  --source-version latest

Agent finds latest version automatically
Useful for quick iterations

Version Comparison & Audit Trail

With new_version mode, the system creates a comparison log:

File: output/Avalanche-v0.0.4/corrections-log.json

{
  "source_version": "v0.0.3",
  "output_version": "v0.0.4",
  "output_mode": "new_version",
  "corrections_applied": 4,
  "sections_modified": 7,
  "timestamp": "2025-11-20T15:30:00Z",
  "corrections_file": "data/Avalanche-corrections.yaml",
  "changes": [
    {
      "correction_type": "inaccurate",
      "sections_affected": ["Fund Strategy & Thesis", "Portfolio Construction", "Fee Structure & Economics", "Executive Summary"],
      "instances_corrected": 11,
      "summary": "Corrected fund size from $50M to $10M"
    },
    {
      "correction_type": "incomplete",
      "sections_affected": ["GP Background & Track Record", "Executive Summary"],
      "facts_added": 5,
      "summary": "Added Katelyn's Pearson Ventures track record (18% IRR)"
    },
    {
      "correction_type": "narrative",
      "sections_affected": ["Investment Thesis"],
      "summary": "Reduced promotional language, added competitive comparisons"
    },
    {
      "correction_type": "mixed",
      "sections_affected": ["Portfolio Construction", "Fund Strategy & Thesis"],
      "instances_corrected": 6,
      "facts_added": 3,
      "summary": "Corrected portfolio size and added reserve strategy"
    }
  ],
  "narrative_impact": "Substantial - fund size correction changes check size strategy, portfolio construction, and economics sections. May affect overall recommendation.",
  "recommendation_changed": false,
  "recommendation_note": "Recommendation remains COMMIT but with updated rationale reflecting smaller fund size"
}

Comparison Command (Future Enhancement):

# Compare two versions
python compare-versions.py Avalanche v0.0.3 v0.0.4

# Output:
# Differences between v0.0.3 and v0.0.4:
#   7 sections modified
#   11 factual corrections
#   3 narrative improvements
#   Recommendation: COMMIT (unchanged)
#   Key changes:
#     - Fund size: $50M → $10M
#     - Portfolio: 25 → 15-20 investments
#     - Added reserve strategy details

Impact on Research Data

Important: Corrections do NOT re-run research or regenerate from scratch.

What happens to research artifacts:

New Version Mode:

1-research.json and 1-research.md are copied from source version
Optionally updated if correction fundamentally conflicts (see “Research Conflicts” below)
0-deck-analysis.json is copied unchanged

In-Place Mode:

Research artifacts remain unchanged
Only section files and final draft are modified

Research Conflicts (Optional --update-research flag):

If correction contradicts research data, the agent can optionally update research:

# In corrections YAML
corrections:
  - type: "inaccurate"
    inaccurate_information: "Fund size $50M"
    correct_information: "Fund size $10M"
    update_research: true  # Optional: update research artifacts

Behavior with update_research: true:

Agent detects conflict: research mentions “$50M” but correction says “$10M”
Updates 1-research.json to reflect $10M
Updates 1-research.md narrative
Logs research update in corrections-log.json

Default behavior (update_research: false or omitted):

Research artifacts unchanged
Only sections and final draft corrected
Potential discrepancy logged in corrections-log.json

Why this matters: If research says “$50M” but memo says “$10M”, future regenerations might reintroduce the error. Updating research ensures consistency.

Correction Types

Type 1: inaccurate - Factual errors that must be corrected

Required fields: inaccurate_information, correct_information, affected_sections
Optional: sources, narrative_shaping_comments

Type 2: incomplete - Missing information that should be added

Required fields: incomplete_information, additional_information, affected_sections
Optional: sources, narrative_shaping_comments

Type 3: narrative - Tone/framing improvements without factual changes

Required fields: section, narrative_shaping_comments
Optional: sources (for competitive research, benchmarking)

Type 4: mixed - Combination of inaccurate + incomplete

Required fields: All of the above
Most comprehensive correction type

Workflow with YAML Corrections

Step 1: User Creates Correction File

# Copy template
cp templates/corrections-template.yaml data/Avalanche-corrections.yaml

# Edit with corrections
# User fills in specific corrections based on feedback

Step 2: Agent Parses YAML

def load_corrections_yaml(corrections_file: Path) -> List[CorrectionObject]:
    """Load and validate corrections YAML file."""
    with open(corrections_file) as f:
        data = yaml.safe_load(f)

    # Validate schema
    validate_corrections_schema(data)

    # Parse into CorrectionObject list
    corrections = []
    for corr in data['corrections']:
        corrections.append(CorrectionObject(
            type=corr['type'],
            inaccurate_info=corr.get('inaccurate_information'),
            correct_info=corr.get('correct_information'),
            incomplete_info=corr.get('incomplete_information'),
            additional_info=corr.get('additional_information'),
            affected_sections=corr.get('affected_sections', []),
            sources=corr.get('sources', []),
            narrative_comments=corr.get('narrative_shaping_comments', [])
        ))

    return corrections

Step 3: Source Verification (Optional)

def verify_corrections_with_sources(
    corrections: List[CorrectionObject],
    use_sonar_pro: bool = True
) -> List[VerificationResult]:
    """
    Use Perplexity Sonar Pro to verify corrections against provided sources.

    For each correction with sources:
    1. Fetch source content (if URL)
    2. Use Sonar Pro to verify correctness
    3. Return confidence score + evidence
    """

    if not use_sonar_pro:
        return [VerificationResult(verified=True, confidence=1.0)]

    results = []
    for correction in corrections:
        if not correction.sources:
            results.append(VerificationResult(verified=True, confidence=0.8,
                note="No sources provided, assuming user is correct"))
            continue

        # Build verification prompt
        prompt = f"""Verify this correction using the provided sources:

CLAIMED INACCURATE INFO: {correction.inaccurate_info}
CLAIMED CORRECT INFO: {correction.correct_info}

SOURCES TO VERIFY:
{chr(10).join(correction.sources)}

TASK:
1. Check if the correction is accurate according to sources
2. Return confidence score (0.0-1.0)
3. Provide evidence from sources

Return JSON:
{{
    "verified": true/false,
    "confidence": 0.95,
    "evidence": "Quote or summary from sources",
    "concerns": "Any potential issues"
}}
"""

        # Call Sonar Pro
        response = perplexity_client.chat.completions.create(
            model="sonar-pro",
            messages=[{"role": "user", "content": prompt}]
        )

        result = parse_verification_result(response)
        results.append(result)

    return results

Step 4: Apply Corrections Section-by-Section

def apply_correction_to_section(
    section_file: Path,
    correction: CorrectionObject,
    company_name: str
) -> str:
    """Apply single correction to section with narrative guidance."""

    with open(section_file) as f:
        original_content = f.read()

    # Build correction prompt with narrative guidance
    correction_prompt = f"""You are correcting an investment memo section for {company_name}.

CORRECTION TYPE: {correction.type}

{"INACCURATE INFORMATION: " + correction.inaccurate_info if correction.inaccurate_info else ""}
{"CORRECT INFORMATION: " + correction.correct_info if correction.correct_info else ""}
{"INCOMPLETE - MISSING: " + correction.incomplete_info if correction.incomplete_info else ""}
{"ADDITIONAL INFORMATION: " + correction.additional_info if correction.additional_info else ""}

NARRATIVE SHAPING GUIDANCE:
{chr(10).join(f"• {comment}" for comment in correction.narrative_comments)}

SOURCES FOR REFERENCE:
{chr(10).join(correction.sources)}

CURRENT SECTION CONTENT:
{original_content}

TASK:
1. Apply factual corrections (inaccurate → correct)
2. Add missing information (incomplete → additional)
3. Follow narrative shaping guidance for tone and framing
4. Preserve ALL existing citations
5. Add NEW citations for newly added facts (use sources provided)
6. Maintain formatting and structure

Return ONLY the corrected section content with citations.
"""

    # Call Claude for correction
    response = anthropic_client.invoke(correction_prompt)
    corrected_content = response.content

    return corrected_content

Step 5: CLI Usage

# Basic usage: apply corrections from YAML
# (source_version and output_mode specified in YAML)
python rewrite-key-info.py --corrections data/Avalanche-corrections.yaml

# With source verification (uses Sonar Pro to verify corrections)
python rewrite-key-info.py \
  --corrections data/Avalanche-corrections.yaml \
  --verify-sources

# Preview mode (show what would change without saving)
python rewrite-key-info.py \
  --corrections data/Avalanche-corrections.yaml \
  --preview

# Override YAML output mode (force in-place even if YAML says new_version)
python rewrite-key-info.py \
  --corrections data/Avalanche-corrections.yaml \
  --output-mode in_place

# Override source version (use latest instead of YAML-specified version)
python rewrite-key-info.py \
  --corrections data/Avalanche-corrections.yaml \
  --source-version latest

# Direct path to artifact directory (bypasses company resolution)
python rewrite-key-info.py \
  --corrections data/Avalanche-corrections.yaml \
  --source-path output/Avalanche-v0.0.3

CLI Flag Priority:

CLI flags override YAML settings
YAML settings override defaults
Defaults: output_mode: "new_version", source_version: "latest"

Step 6: Output

Example 1: New Version Mode

📋 Loaded corrections: data/Avalanche-corrections.yaml
  Company: Avalanche
  Source version: v0.0.3
  Output mode: new_version → v0.0.4
  Corrections: 4

🔍 Verifying corrections with sources...
  ✓ Correction 1: Verified (confidence: 0.95) - Fund size $10M confirmed
  ✓ Correction 2: Verified (confidence: 0.92) - Katelyn's track record confirmed
  ✓ Correction 3: No verification needed (narrative only)
  ✓ Correction 4: Verified (confidence: 0.88) - Portfolio construction confirmed

📝 Applying corrections...
  Correction 1 (inaccurate):
    ✓ Fund Strategy & Thesis (3 instances corrected)
    ✓ Portfolio Construction (2 instances corrected)
    ✓ Fee Structure & Economics (4 instances corrected)
    ✓ Executive Summary (2 instances corrected)

  Correction 2 (incomplete):
    ✓ GP Background & Track Record (added Pearson metrics)
    ✓ Executive Summary (added track record summary)

  Correction 3 (narrative):
    ✓ Investment Thesis (toned down promotional language, added comparisons)

  Correction 4 (mixed):
    ✓ Portfolio Construction (corrected count, added reserve strategy)
    ✓ Fund Strategy & Thesis (added reserve discussion)

📦 Creating new version: v0.0.4
  ✓ Copied artifacts from v0.0.3
  ✓ Applied corrections to 7 sections
  ✓ Updated state.json with correction metadata
  ✓ Created corrections-log.json

✅ Reassembled final draft: output/Avalanche-v0.0.4/4-final-draft.md

📊 Correction Summary:
  Source version: v0.0.3
  Output version: v0.0.4 (NEW)
  Total corrections: 4
  Sections modified: 7/10
  Instances corrected: 15
  Citations added: 8
  Narrative improvements: 1 section

📝 Correction log saved: output/Avalanche-v0.0.4/corrections-log.json

Next steps:
  1. Review corrections: output/Avalanche-v0.0.4/2-sections/
  2. View final draft: output/Avalanche-v0.0.4/4-final-draft.md
  3. Compare versions: diff output/Avalanche-v0.0.3/4-final-draft.md output/Avalanche-v0.0.4/4-final-draft.md
  4. Export to HTML: python export-branded.py output/Avalanche-v0.0.4/4-final-draft.md

Example 2: In-Place Mode

📋 Loaded corrections: data/Avalanche-corrections.yaml
  Company: Avalanche
  Source version: v0.0.3
  Output mode: in_place (⚠️ will overwrite v0.0.3)
  Corrections: 1

⚠️  WARNING: In-place mode will overwrite existing artifacts.
    Use --preview to see changes before applying.
    Original content will be lost. Continue? [y/N]: y

📝 Applying corrections...
  Correction 1 (inaccurate):
    ✓ GP Background & Track Record (1 instance corrected)

✅ Updated final draft: output/Avalanche-v0.0.3/4-final-draft.md

📊 Correction Summary:
  Version: v0.0.3 (MODIFIED IN-PLACE)
  Total corrections: 1
  Sections modified: 1/10
  Instances corrected: 1

📝 Correction log saved: output/Avalanche-v0.0.3/corrections-log.json

Next steps:
  1. Review corrections: output/Avalanche-v0.0.3/2-sections/
  2. View final draft: output/Avalanche-v0.0.3/4-final-draft.md
  3. Export to HTML: python export-branded.py output/Avalanche-v0.0.3/4-final-draft.md

Benefits of YAML Approach

1. Comprehensive Corrections

Single file can fix multiple issues across entire memo
Supports fact corrections, additions, and narrative guidance
Clear categorization of correction types

2. Source Integration

Reference authoritative sources for verification
Automatically verify corrections with Sonar Pro
Add citations to newly added facts

3. Narrative Control

Shape tone and framing with explicit guidance
Not just facts—control how facts are presented
Maintain analytical rigor vs promotional tone

4. Audit Trail

Correction YAML files tracked in version control
corrections-log.json records what was changed
Easy to understand what was corrected and why

5. Reusability

Save correction templates for common issues
Apply same corrections to multiple memo versions
Share correction patterns across projects

6. Batch Efficiency

Fix 10+ issues in one run
Fewer API calls than iterative corrections
Consistent application across all sections

Architecture Design

New Agent: src/agents/key_info_rewrite.py

Agent Function:

def key_information_rewrite_agent(state: MemoState) -> dict:
    """
    Correct crucial information that affects multiple sections.

    Args:
        state: Must contain:
            - correction_instruction: str
              Example: "The fund size is $10M, not $50M"
            - company_name: str
            - latest_output_dir: Path (optional, auto-detected if not provided)

    Process:
        1. Load final draft from latest version
        2. Analyze correction to identify affected sections
        3. For each affected section:
            a. Load section file from 2-sections/
            b. Apply correction via LLM
            c. Preserve citations and formatting
            d. Save corrected section
        4. Reassemble final draft
        5. Update metadata

    Returns:
        {
            "sections_corrected": int,
            "instances_found": int,
            "files_updated": List[str],
            "messages": List[str]
        }
    """

Correction Analysis Algorithm

Phase 1: Parse Correction Instruction

def analyze_correction(instruction: str, company_name: str) -> CorrectionAnalysis:
    """
    Use LLM to understand correction and identify search terms.

    Returns:
        CorrectionAnalysis:
            - incorrect_info: str ("$50M")
            - correct_info: str ("$10M")
            - semantic_variations: List[str] (["fifty million", "Fund II size", "10M fund"])
            - affected_section_types: List[str] (["Fund Strategy", "Economics", "Portfolio"])
    """

    analysis_prompt = f"""Analyze this correction instruction for {company_name}:

INSTRUCTION: {instruction}

TASK: Extract structured information:
1. What information is INCORRECT?
2. What is the CORRECT information?
3. What semantic variations might appear? (paraphrases, related concepts)
4. Which section types are likely affected?

Return JSON:
{{
    "incorrect_info": "exact text",
    "correct_info": "exact text",
    "semantic_variations": ["variant1", "variant2"],
    "affected_section_types": ["section name 1", "section name 2"]
}}
"""

    # Call Claude for analysis
    response = anthropic_client.invoke(analysis_prompt)
    return CorrectionAnalysis.parse(response.content)

Phase 2: Identify Affected Sections

def identify_affected_sections(
    correction_analysis: CorrectionAnalysis,
    artifact_dir: Path
) -> List[SectionInfo]:
    """
    Scan all section files to find which ones contain the error.

    Returns:
        List of SectionInfo:
            - section_name: str
            - section_file: Path
            - instances_found: int
            - sample_text: str (preview of error)
    """

    affected_sections = []
    sections_dir = artifact_dir / "2-sections"

    for section_file in sections_dir.glob("*.md"):
        with open(section_file) as f:
            content = f.read()

        # Check for exact match
        exact_count = content.count(correction_analysis.incorrect_info)

        # Check for semantic variations
        variation_count = 0
        for variation in correction_analysis.semantic_variations:
            variation_count += content.lower().count(variation.lower())

        total_instances = exact_count + variation_count

        if total_instances > 0:
            affected_sections.append(SectionInfo(
                section_name=extract_section_name(section_file),
                section_file=section_file,
                instances_found=total_instances,
                sample_text=extract_sample(content, correction_analysis.incorrect_info)
            ))

    return affected_sections

Phase 3: Apply Correction to Each Section

def correct_section(
    section_file: Path,
    correction_analysis: CorrectionAnalysis,
    other_sections_context: str,
    company_name: str
) -> str:
    """
    Use LLM to apply correction while preserving formatting and citations.
    """

    with open(section_file) as f:
        original_content = f.read()

    correction_prompt = f"""You are correcting a factual error in an investment memo section.

COMPANY: {company_name}

CORRECTION REQUIRED:
  Incorrect: {correction_analysis.incorrect_info}
  Correct: {correction_analysis.correct_info}

CONTEXT FROM OTHER SECTIONS:
{other_sections_context}

CURRENT SECTION CONTENT:
{original_content}

TASK:
1. Find ALL instances of the incorrect information (including paraphrases)
2. Replace with the correct information
3. Ensure consistency throughout the section
4. Update any dependent claims (e.g., if fund size changes, check sizes may change)
5. Preserve ALL citations - do not remove or modify them
6. Preserve all formatting (headers, lists, emphasis)
7. Do NOT change other content unrelated to the correction

CRITICAL:
- If a claim becomes unsupported after correction, flag it with [NEEDS CITATION]
- Maintain the analytical tone and depth
- Return ONLY the corrected section content

CORRECTED SECTION:
"""

    # Call Claude
    response = anthropic_client.invoke(correction_prompt)
    corrected_content = response.content

    # Save corrected section
    with open(section_file, "w") as f:
        f.write(corrected_content)

    return corrected_content

Phase 4: Reassemble Final Draft

def reassemble_after_correction(artifact_dir: Path) -> Path:
    """Reassemble 4-final-draft.md after corrections."""

    # Same logic as Feature #1 reassembly
    content = ""

    # Load header
    header_file = artifact_dir / "header.md"
    if header_file.exists():
        with open(header_file) as f:
            content = f.read() + "\n"

    # Load all sections in order
    sections_dir = artifact_dir / "2-sections"
    for section_file in sorted(sections_dir.glob("*.md")):
        with open(section_file) as f:
            content += f.read() + "\n\n"

    # Save final draft
    final_draft = artifact_dir / "4-final-draft.md"
    with open(final_draft, "w") as f:
        f.write(content.strip())

    return final_draft

CLI Interface

Standalone Script: rewrite-key-info.py

Usage:

# Activate venv first
source .venv/bin/activate

# Basic correction
python rewrite-key-info.py "Avalanche" \
  --correction "The fund size is $10M, not $50M"

# Specify version
python rewrite-key-info.py "Avalanche" \
  --correction "Katelyn Donnelly is Managing Partner, not Partner" \
  --version v0.0.1

# Direct path
python rewrite-key-info.py output/Avalanche-v0.0.1 \
  --correction "Company founded in 2019, not 2020"

# Preview mode (don't save)
python rewrite-key-info.py "Avalanche" \
  --correction "Series A, not Series B" \
  --preview

# Update research data too (deep mode)
python rewrite-key-info.py "Avalanche" \
  --correction "Fund size is $10M" \
  --update-research

Output Example:

🔍 Analyzing correction...
  Incorrect: "$50M"
  Correct: "$10M"
  Semantic variations: "fifty million", "Fund II target", "target size"

🔎 Scanning sections...
  ✓ Found errors in 7/10 sections:
    • Fund Strategy & Thesis (3 instances)
    • Portfolio Construction (2 instances)
    • Fee Structure & Economics (4 instances)
    • Value Add & Differentiation (1 instance)
    • Track Record Analysis (2 instances)
    • Risks & Mitigations (1 instance)
    • Executive Summary (2 instances)

📝 Applying corrections...
  ✓ Corrected: Fund Strategy & Thesis
  ✓ Corrected: Portfolio Construction
  ✓ Corrected: Fee Structure & Economics
  ✓ Corrected: Value Add & Differentiation
  ✓ Corrected: Track Record Analysis
  ✓ Corrected: Risks & Mitigations
  ✓ Corrected: Executive Summary

✅ Reassembled final draft

📊 Correction Summary:
  Sections modified: 7/10
  Total instances corrected: 15
  Files updated:
    • 2-sections/03-fund-strategy--thesis.md
    • 2-sections/04-portfolio-construction.md
    • 2-sections/07-fee-structure--economics.md
    • 2-sections/05-value-add--differentiation.md
    • 2-sections/06-track-record-analysis.md
    • 2-sections/08-risks--mitigations.md
    • 2-sections/01-executive-summary.md
    • 4-final-draft.md

Next steps:
  1. Review corrections in: output/Avalanche-v0.0.1/
  2. Export to HTML: python export-branded.py output/Avalanche-v0.0.1/4-final-draft.md
  3. Create new version: python -m src.main "Avalanche" --version-only

Implementation Steps (YAML-Based)

Step 0: Create YAML Template

New File: templates/corrections-template.yaml

Content:

# Investment Memo Correction Template
# Copy to data/{CompanyName}-corrections.yaml and fill in corrections

company: "CompanyName"

# VERSION MANAGEMENT (required)
source_version: "v0.0.3"  # Which version to use as source
# Alternatives:
#   source_version: "latest"  # Use latest version
#   source_version: "output/CompanyName-v0.0.3"  # Full path

output_mode: "new_version"  # "new_version" or "in_place"
# new_version: Creates v0.0.4 from v0.0.3 (preserves original)
# in_place: Overwrites v0.0.3 directly (DESTRUCTIVE - use with caution)

date_created: "YYYY-MM-DD"

corrections:
  # Example 1: Inaccurate information
  - type: "inaccurate"
    inaccurate_information: |
      Describe what's incorrect in the memo
    correct_information: |
      Provide the correct information
    affected_sections:
      - "Section Name 1"
      - "Section Name 2"
    sources:
      - "https://source-url.com"
      - "data/document.pdf"
    narrative_shaping_comments:
      - "Guidance on how to frame this correction"
      - "Additional context or emphasis"

  # Example 2: Incomplete information
  - type: "incomplete"
    incomplete_information: |
      Describe what's missing
    additional_information: |
      Provide the missing information
    affected_sections:
      - "Section Name"
    sources:
      - "https://source-url.com"
    narrative_shaping_comments:
      - "How to integrate this information"

  # Example 3: Narrative shaping only
  - type: "narrative"
    section: "Section Name"
    narrative_shaping_comments:
      - "Remove promotional language"
      - "Add balanced risk discussion"
      - "Quantify vague claims"
    sources:
      - "https://competitor-comparison.com"

  # Example 4: Mixed correction
  - type: "mixed"
    inaccurate_information: "What's wrong"
    correct_information: "What's correct"
    incomplete_information: "What's missing"
    additional_information: "What to add"
    affected_sections:
      - "Section 1"
      - "Section 2"
    sources:
      - "https://source.com"
    narrative_shaping_comments:
      - "How to present this holistically"

Step 1: Create YAML Parser & Schema

New File: src/corrections.py

Implement:

from dataclasses import dataclass
from typing import List, Optional
from pathlib import Path
import yaml

@dataclass
class CorrectionObject:
    """Represents a single correction from YAML."""
    type: str  # "inaccurate", "incomplete", "narrative", "mixed"
    inaccurate_info: Optional[str] = None
    correct_info: Optional[str] = None
    incomplete_info: Optional[str] = None
    additional_info: Optional[str] = None
    affected_sections: List[str] = None
    section: Optional[str] = None  # For narrative-only corrections
    sources: List[str] = None
    narrative_comments: List[str] = None

    def __post_init__(self):
        if self.affected_sections is None:
            self.affected_sections = []
        if self.sources is None:
            self.sources = []
        if self.narrative_comments is None:
            self.narrative_comments = []

def load_corrections_yaml(corrections_file: Path) -> dict:
    """Load and validate corrections YAML file."""
    with open(corrections_file) as f:
        data = yaml.safe_load(f)

    # Validate schema
    validate_corrections_schema(data)

    return data

def validate_corrections_schema(data: dict) -> None:
    """Validate YAML structure and required fields."""
    required_top = ["company", "corrections"]
    for field in required_top:
        if field not in data:
            raise ValueError(f"Missing required field: {field}")

    for i, corr in enumerate(data["corrections"]):
        if "type" not in corr:
            raise ValueError(f"Correction {i+1}: Missing 'type' field")

        corr_type = corr["type"]

        if corr_type == "inaccurate":
            required = ["inaccurate_information", "correct_information", "affected_sections"]
            for field in required:
                if field not in corr:
                    raise ValueError(f"Correction {i+1} (inaccurate): Missing '{field}'")

        elif corr_type == "incomplete":
            required = ["incomplete_information", "additional_information", "affected_sections"]
            for field in required:
                if field not in corr:
                    raise ValueError(f"Correction {i+1} (incomplete): Missing '{field}'")

        elif corr_type == "narrative":
            required = ["section", "narrative_shaping_comments"]
            for field in required:
                if field not in corr:
                    raise ValueError(f"Correction {i+1} (narrative): Missing '{field}'")

        elif corr_type == "mixed":
            required = ["affected_sections"]
            for field in required:
                if field not in corr:
                    raise ValueError(f"Correction {i+1} (mixed): Missing '{field}'")

        else:
            raise ValueError(f"Correction {i+1}: Invalid type '{corr_type}'")

def parse_corrections(data: dict) -> List[CorrectionObject]:
    """Parse validated YAML into CorrectionObject list."""
    corrections = []
    for corr in data["corrections"]:
        corrections.append(CorrectionObject(
            type=corr["type"],
            inaccurate_info=corr.get("inaccurate_information"),
            correct_info=corr.get("correct_information"),
            incomplete_info=corr.get("incomplete_information"),
            additional_info=corr.get("additional_information"),
            affected_sections=corr.get("affected_sections", []),
            section=corr.get("section"),
            sources=corr.get("sources", []),
            narrative_comments=corr.get("narrative_shaping_comments", [])
        ))
    return corrections

Testing:

def test_load_corrections_yaml():
    yaml_content = """
company: "TestCo"
corrections:
  - type: "inaccurate"
    inaccurate_information: "Wrong info"
    correct_information: "Right info"
    affected_sections: ["Team"]
"""
    # Test parsing and validation

Step 2: Create CLI Script with YAML Support

New File: rewrite-key-info.py

Structure:

#!/usr/bin/env python3
"""
Correct crucial information in investment memos using YAML correction files.

USAGE:
    python rewrite-key-info.py "Company" --corrections data/Company-corrections.yaml
    python rewrite-key-info.py "Company" --corrections data/Company-corrections.yaml --verify-sources
"""

import argparse
from pathlib import Path
from rich.console import Console
from rich.panel import Panel
from src.corrections import load_corrections_yaml, parse_corrections
from src.agents.key_info_rewrite import apply_corrections_to_memo
from src.utils import get_latest_output_dir

def main():
    parser = argparse.ArgumentParser(
        description="Apply YAML-based corrections to investment memos"
    )
    parser.add_argument("target", help="Company name or path to artifact directory")
    parser.add_argument("--corrections", required=True, help="Path to corrections YAML file")
    parser.add_argument("--version", help="Specific version (default: latest)")
    parser.add_argument("--verify-sources", action="store_true",
                       help="Verify corrections with Perplexity Sonar Pro")
    parser.add_argument("--preview", action="store_true", help="Preview without saving")

    args = parser.parse_args()

    console = Console()

    # Load corrections YAML
    corrections_file = Path(args.corrections)
    if not corrections_file.exists():
        console.print(f"[red]Error: Corrections file not found:[/red] {corrections_file}")
        sys.exit(1)

    console.print(f"[bold]Loading corrections:[/bold] {corrections_file}")
    data = load_corrections_yaml(corrections_file)
    corrections = parse_corrections(data)

    console.print(f"  Company: {data['company']}")
    console.print(f"  Corrections: {len(corrections)}")

    # Determine artifact directory
    # ... (similar to improve-section.py)

    # Apply corrections
    result = apply_corrections_to_memo(
        artifact_dir=artifact_dir,
        corrections=corrections,
        verify_sources=args.verify_sources,
        preview=args.preview,
        console=console
    )

    # Display summary
    # ...

Step 3: State Schema Updates

Update: src/state.py

Add Field:

class MemoState(TypedDict):
    # ... existing fields ...

    # NEW: For key information corrections
    correction_instruction: NotRequired[str]
    correction_metadata: NotRequired[Dict[str, Any]]  # Track what was corrected

Step 4: Workflow Integration (Optional)

Update: src/workflow.py

Add Conditional Node:

def build_workflow():
    workflow = StateGraph(MemoState)

    # ... existing nodes ...

    # NEW: Optional correction node
    workflow.add_node("correct_key_info", key_information_rewrite_agent)

    # Conditional routing
    def should_correct(state: MemoState) -> str:
        if state.get("correction_instruction"):
            return "correct_key_info"
        return "continue"

    workflow.add_conditional_edges(
        "validate",
        should_correct,
        {
            "correct_key_info": "finalize",
            "continue": "finalize"
        }
    )

CLI Support:

# Run memo generation with correction
python -m src.main "Avalanche" --correct "Fund size is $10M, not $50M"

Step 5: Handle Edge Cases

Scenarios:

No instances found: Warn user, don’t modify anything
Conflicting citations: Flag sections that need manual review
Dependent claims: Identify claims that may be affected
Research data conflicts: Warn if correction contradicts research

Code:

def validate_correction_safety(
    correction_analysis: CorrectionAnalysis,
    affected_sections: List[SectionInfo],
    research_data: dict
) -> List[str]:
    """Check for potential issues before applying correction."""

    warnings = []

    # No instances found
    if not affected_sections:
        warnings.append("⚠️  No instances of incorrect information found")

    # Check research data conflicts
    research_text = str(research_data)
    if correction_analysis.incorrect_info in research_text:
        warnings.append(
            "⚠️  Research data contains the incorrect information. "
            "Consider using --update-research flag."
        )

    # Check for many instances (may indicate systemic issue)
    total_instances = sum(s.instances_found for s in affected_sections)
    if total_instances > 20:
        warnings.append(
            f"⚠️  Found {total_instances} instances across {len(affected_sections)} sections. "
            "This may indicate a deeper issue. Review carefully after correction."
        )

    return warnings

Step 6: Research Data Updates (—update-research)

If Flag Set:

def update_research_data(
    artifact_dir: Path,
    correction_analysis: CorrectionAnalysis
) -> None:
    """Update research.json with corrected information."""

    research_file = artifact_dir / "1-research.json"
    if not research_file.exists():
        return

    with open(research_file) as f:
        research_data = json.load(f)

    # Apply correction to research data fields
    research_json = json.dumps(research_data)
    corrected_json = research_json.replace(
        correction_analysis.incorrect_info,
        correction_analysis.correct_info
    )
    research_data = json.loads(corrected_json)

    # Save updated research
    with open(research_file, "w") as f:
        json.dump(research_data, f, indent=2)

    # Also update 1-research.md
    research_md = artifact_dir / "1-research.md"
    if research_md.exists():
        with open(research_md) as f:
            content = f.read()

        corrected_content = content.replace(
            correction_analysis.incorrect_info,
            correction_analysis.correct_info
        )

        with open(research_md, "w") as f:
            f.write(corrected_content)

Step 7: Testing & Validation

Test Suite:

✅ Simple correction (fund size)
✅ Complex correction (person title + role)
✅ Date correction with timeline impact
✅ Multiple semantic variations
✅ Correction with citation conflicts
✅ No instances found (error case)
✅ Preview mode
✅ Research data update

Manual Testing Checklist:

Run on Avalanche $50M → $10M
Verify all 7 sections corrected
Check citations preserved
Verify formatting maintained
Review reassembled final draft
Export to HTML and verify
Test with —update-research flag
Test with —preview flag

Step 8: Documentation

Update Files:

CLAUDE.md: Add Key Information Rewrite section
README.md: Move to “Completed” ✅
Create docs/CORRECTIONS.md: Guide with examples
Add examples to docs/EXAMPLES.md

Documentation Structure:

# Key Information Rewrite Guide

## When to Use

Use key information rewrite when:
- A crucial fact appears in multiple sections
- The error affects related claims (e.g., fund size affects check sizes)
- Manual editing would be error-prone

Do NOT use when:
- Error is in only one section (use improve-section.py instead)
- You want to rephrase content (use improve-section.py)
- You need to add new information (use improve-section.py or regenerate)

## Common Scenarios

### Fund Size Correction
...

### Person Title/Role Correction
...

### Date/Timeline Correction
...

### Investment Stage Correction
...

Implementation Roadmap

Step 1: Feature #1 - Sonar Pro Integration ✅ COMPLETED

Objective: Update improve-section.py to use Perplexity Sonar Pro for one-step improvements with citations

Tasks:

Replace Claude with Sonar Pro in improve_section_with_agent()
Update prompt to include citation instructions
Test on Avalanche Team section
Verify citations properly formatted
Compare quality to Claude-only approach

Deliverables:

✅ Updated improve-section.py (commit: 6fbafe5)
✅ Test results: Avalanche Team section (11 citations added)
✅ Quality verified: Significant improvement with concrete details

Completion Date: 2025-11-20

Step 2: Feature #1 - Reassembly ✅ COMPLETED

Objective: Add ability to reassemble final draft after section improvement

Tasks:

Implement reassemble_final_draft() function
~~Add —rebuild-final flag~~ (automatic reassembly, no flag needed)
Test reassembly on improved sections
Verify formatting preserved

Deliverables:

✅ Working reassembly feature (automatic after improvement)
✅ Includes header.md (company trademark)
✅ Verified on Avalanche final draft

Completion Date: 2025-11-20

Step 3: Feature #1 - Before/After Preview ⏳ PENDING

Objective: Show improvements before applying

Tasks:

Add —preview flag
Implement diff display
Add confirmation prompt
Show metrics (word count, citations, etc.)

Deliverables:

Preview mode implementation
User-friendly diff output

Step 4: Feature #1 - Documentation & Testing 🔄 IN PROGRESS

Objective: Document Feature #1 and complete testing

Tasks:

Test on 1 section (Avalanche Team) ✅
Test on 4 more sections from different memos
Handle edge cases (missing API key, invalid sections) ✅
Update README.md ✅
Update CLAUDE.md
Mark as complete in README “Remaining Enhancements” ✅

Deliverables:

🔄 Test results: 1/5 memos tested
🔄 Documentation: README done, CLAUDE in progress
⏳ Feature marked complete

Step 5: Feature #2 - YAML Template & Parser

Objective: Create correction YAML template and parser

Tasks:

Create templates/corrections-template.yaml
Create src/corrections.py with CorrectionObject dataclass
Implement load_corrections_yaml()
Implement validate_corrections_schema()
Implement parse_corrections()
Write unit tests for YAML parsing

Deliverables:

Working YAML template
Validated YAML parser
Unit tests passing

Step 6: Feature #2 - Agent Core (YAML-Based)

Objective: Create key_info_rewrite agent with YAML corrections support

Tasks:

Create src/agents/key_info_rewrite.py
Implement apply_correction_to_section() with narrative guidance
Implement apply_corrections_to_memo() (batch processor)
Optional: Implement verify_corrections_with_sources() (Sonar Pro)
Implement reassemble_after_correction()
Handle all 4 correction types (inaccurate, incomplete, narrative, mixed)
Write unit tests

Deliverables:

Working agent module
Support for all correction types
Unit tests passing

Step 7: Feature #2 - CLI Script (YAML-Based)

Objective: Create standalone CLI for YAML-based corrections

Tasks:

Create rewrite-key-info.py
Implement —corrections flag (required, YAML path)
Implement —verify-sources flag (optional, uses Sonar Pro)
Add preview mode
Implement rich console output with progress
Save corrections-log.json for audit trail

Deliverables:

Working CLI script
Help documentation
Example YAML files

Step 8: Feature #2 - Testing & Validation

Objective: Comprehensive testing of correction feature

Tasks:

Deliverables:

Test results for all scenarios
Edge case handling
Bug fixes

Step 9: Feature #2 - Workflow Integration (Optional)

Objective: Allow corrections during memo generation workflow

Tasks:

Update MemoState schema
Add conditional routing in workflow
Add —correct flag to main CLI
Test integrated workflow

Deliverables:

Workflow integration
Updated CLI interface

Step 10: Documentation & Examples

Objective: Complete documentation for both features

Tasks:

Create docs/CORRECTIONS.md guide
Add examples to docs/EXAMPLES.md
Update CLAUDE.md comprehensively
Update README.md
Mark both features complete ✅

Deliverables:

Complete documentation
Usage examples
Features marked complete in README

Success Criteria

Feature #1: Section Improvement

Must Have:

✅ Uses Perplexity Sonar Pro (not Claude) - COMPLETED
✅ Citations added during improvement (not after) - COMPLETED
✅ Obsidian-style citation format - COMPLETED
✅ Preserves artifact structure - COMPLETED
✅ Can reassemble final draft - COMPLETED (automatic)
✅ Error handling for missing API keys - COMPLETED

Nice to Have:

⏳ Before/after preview mode - PENDING
⏳ Word count and quality metrics - PENDING
⏳ Comparison with original section - PENDING

Status: Core functionality COMPLETED ✅ | Enhancement features PENDING

Feature #2: Key Information Rewrite

Must Have:

✅ Identifies all affected sections
✅ Applies corrections consistently
✅ Preserves citations and formatting
✅ Reassembles final draft automatically
✅ Shows summary of changes

Nice to Have:

✅ Updates research data (—update-research)
✅ Preview mode before applying
✅ Semantic variation detection
✅ Workflow integration

Technical Considerations

API Costs

Feature #1 (Sonar Pro per section):

Cost: ~$0.50-1.00 per section improvement
Context: ~5k chars in, ~7k chars out
Model: sonar-pro

Feature #2 (Corrections):

Analysis: 1 Claude call (~$0.01)
Per section: 1 Claude call (~$0.05)
Total for 7 sections: ~$0.36
Model: claude-sonnet-4-5

Comparison to Full Regeneration:

Full regeneration: 10 sections × $1.00 = $10.00
Section improvement: 1 section × $0.75 = $0.75 (13× cheaper)
Key correction: 7 sections × $0.05 = $0.35 (29× cheaper)

Performance

Feature #1:

Time: ~30-60 seconds per section (Sonar Pro call)
Parallel: Not applicable (one section at a time)

Feature #2:

Analysis: ~5 seconds
Section scanning: ~1 second
Correction per section: ~10-15 seconds
Total for 7 sections: ~90 seconds (vs. 10+ minutes for full regeneration)

Rate Limits

Perplexity Sonar Pro:

Rate limit: 50 requests/minute
Constraint: None (processing one section at a time)

Anthropic Claude:

Rate limit: 50 requests/minute
Constraint: None for corrections (max ~10 sections)

Monitoring & Quality Assurance

Metrics to Track

Feature #1:

Sections improved per week
Average quality improvement (word count, citations added)
User satisfaction (manual review scores)
Time saved vs. full regeneration

Feature #2:

Corrections performed per week
Average sections affected per correction
Accuracy (manual review of corrections)
Time saved vs. manual editing

Quality Checks

Pre-Deployment:

Test both features on 5 real memos
Manual review of outputs
Verify citations preserved
Check formatting maintained

Post-Deployment:

Monitor error rates
Collect user feedback
Review edge cases
Iterate on prompts

Future Enhancements

Feature #1 Extensions

Batch Improvements:

# Improve multiple sections at once
python improve-section.py "Avalanche" --sections "Team,Market Context,Technology"

Comparative Mode:

# Compare section across versions
python improve-section.py "Avalanche" --compare v0.0.1 v0.0.2 --section "Team"

Auto-Improve:

# Automatically improve sections scoring < 7/10
python improve-section.py "Avalanche" --auto-improve --threshold 7

Feature #2 Extensions

Multiple Corrections:

# Apply multiple corrections at once
python rewrite-key-info.py "Avalanche" \
  --corrections corrections.json

Validation Mode:

# Validate consistency across sections
python rewrite-key-info.py "Avalanche" --validate

Rollback Support:

# Undo last correction
python rewrite-key-info.py "Avalanche" --rollback

Multi-Agent-Orchestration-for-Investment-Memo-Generation.md - Main architecture
changelog/2025-11-20_01.md - Section-by-section processing refactor
CLAUDE.md - Developer guide
README.md - User guide

Changelog

2025-11-20: Document created with comprehensive implementation plan for both features

Improving Memo Output: Section Improvement & Key Information Rewrite

Implementation Status

Feature #1: Section Improvement with Sonar Pro ✅

Feature #2: Key Information Rewrite Agent 📋

Executive Summary

Problem Statement

Current Limitations

Requirements

Feature #1: Section Improvement with Sonar Pro

Current Implementation Review

Target Architecture

Prompt Design

CLI Interface

Implementation Steps

Step 1: Update improve-section.py for Sonar Pro

Step 2: Add Reassembly Feature

Step 3: Add Before/After Comparison

Step 4: Error Handling & Edge Cases

Step 5: Documentation & Testing

Feature #2: Key Information Rewrite Agent

Use Cases

YAML-Based Correction System

Rationale

YAML Template Structure

Version Management Options

Version Specification Options

Version Comparison & Audit Trail

Impact on Research Data

Correction Types

Workflow with YAML Corrections

Benefits of YAML Approach

Architecture Design

Correction Analysis Algorithm

CLI Interface

Implementation Steps (YAML-Based)

Step 0: Create YAML Template

Step 1: Create YAML Parser & Schema

Step 2: Create CLI Script with YAML Support

Step 3: State Schema Updates

Step 4: Workflow Integration (Optional)

Step 5: Handle Edge Cases

Step 6: Research Data Updates (—update-research)

Step 7: Testing & Validation

Step 8: Documentation

Implementation Roadmap

Step 1: Feature #1 - Sonar Pro Integration ✅ COMPLETED

Step 2: Feature #1 - Reassembly ✅ COMPLETED

Step 3: Feature #1 - Before/After Preview ⏳ PENDING

Step 4: Feature #1 - Documentation & Testing 🔄 IN PROGRESS

Step 5: Feature #2 - YAML Template & Parser

Step 6: Feature #2 - Agent Core (YAML-Based)

Step 7: Feature #2 - CLI Script (YAML-Based)

Step 8: Feature #2 - Testing & Validation

Step 9: Feature #2 - Workflow Integration (Optional)

Step 10: Documentation & Examples

Success Criteria

Feature #1: Section Improvement

Feature #2: Key Information Rewrite

Technical Considerations

API Costs

Performance

Rate Limits

Monitoring & Quality Assurance

Metrics to Track

Quality Checks

Future Enhancements

Feature #1 Extensions

Feature #2 Extensions

Related Documentation

Changelog

Step 1: Update `improve-section.py` for Sonar Pro