# Legacy Code & Brownfield Onboarding **The fundamental difference:** Greenfield = design → implement. Brownfield = **observe → infer → validate → modify.** ### The Onboarding Pipeline (All 4 Models Agree) #### Phase 1: Structural Analysis (Deterministic) - Dependency graph mapping - Module identification, LOC per component - Test coverage analysis, entry point discovery - Database schema mapping #### Phase 2: Convention Extraction (LLM-Assisted) - Sample representative files across modules - Identify: error handling patterns, naming conventions, API structure, DB access patterns, testing patterns - Output: a **conventions document** that becomes critical reference context #### Phase 3: Pattern Mining - Extract implicit "tribal knowledge" — workarounds for browser bugs, special customer cases, performance hacks that look like mistakes - Generate decision records into project state ### The Cardinal Rules | Rule | Why | |------|-----| | **Observe first, edit later** | Agents must never modify code they don't understand | | **Preserve local consistency over global ideals** | Resist the "Junior Refactor" — don't "fix" legacy code to modern standards | | **Add characterization tests before modifying** | Tests that document *current behavior*, not *correct behavior* | | **Minimal, surgical modifications** | Refactoring is a separate task requiring explicit human approval | ### The Biggest Pitfall The agent will try to refactor legacy code to match its sense of good patterns. Left unchecked, this produces massive diffs that change behavior in subtle ways. **Enforce strict rules:** modifications to legacy code should be minimal and surgical. ---