singularity-forge/docs/building-coding-agents/16-encoding-taste-aesthetics.md
Lex Christopherson 9f4bf8c452 fix: restore PR files lost during merge conflict resolution
Files added by PR #2008 that were not in main were dropped during
the merge. Restore all src/, docs/, and scripts/ files from the
pre-merge PR head.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-25 22:39:33 -06:00

2 KiB

Encoding Taste & Aesthetics

The honest frontier: This is where all four models are most candid about current limitations.

What CAN Be Automated

Technique Description
Reference-based extraction "Feels like Linear" → extract concrete attributes: spacing ratios, animation timing curves, color relationships, typography
Style specification Convert extracted attributes to verifiable parameters: "transitions 150-200ms ease-out, 8px grid spacing, specific contrast ratios"
Automated verification Lighthouse scores, visual regression tests, accessibility audits, performance budgets, design system linting
Visual comparison Render output, compare against reference screenshots using vision-capable models
A/B comparison Show two versions, human picks which "feels better" — faster than absolute judgment

What CANNOT Be Automated

The gestalt — the overall feeling, emotional response, sense of quality emerging from a thousand small interacting decisions. Does this feel premium? Fast? Trustworthy? These are fundamentally subjective.

The Optimal Strategy

Narrow the gap by converting as much "taste" as possible into concrete, verifiable specifications upfront:

  • Not "use nice spacing" → "16px between sections, 8px between related elements, 4px between tightly coupled elements"
  • Exact animation timing curves, color values with contrast ratios, typography weights and sizes

Then reserve human review for the remaining subjective layer with structured, specific questions:

"Does the density feel right? Does the transition timing feel snappy enough? Does the empty state feel intentional or broken?"

The Emerging Frontier

Vision-capable models for aesthetic evaluation — render output, capture screenshot, compare against references on specific visual dimensions. Imperfect but improving rapidly. Grok reports ~80-85% of taste can be automated this way; the remaining 15% stays human-only.