singularity-forge/docs/dev/building-coding-agents/16-encoding-taste-aesthetics.md
Jeremy 872b0adb48 docs: reorganize into user-docs/ and dev/ subdirectories
Split flat docs/ into user-docs/ (guides, config, troubleshooting) and
dev/ (ADRs, architecture, extension guides, proposals). Updated
docs/README.md index to reflect new paths.
2026-04-10 09:25:31 -05:00

2 KiB

Encoding Taste & Aesthetics

The honest frontier: This is where all four models are most candid about current limitations.

What CAN Be Automated

Technique Description
Reference-based extraction "Feels like Linear" → extract concrete attributes: spacing ratios, animation timing curves, color relationships, typography
Style specification Convert extracted attributes to verifiable parameters: "transitions 150-200ms ease-out, 8px grid spacing, specific contrast ratios"
Automated verification Lighthouse scores, visual regression tests, accessibility audits, performance budgets, design system linting
Visual comparison Render output, compare against reference screenshots using vision-capable models
A/B comparison Show two versions, human picks which "feels better" — faster than absolute judgment

What CANNOT Be Automated

The gestalt — the overall feeling, emotional response, sense of quality emerging from a thousand small interacting decisions. Does this feel premium? Fast? Trustworthy? These are fundamentally subjective.

The Optimal Strategy

Narrow the gap by converting as much "taste" as possible into concrete, verifiable specifications upfront:

  • Not "use nice spacing" → "16px between sections, 8px between related elements, 4px between tightly coupled elements"
  • Exact animation timing curves, color values with contrast ratios, typography weights and sizes

Then reserve human review for the remaining subjective layer with structured, specific questions:

"Does the density feel right? Does the transition timing feel snappy enough? Does the empty state feel intentional or broken?"

The Emerging Frontier

Vision-capable models for aesthetic evaluation — render output, capture screenshot, compare against references on specific visual dimensions. Imperfect but improving rapidly. Grok reports ~80-85% of taste can be automated this way; the remaining 15% stays human-only.