singularity-forge/docs/dev/building-coding-agents/16-encoding-taste-aesthetics.md

# Encoding Taste & Aesthetics

**The honest frontier:** This is where all four models are most candid about current limitations.

### What CAN Be Automated

| Technique | Description |
|-----------|-------------|
| **Reference-based extraction** | "Feels like Linear" → extract concrete attributes: spacing ratios, animation timing curves, color relationships, typography |
| **Style specification** | Convert extracted attributes to verifiable parameters: "transitions 150-200ms ease-out, 8px grid spacing, specific contrast ratios" |
| **Automated verification** | Lighthouse scores, visual regression tests, accessibility audits, performance budgets, design system linting |
| **Visual comparison** | Render output, compare against reference screenshots using vision-capable models |
| **A/B comparison** | Show two versions, human picks which "feels better" — faster than absolute judgment |

### What CANNOT Be Automated

The **gestalt** — the overall feeling, emotional response, sense of quality emerging from a thousand small interacting decisions. *Does this feel premium? Fast? Trustworthy?* These are fundamentally subjective.

### The Optimal Strategy

**Narrow the gap** by converting as much "taste" as possible into **concrete, verifiable specifications upfront:**

- Not "use nice spacing" → "16px between sections, 8px between related elements, 4px between tightly coupled elements"
- Exact animation timing curves, color values with contrast ratios, typography weights and sizes

Then **reserve human review for the remaining subjective layer** with structured, specific questions:

> "Does the density feel right? Does the transition timing feel snappy enough? Does the empty state feel intentional or broken?"

### The Emerging Frontier

Vision-capable models for aesthetic evaluation — render output, capture screenshot, compare against references on specific visual dimensions. Imperfect but improving rapidly. Grok reports ~80-85% of taste can be automated this way; the remaining 15% stays human-only.

---
fix: restore PR files lost during merge conflict resolution Files added by PR #2008 that were not in main were dropped during the merge. Restore all src/, docs/, and scripts/ files from the pre-merge PR head. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-03-25 22:38:55 -06:00			`# Encoding Taste & Aesthetics`

			`The honest frontier: This is where all four models are most candid about current limitations.`

			`### What CAN Be Automated`

			`\| Technique \| Description \|`
			`\|-----------\|-------------\|`
			`\| Reference-based extraction \| "Feels like Linear" → extract concrete attributes: spacing ratios, animation timing curves, color relationships, typography \|`
			`\| Style specification \| Convert extracted attributes to verifiable parameters: "transitions 150-200ms ease-out, 8px grid spacing, specific contrast ratios" \|`
			`\| Automated verification \| Lighthouse scores, visual regression tests, accessibility audits, performance budgets, design system linting \|`
			`\| Visual comparison \| Render output, compare against reference screenshots using vision-capable models \|`
			`\| A/B comparison \| Show two versions, human picks which "feels better" — faster than absolute judgment \|`

			`### What CANNOT Be Automated`

			`The gestalt — the overall feeling, emotional response, sense of quality emerging from a thousand small interacting decisions. Does this feel premium? Fast? Trustworthy? These are fundamentally subjective.`

			`### The Optimal Strategy`

			`Narrow the gap by converting as much "taste" as possible into concrete, verifiable specifications upfront:`

			`- Not "use nice spacing" → "16px between sections, 8px between related elements, 4px between tightly coupled elements"`
			`- Exact animation timing curves, color values with contrast ratios, typography weights and sizes`

			`Then reserve human review for the remaining subjective layer with structured, specific questions:`

			`> "Does the density feel right? Does the transition timing feel snappy enough? Does the empty state feel intentional or broken?"`

			`### The Emerging Frontier`

			`Vision-capable models for aesthetic evaluation — render output, capture screenshot, compare against references on specific visual dimensions. Imperfect but improving rapidly. Grok reports ~80-85% of taste can be automated this way; the remaining 15% stays human-only.`

			`---`