Back to Blog

Design Systems at Scale

Design Systems at Scale

A design system isn’t a component library. It’s a shared language between design and engineering that scales your team’s output without scaling headcount.

We’ve built design systems for products ranging from early-stage MVPs to platforms serving thousands of users. The patterns that work at scale are not the ones most teams start with. Here’s what we’ve learned from shipping real products.

Start Small

The biggest mistake teams make is building a design system before they need one. You end up with a library of components nobody uses because they were designed in a vacuum.

We’ve seen teams spend three months building a “comprehensive” component library with 40+ components, complete with Storybook documentation and Figma integration, only to scrap half of it when they actually started building features. The components didn’t fit the real use cases because they were designed from theory, not practice.

Instead, extract patterns from your existing product. If three screens use the same card layout, that’s a component. If five buttons share the same padding and border radius, that’s a token. This approach — extraction over invention — means every component in your system has at least one proven use case from day one.

When we built the design system for MindHyv, we started with the product. We built the first five screens with raw CSS, intentionally duplicating styles. After those screens shipped, we looked at what repeated and extracted those patterns into shared tokens and components. The resulting system was lean, practical, and every piece had a reason to exist.

Designer working on UI component layouts and interface elements

Tokens Over Components

Design tokens are the foundation. Colors, spacing, typography, shadows — these are the atomic values everything else is built from. Get these right, and your components almost design themselves.

:root {
  /* Spacing scale — 4px base unit */
  --spacing-1: 0.25rem;
  --spacing-2: 0.5rem;
  --spacing-3: 0.75rem;
  --spacing-4: 1rem;
  --spacing-6: 1.5rem;
  --spacing-8: 2rem;
  --spacing-12: 3rem;
  --spacing-16: 4rem;

  /* Color primitives */
  --gray-50: #fafafa;
  --gray-100: #f5f5f5;
  --gray-200: #e5e5e5;
  --gray-700: #404040;
  --gray-800: #262626;
  --gray-900: #171717;
  --gray-950: #0a0a0a;

  /* Semantic color tokens */
  --color-surface: var(--gray-950);
  --color-surface-raised: var(--gray-900);
  --color-text-primary: var(--gray-50);
  --color-text-secondary: var(--gray-200);
  --color-border: var(--gray-800);
  --color-accent: #3b82f6;
  --color-accent-hover: #2563eb;
  --color-error: #ef4444;
  --color-success: #22c55e;

  /* Typography scale */
  --text-xs: 0.75rem;
  --text-sm: 0.875rem;
  --text-base: 1rem;
  --text-lg: 1.125rem;
  --text-xl: 1.25rem;
  --text-2xl: 1.5rem;
  --text-3xl: 1.875rem;
  --text-4xl: 2.25rem;

  /* Radii */
  --radius-sm: 0.25rem;
  --radius-md: 0.5rem;
  --radius-lg: 0.75rem;
  --radius-full: 9999px;
}

Notice the two-layer token architecture: primitive tokens (like --gray-900) define raw values, and semantic tokens (like --color-surface) assign meaning. This separation is critical. When you want to support a light theme, you change the semantic tokens to point at different primitives. The components never change.

[data-theme="light"] {
  --color-surface: var(--gray-50);
  --color-surface-raised: white;
  --color-text-primary: var(--gray-900);
  --color-text-secondary: var(--gray-700);
  --color-border: var(--gray-200);
}

Every component references semantic tokens, never primitives directly. This discipline pays off the first time someone asks for dark mode, a high-contrast theme, or brand-specific theming for a white-label product.

The Component API Contract

Once your tokens are solid, components become compositions of those tokens with defined APIs. The key decision here is how much flexibility to expose.

We follow a principle we call constrained flexibility: components accept a small, well-defined set of props that map to design-approved variations. No arbitrary style overrides. No className prop that lets consumers break the visual contract.

// Good: constrained API
interface ButtonProps {
  variant: 'primary' | 'secondary' | 'ghost' | 'danger';
  size: 'sm' | 'md' | 'lg';
  disabled?: boolean;
  loading?: boolean;
  children: React.ReactNode;
}

// Bad: open API that guarantees visual inconsistency
interface ButtonProps {
  className?: string;
  style?: React.CSSProperties;
  color?: string;
  fontSize?: string;
  children: React.ReactNode;
}

The constrained API means a Button always looks like a Button across your entire product. The open API means every developer on the team will create their own slightly different button, and within six months you’ll have 14 button variants that all look almost-but-not-quite the same.

For LancerSpace, we defined exactly four button variants and three sizes. Every button in the application uses one of those twelve combinations. When the design team wanted to refresh the button styles eight months later, we changed the token values in one place and every button in the product updated simultaneously. The entire visual refresh took an afternoon instead of a week.

Abstract diagram representing scalable design token architecture

Naming Conventions That Scale

Naming is the hardest part of any design system, and getting it wrong creates friction that compounds over time. We use a structured naming convention:

[category]-[element]-[variant]-[state]

For CSS custom properties:

--color-button-primary-hover: #2563eb;
--color-input-border-focus: #3b82f6;
--spacing-card-padding: var(--spacing-6);

For component files:

Button.tsx          /* Base component */
ButtonGroup.tsx     /* Composition */
button.module.css   /* Styles */
button.test.tsx     /* Tests */
button.stories.tsx  /* Documentation */

Consistent naming removes an entire category of decisions from your daily work. You never have to debate whether it’s btn or button, colour or color, CardWrapper or CardContainer. Decide once, document it, enforce it with linting.

Automated Visual Regression Testing

A design system that drifts from the product is worse than no design system at all. We’ve seen it happen: the Figma file shows one thing, the Storybook shows another, and the production app shows a third. Within six months, the design system becomes a fiction that everyone ignores.

Visual regression testing catches drift before it reaches production. We use Playwright for screenshot comparison tests that run on every pull request.

// visual-regression/button.spec.ts
import { test, expect } from '@playwright/test';

test('primary button renders correctly', async ({ page }) => {
  await page.goto('/storybook/button--primary');
  await expect(page.locator('.button-primary')).toHaveScreenshot(
    'button-primary.png',
    { maxDiffPixelRatio: 0.01 }
  );
});

test('primary button hover state', async ({ page }) => {
  await page.goto('/storybook/button--primary');
  await page.locator('.button-primary').hover();
  await expect(page.locator('.button-primary')).toHaveScreenshot(
    'button-primary-hover.png',
    { maxDiffPixelRatio: 0.01 }
  );
});

When a pull request changes a button’s padding by 2 pixels — whether intentionally or accidentally — the CI pipeline flags it with a visual diff. Intentional changes get approved and the baseline snapshot updates. Accidental changes get caught before they merge.

We run these tests across three viewport widths (375px, 768px, 1440px) to catch responsive regressions. The test suite adds about 90 seconds to our CI pipeline, and it’s saved us from shipping visual bugs more times than we can count.

Architectural patterns and structured design forming a cohesive system

Documentation as a Product

The best design system in the world is useless if nobody knows how to use it. Documentation needs to be treated as a product, not an afterthought.

What works for us:

  • Live code examples that developers can copy directly. No pseudocode, no abbreviated snippets. Show the full import, the full usage, the full prop interface.
  • Do/Don’t comparisons showing correct and incorrect usage side by side. Developers learn faster from examples of what not to do.
  • Migration guides when components change. If you rename a prop from type to variant, document exactly what to find-and-replace and in what order.
  • Decision records explaining why a component works the way it does. Six months from now, someone will want to add an outline button variant. If the decision record explains why you chose ghost instead and what problem it solved, that conversation is five minutes instead of an hour.

For the Trackelio dashboard, we kept component documentation alongside the component source code. Every component directory contains a README.md with usage examples, prop definitions, and design rationale. When a developer opens a component file, the documentation is one click away.

Keep It Honest

Regular audits, automated visual regression tests, and tight feedback loops between design and engineering keep things in sync. We schedule a quarterly design system review where we compare what’s in the system against what’s actually shipping in the product. Components that aren’t used get deprecated. Patterns that appear multiple times in the product but aren’t in the system get extracted.

The metrics we track:

  • Adoption rate — What percentage of UI elements use system components versus custom one-offs?
  • Override frequency — How often are developers overriding system styles? Frequent overrides signal that a component’s API isn’t flexible enough.
  • Time to new pattern — How long does it take from identifying a new pattern to shipping it in the system?
  • Visual consistency score — How many unique variations exist for elements that should be identical (buttons, inputs, cards)?

Scaling Across Products

When you have multiple products that share a design language — like a main application, a marketing site, and an admin dashboard — the design system needs to work across all of them without creating a deployment bottleneck.

We handle this by publishing tokens as a standalone package separate from components. Tokens change infrequently and are consumed by all products. Components live closer to each product, often as a shared package with tree-shaking so each product only bundles what it uses.

@threshline/tokens      → CSS custom properties, shared across everything
@threshline/components  → React components, tree-shakeable
@threshline/icons       → SVG icons as components

Each package is versioned independently. A token update doesn’t force a component update, and a component update doesn’t force every product to redeploy. Teams can adopt new versions on their own schedule while maintaining visual consistency through the shared token layer.

Conclusion

The goal isn’t perfection — it’s consistency. A design system is a living thing that evolves with your product. The teams that succeed with design systems treat them as infrastructure: invest early in the foundation (tokens), be disciplined about component APIs, automate quality checks, and document everything as if your future self is a new hire who’s never seen the codebase.

Start by extracting patterns from what you’ve already built. Define your tokens before you define your components. Constrain your component APIs to enforce consistency. Test visually on every PR. And audit regularly to keep the system honest.

The payoff compounds over time. The first month, a design system feels like overhead. By month six, new features ship faster because the building blocks already exist. By month twelve, a visual refresh that would have taken weeks takes days. That’s the ROI of doing it right.