How It Works

SuperImg turns your HTML/CSS into MP4 video. Here's exactly what happens under the hood.

The Pipeline

┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Template   │     │  Playwright │     │   FFmpeg    │
│  (HTML/CSS) │ ──▶ │  screenshots│ ──▶ │  encodes    │
│             │     │  30x/second │     │  to MP4     │
└─────────────┘     └─────────────┘     └─────────────┘
    You write         Browser            Final video

That's the entire system. No magic.

Step 1: You Write a Template

A template is a TypeScript file that exports a render function. This function receives timing information and returns an HTML string.

export default defineScene({
  config: {
    width: 1920,
    height: 1080,
    fps: 30,
    duration: "3s"
  },
  defaults: {
    title: "Hello World"
  },
  render(ctx) {
    const opacity = ctx.std.tween(0, 1, ctx.sceneProgress);
    return `<div style="opacity: ${opacity}">${ctx.data.title}</div>`;
  }
})

The key insight: you're not "animating" anything. You're answering the question: "What should be on screen at this exact moment?"

Step 2: SuperImg Calls Your Function

For a 3-second video at 30fps, SuperImg calls your render function 90 times—once per frame.

Each call receives a ctx object with timing info:

Property	Frame 0	Frame 45	Frame 89
`ctx.frame`	0	45	89
`ctx.sceneTimeSeconds`	0.0	1.5	2.97
`ctx.sceneProgress`	0.0	0.5	0.99

Your function uses these values to calculate positions, colors, and opacities—then returns the HTML for that specific frame.

Step 3: Browser Screenshots Each Frame

SuperImg runs a headless Chromium browser (via Playwright). For each frame:

Injects your HTML into the page
Takes a screenshot at the exact canvas dimensions
Saves the image

This is why CSS works perfectly—it's a real browser rendering your styles.

Step 4: FFmpeg Encodes to Video

Once all frames are captured, FFmpeg stitches them into an MP4 (or WebM, or other formats). The video codec, bitrate, and quality are all configurable.

Why This Architecture?

CSS Just Works

Any CSS property animates correctly because it's rendered in a real browser. Flexbox, grid, transforms, filters, blend modes—all work exactly as you'd expect.

No Learning Curve

You already know HTML and CSS. There's no new animation language to learn. If you can build a website, you can build a video.

Deterministic Output

The same template + data = the same video. Every time. This makes videos testable, reproducible, and safe to generate in CI/CD pipelines.

Batch Rendering

Pass different data to the same template, get different videos. Render 1,000 personalized videos from a CSV without changing any code.

What's in the Context?

The ctx object your render function receives:

render(ctx) {
  // Timing
  ctx.frame              // Current frame number (0, 1, 2, ...)
  ctx.sceneTimeSeconds   // Time in seconds (0.0, 0.033, 0.066, ...)
  ctx.sceneProgress      // Progress from 0 to 1
 
  // Dimensions
  ctx.width              // Video width in pixels
  ctx.height             // Video height in pixels
 
  // Data (merged from defaults + runtime data)
  ctx.data.title         // Your custom data
 
  // Standard library
  ctx.std.tween(...)     // Smooth interpolation
  ctx.std.math.clamp(...) // Constrain values
  ctx.std.css(...)       // Generate CSS strings
}

Next Steps

CLI Reference — Commands for creating and rendering videos
Animation Basics — How to think about timing and motion
Player — Embed videos in web apps