Transformer Explainer
Backend Engineering Tour

A walkthrough of the full-stack codebase behind transformer-explainer-three.vercel.app

Next.js 14 App Router · Drizzle + Postgres · Auth.js v5 · Vitest + Playwright

Roadmap

🏛️
The shape
🔄
Request lifecycle
🔐
Data & auth
⚙️
API patterns
🚀
Test & deploy
🐛
Production reality

Each section drills from "what" → "how" → "why we chose it" → "what broke in production". Target audience: a junior backend engineer meeting a modern TypeScript codebase for the first time.

Part 1

The shape of the system

                ┌───────────────────────────────────────────────┐
                │                  Browser                      │
                │  React UI · Service Worker · localStorage     │
                └────────────────┬──────────────────────────────┘
                                 │  HTTPS
                                 ▼
                ┌───────────────────────────────────────────────┐
                │                  Vercel                       │
                │   Edge network · Serverless Node.js runtime   │
                │   • Server Components (pages)                 │
                │   • Server Actions  (sign-in, sign-out)       │
                │   • Route Handlers  (/api/*)                  │
                └───────┬─────────────────────────┬─────────────┘
                        │                         │
                        │ Drizzle ORM             │ HTTPS
                        ▼                         ▼
            ┌──────────────────┐       ┌────────────────────┐
            │   Neon Postgres  │       │   GitHub OAuth     │
            │  (managed pg17)  │       │  (identity prov.)  │
            └──────────────────┘       └────────────────────┘

The only stateful pieces are Postgres and your browser cookies. Everything between is functions — the same code running locally on pnpm dev runs on Vercel as serverless invocations, spun up per-request.

The stack

Layer Choice Why
Runtime Node.js 24 Default on Vercel
Framework Next.js 14 (App Router) Co-locates server + client React
Language TypeScript (strict) Catches bugs at compile time
ORM Drizzle Thin, SQL-shaped, type-safe
Database Postgres / Neon Free-tier managed, branched DBs
Auth Auth.js v5 Handles OAuth + sessions
Validation Zod Runtime schemas + TS inference
Tests Vitest + Playwright + axe-core Unit, e2e, a11y
Deploy Vercel + GitHub Actions Push to main → live in ~90s

Constraint: nothing exotic. Every piece is something a junior engineer will plausibly meet at a normal employer.

Part 2

What happens on GET /learn/03-attention

 Browser                Vercel Edge          Serverless Fn        Postgres
   │                       │                     │                   │
   │  GET /learn/03-...    │                     │                   │
   │ ─────────────────────►│   route lookup      │                   │
   │                       │ ──── invoke ───────►│                   │
   │                       │                     │  read MDX file    │
   │                       │                     │  (bundled)        │
   │                       │                     │  SELECT session   │
   │                       │                     │ ─────────────────►│
   │                       │                     │ ◄─────────────────│
   │                       │                     │  SELECT progress  │
   │                       │                     │ ─────────────────►│
   │                       │                     │ ◄─────────────────│
   │                       │   render JSX → HTML │                   │
   │  HTML + RSC payload   │ ◄── return ─────────│                   │
   │ ◄─────────────────────│                     │                   │
   │  paint, hydrate JS    │                     │                   │
   │                       │                     │                   │
   │  POST /api/events     │                     │                   │
   │ ─────────────────────►│ ──── invoke ───────►│  INSERT events    │
   │                       │                     │ ─────────────────►│
   │  200 ok               │ ◄────────────────── │                   │
   │ ◄─────────────────────│                     │                   │

One page visit = many independent function invocations. Each one is stateless, runs to completion, exits. That is the serverless mental model.

Four kinds of code

                ┌─── server ────┐    ┌──── browser ────┐
                │               │    │                 │
                │  Server       │    │                 │
                │  Components   │    │                 │
                │               │    │                 │
                │  ┌──────────┐ │    │                 │
                │  │ Server   │ │    │                 │
                │  │ Actions  │◄├────┤  form submit    │
                │  └──────────┘ │    │                 │
                │  ┌──────────┐ │    │  ┌──────────┐   │
                │  │ Route    │◄├────┤──┤ Client   │   │
                │  │ Handlers │ │    │  │ Comps    │   │
                │  │ /api/... │ │    │  └──────────┘   │
                │  └──────────┘ │    │     ▲           │
                │  Server       │    │     │ initial   │
                │  Components ──┼────┤──►  │ HTML +    │
                │  emit HTML +  │    │     │ hydration │
                │  hydration    │    │     │           │
                └───────────────┘    └─────────────────┘
Server Components & Server Actions never ship JS to the browser. They run only on the server, can await directly, can read files and the DB.
Client Components run server-side once to produce HTML, then in the browser to hydrate. Route Handlers are plain HTTP endpoints — the closest thing to "traditional" backend code.

A Server Component reads the DB directly


// src/app/learn/page.tsx — Server Component
export default async function LearnIndex(): Promise<JSX.Element> {
  const status = await Promise.all(
    SECTIONS.map(async (s) => ({
      ...s,
      ready: (await readSectionMdx(s.slug)) !== null,
    })),
  );

  const session = await runOrFallback(
    "learn-index:auth",
    getSession,     // ← Auth.js: reads cookie, looks up sessions table
    null,
  );
  const userId = session?.user?.id ?? null;
  const progress = userId
    ? await runOrFallback("learn-index:progress", () => listForUser(userId), [])
    : [];

  return <main>...</main>;  // ← rendered to HTML on the server
}
          

No useState, no useEffect — none of that machinery exists. A Server Component is a plain async function. The JSX it returns is serialised to HTML and sent to the browser.

Part 3

Database schema (eight tables)

              ┌─────────────┐
              │   users     │ ◄────────────────┐
              │  id (uuid)  │                  │
              │  email ─────┤◄──────┐          │
              │  github_id  │       │          │
              └─────────────┘       │          │
                  │       │         │          │
        ┌─────────┘       └─────────┐          │
        ▼                           ▼          │
┌──────────────┐            ┌──────────────┐   │
│  accounts    │            │  sessions    │   │
│  user_id ────┘            │  user_id ────┘   │
│  provider    │            │  expires      │  │
└──────────────┘            └──────────────┘   │
                                                │
        ┌──────────────┐    ┌──────────────┐   │
        │  comments    │    │  experiments │   │
        │  user_id ────┼────┤  owner_id ───┼───┘
        │  parent_id ──┐    │  slug         │
        │  section_slug│    │  config_json  │
        └──────┬───────┘    │  forked_from  │
               │            └──────────────┘
               │ self-FK (one level threading)
               └───────────►
        ┌──────────────┐    ┌──────────────┐
        │  progress    │    │   events     │
        │  user_id ────┼─►  │  user_id?    │ ← nullable (anon ok)
        │  section_slug│    │  session_id  │ ← UUID from localStorage
        │  status      │    │  kind        │
        └──────────────┘    └──────────────┘

UUIDs everywhere (defaultRandom()), foreign keys with onDelete: "cascade", timestamps with timezone, and .notNull() on every required column.

A real-world SQL pattern: monotonic upsert


// src/lib/progress.ts
await db.insert(progress).values({ userId, sectionSlug, status })
  .onConflictDoUpdate({
    target: [progress.userId, progress.sectionSlug],
    set: {
      status: sql`CASE
        WHEN ${progress.status} = 'completed' THEN ${progress.status}
        WHEN ${progress.status} = 'in_progress' AND ${newRank} >= 1 THEN ${status}
        WHEN ${progress.status} = 'not_started' THEN ${status}
        ELSE ${progress.status}
      END`,
      updatedAt: new Date(),
    },
  });
          

Problem: every page visit fires in_progress. A return visit must not regress a completed row. Solution: the CASE clause encodes a state machine in SQL — only forward transitions allowed. Postgres handles concurrent updates for free.

OAuth flow — sign-in with GitHub

 Browser              Our server          GitHub             Postgres
   │  click "Sign in"    │                  │                   │
   │ ───────────────────►│ set CSRF cookie  │                   │
   │                     │ 302 redirect     │                   │
   │ ◄───────────────────│                  │                   │
   │  GET /authorize?... │                  │                   │
   │ ──────────────────────────────────────►│                   │
   │  consent screen     │                  │                   │
   │ ◄──────────────────────────────────────│                   │
   │  POST consent       │                  │                   │
   │ ──────────────────────────────────────►│                   │
   │  302 to /callback?code=XXX             │                   │
   │ ◄──────────────────────────────────────│                   │
   │  GET /api/auth/callback?code=XXX       │                   │
   │ ───────────────────►│  POST /token (server→server)         │
   │                     │ ────────────────►│                   │
   │                     │  access_token    │                   │
   │                     │ ◄────────────────│                   │
   │                     │  GET /user → profile                 │
   │                     │ upsert user, insert account, session │
   │                     │ ───────────────────────────────────►│
   │  Set session cookie │                  │                   │
   │  302 to /           │                  │                   │
   │ ◄───────────────────│                  │                   │

Auth.js handles every step. We provide a profile mapper and a Drizzle adapter; the library does CSRF, the token exchange, the DB inserts, and the session cookie.

Sessions — database vs JWT

Database (production)
  • Cookie holds opaque ID
  • Each request: DB read on sessions
  • Logout: delete the row → instant revoke
  • Slightly slower (~5 ms/request) but stronger guarantees
JWT (E2E test build)
  • Cookie IS the session (signed JWT)
  • No DB read per request
  • Logout deletes cookie; old JWTs still valid
  • Required if using Credentials provider

session: { strategy: e2eAuthEnabled ? "jwt" : "database" },
          

Part 4

Three flavours of API route

Pure compute

/api/compute/embed, etc. CPU-bound, no I/O. Safest endpoints — touch no shared state.

Auth + DB

Session lookup, authorise, validate, query, return. Uniform response: { ok: true, data } | { ok: false, error }.

Analytics

/api/events — accepts anonymous batches. Rate-limited per IP, with graceful DB fallback. Loss of analytics ≪ user-visible 500s.

The choice between Server Action and Route Handler comes down to who calls it. Server Actions are for your own forms. Route Handlers are for anyone with HTTP — JS clients, mobile apps, curl.

Validate every input with Zod


// src/lib/analytics-shared.ts
export const ingestSchema = z.object({
  sessionId: z.string().min(8).max(64),
  events: z.array(z.object({
    kind: z.enum(EVENT_KIND),
    sectionSlug: z.string().min(1).max(80).optional(),
    meta: z.record(z.unknown()).optional(),
  })).min(1).max(50),
});
type Ingest = z.infer<typeof ingestSchema>;  // ← TS type for free

const parsed = ingestSchema.safeParse(await req.json());
if (!parsed.success) {
  return NextResponse.json(
    { ok: false, error: parsed.error.issues[0]?.message },
    { status: 400 },
  );
}
// parsed.data is now sanitised AND typed
          

One schema gives you: (1) runtime validation, (2) a static TS type via z.infer, (3) cheap DoS protection (.max(50)).

Graceful fallback — degrade, don't fail


// src/lib/db-fallback.ts
export async function runOrFallback<T>(
  key: string,
  fn: () => Promise<T>,
  fallback: T,
): Promise<T> {
  try { return await fn(); }
  catch (err) {
    maybeWarn(key, err);   // throttled to once per minute per key
    return fallback;
  }
}

// usage
const session = await runOrFallback("learn:auth", getSession, null);
const rows = await runOrFallback("comments:list", () => listForSection(slug), []);
          

A fresh clone with placeholder env values still renders /learn/* and /playground cleanly. DB outage = empty data, not 500. Pick which writes are critical; for everything else, degrade.

Sanitising user-supplied Markdown (XSS)


// src/lib/comments-render.ts
export function renderCommentHtml(md: string): string {
  const rawHtml = marked.parse(md, { async: false }) as string;
  return getSanitize()(rawHtml, {
    USE_PROFILES: { html: true },
    FORBID_TAGS:  ["style", "script"],
    FORBID_ATTR:  ["style", "onerror", "onload", "onclick"],
  });
}
          
Two stages: Marked turns MD → HTML, DOMPurify walks the tree and removes <script>, event handlers, javascript: URLs.
Lazy-loaded: JSDOM (inside DOMPurify) reads a stylesheet at import-time that breaks Vercel build. Cached dynamic require defers it to runtime.

Token-bucket rate limiter

       capacity = 5       refill = 1 token / sec
       ┌─────────────────┐
       │ ●●●●●           │  t = 0s    bucket starts full
       └─────────────────┘
            │ call (cost 1)
            ▼
       ┌─────────────────┐
       │ ●●●●            │  t = 0.1s
       └─────────────────┘
            │ … 4 more calls in quick succession
            ▼
       ┌─────────────────┐
       │                 │  t = 0.5s   empty → 6th call rejected (429)
       └─────────────────┘
            │ wait 1s; refill adds a token
            ▼
       ┌─────────────────┐
       │ ●               │  t = 1.5s   one call worth of capacity
       └─────────────────┘

Classic algorithm, ~20 lines of code. In-memory per instance — fine for our scale; at scale you'd use Redis or Vercel Edge Config. The function takes now as a parameter so tests pass made-up timestamps; the real clock is the default.

One env-var boundary


// src/lib/env.ts — ONLY file that reads process.env
const parsed = makeEnvSchema(isProd).safeParse(process.env);
if (!parsed.success) {
  console.error("❌ Invalid environment variables:");
  for (const issue of parsed.error.issues) {
    console.error(`  - ${issue.path.join(".") || "(root)"}: ${issue.message}`);
  }
  throw new Error("Invalid environment configuration. See .env.example.");
}
export const env = parsed.data;   // ← rest of the app imports this
          

One chokepoint validates at boot. A bad deploy crashes immediately with a clear message — not on the first request that happened to need that variable. Production requires real secrets; dev tolerates placeholders.

Part 5

The test pyramid

Level Tool What Speed
Unit Vitest Pure functions, Zod schemas, helpers, the math ~5s
Integration Vitest Comment Markdown → safe HTML pipeline ~5s
End-to-end Playwright Real Chromium drives pnpm dev against a real Postgres ~50s
Accessibility axe-core WCAG violations on key pages ~30s
Numerical PyTorch fixtures TS ops match PyTorch to within 1e-5 ~3s

100% line coverage on src/lib/transformer/ is enforced in CI — that's where the maths lives. Drop below 100% and the PR fails to merge.

CI/CD — push to main → live in ~90 seconds

   Developer's machine            GitHub                Vercel
        │                            │                     │
        │ git push branch            │                     │
        │ ──────────────────────────►│                     │
        │                            │  Actions: 4 jobs    │
        │                            │  ├─ Lint & Type     │
        │                            │  ├─ Unit            │
        │                            │  ├─ Verify maths    │
        │                            │  └─ E2E (with pg)   │
        │                            │  ─── all pass ───►  │ build preview
        │                            │                     │ deploys to
        │                            │                     │ <hash>.vercel.app
        │ open PR ──────────────────►│  reviewer reads     │
        │                            │  preview URL + diff │
        │ squash-merge to main ─────►│                     │
        │                            │  Actions re-run     │
        │                            │  on main            │
        │                            │  ─── all pass ───►  │ build prod
        │                            │                     │ deploys to
        │                            │                     │ canonical alias

This loop only works if CI is trustworthy: fast enough you don't ignore it, strict enough that "CI is green" actually means "safe to ship".

Vercel — two delivery paths

File type Path Delivery
Static asset /_next/static/ Edge CDN, cached forever
Static page /some/page Edge CDN, may revalidate
Dynamic page /learn/[slug] Serverless function on demand
Route handler /api/... Serverless function on demand

Cold start tax: ~200-500 ms for the first request after a deploy. Mitigations: small bundles, no work at module load (lazy init), and warm with traffic.

Part 6

Six real bugs we hit going live

1. DATABASE_URL not loaded. drizzle-kit doesn't auto-load .env.local — only Next.js does. Fix: explicit dotenv in drizzle.config.ts.
2. 500-cascade on fresh clone. Placeholder DATABASE_URL → DNS failure → /api/events 500s. Fix: runOrFallback on every page-paint DB call.
3. Vercel security scanner blocked deploy. next-mdx-remote@5.0.0 had a CVE. Fix: bump to 6.0.0 (API-compatible).
4. OAuthCallbackError on sign-in. Postgres rejected GitHub's numeric id as invalid uuid. Fix: drop id from mapGitHubProfile; let defaultRandom() generate.
5. ERR_TOO_MANY_REDIRECTS. pages.signIn: "/api/auth/signin" pointed at the Auth.js handler itself. Fix: remove the field; the handler auto-redirects to the only provider.
6. OAuthAccountNotLinked. Seeded admin user blocked first real OAuth sign-in by email collision. Fix: allowDangerousEmailAccountLinking: true on the GitHub provider (GitHub verifies emails, so safe).

What's missing for "real" production

  1. Distributed rate limiting. In-memory bucket doesn't survive cold starts or scale. → Redis / Upstash.
  2. Background jobs. Anything time-consuming off the request path. → Vercel Cron, Inngest, BullMQ.
  3. Observability. Tracing (OpenTelemetry → Honeycomb / Datadog) and error tracking (Sentry).
  4. Real migrations, not db:push. Generated SQL files, applied deterministically.
  5. Multi-environment Postgres. Separate dev / staging / prod (Neon branches make this cheap).
  6. Secrets rotation. Auth secret, OAuth client secret, DB password rotated on a schedule.
  7. RBAC. Beyond admin/non-admin: scoped tokens, audit logs.

If you can articulate the why behind each of these, you're past the junior bar.

Lessons for a junior backend engineer

  1. One env-var boundary. Validate at boot. Crash loud if anything's missing.
  2. One response shape. { ok, data } | { ok, error } everywhere — clients never guess.
  3. Validate every input with Zod. Network = untrusted. DB = trusted.
  4. Degrade gracefully on read paths. Empty data beats 500s.
  5. TypeScript ≠ DB constraints. Your column type is the real source of truth.
  6. Lazy-load expensive imports. Top-level side effects bite at build time.
  7. Read production logs. Half of debugging is finding the actual error message, not guessing.
  8. Test the inputs to your tests. Pass now, pass seeds — deterministic tests don't flake.

Thanks

The codebase is itself the teaching artefact. Open src/lib/auth/config.ts, src/lib/db/schema.ts, or src/app/api/events/route.ts alongside this presentation — every pattern shown here is one git-blame away.

Next.js 14 TypeScript strict Drizzle + Neon Auth.js v5 Vitest + Playwright Vercel