TECHNICAL PRESENTATION

Introduction to
FastAPI

A modern Python framework where type hints become the API
Pydantic v2 Starlette / ASGI Depends() OpenAPI · ReDoc
📝 type hints 🧱 validation 🔌 inject ⚡ async 📜 OpenAPI 🚀 deploy

Type hints drive validation, serialisation, dependency injection and OpenAPI generation — one declaration, four behaviours, no decorators stack to learn.

Type  ·  Validate  ·  Inject  ·  Async  ·  Document  ·  Deploy
02

Topics

Foundations

  • What FastAPI is — origins, Sebastián Ramírez, Starlette + Pydantic
  • Why FastAPI — type-driven design, performance, ergonomics
  • Installation, the first endpoint, uvicorn
  • Path operations — methods, path / query / body params
  • Pydantic v2 — models, validators, field types

Behaviour

  • Dependency injection with Depends()
  • Sub-dependencies, generators, yield for cleanup
  • Async vs sync — when each runs and where
  • Authentication — OAuth2 password, JWT, API keys
  • Authorisation — scopes, dependencies as policies
  • Background tasks, streaming, file uploads

Integration

  • Database — SQLAlchemy 2.0 async, Tortoise, SQLModel
  • Pydantic Settings — typed config from env / .env
  • Middleware & lifespan events
  • WebSockets & SSE
  • Testing — TestClient, async tests, fixtures
  • OpenAPI — tags, descriptions, ReDoc, custom schema

Production

  • Performance — uvicorn, gunicorn, workers, async pitfalls
  • Deployment — Docker, K8s, ASGI servers compared
  • Observability — logging, metrics, OpenTelemetry
  • Security — CORS, HTTPS, rate limiting, request size
  • Comparisons — Flask, Django REST, Litestar, Starlette
  • Production patterns — lifespan, dependency overrides
03

What Is FastAPI?

FastAPI is a Python web framework created by Sebastián Ramírez (tiangolo) in 2018. It builds on Starlette (ASGI toolkit) and Pydantic v2 (data validation in Rust) to turn type hints into HTTP behaviour.

The two-line pitch

from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class User(BaseModel):
    email: str
    name: str
    age: int | None = None

@app.post('/users', response_model=User, status_code=201)
async def create_user(user: User) -> User:
    return user
# → validation, deserialisation, serialisation, OpenAPI,
#    Swagger UI, ReDoc — all from the type hints alone.

What FastAPI owns

  • Routing — path operations, path / query / body params
  • Validation — via Pydantic v2 (Rust-backed, fast)
  • Serialisation — response models, exclude / include
  • DIDepends() with sub-deps and yield-cleanup
  • Docs — OpenAPI 3.1, Swagger UI, ReDoc, schema export
  • WebSockets, SSE, background tasks
  • Test client (httpx-based)

Where it sits in the stack

LayerComponent
AppFastAPI — routes, DI, OpenAPI
HTTPStarlette — ASGI app, middleware, WebSocket
ValidationPydantic v2pydantic-core (Rust)
ServerUvicorn / Hypercorn / Granian (ASGI)
Process mgrGunicorn (with uvicorn workers) / systemd

Who uses it

  • Microsoft (Azure SDK examples), Uber, Netflix internal services
  • Most Python ML / LLM serving stacks — Hugging Face TGI, OpenAI cookbook examples, vLLM API server
  • Anthropic / OpenAI / Cohere SDK example servers
  • Cited as the third-most-used Python web framework after Django and Flask in the JetBrains 2024 survey

What it is not

Not a full-stack framework. No ORM (you bring SQLAlchemy / SQLModel / Tortoise). No template engine (you bring Jinja2). No admin / auth UI (you bring your own). It's a web API framework.

04

Why FastAPI?

Three claims, each backed by something real: type-driven design, performance close to Node / Go, developer ergonomics from one declaration doing four jobs.

One declaration, four behaviours

@app.get('/items/{item_id}')
def read_item(
    item_id: int,                          # path — coerced & validated
    q: str | None = None,                  # query — optional
    skip: int = 0, limit: int = 100,       # query — defaults
    user: User = Depends(current_user),    # injected
):
    ...

# → routing, type coercion, validation, OpenAPI param entries,
#    DI, automatic error responses with location info — one signature.

Performance

  • Pydantic v2 core is in Rust — validation is > 10× v1
  • Async-first — one event loop, thousands of in-flight requests
  • Comparable to Node / Go for I/O-bound APIs in independent benchmarks
  • Sync paths run in a thread pool — mixing sync and async is graceful

What you give up — honestly

  • No batteries — you assemble ORM, migrations, auth from libraries
  • Async-first culture — sync-only DBs (e.g. some legacy drivers) need extra care
  • Pydantic learning curve — v1 vs v2 differences still trip teams up
  • DI is opinionatedDepends() is great, but unusual if you've come from Spring / DI containers

When FastAPI is the right tool

  • JSON / OpenAPI HTTP APIs — the original sweet spot
  • ML / LLM model servers — PyTorch, transformers, vLLM
  • Internal microservices in a Python house
  • Anywhere you want OpenAPI as a contract for free
  • Background-task / webhook receivers

When something else fits better

  • Server-rendered HTML apps with auth, admin, ORM → Django
  • Tiny single-page proof-of-concepts → Flask
  • Need msgspec speed and stricter type discipline → Litestar
05

Installation & First Endpoint

FastAPI ships as a regular Python package. The recommended install pulls in uvicorn (ASGI server), httpx (test client), and a curated set of optional extras — one command and you're live.

Install

# the recommended bundle — FastAPI + uvicorn + extras
uv add 'fastapi[standard]'

# minimal — just the framework
uv add fastapi
uv add 'uvicorn[standard]'

# pip equivalent
pip install 'fastapi[standard]'

# version
uv run python -c 'import fastapi; print(fastapi.__version__)'

The first app

# app/main.py
from fastapi import FastAPI

app = FastAPI(title='Hello API', version='0.1.0')

@app.get('/healthz')
def health() -> dict:
    return {'ok': True}

@app.get('/echo/{msg}')
async def echo(msg: str) -> dict:
    return {'msg': msg}

Run it

# dev: hot reload, single worker
uv run fastapi dev app/main.py
# → http://127.0.0.1:8000
# → http://127.0.0.1:8000/docs   (Swagger UI)
# → http://127.0.0.1:8000/redoc  (ReDoc)

# prod-style: uvicorn directly
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4

Anatomy of a path operation

  • Decorator@app.get/.post/.put/.patch/.delete sets method & path
  • Function name — cosmetic; appears as the OpenAPI operationId
  • Parameters — FastAPI inspects type + default & classifies as path / query / body / dep
  • Return type — used as response_model if not overridden; drives serialisation

Project layout that scales

app/
├── main.py              # creates FastAPI(), includes routers
├── api/
│   ├── deps.py          # shared dependencies
│   ├── v1/
│   │   ├── users.py     # APIRouter
│   │   └── items.py
│   └── v1/__init__.py
├── core/
│   ├── config.py        # pydantic Settings
│   ├── security.py
│   └── logging.py
├── schemas/             # pydantic models (DTOs)
├── models/              # SQLAlchemy ORM models
├── services/            # domain logic
├── db/                  # session, migrations
└── tests/
06

Path Operations — Params & Responses

Parameters are classified by their type and where they're declared. FastAPI never guesses — it follows a single rule: Path, Query, Body, Header, Cookie overrides; otherwise scalar → query, model → body.

The classification rule

from fastapi import Query, Path, Header, Cookie, Body
from typing import Annotated

@app.get('/users/{user_id}/posts')
def list_posts(
    user_id: Annotated[int, Path(ge=1)],
    # query params with constraints
    page:  Annotated[int, Query(ge=1)] = 1,
    size:  Annotated[int, Query(ge=1, le=100)] = 20,
    q:     Annotated[str | None, Query(max_length=64)] = None,
    # header
    x_request_id: Annotated[str | None, Header()] = None,
    # cookie
    session: Annotated[str | None, Cookie()] = None,
):
    ...

Body — one model or many fields

from pydantic import BaseModel

class ItemCreate(BaseModel):
    name: str
    price: float

@app.post('/items')
def create(item: ItemCreate):     # whole body
    ...

@app.post('/items/multi')
def create2(
    item: ItemCreate,
    user: UserStub,
    importance: Annotated[int, Body(ge=1, le=5)] = 1,
):
    # body becomes { "item": ..., "user": ..., "importance": ... }
    ...

Response models & status

from fastapi import status

class ItemRead(BaseModel):
    id: int
    name: str
    price: float

@app.post('/items',
          response_model=ItemRead,
          status_code=status.HTTP_201_CREATED,
          tags=['items'],
          summary='Create an item',
          response_description='The created item')
def create(item: ItemCreate) -> ItemRead:
    return ItemRead(id=42, **item.model_dump())

Multiple responses, error shapes

from fastapi import HTTPException

class Error(BaseModel):
    code: str
    message: str

@app.get('/items/{id}',
         response_model=ItemRead,
         responses={
           404: {'model': Error, 'description': 'Not found'},
           409: {'model': Error, 'description': 'Conflict'},
         })
def get(id: int):
    if id > 1000:
        raise HTTPException(404, detail='not found')
    return ItemRead(id=id, name='x', price=1.0)

The Annotated-first style

FastAPI 0.95+ recommends Annotated[T, Query(...)] over = Query(...) default values — lets type checkers see the real type, plays nicely with stricter mypy / pyright settings, and reads better in long signatures.

07

Pydantic v2 — The Validation Engine

Pydantic v2 (released 2023) rewrote the validation core in Rust. It is what makes FastAPI fast, strict, and informative — and learning Pydantic is most of learning FastAPI.

Models & field constraints

from pydantic import BaseModel, Field, EmailStr, HttpUrl
from datetime import datetime
from decimal import Decimal

class User(BaseModel):
    id:       int
    email:    EmailStr
    name:     str = Field(min_length=1, max_length=120)
    role:     Literal['user', 'admin'] = 'user'
    avatar:   HttpUrl | None = None
    balance:  Decimal = Field(decimal_places=2)
    created:  datetime
    metadata: dict[str, str] = {}

# parse / validate
u = User.model_validate(payload)        # raises ValidationError on bad input
u = User.model_validate_json(raw_bytes) # parse JSON directly — faster than json.loads + .model_validate

Field validators

from pydantic import field_validator, model_validator

class Order(BaseModel):
    qty:  int
    unit_price: Decimal
    total: Decimal

    @field_validator('qty')
    @classmethod
    def positive(cls, v):
        if v <= 0: raise ValueError('qty must be positive')
        return v

    @model_validator(mode='after')
    def check_total(self):
        if self.total != self.qty * self.unit_price:
            raise ValueError('total mismatch')
        return self

Serialisation — model_dump

u.model_dump()              # → dict
u.model_dump(mode='json')   # JSON-compatible (Decimal/datetime as str)
u.model_dump_json()         # → bytes / str
u.model_dump(exclude={'metadata'})
u.model_dump(include={'id', 'email'})
u.model_dump(by_alias=True, exclude_none=True)

# computed fields surface in serialisation
class Box(BaseModel):
    w: float; h: float; d: float
    @computed_field
    def volume(self) -> float: return self.w * self.h * self.d

Common types you'll reach for

TypeUse
EmailStrValidated email (needs email-validator)
HttpUrl · AnyHttpUrlURL with scheme/host parsing
UUID4 · UUID7UUID with version check
SecretStrHidden in repr / dump
conlist · conintConstrained list / int
Annotated[..., AfterValidator(f)]Plug a function in
RootModel[list[X]]Top-level list / scalar
Field(discriminator='kind')Tagged union
08

Dependency Injection with Depends()

FastAPI's signature feature: anything that can be a parameter can be a dependency. Dependencies compose, can have their own dependencies, and can yield to clean up — replacing what a DI container does in other ecosystems.

The basic shape

from fastapi import Depends

# any callable can be a dependency
def common_paginate(
    page: int = 1,
    size: int = 20,
) -> dict:
    return {'page': page, 'size': size}

@app.get('/items')
def list_items(
    p: Annotated[dict, Depends(common_paginate)],
):
    return {'page': p['page'], 'size': p['size']}

# implicit form omitting the function in Depends()
def get_db() -> Session: ...
@app.get('/u/{id}')
def by_id(id: int, db: Annotated[Session, Depends(get_db)]):
    ...

Yield — setup / teardown

from sqlalchemy.orm import Session
from app.db import SessionLocal

def get_db():
    db = SessionLocal()
    try:
        yield db          # the value passed to the handler
    finally:
        db.close()        # runs after response is sent

# any exception in the handler → the finally still runs.

Sub-dependencies and policies

def current_user(
    token: Annotated[str, Depends(oauth2_scheme)],
    db:    Annotated[Session, Depends(get_db)],
) -> User:
    user = decode_and_load(token, db)
    if not user: raise HTTPException(401)
    return user

def require_admin(
    user: Annotated[User, Depends(current_user)],
) -> User:
    if user.role != 'admin': raise HTTPException(403)
    return user

@app.delete('/users/{id}')
def remove(id: int,
           _admin: Annotated[User, Depends(require_admin)]):
    ...

Where to attach dependencies

  • Per-handlerdef f(x = Depends(dep))
  • Per-routerAPIRouter(dependencies=[Depends(verify_api_key)])
  • Per-appFastAPI(dependencies=[...]) for global guards
  • As policy — raise HTTPException in the dep; never reach the handler

Don't fight the cache

Dependencies are cached per request: if get_db is in three deps in the same request, you get one session. To opt out: Depends(get_db, use_cache=False).

09

Async vs Sync — When Each Runs

FastAPI runs async path operations on the event loop. Sync ones run in a thread pool (anyio). Mixing is fine; getting the boundary right is the difference between fast and stalled.

The decision tree

  • All your I/O is async-aware (httpx, asyncpg, aiokafka) → declare async def and await normally
  • Some I/O is sync-only (psycopg2, requests, openai with the sync client) → declare the handler def; FastAPI runs it in the threadpool
  • Never mix blocking I/O inside an async def — one stalled handler stalls every other request on the loop

Async handler — fully async stack

import httpx

@app.get('/weather/{city}')
async def weather(city: str):
    async with httpx.AsyncClient(timeout=5) as c:
        r = await c.get(f'https://api.weather/{city}')
        r.raise_for_status()
        return r.json()

Sync handler — blocking work

import requests

@app.get('/legacy/{q}')
def legacy(q: str):              # def, not async def
    # runs in threadpool — main loop unblocked
    r = requests.get(f'https://legacy/{q}', timeout=5)
    return r.json()

Run sync work from async

from fastapi.concurrency import run_in_threadpool
from anyio import to_thread

@app.post('/render')
async def render(req: RenderReq):
    # CPU-bound or sync-only call from an async handler
    img = await run_in_threadpool(make_pdf, req)
    # equivalent:
    img2 = await to_thread.run_sync(make_pdf, req)
    return {'size': len(img)}

The classic foot-gun

# BAD — time.sleep blocks the event loop;
# every request to ANY endpoint stalls for 2s
@app.get('/slow')
async def slow():
    time.sleep(2)         # blocking!
    return {'ok': True}

# GOOD
import asyncio
@app.get('/slow')
async def slow():
    await asyncio.sleep(2)
    return {'ok': True}

CPU-bound? Off the loop entirely

  • Threads for short I/O wrappers (the threadpool default is 40 workers)
  • Process pool (concurrent.futures) for genuinely CPU-bound work — one process per core
  • Background queue (Celery, RQ, dramatiq, arq) when the work is > ~1 second or needs retries
  • GPU inference — async handler that awaits the model client; never block the loop on .generate()
10

Authentication

FastAPI ships security utilities — classes that double as dependencies and OpenAPI Security Schemes. The framework parses the right header / query / cookie; you decide how to verify.

OAuth2 password / JWT

from fastapi.security import OAuth2PasswordBearer
from jose import jwt, JWTError

oauth2_scheme = OAuth2PasswordBearer(tokenUrl='/auth/token')

def current_user(
    token: Annotated[str, Depends(oauth2_scheme)],
) -> User:
    try:
        payload = jwt.decode(token, settings.JWT_KEY,
                             algorithms=['HS256'],
                             audience=settings.AUD,
                             issuer=settings.ISS)
    except JWTError:
        raise HTTPException(401, 'invalid token')
    return User(**payload['user'])

@app.get('/users/me', response_model=User)
def me(user: Annotated[User, Depends(current_user)]):
    return user

Scopes

from fastapi.security import SecurityScopes

oauth2_scheme = OAuth2PasswordBearer(
    tokenUrl='/auth/token',
    scopes={'read': 'Read', 'write': 'Modify'})

def require_scopes(
    scopes: SecurityScopes,
    token: Annotated[str, Depends(oauth2_scheme)],
):
    payload = jwt.decode(token, ...)
    have = set(payload.get('scope', '').split())
    for s in scopes.scopes:
        if s not in have: raise HTTPException(403)
    return payload

@app.delete('/items/{id}')
def remove(id: int,
           _: Annotated[dict, Security(require_scopes,
                                        scopes=['write'])]):
    ...

API keys

from fastapi.security import APIKeyHeader

api_key = APIKeyHeader(name='X-API-Key', auto_error=False)

def require_api_key(
    key: Annotated[str | None, Depends(api_key)],
    db:  Annotated[Session, Depends(get_db)],
) -> APIKey:
    if not key: raise HTTPException(401)
    row = db.query(APIKey).filter_by(hash=sha256(key)).first()
    if not row or row.revoked: raise HTTPException(401)
    return row

What FastAPI gives you

  • Security utilitiesOAuth2*, APIKey*, HTTPBearer, HTTPBasic
  • OpenAPI security schemes auto-populated; "Authorize" button in Swagger
  • Composability — the security dep can have its own deps (DB, cache, logger)

What you bring

  • Token signing / verification — python-jose, authlib, or hit your IdP's JWKS
  • Password hashing — passlib[argon2] or argon2-cffi directly
  • Refresh-token rotation, idle timeouts, revocation lists
  • Rate limiting on the auth endpoints — FastAPI doesn't ship one
11

Errors, Exception Handlers & Middleware

FastAPI turns raised exceptions into HTTP responses. Hook in handlers for your own classes; add middleware for cross-cutting concerns — timing, request IDs, CORS.

HTTPException & custom errors

from fastapi import HTTPException
from fastapi.responses import JSONResponse

class NotFoundError(Exception):
    def __init__(self, what: str): self.what = what

@app.exception_handler(NotFoundError)
async def not_found(_req, exc: NotFoundError):
    return JSONResponse(404, content={
      'error': {'code': 'not_found',
                'message': f'{exc.what} not found'}})

@app.get('/users/{id}')
def get_user(id: int):
    user = repo.find(id)
    if not user: raise NotFoundError('user')
    return user

Override the validation error shape

from fastapi.exceptions import RequestValidationError

@app.exception_handler(RequestValidationError)
async def validation_error(req, exc):
    return JSONResponse(422, content={
      'error': {
        'code': 'validation_failed',
        'details': exc.errors(),
        'request_id': req.state.request_id,
      }
    })

Middleware

from fastapi import Request
from uuid import uuid4
import time

@app.middleware('http')
async def request_id_and_timing(request: Request, call_next):
    rid = request.headers.get('x-request-id') or str(uuid4())
    request.state.request_id = rid

    t0 = time.perf_counter()
    try:
        response = await call_next(request)
    except Exception:
        # logged by your structured logger via exception_handler
        raise
    dur_ms = (time.perf_counter() - t0) * 1000
    response.headers['x-request-id'] = rid
    response.headers['server-timing'] = f'app;dur={dur_ms:.1f}'
    return response

CORS & trusted host

from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.trustedhost import TrustedHostMiddleware

app.add_middleware(CORSMiddleware,
  allow_origins=['https://app.example.com'],
  allow_credentials=True,
  allow_methods=['GET','POST','PUT','PATCH','DELETE'],
  allow_headers=['authorization','content-type'])

app.add_middleware(TrustedHostMiddleware,
  allowed_hosts=['api.example.com', '*.example.com'])

Order matters

Middleware added later runs outermost. Put request-ID and timing last in code so they wrap everything — including CORS rejections.

12

Background, Streaming & WebSockets

Three patterns for non-classical request/response: BackgroundTasks for fire-and-forget after the response, StreamingResponse for chunked output, and WebSockets for full-duplex.

Background tasks

from fastapi import BackgroundTasks

@app.post('/users')
def create_user(
    body: UserCreate,
    bg: BackgroundTasks,
    mailer: Annotated[Mailer, Depends(get_mailer)],
):
    user = users.create(body)
    bg.add_task(mailer.send_welcome, user.email)
    return user
# → response sent immediately; mailer runs after.
# Good for: emails, webhooks, audit logs, <1s work.
# For real retry / scheduling: use Celery / arq / dramatiq.

Streaming — SSE / NDJSON

from fastapi.responses import StreamingResponse
import json

async def gen_events():
    for i in range(10):
        yield f'event: tick\ndata: {json.dumps({"i": i})}\n\n'
        await asyncio.sleep(1)

@app.get('/stream')
async def stream():
    return StreamingResponse(gen_events(),
                             media_type='text/event-stream',
                             headers={'Cache-Control': 'no-cache',
                                      'X-Accel-Buffering': 'no'})

# NDJSON
async def gen_rows():
    async for r in db.stream(query):
        yield json.dumps(r).encode() + b'\n'

@app.get('/users.ndjson')
async def ndjson():
    return StreamingResponse(gen_rows(),
                             media_type='application/x-ndjson')

WebSockets

from fastapi import WebSocket, WebSocketDisconnect

@app.websocket('/ws')
async def ws(socket: WebSocket):
    await socket.accept()
    try:
        async for msg in socket.iter_text():
            await socket.send_text(f'echo: {msg}')
    except WebSocketDisconnect:
        ...
    finally:
        # cleanup — remove from any pubsub fanout, etc.
        ...

LLM-streaming pattern

from fastapi.responses import StreamingResponse

@app.post('/chat')
async def chat(req: ChatReq):
    async def tokens():
        async for delta in llm.stream(req.messages):
            yield f'data: {json.dumps({"delta": delta})}\n\n'
        yield 'data: [DONE]\n\n'
    return StreamingResponse(tokens(),
        media_type='text/event-stream',
        headers={'Cache-Control': 'no-cache',
                 'X-Accel-Buffering': 'no'})

Pitfalls

  • BackgroundTasks blocks worker shutdown — long jobs go in a queue, not here
  • SSE behind nginx without X-Accel-Buffering: no → silent client
  • WS handlers don't auto-reconnect — client must handle it
  • StreamingResponse + a response_model doesn't work — you're past Pydantic
13

Database — SQLAlchemy 2.0 & Friends

FastAPI doesn't bundle an ORM. Three real choices: SQLAlchemy 2.0 async (the default), SQLModel (Pydantic + SQLAlchemy by tiangolo), Tortoise ORM (Django-ish async).

SQLAlchemy 2.0 async

# db.py
from sqlalchemy.ext.asyncio import (
    create_async_engine, async_sessionmaker, AsyncSession)
from sqlalchemy.orm import DeclarativeBase

engine = create_async_engine(settings.DATABASE_URL,
                             pool_size=10, max_overflow=20)
SessionLocal = async_sessionmaker(engine, expire_on_commit=False)

class Base(DeclarativeBase): ...

# deps.py
async def get_db() -> AsyncIterator[AsyncSession]:
    async with SessionLocal() as s:
        yield s

# usage
@app.get('/users/{id}', response_model=UserRead)
async def get_user(id: int,
                   db: Annotated[AsyncSession, Depends(get_db)]):
    user = await db.get(User, id)
    if not user: raise HTTPException(404)
    return user

Migrations — Alembic

uv add alembic
uv run alembic init alembic
uv run alembic revision --autogenerate -m 'init'
uv run alembic upgrade head

SQLModel — one class, two roles

from sqlmodel import SQLModel, Field

class User(SQLModel, table=True):
    id:    int | None = Field(default=None, primary_key=True)
    email: str        = Field(index=True, unique=True)
    name:  str
# → this is BOTH the SQLAlchemy table and the Pydantic model
#    used in request / response bodies.

# pros: less duplication for simple CRUD
# cons: ORM and DTO concerns become tangled in larger apps

Patterns that pay off

  • DTOs separate from ORMUserCreate / UserRead / UserDB; never expose the table directly past the boundary
  • Repository functions over the session — users.by_email(db, email) is testable; chained ORM calls in handlers aren't
  • Transactionsasync with db.begin(): at the boundary, not deep in services
  • Pool sizingpool_size per worker × workers < DB max_connections

Don't

  • Don't share a session across requests — one per request, scoped via Depends
  • Don't call sync ORM (psycopg2 / SQLAlchemy 1.4 sync) from async def handlers
  • Don't ship without a connection-pool monitor — pool exhaustion looks like "FastAPI is slow"
14

Configuration — Pydantic Settings

Twelve-factor config, but typed. pydantic-settings reads env vars / .env / secrets dirs, validates types and defaults at boot, and surfaces missing values as startup errors, not runtime ones.

Settings model

from pydantic import Field, SecretStr, AnyHttpUrl, PostgresDsn
from pydantic_settings import BaseSettings, SettingsConfigDict

class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file='.env', env_prefix='APP_',
        env_nested_delimiter='__', case_sensitive=False)

    env:        Literal['dev','staging','prod'] = 'dev'
    debug:      bool = False
    database_url: PostgresDsn
    redis_url:    str
    jwt_key:      SecretStr
    cors_origins: list[AnyHttpUrl] = []
    log_level:    Literal['DEBUG','INFO','WARNING','ERROR'] = 'INFO'

    # nested groups via env_nested_delimiter
    smtp__host: str | None = None
    smtp__port: int = 587
    smtp__user: str | None = None
    smtp__pass: SecretStr | None = None

settings = Settings()

Use from a dependency

from functools import lru_cache

@lru_cache
def get_settings() -> Settings:
    return Settings()

@app.get('/config')
def cfg(s: Annotated[Settings, Depends(get_settings)]):
    return {'env': s.env, 'log_level': s.log_level}

Per-environment

# .env.dev / .env.staging / .env.prod
ENV_FILE=.env.prod uv run uvicorn app.main:app

# in a Settings model:
class Settings(BaseSettings):
    model_config = SettingsConfigDict(
        env_file=os.getenv('ENV_FILE', '.env'))

Why this is load-bearing

  • Type errors at boot — the pod that can't start is the one you want, not the one that crashes mid-request
  • SecretStr hides secrets in repr / logs / model_dump
  • .env for local dev, real env vars for k8s — same code path
  • Tests override with app.dependency_overrides[get_settings] = lambda: TestSettings()

Don't

  • Don't call Settings() at import time at module top — it makes tests harder; cache via get_settings()
  • Don't log secrets — even with SecretStr, custom __str__ in error paths can leak
  • Don't conflate config and feature flags — flags belong in a flag service / DB, not Settings
15

Testing

FastAPI ships a TestClient built on httpx. The interesting feature is app.dependency_overrides — swap any Depends dependency in tests for a stub, with no patching.

Sync tests with TestClient

from fastapi.testclient import TestClient
from app.main import app

client = TestClient(app)

def test_health():
    r = client.get('/healthz')
    assert r.status_code == 200
    assert r.json() == {'ok': True}

def test_create_user():
    r = client.post('/users', json={'email': 'a@x.io', 'name': 'A'})
    assert r.status_code == 201
    assert r.json()['email'] == 'a@x.io'

Async tests with httpx

import pytest
from httpx import AsyncClient, ASGITransport
from app.main import app

@pytest.mark.asyncio
async def test_async():
    transport = ASGITransport(app=app)
    async with AsyncClient(transport=transport,
                           base_url='http://test') as c:
        r = await c.get('/healthz')
        assert r.status_code == 200

Dependency overrides — the killer feature

def fake_db():
    return InMemoryDB()

def fake_user():
    return User(id=1, email='t@t', role='admin')

app.dependency_overrides[get_db]      = fake_db
app.dependency_overrides[current_user] = fake_user

# now every endpoint sees the fakes — no monkeypatch, no patch.object
client = TestClient(app)
r = client.get('/users/me')
assert r.status_code == 200

# clean up between tests
app.dependency_overrides.clear()

Pytest fixtures pattern

@pytest.fixture
def client(app_with_overrides):
    return TestClient(app_with_overrides)

@pytest.fixture
def app_with_overrides():
    app.dependency_overrides[get_db] = fake_db
    yield app
    app.dependency_overrides.clear()

What to fake, what to keep real

  • Database: real Postgres (Testcontainers) for integration; SQLite for fast unit; never mock
  • Auth: override current_user; don't mint real tokens in unit tests
  • External HTTP: respx (httpx) or pytest-httpserver
  • Queues / mailers: in-memory fake; assert calls in the test
16

OpenAPI & Documentation

OpenAPI generation is the point of FastAPI — not a side feature. The schema you serve at /openapi.json is the same schema Swagger UI / ReDoc render, the same one you ship to clients, the same one CI validates against.

What's free

  • /openapi.json — OpenAPI 3.1, regenerated on each request in dev
  • /docs — Swagger UI
  • /redoc — ReDoc
  • Path / query / body / header param entries
  • Request bodies and response schemas
  • Security schemes from the security utilities
  • Example payloads from Field(examples=[...])

Tags, summaries, deprecation

@app.post('/items',
          tags=['items'],
          summary='Create an item',
          description='Accepts a JSON body...',
          response_description='The created item',
          status_code=201,
          deprecated=False)
def create(item: ItemCreate): ...

# group tags
app = FastAPI(openapi_tags=[
    {'name': 'items', 'description': 'Catalogue items'},
    {'name': 'users', 'description': 'Account ops'},
])

Customise the schema

from fastapi.openapi.utils import get_openapi

def custom_openapi():
    if app.openapi_schema:
        return app.openapi_schema
    schema = get_openapi(title='My API', version='1.2.3',
                         routes=app.routes)
    schema['info']['x-logo'] = {'url': 'https://example/logo.png'}
    schema['servers'] = [{'url': 'https://api.example.com'}]
    app.openapi_schema = schema
    return schema

app.openapi = custom_openapi

Export & client generation

# export the schema at build time
uv run python -c \
  'from app.main import app; import json; \
   print(json.dumps(app.openapi()))' \
  > openapi.json

# typescript types for the frontend
npx openapi-typescript openapi.json -o api.ts

# python client
uv run datamodel-code-generator \
  --input openapi.json --output client.py

Hide what shouldn't be in the public schema

  • Internal / admin routes — include_in_schema=False on the operation or router
  • response_model_exclude_none=True for cleaner output schemas
  • Mount Swagger UI behind admin auth in prod, or disable with FastAPI(docs_url=None)
17

Lifespan & Startup / Shutdown

Where to open a DB pool, warm a model, connect to Redis, register Prometheus collectors. The modern pattern is asynccontextmanager + FastAPI(lifespan=...).

Lifespan context manager

from contextlib import asynccontextmanager

@asynccontextmanager
async def lifespan(app: FastAPI):
    # —— startup ——
    app.state.db    = create_engine_pool()
    app.state.redis = await aioredis.from_url(settings.REDIS_URL)
    app.state.model = load_model('llama-3-8b')
    log.info('startup complete')

    yield                            # app runs

    # —— shutdown ——
    await app.state.redis.close()
    await app.state.db.dispose()
    log.info('shutdown complete')

app = FastAPI(lifespan=lifespan)

Sharing state with handlers

def get_redis(req: Request):
    return req.app.state.redis

@app.get('/cache/{k}')
async def cache(k: str,
                r: Annotated[Redis, Depends(get_redis)]):
    return {'value': await r.get(k)}

Health vs ready

@app.get('/healthz')         # liveness: process alive
def healthz(): return {'ok': True}

@app.get('/readyz')          # readiness: deps OK + not draining
async def readyz(req: Request):
    if app.state.draining:
        return JSONResponse(503, {'ok': False})
    try:
        async with req.app.state.db.connect() as c:
            await c.execute(text('select 1'))
        await req.app.state.redis.ping()
    except Exception:
        return JSONResponse(503, {'ok': False})
    return {'ok': True}

Graceful shutdown

  • Uvicorn handles SIGTERM — calls the lifespan teardown after in-flight requests drain
  • Set app.state.draining = True in a SIGTERM handler (or in lifespan exit) so /readyz goes 503 first — LB stops sending traffic
  • K8s: terminationGracePeriodSeconds > longest_request; preStop hook of sleep 5 avoids the LB race

Avoid the deprecated form

The old @app.on_event('startup') / 'shutdown' decorators still work but are deprecated. Use lifespan; it integrates with Starlette and is what the docs recommend.

18

Performance

FastAPI is rarely the bottleneck. Wins live in workers vs concurrency, avoiding loop blocking, payload size & serialisation, and downstream parallelism.

Workers & processes

# single process, multi-coroutine — great for I/O
uvicorn app.main:app --host 0.0.0.0 --port 8000

# multi-process — one event loop per worker
uvicorn app.main:app --workers 4

# production: gunicorn manages uvicorn workers
gunicorn app.main:app \
  -k uvicorn.workers.UvicornWorker \
  -w 4 -b 0.0.0.0:8000 \
  --timeout 60 --graceful-timeout 30 \
  --max-requests 10000 --max-requests-jitter 1000

Rule of thumb: workers2 × CPU. In K8s prefer one worker per pod — the orchestrator scales pods.

Newer ASGI servers

  • Granian — Rust-backed; faster than uvicorn on raw I/O
  • Hypercorn — HTTP/2, HTTP/3, trio support
  • For LLM streaming, uvicorn + --http httptools is usually plenty

Avoid loop blocking

  • No time.sleep, requests, blocking open() in async def
  • Use asyncio.sleep, httpx.AsyncClient, aiofiles
  • If you must call sync code — await run_in_threadpool(fn, ...)
  • Tune anyio threadpool: BackgroundExecutor(max_workers=...)

Serialisation

  • Pydantic v2 is fast — usually the fastest part of your handler
  • Switch to orjson via ORJSONResponse for > 2× on big lists
  • response_model_exclude_unset=True — trims default fields from the wire
  • Stream large lists as NDJSON — constant memory
from fastapi.responses import ORJSONResponse
app = FastAPI(default_response_class=ORJSONResponse)

If you're still slow

  • Profile with py-spy record against the prod-like image — intuition is wrong half the time
  • Watch the connection pool — pool exhaustion looks identical to "FastAPI is slow"
  • Measure p50 / p95 / p99 separately — a slow tail is usually a slow downstream
19

Deployment

The shape that ships: multi-stage Dockerfile, non-root user, distroless or slim base, healthchecks wired to /healthz & /readyz.

Dockerfile (uv-based)

ARG PY=3.12
FROM python:${PY}-slim-bookworm AS base
ENV PYTHONDONTWRITEBYTECODE=1 PYTHONUNBUFFERED=1

# —— deps stage ——
FROM base AS deps
RUN pip install --no-cache-dir uv
WORKDIR /app
COPY pyproject.toml uv.lock ./
RUN uv sync --frozen --no-dev

# —— runtime stage ——
FROM base AS runtime
RUN useradd -r -u 10001 app
WORKDIR /app
COPY --from=deps /app/.venv /app/.venv
COPY app/ ./app/
USER app
ENV PATH=/app/.venv/bin:$PATH

EXPOSE 8000
HEALTHCHECK --interval=30s --timeout=3s \
  CMD python -c "import urllib.request,sys; \
                 sys.exit(0 if urllib.request.urlopen('http://localhost:8000/healthz').status==200 else 1)"

CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Kubernetes basics

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
      - name: api
        image: ghcr.io/me/api:1.2.3
        ports: [{containerPort: 8000}]
        livenessProbe:
          httpGet: {path: /healthz, port: 8000}
          periodSeconds: 10
        readinessProbe:
          httpGet: {path: /readyz, port: 8000}
          periodSeconds: 5
          failureThreshold: 3
        resources:
          requests: {cpu: "200m", memory: "256Mi"}
          limits:   {cpu: "1",    memory: "512Mi"}
        lifecycle:
          preStop:
            exec: {command: ["sleep", "5"]}

Where to host

PlatformNotes
K8s (EKS / GKE / AKS)One worker per pod; HPA
Cloud Run / Lambda + MangumStateless; cold starts; no WebSockets on Lambda
Render / Fly / RailwayEasy buttons; great for staging
VPS + nginx + systemdCheap, predictable; gunicorn + uvicorn

Don't ship

  • As root
  • With --reload
  • With /docs public if the API is non-public
  • Without HTTPS at the proxy layer
20

Observability

Three pillars, same as everywhere: structured logs, metrics, distributed traces. The minimum competent setup is structlog + prometheus-fastapi-instrumentator + OpenTelemetry.

Structured logs (structlog)

import structlog, logging

structlog.configure(
  processors=[
    structlog.contextvars.merge_contextvars,
    structlog.processors.add_log_level,
    structlog.processors.TimeStamper(fmt='iso'),
    structlog.processors.JSONRenderer(),
  ],
  wrapper_class=structlog.stdlib.BoundLogger,
)
log = structlog.get_logger()

@app.middleware('http')
async def log_request(req, call_next):
    rid = req.headers.get('x-request-id') or str(uuid4())
    structlog.contextvars.bind_contextvars(request_id=rid)
    t0 = time.perf_counter()
    resp = await call_next(req)
    log.info('http_request',
             method=req.method, path=req.url.path,
             status=resp.status_code,
             dur_ms=round((time.perf_counter()-t0)*1000, 1))
    return resp

Metrics — Prometheus

from prometheus_fastapi_instrumentator import Instrumentator
Instrumentator().instrument(app).expose(app, endpoint='/metrics')
# → per-route RED metrics + python_gc + process_cpu

Tracing — OpenTelemetry

from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter \
  import OTLPSpanExporter
from opentelemetry.instrumentation.fastapi \
  import FastAPIInstrumentor
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor

provider = TracerProvider(resource=Resource.create(
    {'service.name': 'api', 'service.version': '1.2.3'}))
provider.add_span_processor(
    BatchSpanProcessor(OTLPSpanExporter()))
trace.set_tracer_provider(provider)

FastAPIInstrumentor.instrument_app(app)

Hygiene

  • Redact authorization, cookie, x-api-key in logs
  • Never log request / response bodies in prod
  • Bound metric cardinality — never tag by user ID
  • Sample traces 1–5% with tail-sampling on errors
  • Bind request_id via contextvars — appears on every line

What to look at first

  1. p95 / p99 per route
  2. Error rate per route
  3. DB pool wait time
  4. Downstream call duration histograms
21

Security Hardening

FastAPI is a thin layer; security is mostly composing the right middleware, validating at the boundary, and not trusting input. Six controls cover the OWASP Top 10 for an API.

Required middleware

from fastapi.middleware.cors import CORSMiddleware
from fastapi.middleware.trustedhost import TrustedHostMiddleware
from fastapi.middleware.httpsredirect import HTTPSRedirectMiddleware
from starlette.middleware.gzip import GZipMiddleware

app.add_middleware(HTTPSRedirectMiddleware)
app.add_middleware(TrustedHostMiddleware,
  allowed_hosts=['api.example.com','*.example.com'])
app.add_middleware(CORSMiddleware,
  allow_origins=settings.cors_origins,
  allow_credentials=True,
  allow_methods=['GET','POST','PUT','PATCH','DELETE'],
  allow_headers=['authorization','content-type'])
app.add_middleware(GZipMiddleware, minimum_size=1024)

Security headers

@app.middleware('http')
async def security_headers(req, call_next):
    resp = await call_next(req)
    resp.headers['Strict-Transport-Security'] = \
      'max-age=31536000; includeSubDomains; preload'
    resp.headers['X-Content-Type-Options'] = 'nosniff'
    resp.headers['Referrer-Policy'] = 'no-referrer'
    resp.headers['X-Frame-Options'] = 'DENY'
    resp.headers['Content-Security-Policy'] = \
      "default-src 'none'; frame-ancestors 'none'"
    return resp

Body size & rate limiting

from starlette.middleware.base import BaseHTTPMiddleware

class BodySizeLimit(BaseHTTPMiddleware):
    def __init__(self, app, max_bytes: int):
        super().__init__(app); self.max = max_bytes
    async def dispatch(self, req, call_next):
        cl = int(req.headers.get('content-length', 0))
        if cl > self.max:
            return JSONResponse(413, {'error': 'too large'})
        return await call_next(req)

app.add_middleware(BodySizeLimit, max_bytes=128 * 1024)

# rate limiting: slowapi (limits + redis)
from slowapi import Limiter
limiter = Limiter(key_func=lambda r: r.client.host,
                  storage_uri=settings.redis_url)
app.state.limiter = limiter

@limiter.limit('5/minute')
@app.post('/auth/login')
async def login(req: Request, body: Login): ...

Input & auth hygiene

  • Validate every field with Pydantic — EmailStr, HttpUrl, length / range
  • Parameterise SQL — never f-string a user value
  • Hash passwords with argon2 — passlib[argon2]
  • Verify JWTs strictlyiss / aud / exp / nbf
  • Constant-time compare for tokens / API keys — secrets.compare_digest

Don't

  • Echo unsanitised user input in errors / headers / logs
  • Trust X-Forwarded-For without configuring trust at the proxy
  • Disable CORS «temporarily» with '*' — nothing is more permanent than a temporary fix
22

FastAPI vs Flask, Django REST, Litestar, Starlette

The Python web stack splits along two axes: async / sync and batteries / minimalism. Pick by which axis matters more.

Criterion FastAPI Litestar Starlette Flask Django REST
StyleType hints + DIType hints + DIASGI toolkitDecorators + contextClass-based views
ValidationPydantic v2msgspec / Pydantic / attrsNone (BYO)None (BYO)DRF serializers
SpeedFastFaster (msgspec)Fast (raw)Sync, slowerSync, slowest
AsyncNativeNativeNativeLimitedLimited (3.x)
OpenAPIBuilt-inBuilt-inBYOBYO (Spectree, etc.)drf-spectacular
BatteriesFewFewNoneFewMany (admin/ORM)
MaturityMature (2018)Active (2023+)MatureMatureVery mature
Best forJSON APIs, ML serversThroughput, stricter typingBuilding frameworksTiny apps, prototypesServer-rendered + admin

Choose FastAPI when

  • JSON API or ML / LLM model server
  • You want OpenAPI for free
  • You like Pydantic v2's validation feel
  • Most Python web jobs in 2026

Choose Litestar when

  • You want msgspec speed and stricter typing discipline
  • You prefer a richer DI / plugin model
  • You're starting a fresh greenfield project

Choose Django / Flask when

  • Django + DRF: server-rendered HTML + admin + ORM in one
  • Flask: tiny scripts, glue services, minimum surface area
23

Production Recipes

The recurring patterns: repository functions over the session, DTOs separate from ORM, an outbox for "DB write + queue", dependency overrides for tests and feature flags.

Repository + service shape

# schemas/users.py — DTOs
class UserCreate(BaseModel):
    email: EmailStr; name: str
class UserRead(BaseModel):
    id: int; email: EmailStr; name: str

# repositories/users.py
async def create(db, data: UserCreate) -> UserRead:
    row = User(**data.model_dump())
    db.add(row); await db.commit(); await db.refresh(row)
    return UserRead.model_validate(row, from_attributes=True)

# services/users.py
async def signup(db, queue, data: UserCreate) -> UserRead:
    if await users_repo.by_email(db, data.email):
        raise HTTPException(409)
    user = await users_repo.create(db, data)
    await queue.enqueue('emails.welcome', user.id)
    return user

# api/v1/users.py
@router.post('/', response_model=UserRead, status_code=201)
async def signup_endpoint(
    body: UserCreate,
    db: Annotated[AsyncSession, Depends(get_db)],
    q:  Annotated[Queue,        Depends(get_queue)],
):
    return await users_service.signup(db, q, body)

Outbox pattern

async def signup(db, data):
    async with db.begin():
        user = await users_repo.create(db, data)
        await db.execute(insert(Outbox).values(
            kind='emails.welcome',
            payload={'user_id': user.id}))
    # a relay process polls outbox and pushes to the queue
    return user

Versioned routers

from fastapi import APIRouter

v1 = APIRouter(prefix='/v1')
v1.include_router(users_router, prefix='/users', tags=['v1.users'])
v1.include_router(items_router, prefix='/items', tags=['v1.items'])

v2 = APIRouter(prefix='/v2')
v2.include_router(users_router_v2, prefix='/users', tags=['v2.users'])

app.include_router(v1)
app.include_router(v2)

Feature flags via dependency overrides

def feature_x_enabled() -> bool:
    return get_settings().flags.feature_x

@app.get('/x')
def x(enabled: Annotated[bool, Depends(feature_x_enabled)]):
    if not enabled: raise HTTPException(404)
    ...
# tests can flip the flag with dependency_overrides[feature_x_enabled]
24

Summary & Next Steps

Why FastAPI earns its keep

  • Type hints become the API — one declaration drives validation, serialisation, DI and OpenAPI
  • Async-first with a graceful sync escape hatch via the threadpool
  • Pydantic v2 is fast — the Rust core leaves serialisation off the hot list
  • DI via Depends() is small enough to hold in your head, big enough to compose policies
  • OpenAPI for free — clients, contract tests, mock servers all fall out

Take-aways

  • Use Annotated[T, Query/Path/Body(...)] for type-checker-friendly signatures
  • One Depends() chain per concern: DB, current user, scope guard
  • Lifespan over deprecated on_event
  • DTOs separate from ORM — UserCreate / UserRead / UserDB
  • app.dependency_overrides in tests, not monkeypatch
  • Pydantic Settings for config; fail at boot, not at request time

Production checklist

  • Lifespan opens/closes pools; /healthz + /readyz wired
  • HTTPS + TrustedHost + CORS allow-list + CSP
  • Rate limit on auth endpoints; body-size cap globally
  • Structured logs (structlog), Prometheus, OpenTelemetry
  • Non-root container, distroless / slim base, K8s probes & preStop
  • Tests use TestClient + dependency_overrides

Next steps

  • Build a small CRUD service: FastAPI + SQLAlchemy 2.0 async + Alembic + uvicorn + Docker
  • Wire structlog + Prometheus + OTel; ship dashboards
  • Generate a TypeScript client from openapi.json
  • Add a Testcontainers Postgres test suite
  • Try Litestar on a side project — see what stricter typing buys you

Essential reading