COMPUTER SCIENCE FUNDAMENTALS SERIES

Balanced
Trees

AVL trees · Red-Black trees · Rotations ·
2-3 trees · Balance factor · Persistent trees
Mid-level software engineer track  ·  20 slides
02

Why Balancing Matters

The promise of binary search trees

A BST offers O(log n) search, insert, and delete — but only when the tree is balanced. The height determines the worst-case cost of every operation.

  • Balanced tree — height O(log n), operations are fast
  • Degenerate tree — height O(n), operations degrade to a linked list
  • Balancing guarantees a logarithmic height bound regardless of insertion order
  • Without balancing, sorted input produces the worst case every time

The cost of imbalance

ScenarioHeightSearch cost
Perfectly balanced (n = 1M)~2020 comparisons
Random insertion (n = 1M)~4040 comparisons
Sorted insertion (n = 1M)1,000,0001M comparisons

Key insight: self-balancing trees pay a small rebalancing cost per mutation to guarantee O(log n) height at all times. The amortised overhead is negligible compared to the worst-case degenerate scenario.

03

BST Recap & the Degenerate Case

Binary Search Tree invariant

For every node x: all keys in the left subtree are less than x.key, and all keys in the right subtree are greater.

Balanced BST:          Degenerate BST:

       50                  10
      /  \                   \
    30    70                  20
   / \   / \                   \
  20 40 60 80                   30
                                 \
                                  40
                                   \
                                    50

Why sorted input kills a plain BST

Inserting [10, 20, 30, 40, 50] into a naive BST produces a right-skewed chain. Every operation becomes O(n).

  • Search — must traverse the entire chain
  • Insert — always appends at the deepest point
  • Delete — restructuring offers no height reduction

Self-balancing trees solve this by restructuring after each mutation. The mechanisms differ — rotations, colour flips, node splits — but the goal is always the same: keep h = O(log n).

04

AVL Trees — Balance Factor

Definition

An AVL tree (Adelson-Velsky & Landis, 1962) is a BST where, for every node, the heights of the left and right subtrees differ by at most 1.

Balance factor

BF(node) = height(left subtree) - height(right subtree)
Balance factorMeaning
BF = -1Right-heavy by one level — acceptable
BF = 0Perfectly balanced at this node
BF = +1Left-heavy by one level — acceptable
BF = -2 or +2Violation — requires rotation

Height guarantee

An AVL tree with n nodes has height at most 1.44 × log2(n). Tighter than Red-Black trees, making AVL faster for lookup-heavy workloads.

Strict balance

AVL trees are strictly balanced — they tolerate less imbalance than Red-Black trees. This means more rotations on insert/delete, but faster searches.

05

AVL Rotations — LL & RR

When the imbalance is on the outer edge (left-left or right-right), a single rotation restores balance.

LL Rotation (Right Rotation)

  BF=+2                  BF=0
    z                      y
   / \     rotate         / \
  y   T4  right(z)       x   z
 / \     -------->      / \ / \
x   T3                T1 T2 T3 T4
/ \
T1 T2

Node y becomes the new root. z becomes y's right child. T3 moves from y's right to z's left.

RR Rotation (Left Rotation)

BF=-2                   BF=0
  z                       y
 / \     rotate          / \
T1  y   left(z)         z   x
   / \  -------->      / \ / \
  T2  x              T1 T2 T3 T4
     / \
    T3 T4

Mirror of LL. Node y becomes root. z becomes y's left child.

Single rotations run in O(1) — they reassign at most three pointers and update two heights.

06

AVL Rotations — LR & RL

When the imbalance is on the inner edge (left-right or right-left), a double rotation is required.

LR Rotation (Left-Right)

Step 1: Left rotate y     Step 2: Right rotate z

    z          z                  x
   / \        / \                / \
  y   T4 -> x   T4  ->        y   z
 / \        / \               / \ / \
T1  x      y   T3           T1 T2 T3 T4
   / \    / \
  T2 T3  T1 T2

First rotate left at y, then rotate right at z.

RL Rotation (Right-Left)

Step 1: Right rotate y    Step 2: Left rotate z

  z            z                  x
 / \          / \                / \
T1  y   ->  T1   x    ->      z   y
   / \          / \           / \ / \
  x   T4      T2  y         T1 T2 T3 T4
 / \              / \
T2 T3            T3 T4

Mirror of LR. First rotate right at y, then rotate left at z.

How to pick the rotation: check the balance factor of the violating node and its heavy child. Same sign → single rotation. Different signs → double rotation.

07

AVL Insertion & Deletion

Insertion algorithm

  1. Standard BST insert at the correct leaf position
  2. Walk back up the path to the root, updating heights
  3. At the first node with |BF| = 2, perform the appropriate rotation
  4. At most one rotation (single or double) is needed per insertion

Deletion algorithm

  1. Standard BST delete (in-order successor for 2-child case)
  2. Walk back up the path, updating heights
  3. At each node with |BF| = 2, perform the appropriate rotation
  4. Unlike insertion, deletion may require O(log n) rotations

Complexity

OperationAverageWorst case
SearchO(log n)O(log n)
InsertO(log n)O(log n) — 1 rotation
DeleteO(log n)O(log n) — up to O(log n) rotations
SpaceO(n)O(n) — one height/BF per node

AVL trees guarantee at most one rotation on insert but may need multiple rotations on delete. This asymmetry is why Red-Black trees are sometimes preferred for write-heavy workloads.

08

Red-Black Trees — Properties

The five properties

1. Every node is either red or black

2. The root is always black

3. Every leaf (NIL sentinel) is black

4. If a node is red, both its children are black (no consecutive reds)

5. All paths from any node to its NIL leaves contain the same number of black nodes (black-height)

Height guarantee

A Red-Black tree with n internal nodes has height at most 2 × log2(n + 1). Looser than AVL's 1.44, but still O(log n).

Black-height

The black-height of a node is the number of black nodes on any path from that node (exclusive) down to a NIL leaf. Property 5 ensures this is well-defined.

Rotation guarantee

At most 2 rotations per insert, at most 3 rotations per delete. Trade slightly worse search performance for fewer structural changes.

09

Red-Black Trees — Colouring Rules

Why colours?

The colour constraints encode balance information implicitly. Property 4 (no consecutive reds) and Property 5 (equal black-height) together force the longest path to be at most twice the shortest.

Intuition via 2-3-4 mapping

Every Red-Black tree is isomorphic to a 2-3-4 tree:

2-3-4 nodeRed-Black equivalent
2-nodeSingle black node
3-nodeBlack node with one red child
4-nodeBlack node with two red children

Colour constraints at a glance

New nodes are always inserted as red — this preserves black-height

Red violations (two consecutive reds) are fixed by recolouring or rotations

Black violations only arise during deletion and require more complex fixups

The colour scheme is not arbitrary — it is a compact encoding of the 2-3-4 tree structure into a binary tree. Understanding this mapping makes Red-Black tree operations far more intuitive.

10

Red-Black Insertion — Cases & Uncle Colour

Insertion procedure

  1. Standard BST insert; colour the new node red
  2. If the parent is black, done — no violation
  3. If the parent is red, check the uncle (parent's sibling)

Case analysis (parent is red)

CaseUncle colourFix
Case 1Uncle is redRecolour parent & uncle black, grandparent red. Move violation up. Repeat.
Case 2Uncle is black, inner childRotate node to outer position (converts to Case 3)
Case 3Uncle is black, outer childRotate grandparent, swap colours of parent & grandparent. Done.

Rotation count

At most 2 rotations per insertion (Case 2 + Case 3)

Recolouring (Case 1) may propagate up O(log n) times but involves no rotations

After any rotation, the fix-up terminates immediately

Compare with AVL: AVL does at most 1 rotation on insert but must always walk to the root updating heights. Red-Black insertion can terminate early when no further violations exist.

11

Red-Black Deletion

Overview

Deletion is the most complex Red-Black operation. Removing a black node reduces the black-height on that path, creating a "double-black" deficit.

Procedure

  1. Standard BST delete (find in-order successor if needed)
  2. If removed node was red, done
  3. If black, replacement carries extra "black" (double-black)
  4. Fix up the double-black by case analysis

Double-black fix-up cases

CaseSiblingSibling's childrenFix
1RedRotate sibling up, recolour. Converts to Case 2/3/4.
2BlackBoth blackRecolour sibling red, push double-black up. Repeat.
3BlackNear child redRotate near child up, recolour. Converts to Case 4.
4BlackFar child redRotate sibling up, transfer colours. Done.

At most 3 rotations per deletion. Recolouring may propagate O(log n) times (Case 2 bubbling). Deletion case analysis is symmetric for left/right.

12

AVL vs Red-Black — When to Choose Which

PropertyAVLRed-Black
Height bound1.44 log n2 log n
Search speedFaster (shorter height)Slightly slower
Insert rotationsAt most 1At most 2
Delete rotationsUp to O(log n)At most 3
Storage overheadHeight or BF per node1 bit (colour) per node
Implementation complexityModerateHigher

Choose AVL

  • Read-heavy workloads where search performance dominates
  • Databases and file systems where lookups vastly outnumber mutations
  • When you need the tightest possible height guarantee

Choose Red-Black

  • Write-heavy workloads with frequent insertions and deletions
  • Standard library implementations (C++ std::map, Java TreeMap)
  • Kernel data structures (Linux CFS scheduler)
13

2-3 Trees

Definition

A 2-3 tree is a perfectly balanced search tree where every internal node has either 2 or 3 children, and all leaves are at the same depth.

Node types

Node typeKeysChildrenSearch
2-node1 key2Left < key < Right
3-node2 keys (a, b)3Left < a, Mid between, Right > b

Insertion

  1. Search down to the correct leaf
  2. Add the key to the leaf node
  3. If overflow (3 keys), split: push middle key up to parent
  4. Splitting may cascade up; if root splits, height increases by 1

Properties

  • Perfect balance — all leaves at the same level, always
  • Height: O(log n) with base between 2 and 3
  • Splits maintain balance without rotations
  • Conceptual foundation for Red-Black trees and B-trees
14

2-3-4 Trees

Extension of 2-3 trees

A 2-3-4 tree adds a 4-node (3 keys, 4 children). This directly corresponds to a Red-Black tree.

Node typeKeysChildren
2-node12
3-node23
4-node34

Insertion strategies

  • Bottom-up: insert, then split overflowing nodes back up
  • Top-down: pre-emptively split any 4-node on the way down — guarantees the leaf has room

Equivalence to Red-Black trees

2-node:    [B]          Black node, no red children
3-node:    [B]-R        Black with one red child
4-node:  R-[B]-R        Black with two red children

Top-down 2-3-4 insertion maps directly to top-down Red-Black insertion with colour flips.

Sedgewick's Algorithms teaches Red-Black trees entirely through this 2-3-4 mapping. Understanding the isomorphism makes Red-Black rotations intuitive rather than magical.

15

Left-Leaning Red-Black Trees

Sedgewick's simplification (2008)

A Left-Leaning Red-Black (LLRB) tree adds one extra invariant: red links lean left. This eliminates half the cases in standard Red-Black tree operations.

Operations simplified

OperationStandard RBLLRB
Insert cases6 (3 + mirror)3
Delete cases8+~4
Code lines150–20040–60

Core helpers

rotateLeft(h)     // fix right-leaning red link
rotateRight(h)    // temporarily create right-leaning red
flipColours(h)    // split a 4-node (both children red)

Correspondence

LLRB trees are isomorphic to 2-3 trees (not 2-3-4), because the left-leaning constraint forbids 4-nodes from persisting — they are always split immediately.

LLRB trees are the best choice for teaching and for implementations where code simplicity matters more than raw performance. The ~50-line implementation is remarkably elegant.

16

AA Trees

Arne Andersson's simplification (1993)

An AA tree is a Red-Black tree with an additional constraint: right children cannot be red. Equivalently, only right-leaning "horizontal" links at the same level are allowed.

Level-based formulation

  • Leaf nodes have level 1
  • A left child's level is strictly less than its parent's level
  • A right child's level is equal to or one less than its parent's level
  • A right grandchild's level is strictly less than its grandparent's level

Two operations only

OperationWhat it does
skewRight rotation to eliminate a left horizontal link
splitLeft rotation to eliminate two consecutive right horizontal links (and increment level)

Insert and delete

  • Insert: BST insert, then skew and split back up
  • Delete: BST delete, decrease levels if needed, then skew and split back up

AA trees are arguably the simplest balanced BST to implement correctly. The two-operation approach eliminates the complex case analysis of Red-Black trees.

17

Scapegoat Trees

Lazy rebalancing

Scapegoat trees do not maintain any auxiliary information (no heights, no colours, no levels). Instead, they detect and fix imbalance lazily.

How it works

  1. Insert: BST insert; if depth exceeds log1/α(n), find a scapegoat ancestor
  2. Scapegoat: highest node where size(child) > α × size(node)
  3. Rebuild: flatten subtree into sorted array, rebuild as perfectly balanced BST
  4. Delete: mark-and-lazy-rebuild when too many deleted

The alpha parameter

AlphaBehaviour
0.5Perfectly balanced — rebuild often
0.75Good balance vs rebuild trade-off (common default)
1.0Never rebalance — degenerate BST

Complexity

Search: O(log n) worst case

Insert: O(log n) amortised (O(n) worst case for a single rebuild)

Space: O(n) — no extra metadata per node

18

Weight-Balanced Trees

Balance by subtree size

Weight-balanced trees (BB[α] trees, Nievergelt & Reingold, 1973) maintain balance based on the size of subtrees rather than height.

Weight-balance condition

size(left)  >= alpha * size(node)
size(right) >= alpha * size(node)

Typical alpha: 2/11 to 1/4
Bound: alpha <= 1 - 1/sqrt(2) ~ 0.293

Advantages

  • Order-statistic operations come for free — each node stores subtree size
  • rank(x), select(k), range_count(lo, hi) are all O(log n)
  • Well-suited for persistent/functional implementations
  • Merging and splitting are efficient

Used in practice

  • Haskell's Data.Map and Data.Set use weight-balanced trees
  • Good choice when order-statistic queries are frequent
  • Natural fit for functional programming — persistent structure, no mutation
19

Persistent Balanced Trees

Path copying

A persistent data structure preserves all previous versions. Instead of modifying in place, each update creates a new path from the modified node to the root.

Version 1:         Version 2 (insert 25):

      30                30'
     / \               / \
   20   40    ->     20'  40   (shared)
  /                 / \
10               10   25  (new)

Shared: 10, 40 (unchanged)
New:    30', 20', 25
Cost:   O(log n) new nodes per update

Tree choices for persistence

Tree typeSuitability
Weight-balancedExcellent — Haskell's default
Red-BlackGood — Clojure, Scala
AVLGood — slightly more copying
2-3 treesExcellent — clean functional style

Applications

  • Version control — git-like history for in-memory data
  • Undo/redo — O(1) rollback by keeping the old root
  • Concurrent reads — readers use old versions; no locks
  • Temporal databases — query state at any past point
20

Summary & Comparison Table

TreeHeight boundInsert rot.Delete rot.Extra storageBest for
AVL1.44 log n≤1≤O(log n)BF / heightRead-heavy
Red-Black2 log n≤2≤31 bit colourGeneral-purpose
LLRB2 log n≤2≤21 bit colourSimple impl.
AA2 log n≤1 split≤O(log n)Level integerTeaching
2-3log2 to log3Splits upMerges upMulti-keyFoundation
2-3-4log2 to log4Splits upMerges upMulti-keyRB mapping
Scapegoatlog1/α nAmort.Amort.NoneMin. metadata
Weight-bal.O(log n)≤1≤1Subtree sizeOrder-stat, FP

Key takeaways

  • All balanced BSTs guarantee O(log n) search, insert, and delete
  • AVL is best for lookup-heavy; Red-Black for mutation-heavy
  • 2-3 and 2-3-4 trees are the conceptual foundation — RB trees are their binary encoding
  • Persistent balanced trees enable immutable, versioned data with O(log n) overhead

Recommended reading

  • Sedgewick & WayneAlgorithms, 4th ed.
  • CLRSIntroduction to Algorithms
  • OkasakiPurely Functional Data Structures
  • Andersson — "Balanced Search Trees Made Simple" (1993)
  • Galperin & Rivest — "Scapegoat Trees" (1993)