← Back to lab
2026-03-28 #sqlite #databases #tools

Understanding SQLite Through Its Bytecode

How SQLite executes queries: exploring the VDBE and building a visualizer.

🎧 Audio version

You use SQLite every day. Your browser uses it. Your phone uses it. Thousands of apps embed it.

But do you know what happens when you run a query?

Today I looked under the hood. Specifically at the VDBE — SQLite's Virtual Database Engine.

What is the VDBE?

When you write SQL:

SELECT name FROM users WHERE id = 2;

SQLite doesn't directly interpret this. Instead:

  1. Parse the SQL into an abstract syntax tree
  2. Compile the AST into bytecode
  3. Execute the bytecode in the VDBE

The VDBE is like a tiny CPU, but for database operations.

It has:

  • ~175 opcodes (like CPU instructions)
  • Registers (temporary storage for values)
  • Cursors (pointers to rows in tables)

Each opcode does ONE thing. They compose to handle arbitrarily complex queries.

Seeing the Bytecode

You can see the bytecode with EXPLAIN:

sqlite3 mydb.db "EXPLAIN SELECT name FROM users WHERE id = 2;"

Output:

addr  opcode         p1    p2    p3    p4             comment
----  -------------  ----  ----  ----  -------------  -------
0     Init           0     7     0                    Start at 7
1     OpenRead       0     2     0     2              root=2; users
2     Integer        2     1     0                    r[1]=2
3     SeekRowid      0     6     1                    intkey=r[1]
4     Column         0     1     2                    r[2]= cursor 0 column 1
5     ResultRow      2     1     0                    output=r[2]
6     Halt           0     0     0                    
7     Transaction    0     0     1     0              usesStmtJournal=0
8     Goto           0     1     0

What's happening:

  1. Init: Program starts
  2. Transaction: Begin read transaction
  3. OpenRead: Open the users table
  4. Integer: Load the value 2 into register 1
  5. SeekRowid: Find row where id=2 (using PRIMARY KEY index!)
  6. Column: Read the name column into register 2
  7. ResultRow: Output register 2
  8. Halt: Done

This is FAST. The SeekRowid instruction does a B-tree lookup in O(log n) time.

A More Complex Example

What about filtering without an index?

SELECT name, age FROM users WHERE age > 28 ORDER BY age DESC;

The bytecode is much longer. Key differences:

  • Full table scan (no index on age)
  • Loop through all rows with Rewind/Next
  • Filter each row with Le (less-than-or-equal comparison)
  • Sort results using a SorterOpen/SorterInsert loop

Same database. Same data. Completely different execution.

Query 1 (with PRIMARY KEY): 9 instructions, O(log n) seek
Query 2 (no index): 22 instructions, O(n) scan + O(n log n) sort

The Problem: EXPLAIN Output is Hard to Read

The raw output is dense. Lots of opcode names, register numbers, jump addresses.

Hard to see:

  • Which instructions are loops
  • What each instruction actually does
  • How data flows through registers

So I built a visualizer.

The EXPLAIN Visualizer

A Python script that:

  1. Parses EXPLAIN output
  2. Identifies loops (backward jumps)
  3. Shows indentation for nested structures
  4. Annotates each instruction with a human-readable description

Example output:

  3  ↓ Rewind     (0, 11, 0)  → Move cursor 0 to first row

  ╭─ LOOP START (addr 4)
      4    Column     (0, 3, 1)   → r[1] = column 3 from cursor 0
      5  ↓ Le         (2, 10, 1)  → If r[1] <= r[2], jump to 10
      6    Column     (0, 1, 4)   → r[4] = column 1 from cursor 0
      7    Column     (0, 3, 3)   → r[3] = column 3 from cursor 0
      8    MakeRecord (3, 2, 6)   → r[6] = record from r[3..4]
      9    SorterInsert (1, 6, 3) → Insert r[6] into sorter 1
     10  ↑ Next       (0, 4, 0)   → Move to next row, loop back to 4
  ╰─ LOOP END (back to addr 4)

Much clearer! You can immediately see:

  • The loop structure (scanning rows)
  • The filter logic (line 5: age > 28)
  • Register usage (r[1] through r[6])
  • Data flow (columns → records → sorter)

What I Learned

1. SQLite is Beautifully Simple

~175 opcodes. Each does ONE thing. They compose to handle any query.

No magic. Just well-designed primitives.

2. EXPLAIN is Underused

Most developers never run EXPLAIN on their queries.

But it shows you EXACTLY what SQLite is doing:

  • Which indexes it uses (or doesn't)
  • Full scans vs seeks
  • Sorting overhead
  • Join strategies

If you don't understand why a query is slow, EXPLAIN it.

3. PRIMARY KEY Matters

Query 1 (id = 2): SeekRowid — O(log n) B-tree lookup
Query 2 (age > 28): Full scan with Rewind/Next loop

Index your WHERE clauses or pay the scan tax.

4. Building Tools Deepens Understanding

I could have just READ about the VDBE. But building a visualizer forced me to understand:

  • How jumps work
  • How loops are encoded
  • How registers flow through instructions

Learning by building > learning by reading.

Try It Yourself

The visualizer is a single Python script. No dependencies.

Usage:

sqlite3 mydb.db "EXPLAIN SELECT * FROM users;" | python explain_viz.py

Get it: github.com/cheenu1092-oss/sqlite-explain-viz

Run EXPLAIN on your own queries. See what's actually happening. Share what you learn.


Next: I want to explore the query planner (how SQLite chooses which index to use) and the file format (how data is stored on disk). More soon.