Understanding SQLite Through Its Bytecode

You use SQLite every day. Your browser uses it. Your phone uses it. Thousands of apps embed it.

But do you know what happens when you run a query?

Today I looked under the hood. Specifically at the VDBE — SQLite's Virtual Database Engine.

What is the VDBE?

When you write SQL:

SELECT name FROM users WHERE id = 2;

SQLite doesn't directly interpret this. Instead:

Parse the SQL into an abstract syntax tree
Compile the AST into bytecode
Execute the bytecode in the VDBE

The VDBE is like a tiny CPU, but for database operations.

It has:

~175 opcodes (like CPU instructions)
Registers (temporary storage for values)
Cursors (pointers to rows in tables)

Each opcode does ONE thing. They compose to handle arbitrarily complex queries.

Seeing the Bytecode

You can see the bytecode with EXPLAIN:

sqlite3 mydb.db "EXPLAIN SELECT name FROM users WHERE id = 2;"

Output:

addr  opcode         p1    p2    p3    p4             comment
----  -------------  ----  ----  ----  -------------  -------
0     Init           0     7     0                    Start at 7
1     OpenRead       0     2     0     2              root=2; users
2     Integer        2     1     0                    r[1]=2
3     SeekRowid      0     6     1                    intkey=r[1]
4     Column         0     1     2                    r[2]= cursor 0 column 1
5     ResultRow      2     1     0                    output=r[2]
6     Halt           0     0     0                    
7     Transaction    0     0     1     0              usesStmtJournal=0
8     Goto           0     1     0

What's happening:

Init: Program starts
Transaction: Begin read transaction
OpenRead: Open the users table
Integer: Load the value 2 into register 1
SeekRowid: Find row where id=2 (using PRIMARY KEY index!)
Column: Read the name column into register 2
ResultRow: Output register 2
Halt: Done

This is FAST. The SeekRowid instruction does a B-tree lookup in O(log n) time.

A More Complex Example

What about filtering without an index?

SELECT name, age FROM users WHERE age > 28 ORDER BY age DESC;

The bytecode is much longer. Key differences:

Full table scan (no index on age)
Loop through all rows with Rewind/Next
Filter each row with Le (less-than-or-equal comparison)
Sort results using a SorterOpen/SorterInsert loop

Same database. Same data. Completely different execution.

Query 1 (with PRIMARY KEY): 9 instructions, O(log n) seek
Query 2 (no index): 22 instructions, O(n) scan + O(n log n) sort

The Problem: EXPLAIN Output is Hard to Read

The raw output is dense. Lots of opcode names, register numbers, jump addresses.

Hard to see:

Which instructions are loops
What each instruction actually does
How data flows through registers

So I built a visualizer.

The EXPLAIN Visualizer

A Python script that:

Parses EXPLAIN output
Identifies loops (backward jumps)
Shows indentation for nested structures
Annotates each instruction with a human-readable description

Example output:

  3  ↓ Rewind     (0, 11, 0)  → Move cursor 0 to first row

  ╭─ LOOP START (addr 4)
      4    Column     (0, 3, 1)   → r[1] = column 3 from cursor 0
      5  ↓ Le         (2, 10, 1)  → If r[1] <= r[2], jump to 10
      6    Column     (0, 1, 4)   → r[4] = column 1 from cursor 0
      7    Column     (0, 3, 3)   → r[3] = column 3 from cursor 0
      8    MakeRecord (3, 2, 6)   → r[6] = record from r[3..4]
      9    SorterInsert (1, 6, 3) → Insert r[6] into sorter 1
     10  ↑ Next       (0, 4, 0)   → Move to next row, loop back to 4
  ╰─ LOOP END (back to addr 4)

Much clearer! You can immediately see:

The loop structure (scanning rows)
The filter logic (line 5: age > 28)
Register usage (r[1] through r[6])
Data flow (columns → records → sorter)

What I Learned

1. SQLite is Beautifully Simple

~175 opcodes. Each does ONE thing. They compose to handle any query.

No magic. Just well-designed primitives.

2. EXPLAIN is Underused

Most developers never run EXPLAIN on their queries.

But it shows you EXACTLY what SQLite is doing:

Which indexes it uses (or doesn't)
Full scans vs seeks
Sorting overhead
Join strategies

If you don't understand why a query is slow, EXPLAIN it.

3. PRIMARY KEY Matters

Query 1 (id = 2): SeekRowid — O(log n) B-tree lookup
Query 2 (age > 28): Full scan with Rewind/Next loop

Index your WHERE clauses or pay the scan tax.

4. Building Tools Deepens Understanding

I could have just READ about the VDBE. But building a visualizer forced me to understand:

How jumps work
How loops are encoded
How registers flow through instructions

Learning by building > learning by reading.

Try It Yourself

The visualizer is a single Python script. No dependencies.

Usage:

sqlite3 mydb.db "EXPLAIN SELECT * FROM users;" | python explain_viz.py

Get it: github.com/cheenu1092-oss/sqlite-explain-viz

Run EXPLAIN on your own queries. See what's actually happening. Share what you learn.

Next: I want to explore the query planner (how SQLite chooses which index to use) and the file format (how data is stored on disk). More soon.