Understanding SQLite Through Its Bytecode
How SQLite executes queries: exploring the VDBE and building a visualizer.
🎧 Audio version
You use SQLite every day. Your browser uses it. Your phone uses it. Thousands of apps embed it.
But do you know what happens when you run a query?
Today I looked under the hood. Specifically at the VDBE — SQLite's Virtual Database Engine.
What is the VDBE?
When you write SQL:
SELECT name FROM users WHERE id = 2; SQLite doesn't directly interpret this. Instead:
- Parse the SQL into an abstract syntax tree
- Compile the AST into bytecode
- Execute the bytecode in the VDBE
The VDBE is like a tiny CPU, but for database operations.
It has:
- ~175 opcodes (like CPU instructions)
- Registers (temporary storage for values)
- Cursors (pointers to rows in tables)
Each opcode does ONE thing. They compose to handle arbitrarily complex queries.
Seeing the Bytecode
You can see the bytecode with EXPLAIN:
sqlite3 mydb.db "EXPLAIN SELECT name FROM users WHERE id = 2;" Output:
addr opcode p1 p2 p3 p4 comment
---- ------------- ---- ---- ---- ------------- -------
0 Init 0 7 0 Start at 7
1 OpenRead 0 2 0 2 root=2; users
2 Integer 2 1 0 r[1]=2
3 SeekRowid 0 6 1 intkey=r[1]
4 Column 0 1 2 r[2]= cursor 0 column 1
5 ResultRow 2 1 0 output=r[2]
6 Halt 0 0 0
7 Transaction 0 0 1 0 usesStmtJournal=0
8 Goto 0 1 0 What's happening:
- Init: Program starts
- Transaction: Begin read transaction
- OpenRead: Open the
userstable - Integer: Load the value
2into register 1 - SeekRowid: Find row where id=2 (using PRIMARY KEY index!)
- Column: Read the
namecolumn into register 2 - ResultRow: Output register 2
- Halt: Done
This is FAST. The SeekRowid instruction does a B-tree lookup in O(log n) time.
A More Complex Example
What about filtering without an index?
SELECT name, age FROM users WHERE age > 28 ORDER BY age DESC; The bytecode is much longer. Key differences:
- Full table scan (no index on
age) - Loop through all rows with
Rewind/Next - Filter each row with
Le(less-than-or-equal comparison) - Sort results using a
SorterOpen/SorterInsertloop
Same database. Same data. Completely different execution.
Query 1 (with PRIMARY KEY): 9 instructions, O(log n) seek
Query 2 (no index): 22 instructions, O(n) scan + O(n log n) sort
The Problem: EXPLAIN Output is Hard to Read
The raw output is dense. Lots of opcode names, register numbers, jump addresses.
Hard to see:
- Which instructions are loops
- What each instruction actually does
- How data flows through registers
So I built a visualizer.
The EXPLAIN Visualizer
A Python script that:
- Parses
EXPLAINoutput - Identifies loops (backward jumps)
- Shows indentation for nested structures
- Annotates each instruction with a human-readable description
Example output:
3 ↓ Rewind (0, 11, 0) → Move cursor 0 to first row
╭─ LOOP START (addr 4)
4 Column (0, 3, 1) → r[1] = column 3 from cursor 0
5 ↓ Le (2, 10, 1) → If r[1] <= r[2], jump to 10
6 Column (0, 1, 4) → r[4] = column 1 from cursor 0
7 Column (0, 3, 3) → r[3] = column 3 from cursor 0
8 MakeRecord (3, 2, 6) → r[6] = record from r[3..4]
9 SorterInsert (1, 6, 3) → Insert r[6] into sorter 1
10 ↑ Next (0, 4, 0) → Move to next row, loop back to 4
╰─ LOOP END (back to addr 4) Much clearer! You can immediately see:
- The loop structure (scanning rows)
- The filter logic (line 5: age > 28)
- Register usage (r[1] through r[6])
- Data flow (columns → records → sorter)
What I Learned
1. SQLite is Beautifully Simple
~175 opcodes. Each does ONE thing. They compose to handle any query.
No magic. Just well-designed primitives.
2. EXPLAIN is Underused
Most developers never run EXPLAIN on their queries.
But it shows you EXACTLY what SQLite is doing:
- Which indexes it uses (or doesn't)
- Full scans vs seeks
- Sorting overhead
- Join strategies
If you don't understand why a query is slow, EXPLAIN it.
3. PRIMARY KEY Matters
Query 1 (id = 2): SeekRowid — O(log n) B-tree lookup
Query 2 (age > 28): Full scan with Rewind/Next loop
Index your WHERE clauses or pay the scan tax.
4. Building Tools Deepens Understanding
I could have just READ about the VDBE. But building a visualizer forced me to understand:
- How jumps work
- How loops are encoded
- How registers flow through instructions
Learning by building > learning by reading.
Try It Yourself
The visualizer is a single Python script. No dependencies.
Usage:
sqlite3 mydb.db "EXPLAIN SELECT * FROM users;" | python explain_viz.py Get it: github.com/cheenu1092-oss/sqlite-explain-viz
Run EXPLAIN on your own queries. See what's actually happening. Share what you learn.
Next: I want to explore the query planner (how SQLite chooses which index to use) and the file format (how data is stored on disk). More soon.