BRIN Indexes
BRIN (Block Range INdex) is a PostgreSQL index type designed for large tables where data is physically ordered on disk. Instead of indexing every row like a B-tree, a BRIN index stores summary values (min/max) for each range of adjacent table blocks. This makes BRIN indexes dramatically smaller than B-tree indexes — often by a factor of 1,000x or more — while still allowing the planner to skip irrelevant blocks during range scans.
How BRIN works
Section titled “How BRIN works”A B-tree index maintains one entry per row. A BRIN index divides the table into block ranges (128 pages by default) and stores only the minimum and maximum values for each range. When a query filters on the indexed column, PostgreSQL checks which block ranges could contain matching rows and skips the rest.
This approach works because of physical correlation: if rows are inserted in order by a column (e.g. a timestamp), nearby rows on disk will have similar values. The planner can rule out entire block ranges without scanning them.
| Property | B-tree | BRIN |
|---|---|---|
| Index entry | One per row | One per block range (128 pages) |
| Size on large tables | Can be tens of GBs | Typically KBs to low MBs |
| Creation speed | Slower | Significantly faster |
| Write overhead | Moderate (update on each insert) | Minimal (update only on range change) |
| Equality lookups | Fast | Not effective |
| Narrow range scans | Faster (sub-millisecond) | Slower (must scan matching block ranges) |
| Wide range scans | Fast | Comparable or faster (fewer pages to check) |
When BRIN is a good fit
Section titled “When BRIN is a good fit”BRIN indexes are most effective when all of these conditions hold:
- Append-only or insert-heavy table — rows are inserted but rarely updated or deleted. This preserves physical ordering on disk.
- High physical correlation — the indexed column’s values increase (or decrease) in the same order as the table’s physical row order. Timestamps and auto-incrementing IDs naturally satisfy this.
- Large table — the size advantage over B-tree is most meaningful on tables with millions of rows. On small tables, B-tree overhead is negligible.
- Range queries — the workload uses
BETWEEN,>=,<=, or similar range conditions on the indexed column.
Common use cases:
- Log tables with a
created_attimestamp - Event/audit tables ordered by time
- IoT sensor data with sequential timestamps
- Append-only analytics tables
Query performance tradeoff
Section titled “Query performance tradeoff”BRIN’s primary advantage is storage size and write overhead, not query speed. For queries returning a small number of rows, B-tree is significantly faster because it can jump directly to the matching rows. BRIN must scan all block ranges that might contain matches, which is less precise.
As the result set grows larger, BRIN catches up and eventually wins — it has less index overhead to process per block range. In CrunchyData’s benchmarks on a 10M-row table with high correlation:
| Rows returned | B-tree | BRIN |
|---|---|---|
| 100 | 0.5 ms | 11 ms |
| 1,000 | 2 ms | 10 ms |
| 10,000 | 13 ms | 15 ms |
| 100,000 | 85 ms | 67 ms |
The crossover point is around 10,000 rows. Below that, B-tree is faster for reads. Above that, BRIN pulls ahead and the advantage grows with larger result sets.
Bottom line: choose BRIN when the table is large, storage matters, and queries typically scan broad time ranges. Choose B-tree when queries are selective (returning few rows) and read latency is the priority.
When BRIN is not suitable
Section titled “When BRIN is not suitable”- Frequent updates or deletes — these can break physical correlation over time as rows are moved or gaps appear.
VACUUMand heap rewrites can partially restore order, but heavily-updated tables will see degraded BRIN performance. - Out-of-order inserts — if rows are not inserted in column order, block ranges will overlap and BRIN will be unable to skip blocks effectively.
- Equality lookups — BRIN indexes are not useful for
WHERE column = valuequeries. A matching value could exist in many block ranges, so BRIN provides little filtering. Use B-tree for equality lookups. - Small tables — on tables with fewer than a few thousand rows, B-tree indexes are already small and fast. BRIN’s advantage is in storage savings at scale.
Example
Section titled “Example”-- Create a BRIN index on a timestamp columnCREATE INDEX idx_logs_created ON logs USING BRIN (created_at);
-- Queries that benefit from this index:SELECT * FROM logs WHERE created_at BETWEEN '2024-01-01' AND '2024-02-01';SELECT * FROM logs WHERE created_at >= now() - interval '7 days';For comparison, on a 49 GB table, a B-tree index can be ~21 GB while a BRIN index on the same column is ~1.7 MB.
Checking physical correlation
Section titled “Checking physical correlation”You can verify whether a column has high physical correlation using pg_stats:
SELECT tablename, attname, correlationFROM pg_statsWHERE tablename = 'logs' AND attname = 'created_at';A correlation value close to 1.0 or -1.0 indicates strong physical ordering — BRIN will work well. Values near 0 indicate random ordering — BRIN will not be effective.
Tuning pages_per_range
Section titled “Tuning pages_per_range”The pages_per_range storage parameter controls how many table pages each BRIN entry summarizes. The default is 128.
-- Smaller ranges = more precise filtering, larger indexCREATE INDEX idx_logs_created ON logs USING BRIN (created_at) WITH (pages_per_range = 32);- Smaller values (e.g. 32): more index entries, better selectivity, slightly larger index (still far smaller than B-tree).
- Larger values (e.g. 256): fewer index entries, less precise filtering, even smaller index.
For most append-only temporal tables, the default of 128 is a reasonable starting point.