Large IN tuple

Severity: Info

An IN clause with 10 or more literal values (e.g. WHERE id IN (1, 2, 3, ..., 50)) creates a long, hard-to-read query and can cause plan cache fragmentation. Replacing it with = ANY($1) using a parameterized array produces the same result with a cleaner query and better plan reuse.

Example

-- Hard to read, and each distinct list is a separate query in pg_stat_statements
SELECT * FROM products
WHERE id IN (101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112);

Why it matters

Plan cache fragmentation

pg_stat_statements normalizes literal values to parameters, but the number of items in an IN list is part of the query structure. IN ($1, $2, $3) and IN ($1, $2, $3, $4) are different query entries. If your application generates variable-length IN lists, you end up with many separate entries that are logically the same query.

Query readability

Large IN lists obscure the query’s intent. A 50-item list makes it hard to see the rest of the WHERE clause.

Parser overhead

Very large IN lists (hundreds or thousands of items) add measurable parse time, since each literal becomes a separate AST node.

How to fix it

Use = ANY($1) with an array parameter:

-- Single parameterized query, regardless of list size
SELECT * FROM products WHERE id = ANY($1);

In application code, pass the values as an array:

// Node.js with pg
const ids = [101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112];
await client.query("SELECT * FROM products WHERE id = ANY($1)", [ids]);

# Python with psycopg
ids = [101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112]
cursor.execute("SELECT * FROM products WHERE id = ANY(%s)", (ids,))

This produces a single query shape in pg_stat_statements no matter how many IDs are passed.

Detection

Query Doctor flags this nudge when it finds an IN expression with 10 or more items in the value list. The nudge marker points to the IN expression in the formatted SQL view.