Skip to content

NOT IN (SELECT ...) subquery

Severity: Warning

NOT IN (SELECT ...) is both a performance anti-pattern and a correctness trap. PostgreSQL materializes the entire subquery result and scans it for every outer row. Worse, if the subquery returns even a single NULL, the entire NOT IN evaluates to NULL (unknown) for every outer row — returning zero results.

-- Dangerous: wrong results if inactive_customers.id contains any NULL
SELECT * FROM orders
WHERE customer_id NOT IN (SELECT id FROM inactive_customers);

SQL’s NOT IN uses three-valued logic. When the subquery list contains a NULL, the expression x NOT IN (1, 2, NULL) evaluates as x <> 1 AND x <> 2 AND x <> NULL. Since x <> NULL is always unknown, the entire AND chain becomes unknown, and the WHERE clause discards the row. This means a single NULL in the subquery silently causes NOT IN to return zero rows — almost never the intended behavior.

PostgreSQL materializes the subquery result set and performs a linear scan for each outer row. NOT EXISTS allows the planner to use an efficient anti-join strategy (hash or merge), which can be dramatically faster on large tables.

SELECT * FROM orders
WHERE NOT EXISTS (
SELECT 1 FROM inactive_customers
WHERE inactive_customers.id = orders.customer_id
);

NOT EXISTS handles NULL correctly (a NULL in inactive_customers.id simply fails the equality check and doesn’t match, which is the expected behavior) and enables anti-join execution.

If you must keep NOT IN, guard against NULL values:

SELECT * FROM orders
WHERE customer_id NOT IN (
SELECT id FROM inactive_customers WHERE id IS NOT NULL
);

This prevents the NULL correctness issue but still doesn’t benefit from anti-join optimization.

Query Doctor flags this nudge when it finds a NOT IN expression where the right-hand side is a subquery (SELECT). Literal NOT IN lists (e.g. NOT IN (1, 2, 3)) are not flagged. The nudge marker points to the NOT IN location in the formatted SQL view.