DISTINCT without clear need
Severity: Warning
Adding DISTINCT to a query forces PostgreSQL to sort or hash the entire result set to eliminate duplicate rows. If the result set can’t contain duplicates in the first place — because of a primary key, unique constraint, or the query structure — then DISTINCT adds work for no benefit.
Example
Section titled “Example”-- If users.id is a primary key, every row is already uniqueSELECT DISTINCT id, name, email FROM users WHERE active = true;Why it happens
Section titled “Why it happens”DISTINCT tells PostgreSQL to deduplicate the result. To do this, the planner adds either a Sort node (to group identical rows together and discard duplicates) or a HashAggregate node (to build a hash table of seen rows). Both consume CPU and memory proportional to the result size. For large result sets, a sort may spill to disk.
If the selected columns already include a unique key, no duplicates are possible and this extra step is pure waste.
When DISTINCT is appropriate
Section titled “When DISTINCT is appropriate”DISTINCT is appropriate when duplicates genuinely occur and you need to eliminate them:
-- Multiple orders per customer, but we want each customer onceSELECT DISTINCT customer_id FROM orders WHERE created_at > '2024-01-01';It’s also valid after a JOIN that can fan out rows:
-- A user may have many orders, but we want unique user rowsSELECT DISTINCT u.id, u.nameFROM users uJOIN orders o ON o.user_id = u.idWHERE o.total > 100;How to fix it
Section titled “How to fix it”If the query already guarantees unique results, remove DISTINCT:
-- Primary key guarantees uniqueness — DISTINCT is unnecessarySELECT id, name, email FROM users WHERE active = true;If you’re using DISTINCT to fix a JOIN that produces unwanted duplicates, consider whether the join itself can be restructured — for example, using EXISTS instead:
-- Instead of SELECT DISTINCT u.* FROM users u JOIN orders o ON ...SELECT u.id, u.nameFROM users uWHERE EXISTS (SELECT 1 FROM orders o WHERE o.user_id = u.id AND o.total > 100);Detection
Section titled “Detection”Query Doctor flags this nudge on any SELECT statement that includes a DISTINCT clause. Review each flagged query to determine whether duplicates are actually possible.