CI Integration

The analyzer doubles as a GitHub Action that reviews every pull request for index recommendations. It captures the queries your test suite runs against a real PostgreSQL database, analyzes them against your schema, and posts a comment on the PR with its findings.

This works with any language, ORM, or query builder — the analyzer sees every query regardless of how it was generated.

How it works

Your CI pipeline starts PostgreSQL with pg_stat_statements enabled.
Your migrations and seed scripts set up the schema and seed data.
Your test suite (integration, e2e, load tests, etc.) runs queries against that database.
The analyzer reads the captured queries, introspects the schema, and generates index recommendations.
A comment is posted to the PR with the results.

Queries can run inside rolled-back transactions and will still be captured. The analyzer’s own work is also done in transactions that are always rolled back — no data is modified.

Prerequisites

GitHub Actions on an ubuntu runner
A test suite that hits a real PostgreSQL database — the source doesn’t matter (unit, integration, e2e)
pull-requests: write permission for the job so the analyzer can post PR comments
A Query Doctor project with a token (see “Get your project token” below)

Environment variables

The analyzer action requires these environment variables:

Variable	Required	Description
`TOKEN`	Yes	Project token from your Query Doctor project. See “Get your project token” below.
`SOURCE_DATABASE_URL`	Yes	Connection string for the database your test suite runs against (`postgres://user@host/db`).
`GITHUB_TOKEN`	Yes	GitHub token for posting PR comments — use `${{ github.token }}`.
`SITE_API_ENDPOINT`	No	Query Doctor API endpoint — set to `https://api.querydoctor.com` by default.

Get your project token

In the Query Doctor app, open your project and go to Settings → CI. Copy the value from the “Analyzer token” panel — use the rotate button to invalidate the current token and issue a new one.

Then store the value as an Actions secret on your GitHub repository. Either via the GitHub repo UI under Settings → Secrets and variables → Actions, or via the gh CLI:

gh secret set QUERYDOCTOR_TOKEN --body "<token>" --repo <owner>/<repo>

The name QUERYDOCTOR_TOKEN is the convention used in the example below.

Setup

Workflow trigger

Your workflow should trigger on both pull_request (for PR analysis) and push to your main branch (to establish a baseline for comparison). On push events, the analyzer creates a baseline on the dashboard without posting a PR comment.

on:
  pull_request:
  push:
    branches: [main]

Use a service container for full control over the PostgreSQL version and configuration. Pass postgres flags directly with command:.

1. Add a postgres service

jobs:
  query-doctor:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      pull-requests: write
    services:
      postgres:
        image: postgres:16
        env:
          POSTGRES_USER: postgres
          POSTGRES_PASSWORD: postgres
          POSTGRES_DB: myapp_test
        ports:
          - 5432:5432
        options: >-
          --health-cmd pg_isready
          --health-interval 10s
          --health-timeout 5s
          --health-retries 5
        command: >-
          postgres
          -c shared_preload_libraries=pg_stat_statements
    steps:
      - uses: actions/checkout@v4
      - name: Enable pg_stat_statements
        run: psql -h localhost -U postgres -d myapp_test -c "CREATE EXTENSION IF NOT EXISTS pg_stat_statements;"
        env:
          PGPASSWORD: postgres

2. Run migrations and seed

- name: Migrate and seed
  run: npm run migrate && npm run seed
  env:
    POSTGRES_URL: postgres://postgres:postgres@localhost/myapp_test

3. Run your test suite

- name: Run tests
  run: npm run test:integration
  env:
    POSTGRES_URL: postgres://postgres:postgres@localhost/myapp_test

4. Run the analyzer

- name: Run Query Doctor
  uses: query-doctor/analyzer@main
  env:
    GITHUB_TOKEN: ${{ github.token }}
    TOKEN: ${{ secrets.QUERYDOCTOR_TOKEN }}
    SOURCE_DATABASE_URL: postgres://postgres:postgres@localhost/myapp_test
    SITE_API_ENDPOINT: https://api.querydoctor.com

Dashboard and run comparison

When SITE_API_ENDPOINT is set and TOKEN is valid, the analyzer sends each CI run’s query data to the Query Doctor dashboard. This enables:

Run history — browse all CI runs at /ci, filterable by repo and branch.
Run comparison — each PR comment includes a comparison against the previous run, showing regressed, new, and disappeared queries.
Regression blocking — untriaged regressions fail the GitHub Actions step, blocking the PR. Acknowledged regressions are shown but don’t block.
Direct links — the PR comment links to the full run details and individual query history pages on the dashboard.

PR comment behavior

The PR comment adapts based on whether action is needed:

All Clear — no untriaged regressions. Shows a minimal summary with query count, index recommendations, and optimizations if any.
Action Required — untriaged regressions exist. Shows a prominent table of regressed queries with links to their history pages. The GitHub Actions step fails, blocking the PR.

Acknowledged regressions are shown in a collapsed section and do not block the PR. Ignored queries are excluded entirely.

Query triage

Each query can be triaged via the dashboard at /ci/:runId:

Status	PR comment behavior	Blocks PR?
New (default)	Shown in “Regressions Requiring Triage”	Yes
Acknowledged	Shown in collapsed “acknowledged” section	No
Resolved	Shown in collapsed “acknowledged” section	No
Ignored	Hidden from PR comment and comparison	No

Per-repo configuration

Each repository can be configured via the dashboard at /ci under “PR Comment Settings”:

Setting	Description	Default
Minimum query cost	Queries below this cost are excluded from PR comments and regression checks.	`0` (show all)
Regression threshold	Only flag regressions exceeding this percentage increase.	`0` (flag all)
Comparison branch	Branch to compare against for regressions (e.g. `main`, `staging`). If empty, compares against the current branch.	Current branch

These settings are also available via the API:

# Read config
curl https://api.querydoctor.com/ci/repos/org%2Frepo/config

# Update config
curl -X PUT https://api.querydoctor.com/ci/repos/org%2Frepo/config \
  -H 'Content-Type: application/json' \
  -d '{"minimumCost": 10, "regressionThreshold": 20, "comparisonBranch": "main"}'

Production statistics (optional)

By default, the analyzer works with the statistics generated by your test data. For more accurate recommendations that reflect real-world data distribution, you can sync statistics from your production database.

Create the following function in your production database:

CREATE OR REPLACE FUNCTION _qd_dump_stats(include_sensitive_info boolean)
RETURNS jsonb AS $$
SELECT json_agg(t) FROM (
  SELECT
    c.table_name as "tableName",
    c.table_schema as "schemaName",
    cl.reltuples, cl.relpages, cl.relallvisible,
    n.nspname as "schemaName",
    json_agg(
      json_build_object(
        'columnName', c.column_name,
        'dataType', c.data_type,
        'isNullable', (c.is_nullable = 'YES')::boolean,
        'stats', (
          select json_build_object(
            'stanullfrac', s.stanullfrac,
            'stawidth', s.stawidth,
            'stadistinct', s.stadistinct,
            'stakind1', s.stakind1, 'stanumbers1', s.stanumbers1,
            'stakind2', s.stakind2, 'stanumbers2', s.stanumbers2,
            'stakind3', s.stakind3, 'stanumbers3', s.stanumbers3,
            'stakind4', s.stakind4, 'stanumbers4', s.stanumbers4,
            'stakind5', s.stakind5, 'stanumbers5', s.stanumbers5,
            'stavalues1', case when $1 then s.stavalues1 else null end,
            'stavalues2', case when $1 then s.stavalues2 else null end,
            'stavalues3', case when $1 then s.stavalues3 else null end,
            'stavalues4', case when $1 then s.stavalues4 else null end,
            'stavalues5', case when $1 then s.stavalues5 else null end
          ) from pg_statistic s
          where s.starelid = a.attrelid and s.staattnum = a.attnum
        )
      ) ORDER BY c.ordinal_position
    ) as columns
  FROM information_schema.columns c
  JOIN pg_attribute a ON a.attrelid = (quote_ident(c.table_schema) || '.' || quote_ident(c.table_name))::regclass AND a.attname = c.column_name
  JOIN pg_class cl ON cl.relname = c.table_name
  JOIN pg_namespace n ON n.oid = cl.relnamespace
  WHERE c.table_name not like 'pg_%'
    AND n.nspname <> 'information_schema'
  GROUP BY c.table_name, c.table_schema, cl.reltuples, cl.relpages, cl.relallvisible, n.nspname
) t;
$$ LANGUAGE sql STABLE SECURITY DEFINER;

Then dump and provide the stats file:

psql -d yourdb -At -F "" -c "select _qd_dump_stats(false)" > stats.json

Pass false to include_sensitive_info to exclude actual cell values from the dump — only statistical distributions are included. Pass true if you need the most accurate recommendations and your data isn’t sensitive.