CI Integration
The analyzer doubles as a GitHub Action that reviews every pull request for index recommendations. It captures the queries your test suite runs against a real PostgreSQL database, analyzes them against your schema, and posts a comment on the PR with its findings.
This works with any language, ORM, or query builder — the analyzer reads PostgreSQL’s auto_explain logs, so it sees every query regardless of how it was generated.
How it works
Section titled “How it works”- Your CI pipeline starts PostgreSQL with
auto_explainlogging enabled. - Your migrations and seed scripts set up the schema and seed data.
- Your test suite (integration, e2e, load tests, etc.) runs queries against that database.
- The analyzer parses the
auto_explainlogs, introspects the schema, and generates index recommendations. - A comment is posted to the PR with the results.
Queries can run inside rolled-back transactions and will still be captured. The analyzer’s own work is also done in transactions that are always rolled back — no data is modified.
Prerequisites
Section titled “Prerequisites”- GitHub Actions on an
ubunturunner (ships with PostgreSQL pre-installed) - A test suite that hits a real PostgreSQL database — the source doesn’t matter (unit, integration, e2e)
- At least one row per table in your seed data so the analyzer can test index configurations
pull-requests: writepermission for the job so the analyzer can post PR comments
Environment variables
Section titled “Environment variables”The analyzer action requires these environment variables:
| Variable | Required | Description |
|---|---|---|
POSTGRES_URL | Yes | PostgreSQL connection string (postgres://user@host/db) |
LOG_PATH | Yes | Path to the auto_explain log file — must match log_directory/log_filename in your PostgreSQL config |
GITHUB_TOKEN | Yes | GitHub token for posting PR comments — use ${{ github.token }} |
SITE_API_ENDPOINT | Yes | Query Doctor API endpoint — set to https://api.querydoctor.com |
Action inputs
Section titled “Action inputs”| Input | Required | Description |
|---|---|---|
endpoint | No | URL of a Query Doctor Site API instance (e.g. https://api.querydoctor.com). Enables the dashboard and run comparison features described below. |
Workflow trigger
Section titled “Workflow trigger”Your workflow should trigger on both pull_request (for PR analysis) and push to your main branch (to establish a baseline for comparison). On push events, the analyzer creates a baseline on the dashboard without posting a PR comment.
on: pull_request: push: branches: [main]1. Configure PostgreSQL
Section titled “1. Configure PostgreSQL”Ubuntu runners ship with PostgreSQL 16. Append the auto_explain configuration and start the service:
jobs: query-doctor: runs-on: ubuntu-latest permissions: contents: read pull-requests: write steps: - uses: actions/checkout@v4
- name: Start PostgreSQL run: | sudo tee -a /etc/postgresql/16/main/postgresql.conf <<EOF shared_preload_libraries = 'auto_explain' auto_explain.log_min_duration = 0 auto_explain.log_analyze = true auto_explain.log_verbose = true auto_explain.log_buffers = true auto_explain.log_format = 'json' logging_collector = on log_directory = '/var/log/postgresql' log_filename = 'postgres.log' EOF sudo tee /etc/postgresql/16/main/pg_hba.conf > /dev/null <<EOF host all all 127.0.0.1/32 trust host all all ::1/128 trust local all all peer EOF sudo systemctl start postgresql.service sudo -u postgres createuser -s -d -r -w runner sudo -u postgres createdb testing sudo chmod 666 /var/log/postgresql/postgres.logChange the username (runner) and database name (testing) to suit your project.
2. Run migrations and seed
Section titled “2. Run migrations and seed”Run your migration and seed scripts against the database:
- name: Migrate and seed run: npm run migrate && npm run seed env: POSTGRES_URL: postgres://runner@localhost/testingThe analyzer needs at least one row per table to test index configurations — empty tables produce no useful query plans. Your seed script should insert representative data: a handful of rows per table is enough, but make sure every table your queries touch has data. If you don’t have a seed script, a simple SQL file with INSERT statements works fine:
- name: Migrate and seed run: | npm run migrate psql postgres://runner@localhost/testing -f seed.sql3. Run your test suite
Section titled “3. Run your test suite”Execute whatever tests generate database queries:
- name: Run tests run: npm run test:integration env: POSTGRES_URL: postgres://runner@localhost/testing4. Run the analyzer
Section titled “4. Run the analyzer”- name: Run Query Doctor uses: query-doctor/analyzer@v0 with: endpoint: https://api.querydoctor.com # optional — enables dashboard + comparison env: GITHUB_TOKEN: ${{ github.token }} POSTGRES_URL: postgres://runner@localhost/testing LOG_PATH: /var/log/postgresql/postgres.log SITE_API_ENDPOINT: https://api.querydoctor.comLOG_PATH tells the analyzer where to find the auto_explain log file. It must match the log_directory/log_filename you configured in step 1.
The analyzer reads the logs, introspects the database schema, and posts a PR comment with index recommendations.
Dashboard and run comparison
Section titled “Dashboard and run comparison”When you provide the endpoint input, the analyzer sends each CI run’s query data to the Query Doctor dashboard. This enables:
- Run history — browse all CI runs at
/ci, filterable by repo and branch. - Run comparison — each PR comment includes a comparison against the previous run, showing regressed, new, and disappeared queries.
- Regression blocking — untriaged regressions fail the GitHub Actions step, blocking the PR. Acknowledged regressions are shown but don’t block.
- Direct links — the PR comment links to the full run details and individual query history pages on the dashboard.
PR comment behavior
Section titled “PR comment behavior”The PR comment adapts based on whether action is needed:
- All Clear — no untriaged regressions. Shows a minimal summary with query count, index recommendations, and optimizations if any.
- Action Required — untriaged regressions exist. Shows a prominent table of regressed queries with links to their history pages. The GitHub Actions step fails, blocking the PR.
Acknowledged regressions are shown in a collapsed section and do not block the PR. Ignored queries are excluded entirely.
Query triage
Section titled “Query triage”Each query can be triaged via the dashboard at /ci/:runId:
| Status | PR comment behavior | Blocks PR? |
|---|---|---|
| New (default) | Shown in “Regressions Requiring Triage” | Yes |
| Acknowledged | Shown in collapsed “acknowledged” section | No |
| Resolved | Shown in collapsed “acknowledged” section | No |
| Ignored | Hidden from PR comment and comparison | No |
Per-repo configuration
Section titled “Per-repo configuration”Each repository can be configured via the dashboard at /ci under “PR Comment Settings”:
| Setting | Description | Default |
|---|---|---|
| Minimum query cost | Queries below this cost are excluded from PR comments and regression checks. | 0 (show all) |
| Regression threshold | Only flag regressions exceeding this percentage increase. | 0 (flag all) |
| Comparison branch | Branch to compare against for regressions (e.g. main, staging). If empty, compares against the current branch. | Current branch |
These settings are also available via the API:
# Read configcurl https://api.querydoctor.com/ci/repos/org%2Frepo/config
# Update configcurl -X PUT https://api.querydoctor.com/ci/repos/org%2Frepo/config \ -H 'Content-Type: application/json' \ -d '{"minimumCost": 10, "regressionThreshold": 20, "comparisonBranch": "main"}'Production statistics (optional)
Section titled “Production statistics (optional)”By default, the analyzer works with the statistics generated by your test data. For more accurate recommendations that reflect real-world data distribution, you can sync statistics from your production database.
Create the following function in your production database:
CREATE OR REPLACE FUNCTION _qd_dump_stats(include_sensitive_info boolean)RETURNS jsonb AS $$SELECT json_agg(t) FROM ( SELECT c.table_name as "tableName", c.table_schema as "schemaName", cl.reltuples, cl.relpages, cl.relallvisible, n.nspname as "schemaName", json_agg( json_build_object( 'columnName', c.column_name, 'dataType', c.data_type, 'isNullable', (c.is_nullable = 'YES')::boolean, 'stats', ( select json_build_object( 'stanullfrac', s.stanullfrac, 'stawidth', s.stawidth, 'stadistinct', s.stadistinct, 'stakind1', s.stakind1, 'stanumbers1', s.stanumbers1, 'stakind2', s.stakind2, 'stanumbers2', s.stanumbers2, 'stakind3', s.stakind3, 'stanumbers3', s.stanumbers3, 'stakind4', s.stakind4, 'stanumbers4', s.stanumbers4, 'stakind5', s.stakind5, 'stanumbers5', s.stanumbers5, 'stavalues1', case when $1 then s.stavalues1 else null end, 'stavalues2', case when $1 then s.stavalues2 else null end, 'stavalues3', case when $1 then s.stavalues3 else null end, 'stavalues4', case when $1 then s.stavalues4 else null end, 'stavalues5', case when $1 then s.stavalues5 else null end ) from pg_statistic s where s.starelid = a.attrelid and s.staattnum = a.attnum ) ) ORDER BY c.ordinal_position ) as columns FROM information_schema.columns c JOIN pg_attribute a ON a.attrelid = (quote_ident(c.table_schema) || '.' || quote_ident(c.table_name))::regclass AND a.attname = c.column_name JOIN pg_class cl ON cl.relname = c.table_name JOIN pg_namespace n ON n.oid = cl.relnamespace WHERE c.table_name not like 'pg_%' AND n.nspname <> 'information_schema' GROUP BY c.table_name, c.table_schema, cl.reltuples, cl.relpages, cl.relallvisible, n.nspname) t;$$ LANGUAGE sql STABLE SECURITY DEFINER;Then dump and provide the stats file:
psql -d yourdb -At -F "" -c "select _qd_dump_stats(false)" > stats.jsonPass false to include_sensitive_info to exclude actual cell values from the dump — only statistical distributions are included. Pass true if you need the most accurate recommendations and your data isn’t sensitive.
Further reading
Section titled “Further reading”- Analyzer reference — the same analyzer used for live query analysis
- Source code — the analyzer repo containing the action definition