TECH SOLUTIONS: LONG RUNING QUERIES

*** LONG RUNING QUERIES

✅ STEP-BY-STEP: Investigate and Resolve Long-Running Queries

🧭 STEP 1: Identify Long-Running Queries

Use this SQL to list all currently running queries ordered by duration:

SELECT
    pid,
    usename,
    datname,
    client_addr,
    application_name,
    state,
    now() - query_start AS duration,
    wait_event_type,
    wait_event,
    query
FROM pg_stat_activity
WHERE state != 'idle'
  AND now() - query_start > interval '2 minutes'
ORDER BY duration DESC;

You can adjust the 2 minutes threshold based on your SLAs.

🧭 STEP 2: Investigate Query Type and Intent

Look at the output:

SELECT → check if it's scanning large tables without indexes
UPDATE / DELETE → large data change without batching
DDL like CREATE INDEX → may block writes unless concurrent

🛠 Check if the query is:

Holding locks
Waiting on disk/network (wait_event_type = IO/Client)

🧭 STEP 3: Check for Blocking Locks (if any)

Long-running queries may be blocked or are blocking others.

Run:

SELECT
    blocked_locks.pid     AS blocked_pid,
    blocked_activity.query AS blocked_query,
    now() - blocked_activity.query_start AS blocked_duration,
    blocking_locks.pid     AS blocking_pid,
    blocking_activity.query AS blocking_query,
    now() - blocking_activity.query_start AS blocking_duration
FROM pg_locks blocked_locks
JOIN pg_stat_activity blocked_activity ON blocked_activity.pid = blocked_locks.pid
JOIN pg_locks blocking_locks ON blocking_locks.transactionid = blocked_locks.transactionid
JOIN pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
WHERE NOT blocked_locks.granted;

🧭 STEP 4: Check Query Plan (Optional but Useful)

If you're unsure why it's slow, run EXPLAIN (ANALYZE, BUFFERS) for the same query manually (on a test DB).

This shows:

Seq Scan vs Index Scan
Sort/Join bottlenecks
Rows returned vs expected

🧭 STEP 5: Terminate a Truly Problematic Query

If the query is:

Running for an unusual time
Not critical
Blocking other sessions

Terminate it safely:

SELECT pg_terminate_backend(<pid>);

Or just cancel it (safer):

SELECT pg_cancel_backend(<pid>);

🧭 STEP 6: Audit User and Source (Who’s Running It?)

SELECT pid, usename, application_name, client_addr, query_start, query
FROM pg_stat_activity
WHERE now() - query_start > interval '2 minutes'
ORDER BY query_start;

Use this to check:

Who triggered the query
Is it a batch job or application process?

🛡️ STEP 7: Prevent Long Queries in Future

Problem	Prevention
Full table scans	Add appropriate indexes
Large batch updates	Use `LIMIT` + `OFFSET` batching
Idle transactions	Set `idle_in_transaction_session_timeout`
Infinite reports	Use `LIMIT` in queries
Application retries	Throttle, set timeouts at app level

Example config (postgresql.conf):

statement_timeout = '2min'
log_min_duration_statement = '30s'

🧠 BONUS: Monitor Long Queries in Health Checks

Add this SQL in your pp_health_checks_01.sql:

-- Long Running Queries (above 2 min)
SELECT
    now() as captured_at,
    pid,
    usename,
    client_addr,
    now() - query_start AS runtime,
    query
FROM pg_stat_activity
WHERE state != 'idle'
  AND now() - query_start > interval '2 minutes'
ORDER BY runtime DESC;

You can output this to .html format from your shell script as part of daily health logs.

TECH SOLUTIONS

Tuesday, June 24, 2025

LONG RUNING QUERIES