only get active long running queries

Getting all connections that have been alive > 5 minutes is noisy. Instead, we want only queries that are active and have been alive for more than 5 minutes. This will filter out any idle connections.
This commit is contained in:
Chris Meyers 2021-06-21 09:57:29 -04:00 committed by GitHub
parent 0c120782d4
commit 5975082954
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -62,11 +62,17 @@ log_min_duration_statement = 500 # in ms
log_line_prefix = '< %m %a >' # timestamp, application name
```
We've made it easier to correlate postgres connections <--> the processes that is doing the query. For example, the following line is a log entry from a slow running `awx-manage` command. `< 2021-04-21 12:42:10.307 UTC awx_1-1540765-/bin/awx-manage gather_analytics --dry-run -v 3 >LOG: duration: 1211.270 ms statement: SELECT MIN("main_jobevent"."id") AS "pk__min", MAX("main_jobevent"."id") AS "pk__max" FROM "main_jobevent" WHERE ("main_jobevent"."modified" >= '2021-03-24T12:42:08.846790+00:00'::timestamptz AND "main_jobevent"."modified" <= '2021-04-21T12:42:08.846790+00:00'::timestamptz)` the entry was found in the log file `/var/lib/pgsql/data/pg_log/postgresql-Wed.log`
We've made it easier to correlate postgres connections <--> the processes that is doing the query. For example, the following line is a log entry from a slow running `awx-manage` command.
```
< 2021-04-21 12:42:10.307 UTC awx_1-1540765-/bin/awx-manage gather_analytics --dry-run -v 3 >LOG: duration: 1211.270 ms statement: SELECT MIN("main_jobevent"."id") AS "pk__min", MAX("main_jobevent"."id") AS "pk__max" FROM "main_jobevent" WHERE ("main_jobevent"."modified" >= '2021-03-24T12:42:08.846790+00:00'::timestamptz AND "main_jobevent"."modified" <= '2021-04-21T12:42:08.846790+00:00'::timestamptz)
```
The above entry was found in the log file `/var/lib/pgsql/data/pg_log/postgresql-Wed.log`
Note the `application_name` portion. This allows us to trace the query to the node `awx_1` with processes id `1540765`. The full task command line string gives us context for each long-running query that we need to find the needle in the hay stack without having to go to each individual AWX node and query Linux by the pid to understand what work is being done by each pid.
```
awx_1-1540765-/bin/awx-manage gather_analytics --dry-run -v 3`
awx_1-1540765-/bin/awx-manage gather_analytics --dry-run -v 3
<tower_instance_hostname>-<pid>-<pid_launch_string>
```
@ -79,15 +85,12 @@ SELECT
state,
application_name
FROM pg_stat_activity
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes';
WHERE (now() - pg_stat_activity.query_start) > interval '5 minutes' and state='active';
```
```
duration | query | state | application_name
-----------------+---------------------------------------------------------------------------------------------------------------+--------+-----------------------------------------------------------------
00:13:17.430778 | LISTEN "ec2-*-**-***-**.compute-1.amazonaws.com"; | idle | ec2-*-**-***-**.compute-1.amazonaws.com-1460120-/usr/bin/awx-ma
00:13:17.430826 | LISTEN "awx_1"; | idle | awx_1-1540753-/usr/bin/awx-manage run_dispatcher
00:13:17.202724 | LISTEN "ec2-*-**-***-**.compute-1.amazonaws.com"; | idle | ec2-*-**-***-**.compute-1.amazonaws.com-1497861-/usr/bin/awx
00:13:13.125703 | COPY (SELECT main_jobevent.id, +| active | awx_1-1540765-/bin/awx-manage gather_analytics --dry-run -v 3
| main_jobevent.created, +| |
| main_jobevent.modified, +| |