instead, just have each worker connect directly to redis
this has a few benefits:
- it's simpler to explain and debug
- back pressure on the queue keeps messages around in redis (which is
observable, and survives the restart of Python processes)
- it's likely notably more performant at high loads
make the --status flag work by fetching a periodically recorded snapshot
of internal process state; additionally, update the callback receiver to
*also* record these statistics so we can gain more insight into any
performance issues
* postgres notify/listen channel names have size limitations as well as
character limitations. Respect those limitations while at the same time
generate a unique channel name.
1. Install awx w/ a single node.
2. Start a long-running job.
3. Forcibly kill the `awx-manage run_dispatcher` process (e.g.,
SIGKILL) and do not start it again.
4. The job remains in running - without a second cluster to discover
the job, it is never reaped.
5. This PR allows you to cancel the job from the UI+API.
this commit implements the bulk of `awx-manage run_dispatcher`, a new
command that binds to RabbitMQ via kombu and balances messages across
a pool of workers that are similar to celeryd workers in spirit.
Specifically, this includes:
- a new decorator, `awx.main.dispatch.task`, which can be used to
decorate functions or classes so that they can be designated as
"Tasks"
- support for fanout/broadcast tasks (at this point in time, only
`conf.Setting` memcached flushes use this functionality)
- support for job reaping
- support for success/failure hooks for job runs (i.e.,
`handle_work_success` and `handle_work_error`)
- support for auto scaling worker pool that scale processes up and down
on demand
- minimal support for RPC, such as status checks and pool recycle/reload