Commit Graph

7 Commits

Author SHA1 Message Date
AlanCoding
482395eb6a reduce default verbosity of devel-specific callback logging 2018-10-26 10:03:46 -04:00
Ryan Petrello
3be9113d6b fix a bug that breaks job cancel on single node jobs
1.  Install awx w/ a single node.
2.  Start a long-running job.
3.  Forcibly kill the `awx-manage run_dispatcher` process (e.g.,
    SIGKILL) and do not start it again.
4.  The job remains in running - without a second cluster to discover
    the job, it is never reaped.
5.  This PR allows you to cancel the job from the UI+API.
2018-10-19 09:10:33 -04:00
Ryan Petrello
0d29bbfdc6 make the dispatcher more fault-tolerant to prolonged database outages 2018-10-18 20:00:07 -04:00
Ryan Petrello
53ae05094e use the proper logger for the callback receiver 2018-10-17 10:56:29 -04:00
Ryan Petrello
720a634702 don't attempt to recover special QUIT messages in the worker pool
when `--reload` is sent to the dispatcher, it sends a special QUIT
message to each worker in the pool so that it will exit gracefully at
the next opportunity

when a worker process exits unexpectedly, the dispatcher attempts to
recover its queued messages and sends them to another worker in the
pool; in this scenario, we should _never_ re-enqueue these special
QUIT messages (because the process doesn't need to quit, it's already
gone)

To reproduce this race condition:

1.  Launch an adhoc that does `sleep 60`
2.  Run `awx-manage run_dispatcher --reload` to enqueue a `QUIT` message
    into the worker's queue
3.  Find the pid of the worker running the `sleep 60` and `SIGKILL` it.
4.  Observe that dispatcher attempts to requeue the `QUIT` message and
    logs a confusing error.
2018-10-15 12:17:52 -04:00
Ryan Petrello
ff1e8cc356 replace celery task decorators with a kombu-based publisher
this commit implements the bulk of `awx-manage run_dispatcher`, a new
command that binds to RabbitMQ via kombu and balances messages across
a pool of workers that are similar to celeryd workers in spirit.
Specifically, this includes:

- a new decorator, `awx.main.dispatch.task`, which can be used to
  decorate functions or classes so that they can be designated as
  "Tasks"
- support for fanout/broadcast tasks (at this point in time, only
  `conf.Setting` memcached flushes use this functionality)
- support for job reaping
- support for success/failure hooks for job runs (i.e.,
  `handle_work_success` and `handle_work_error`)
- support for auto scaling worker pool that scale processes up and down
  on demand
- minimal support for RPC, such as status checks and pool recycle/reload
2018-10-11 10:53:30 -04:00
Ryan Petrello
da74f1d01f refactor and test the callback receiver as a base for a task dispatcher 2018-10-11 10:53:26 -04:00