Commit Graph

30 Commits

Author SHA1 Message Date
Seth Foster
0a47d05d26 split schedule_task_manager into 3
each call to schedule_task_manager becomes one of

ScheduleTaskManager
ScheduleDependencyManager
ScheduleWorkflowManager
2022-08-05 14:33:25 -04:00
Elijah DeLee
236c1df676 fix lint errors 2022-08-05 14:33:24 -04:00
Elijah DeLee
ad08eafb9a add debug views for task manager(s)
implement https://github.com/ansible/awx/issues/12446
in development environment, enable set of views that run
the task manager(s).

Also introduce a setting that disables any calls to schedule()
that do not originate from the debug views when in the development
environment. With guards around both if we are in the development
environment and the setting, I think we're pretty safe this won't get
triggered unintentionally.

use MODE to determine if we are in devel env

Also, move test for skipping task managers to the tasks file
2022-08-05 14:31:24 -04:00
Seth Foster
431b9370df Split TaskManager into
- DependencyManager spawns dependencies if necessary
- WorkflowManager processes running workflows to see if a new job is
  ready to spawn
- TaskManager starts tasks if unblocked and has execution capacity
2022-08-05 14:29:02 -04:00
Bill Nottingham
c8cf28f266 Assorted renaming and string changes 2021-04-30 14:32:05 -04:00
Ryan Petrello
c2ef0a6500 move code linting to a stricter pep8-esque auto-formatting tool, black 2021-03-23 09:39:58 -04:00
chris meyers
dc6c353ecd remove support for multi-reader dispatch queue
* Under the new postgres backed notify/listen message queue, this never
actually worked. Without using the database to store state, we can not
provide a at-most-once delivery mechanism w/ multi-readers.
* With this change, work is done ONLY on the node that requested for the
work to be done. Under rabbitmq, the node that was first to get the
message off the queue would do the work; presumably the least busy node.
2020-03-18 16:10:16 -04:00
AlanCoding
1f46878652 Revert "Apply migration flag check to task manager"
This reverts commit a0910eb6de.
2020-01-02 09:08:17 -05:00
AlanCoding
a0910eb6de Apply migration flag check to task manager 2019-12-15 22:56:57 -05:00
AlanCoding
758a488aee Add task manager rescheduling hooks, de-duplication, lifecycle tests 2018-11-14 11:31:34 -05:00
Ryan Petrello
ff1e8cc356 replace celery task decorators with a kombu-based publisher
this commit implements the bulk of `awx-manage run_dispatcher`, a new
command that binds to RabbitMQ via kombu and balances messages across
a pool of workers that are similar to celeryd workers in spirit.
Specifically, this includes:

- a new decorator, `awx.main.dispatch.task`, which can be used to
  decorate functions or classes so that they can be designated as
  "Tasks"
- support for fanout/broadcast tasks (at this point in time, only
  `conf.Setting` memcached flushes use this functionality)
- support for job reaping
- support for success/failure hooks for job runs (i.e.,
  `handle_work_success` and `handle_work_error`)
- support for auto scaling worker pool that scale processes up and down
  on demand
- minimal support for RPC, such as status checks and pool recycle/reload
2018-10-11 10:53:30 -04:00
Ryan Petrello
4c0096a524 implement celery failure logging using CELERY_ANNOTATIONS
see: https://github.com/ansible/awx/issues/1720
see: https://github.com/ansible/tower/issues/1190
2018-04-06 11:23:23 -04:00
Chris Meyers
c9ff3e99b8 celeryd attach to queues dynamically
* Based on the tower topology (Instance and InstanceGroup
relationships), have celery dyamically listen to queues on boot
* Add celery task capable of "refreshing" what queues each celeryd
worker listens to. This will be used to support changes in the topology.
* Cleaned up some celery task definitions.
* Converged wrongly targeted job launch/finish messages to 'tower'
queue, rather than a 1-off queue.
* Dynamically route celery tasks destined for the local node
* separate beat process

add support for separate beat process
2018-02-01 16:37:33 -05:00
Wayne Witzel III
14c5123fda Update celery environ and tasks 2017-11-09 17:21:19 -05:00
AlanCoding
5327a4c622 Use global capacity algorithm in serializer
The task manager was doing work to compute currently consumed
capacity, this is moved into the manager and applied in the
same form to the instance group list.
2017-08-28 12:07:47 -04:00
Ryan Petrello
0e29f3617d periodically run orphaned task cleanup as part of the scheduler
Running orphaned task cleanup within its own scheduled task via
celery-beat causes a race-y lock contention between the cleanup task and
the task scheduler.  Unfortunately, the scheduler and the cleanup task
both run at similar intervals, so this race condition is fairly easy to
hit.  At best, it results in situations where the scheduler is
regularly delayed 20s; depending on timing, this can cause situations
where task execution is needlessly delayed a minute+.  At worst, it can
result in situations where the scheduler is never able to schedule
tasks.

This change implements the cleanup as a periodic block of code in the
scheduler itself that tracks its "last run" time in memcached (by
default, it performs a cleanup every 60 seconds)

see: #6534
2017-07-10 15:51:46 -04:00
Chris Meyers
aeb7119796 fix 2 data source inconcistency with failing tasks
* Do not "trust" the list of celery ids for database entries that were
modified after the list of celery ids was gotten.
* err on the side of caution and just let the next heartbeat celery
killer try killing the task if it needs to be reaped.
2017-07-10 10:43:01 -04:00
Chris Meyers
5b9a0b504a celery task fail check now uses pglock
* Align locking used by celery task cleaner upper with regular task manager.
* Uses pglock/advisory lock instead of abusing Instance table lock.
2017-07-10 10:41:54 -04:00
Chris Meyers
e226b0ab37 noop pglock for unit tests 2017-07-05 13:34:07 -04:00
Chris Meyers
f3f9782c0b fix 2 data source inconcistency with failing tasks
* Do not "trust" the list of celery ids for database entries that were
modified after the list of celery ids was gotten.
* err on the side of caution and just let the next heartbeat celery
killer try killing the task if it needs to be reaped.
2017-07-05 13:33:29 -04:00
Chris Meyers
15aee1f8ac celery task fail check now uses pglock
* Align locking used by celery task cleaner upper with regular task manager.
* Uses pglock/advisory lock instead of abusing Instance table lock.
2017-07-05 13:27:08 -04:00
AlanCoding
cc4aa49556 pool restart with thread-based external log config
for CTiT enablement of celery task Tower logs
2017-01-04 14:53:49 -05:00
Aaron Tan
9e4655419e Fix flake8 E302 errors. 2016-11-15 20:59:39 -05:00
Matthew Jones
8e77deea27 Push celery queue stats to memcached periodically
We'll piggyback off the task that checks for inconsistent jobs between
celery and Tower itself. These are read off via /api/v1/ping
2016-11-03 12:10:38 -04:00
Chris Meyers
13c89ab78c HAify job schedules and more task_manager renaming 2016-11-01 13:50:42 -05:00
Chris Meyers
87dd91e849 rename Scheduler to TaskManager 2016-11-01 13:50:42 -05:00
Chris Meyers
454b3edb7c rectify celery<->db inconsistent running job 2016-11-01 13:50:42 -05:00
Chris Meyers
306562cd67 inventory updates running correctly 2016-11-01 13:50:42 -05:00
Chris Meyers
555f0bb90f project and jobs running correctly 2016-11-01 13:50:42 -05:00
Chris Meyers
cdb65ccac9 replace task manager with event driven scheduler 2016-09-27 14:16:18 -04:00