Commit Graph

119 Commits

Author SHA1 Message Date
Wayne Witzel III
095a93d895 remove duplicated get_latest calls 2017-09-20 13:57:01 -04:00
Wayne Witzel III
2889df8013 ensure project sync/inv updates are added to the dependencies 2017-09-20 13:15:02 -04:00
Wayne Witzel III
11b2bc33fe add scheduler module __init__ 2017-09-20 13:14:01 -04:00
Wayne Witzel III
1beaccb9c9 move TaskManager out of init 2017-09-20 13:13:10 -04:00
Matthew Jones
9f3a0c0716 Fix an issue where dependent updates weren't sorted correctly
When considering previous / current Project Updates we weren't
properly sorting the previous runs.

We also make sure we filter down to just "check" style project updates
and don't consider 'run' style standalone project updates when
deciding what are potentially related project updates
2017-09-12 09:42:50 -04:00
AlanCoding
8e1e60c187 simplify isolated job reaping by checking all task ids 2017-09-08 12:30:05 -07:00
AlanCoding
878e7ef49f reap isolated jobs 2017-09-08 10:40:39 -07:00
AlanCoding
d54eb93f26 Handle capacity algorithm corner cases
Instance has gone lost, and jobs are still either running
or waiting inside of its instance group
RBAC - user does not have permission to see some of the
groups that would be used in the capacity calculation

For some cases, a naive capacity dictionary is returned,
main goal is to not throw errors and avoid unpredicted behavior

Detailed capacity tests are moved into new unit test file.
2017-08-28 16:12:12 -04:00
AlanCoding
5327a4c622 Use global capacity algorithm in serializer
The task manager was doing work to compute currently consumed
capacity, this is moved into the manager and applied in the
same form to the instance group list.
2017-08-28 12:07:47 -04:00
AlanCoding
ce3c969c08 correct capacity algorithm for task manager 2017-08-26 11:59:25 -04:00
Chris Meyers
1df47a2ddd account for waiting tasks not having execution_nodes yet
* Reap running tasks on non-netsplit nodes
* Reap running tasks on known to be offline nodes
* Reap waiting tasks with no celery id anywhere if waiting >= 60 seconds
2017-08-16 13:18:25 -04:00
Chris Meyers
d615e2e9ff do not include workflow jobs for reaping
* Workflow jobs are virtual jobs that don't actually run. Thus they
won't have a celery id and aren't candidates for the generic reaping.
* Better error logging when Instance not found in reaping code.
2017-08-16 08:51:27 -04:00
Chris Meyers
de82707581 fail all jobs on an offline node 2017-08-15 14:46:28 -04:00
Chris Meyers
9314db646b only reap non-netsplit nodes 2017-08-15 08:30:14 -04:00
Chris Meyers
80a90beae8 remove dead code 2017-08-14 10:37:57 -04:00
AlanCoding
fd31dc9c63 use logger args for task log format 2017-08-11 10:56:45 -04:00
AlanCoding
33df1d8c8b introduce log format for jobs inside of scheduler 2017-08-11 10:56:15 -04:00
Chris Meyers
c0b9c76a41 stringify uuid so it can be serialized 2017-08-10 07:32:03 -04:00
Chris Meyers
c3f24d878d reap waiting processes if crash 2017-08-09 14:01:33 -04:00
AlanCoding
59249d013b use standard type string algorithm in scheduler 2017-08-08 10:40:10 -04:00
AlanCoding
9572b429ad remove support for celery_active_tasks 2017-07-31 15:02:22 -04:00
Matthew Jones
b3b4a515e2 Refactor some tower periodic tasks to label as awx 2017-07-26 13:35:30 -04:00
Chris Meyers
525490a9a0 all dependent jobs must finish before starting job
related to https://github.com/ansible/ansible-tower/issues/6570 https://github.com/ansible/ansible-tower/issues/6489

* Ensure that all jobs dependent on a job have finished (i.e. error,
success, failed) before starting the dependent job.
* Fixes a bug where a smaller set of dependent jobs to fail upon a
update_on_launch job failing is chosen.
* This fixes the bug of jobs starting before dependent Project Updates
created via update on launch. This also fixes similar bugs associated
with inventory updates.
2017-07-24 12:31:35 -04:00
AlanCoding
a2b98f0a40 disable 2 more types of unhelpful act. str. entries
project local_path changed as a secondary save after creation
adding jobs to dependency list (not user facing)
2017-07-21 07:47:30 -04:00
AlanCoding
d02bc52ca2 silence activity stream IG assignment 2017-07-17 16:28:13 -04:00
Ryan Petrello
0e29f3617d periodically run orphaned task cleanup as part of the scheduler
Running orphaned task cleanup within its own scheduled task via
celery-beat causes a race-y lock contention between the cleanup task and
the task scheduler.  Unfortunately, the scheduler and the cleanup task
both run at similar intervals, so this race condition is fairly easy to
hit.  At best, it results in situations where the scheduler is
regularly delayed 20s; depending on timing, this can cause situations
where task execution is needlessly delayed a minute+.  At worst, it can
result in situations where the scheduler is never able to schedule
tasks.

This change implements the cleanup as a periodic block of code in the
scheduler itself that tracks its "last run" time in memcached (by
default, it performs a cleanup every 60 seconds)

see: #6534
2017-07-10 15:51:46 -04:00
Chris Meyers
aeb7119796 fix 2 data source inconcistency with failing tasks
* Do not "trust" the list of celery ids for database entries that were
modified after the list of celery ids was gotten.
* err on the side of caution and just let the next heartbeat celery
killer try killing the task if it needs to be reaped.
2017-07-10 10:43:01 -04:00
Chris Meyers
5b9a0b504a celery task fail check now uses pglock
* Align locking used by celery task cleaner upper with regular task manager.
* Uses pglock/advisory lock instead of abusing Instance table lock.
2017-07-10 10:41:54 -04:00
Chris Meyers
e226b0ab37 noop pglock for unit tests 2017-07-05 13:34:07 -04:00
Chris Meyers
f3f9782c0b fix 2 data source inconcistency with failing tasks
* Do not "trust" the list of celery ids for database entries that were
modified after the list of celery ids was gotten.
* err on the side of caution and just let the next heartbeat celery
killer try killing the task if it needs to be reaped.
2017-07-05 13:33:29 -04:00
Chris Meyers
15aee1f8ac celery task fail check now uses pglock
* Align locking used by celery task cleaner upper with regular task manager.
* Uses pglock/advisory lock instead of abusing Instance table lock.
2017-07-05 13:27:08 -04:00
Chris Meyers
f1b1c4ee97 avoid instance record deadlock by using pglock 2017-07-05 09:38:19 -04:00
AlanCoding
9e07b3f777 Tag jobs started via special cases with node & group 2017-06-22 16:24:12 -04:00
AlanCoding
d69b4e00ff select isolated node inside of queue based on capacity 2017-06-21 15:48:23 -04:00
Ryan Petrello
422950f45d Support for executing job and adhoc commands on isolated Tower nodes (#6524) 2017-06-14 11:47:30 -04:00
Matthew Jones
adf4be29b7 Fix taskmanager failing to launch workflow job
This was caused by taskmanager instance group refactoring. This
re-enables that functionality
2017-05-23 11:15:57 -04:00
Chris Meyers
067fdac8b1 prefetch optimizations for task manager
* Prefetch all Jobs Types related instance group
* Prefetch inventory updates inventory source. The attribute
inventory_source.inventory_id is accessed when building the hash
tables of the running tasks.
2017-05-16 15:05:20 -04:00
Chris Meyers
9b771ae907 only consider update_on_launch inventory sources
* When generating dependencies (i.e. dynamically launching Project
Update and Inventory Update) only create the dynamic dependencies if
update_on_launch is True.
2017-05-16 13:18:15 -04:00
AlanCoding
901c77bcfb dependent IU launch_type reduced to 'dependency' 2017-05-15 14:24:41 -04:00
Matthew Jones
fc6630fd25 Consider that the project could be none when launching dependencies 2017-05-15 12:33:36 -04:00
Matthew Jones
8bc1490368 Increase test coverage for task scheduler inventory updates
Also fixes some bugs found in that process
2017-05-11 14:36:13 -04:00
Matthew Jones
5508bad97c Updating unit tests for task manager refactoring
* Purging old task manager unit tests
* Migrating those tests to functional tests
* Updating fixtures and factories to support a change in the way the
  task manager is tested
* Fix an issue with the mk_credential fixture when used in functional
  tests. Previously it had trouble with multiplel invocations when
  persistence was used
2017-05-10 12:45:37 -04:00
Matthew Jones
4ced911c00 Implementing models for instance groups, updating task manager
* New InstanceGroup model and associative relationship with Instances
* Associative instances between Organizations, Inventory, and Job
  Templates and InstanceGroups
* Migrations for adding fields and tables for Instance Groups
* Adding activity stream reference for instance groups
* Task Manager Refactoring:
** Simplifying task manager relationships and move away from the
   interstitial hash tables
** Simplify dependency determination logic
** Reduce task manager runtime complexity by removing the partial
   references and moving the logic into the task manager directly or
   relying on Job model logic for determinism
2017-05-10 12:32:54 -04:00
Aaron Tan
057b24ccd0 Allow concurrent workflow job runs. 2017-05-02 16:28:24 -04:00
Chris Meyers
c4fb88c0d9 remove uneeded post commit wrapper
* Since we changed the lower level method to always use post commit
message emit
2017-02-28 13:10:00 -05:00
Chris Meyers
a1c76d3adc like inventory updates, check if project update deps already processed 2017-02-27 10:34:44 -05:00
Chris Meyers
dd513621f0 Revert "Merge pull request #5553 from chrismeyersfsu/fix-waiting_blocked"
This reverts commit 9ba2122f4f85eecaeb6fa53ac92ea2811b05e83f, reversing
changes made to c3a5f2c96fd85dd1405a8f5c875ffc988dee16a4.
2017-02-27 09:38:45 -05:00
Chris Meyers
903d0472f0 just like we fail running tasks fail waiting tasks
* Associate the celery_id with the job at the earliest point possible.
This ensures that a waiting job has a celery id. Thus, we are free to
fail waiting jobs that don't have a celery id.
2017-02-24 12:07:04 -05:00
Chris Meyers
08825a1f49 fix check running status 2017-02-23 15:09:50 -05:00
Chris Meyers
6e9488a59b ensure job deps are created only once 2017-02-15 15:54:30 -05:00