* Before, we had a special group, tower, that ran any async work that
tower needed done. This allowed users fine grain control over which
nodes did background work. However, this granularity was too complicated
for users. So now, all tower system work goes to a special non-user
exposed celery queue. Tower remains the fallback instance group to
execute jobs on. The tower group will be created upon install and
protected from deletion.
In verbose unified job models (inventory updates, system jobs,
etc.), do not delay dispatch just because the encoded
event data is not part of the data written to the buffer.
This allows output from these commands to be submitted
to the callback queue as they are produced, instead
of waiting until the buffer is closed.
* This also adds fields to the instance view for tracking cpu and
memory usage as well as information on what the capacity ranges are
* Also adds a flag for enabling/disabling instances which removes them
from all queues and has them stop processing new work
* The capacity is now based almost exclusively on some value relative
to forks
* capacity_adjustment allows you to commit an instance to a certain
amount of forks, cpu focused or memory focused
* Each job run adds a single fork overhead (that's the reasoning
behind the +1)
* Switch policy router queue to not be "tower" so that we don't
fall into a chicken/egg scenario
* Show fixed policy list in serializer so a user can determine if
an instance is manually managed
* Change IG membership mixin to not directly handle applying topology
changes. Instead it just makes sure the policy instance list is
accurate
* Add create/delete hooks for instances and groups to trigger policy
re-evaluation
* Update policy algorithm for fairer distribution
* Fix an issue where CELERY_ROUTES wasn't renamed after celery/django
upgrade
* Update unit tests to be more explicit
* Update count calculations used by algorithm to only consider
non-manual instances
* Adding unit tests and fixture
* Don't propagate logging messages from awx.main.tasks and
awx.main.scheduler
* Use advisory lock to prevent policy eval conflicts
* Allow updating instance groups from view
* Based on the tower topology (Instance and InstanceGroup
relationships), have celery dyamically listen to queues on boot
* Add celery task capable of "refreshing" what queues each celeryd
worker listens to. This will be used to support changes in the topology.
* Cleaned up some celery task definitions.
* Converged wrongly targeted job launch/finish messages to 'tower'
queue, rather than a 1-off queue.
* Dynamically route celery tasks destined for the local node
* separate beat process
add support for separate beat process
update our event data search algorithm to be a bit lazier in event data
discovery; this drastically improves processing speeds for stdout >5MB
see: https://github.com/ansible/awx/issues/417
The original tests set no longer works after Django 1.11 due to more
strict rules against dynamic model definition. The refactored tests set
aims at each existing model that apply named URL rules, instead of
abstract general use cases, thus significantly improves maintainability
and readability.
Signed-off-by: Aaron Tan <jangsutsr@gmail.com>
this approach totally removes the process of reading and writing stdout
files on the local file system at settings.JOBOUTPUT_ROOT when jobs are
run; now stdout content is only written on-demand as it's fetched for
the deprecated `stdout` endpoint
see: https://github.com/ansible/awx/issues/200
* release_3.2.1:
fallback to empty dict when processing extra_data
fix migration problem from 3.1.1
move 0005a migration to 0005b
feedback on ad hoc prohibited vars error msg
Fix the way we include i18n files in sdist
Fix migrations to support 3.1.2 -> 3.2.1+ upgrade path
fix missing parameter to update_capacity method
fix WARNING log when launching ad hoc command
Validate against ansible variables on ad hoc launch
do not allow ansible connection type of local for ad_hoc
work around an ansible 2.4 inventory caching bug
fix scan job migration unicode issue
Assert isolated nodes have capacity set to 0 and restored based on version
Set capacity to zero if the isolated node has an old version
* Change scheme from using event dict to JobEvent object
* Add processing to grok object fields
* Allow override of provided formatter in case of future issues
This change _only_ injects `AWS_TASK_ENV` into `os.environ`; it's up to
underlying libraries to be good citizens and actually respect things
like `HTTPS_PROXY`.
see: #3508