* celery workers have internal queue names that are named after the
system hostname. This may differ from what tower knows the host by,
Instance.hostname
This adds a mapping so we can convert internal celery names to Instance
names for purposes of reaping jobs.
the main goal of this change is to make `make docker-isolated` work out
of the box
- specify the proper version for awx-expect --version
- update some deprecated playbook bits
- change the isolated container to privileged so bwrap will work
- fix awx-manage test_isolated_connection
- expedite the first isolated heartbeat so you don't have to wait 10m;
this is accomplished by _not_ setting Instance.last_isolated_check to
now() at insertion time (which causes the next check not to happen for
10 minutes)
- fix a bug that caused isolated node execution to fail when bwrap was
enabled
see: https://github.com/ansible/tower/issues/2150
This reverts commit 9863fe71dc.
* Instead of passing around the isolated host that the task is to
execute on; grab the isolated execution host from the instance further
down the call stack. Without passing the isolated hostname around.
* Randomly chose an instance in the controller instance group for which
to control the isolated node run. Note the chosen instance via a job
controller_node field
* Deciding the Instance that a Job runs on at celery task run-time makes
it hard to evenly distribute tasks among Instnaces. Instead, the task
manager will look at the world of running jobs and choose an instance
node to run on; applying a deterministic job distribution algo.
* Each time a route is needed (i.e. when a task is sumitted to celery).
The router will be queried. This is ideal. With the previous method we
had to consider how a change in the routes would propogate to all celery
workers and nodes.
* fully describe the default awx queue
* Our dynamic queue registration would correct awx_private_queue.
However, we don't want celery to even create an "invalid"/extra
queue-exchange-route. This change makes sure we don't create extranious
things in rabbitmq.
* reduce the cluster queue registration output. Only output when the
queue registration list changes.
refactor existing handlers to be the related
"real" handler classes, which are swapped
out dynamically by external logger "proxy" handler class
real handler swapout only done on setting change
remove restart_local_services method
get rid of uWSGI fifo file
change TCP/UDP return type contract so that it mirrors
the request futures object
add details to socket error messages
* Using Kombu's default Broadcast() constructor requires only 1
parameter. That parameter defines the exchange name and the queue name
is randomly generated per-node.
* This caused problems if/when celery enters an infinite restart loop
because too many rabbit queues get created and rabbit OOM's
(gracefully).
* To remedy this we tell Broadcast the queue name to use, which is
derived from some constant + the node name so that the per-node queue
name is stable.
* Before, we had a special group, tower, that ran any async work that
tower needed done. This allowed users fine grain control over which
nodes did background work. However, this granularity was too complicated
for users. So now, all tower system work goes to a special non-user
exposed celery queue. Tower remains the fallback instance group to
execute jobs on. The tower group will be created upon install and
protected from deletion.
The extra vars file created lives in the playbook private runtime
directory, and will be reaped along with the rest of the directory.
Adjust assorted unit tests as necessary.
* Was considering an isolated instance: any instance that has at least 1
group with no controller. This is technically correct since an iso node
can not be a part of a non-iso group.
* The query is now more robust and considers a node an iso node if ALL
groups that a node belong to ALL have a controller.
* Also added better debugging for the special tower instance group
* Added a check for the existance of the special tower group so that
logs are less "messy" during the install process.