- In K8S-based installs, only container groups are intended to be used
for playbook execution (JTs, adhoc, inventory updates), so in this
scenario, other job types have a task impact of zero.
- In K8S-based installs, traditional instances have *zero* capacity
(because they're only members of the control plane where services
- http/s, local control plane execution - run)
- This commit also includes some changes that allow for the task manager
to launch tasks with task_impact=0 on instances that have capacity=0
(previously, an instance with zero capacity would never be selected
as the "execution node"
This means that when IS_K8S=True, any Job Template associated with an
Instance Group will never actually go from pending -> running (because
there's no capacity - all playbooks must run through Container Groups).
For an improved ux, our intention is to introduce logic into the
operator install process such that the *default* group that's created at
install time is a *Container Group* that's configured to point at the
K8S cluster where awx itself is deployed.
- a new unique name field to EE
- a new configure-Tower-in-Tower setting DEFAULT_EXECUTION_ENVIRONMENT
- an Org-level execution_environment_admin_role
- a default_environment field on Project
- a new Container Registry credential type
- order EEs by reverse of the created timestamp
- a method to resolve which EE to use on jobs
Various points (e.g. created, running, processing events), are
structured into json format and output to /var/log/tower/job_lifecycle.log
As part of this work, the DependencyGraph is reworked to return
which job object is doing the blocking, rather than a boolean.
* The cron ran logrotate will now rotate our log files instead of python
* If not error log file is specified in the config then do not include
it as a paremter to rsyslog omhttp module. This is useful for
containers.
* The namespace for isolated logging was not enabled. Add a handler and
logger so that it's enabled. This is particularly useful when the
logging level is switched to DEBUG
instead, just have each worker connect directly to redis
this has a few benefits:
- it's simpler to explain and debug
- back pressure on the queue keeps messages around in redis (which is
observable, and survives the restart of Python processes)
- it's likely notably more performant at high loads
The analytics change PR adjusted the logging for awx.analytics,
which solved the issue, but should have used the targeted awx.main.analytics.
Also flip a couple of loggers to use the regular awx.analytics (awx analytics)
logger instead of awx.main.analytics (the automation anayltics task system).
Errors/warnings when gathering analytics are about 50/50 split between
the gathering code in analytics and the task code that calls it, so
they should be in the same place for debugging sanity.