previously this is used so that task running in the task container can reach into the web container to restart rsyslog
now that the web container and task container are split there's no longer a way to do that so i renamed this env var to reference where it will now do
which is pointing to the supervisor conf file of the current running container
- Add a placeholder rsyslog.conf so it doesn't fail on start
- Create access restricted directory for unix socket to be created in
- Create RSyslogHandler to exit early when logging socket doesn't exist
- Write updated logging settings when dispatcher comes up and restart rsyslog so they take effect
- Move rsyslogd to the web container and create rpc supervisor.sock
- Add env var for supervisor.conf path
- this change adds rsyslog (https://github.com/rsyslog/rsyslog) as
a new service that runs on every AWX node (managed by supervisord)
in particular, this feature requires a recent version (v8.38+) of
rsyslog that supports the omhttp module
(https://github.com/rsyslog/rsyslog-doc/pull/750)
- the "external_logger" handler in AWX is now a SysLogHandler that ships
logs to the local UDP port where rsyslog is configured to listen (by
default, 51414)
- every time a LOG_AGGREGATOR_* setting is changed, every AWX node
reconfigures and restarts its local instance of rsyslog so that its
fowarding settings match what has been configured in AWX
- unlike the prior implementation, if the external logging aggregator
(splunk/logstash) goes temporarily offline, rsyslog will retain the
messages and ship them when the log aggregator is back online
- 4xx or 5xx level errors are recorded at /var/log/tower/external.err
this commit implements the bulk of `awx-manage run_dispatcher`, a new
command that binds to RabbitMQ via kombu and balances messages across
a pool of workers that are similar to celeryd workers in spirit.
Specifically, this includes:
- a new decorator, `awx.main.dispatch.task`, which can be used to
decorate functions or classes so that they can be designated as
"Tasks"
- support for fanout/broadcast tasks (at this point in time, only
`conf.Setting` memcached flushes use this functionality)
- support for job reaping
- support for success/failure hooks for job runs (i.e.,
`handle_work_success` and `handle_work_error`)
- support for auto scaling worker pool that scale processes up and down
on demand
- minimal support for RPC, such as status checks and pool recycle/reload
refactor existing handlers to be the related
"real" handler classes, which are swapped
out dynamically by external logger "proxy" handler class
real handler swapout only done on setting change
remove restart_local_services method
get rid of uWSGI fifo file
change TCP/UDP return type contract so that it mirrors
the request futures object
add details to socket error messages
In some systems, the tower service process may not have sufficient
permissions to communicate with the supervisorctl socket, in which
case an automated restart will not be possible.
* Modify instance model to container a version number for the node
* Update that version number during the heartbeat
* If during a heartbeat any of the nodes are of a newer version then
shutdown the current node.
The idea behind this is that if all nodes were upgraded at the same
time then at the moment of the healthcheck they should all be at the
newer version. Otherwise we put the system in a state where it can
receive the upgrade but stay down until that happens. During setup
playbook run the services will be fully restarted.