POC: replace our external log aggregation feature with rsyslog

- this change adds rsyslog (https://github.com/rsyslog/rsyslog) as
  a new service that runs on every AWX node (managed by supervisord)
  in particular, this feature requires a recent version (v8.38+) of
  rsyslog that supports the omhttp module
  (https://github.com/rsyslog/rsyslog-doc/pull/750)
- the "external_logger" handler in AWX is now a SysLogHandler that ships
  logs to the local UDP port where rsyslog is configured to listen (by
  default, 51414)
- every time a LOG_AGGREGATOR_* setting is changed, every AWX node
  reconfigures and restarts its local instance of rsyslog so that its
  fowarding settings match what has been configured in AWX
- unlike the prior implementation, if the external logging aggregator
  (splunk/logstash) goes temporarily offline, rsyslog will retain the
  messages and ship them when the log aggregator is back online
- 4xx or 5xx level errors are recorded at /var/log/tower/external.err
This commit is contained in:
Ryan Petrello
2019-10-23 23:54:47 -04:00
committed by Christian Adams
parent eafb751ecc
commit 589d27c88c
14 changed files with 106 additions and 856 deletions

View File

@@ -11,18 +11,13 @@ from django.conf import settings
logger = logging.getLogger('awx.main.utils.reload')
def _supervisor_service_command(command, communicate=True):
def supervisor_service_command(command, service='*', communicate=True):
'''
example use pattern of supervisorctl:
# supervisorctl restart tower-processes:receiver tower-processes:factcacher
'''
group_name = 'tower-processes'
if settings.DEBUG:
group_name = 'awx-processes'
args = ['supervisorctl']
if settings.DEBUG:
args.extend(['-c', '/supervisor.conf'])
args.extend([command, '{}:*'.format(group_name)])
args.extend([command, ':'.join(['tower-processes', service])])
logger.debug('Issuing command to {} services, args={}'.format(command, args))
supervisor_process = subprocess.Popen(args, stdin=subprocess.PIPE,
stdout=subprocess.PIPE, stderr=subprocess.PIPE)
@@ -41,4 +36,4 @@ def _supervisor_service_command(command, communicate=True):
def stop_local_services(communicate=True):
logger.warn('Stopping services on this node in response to user action')
_supervisor_service_command(command='stop', communicate=communicate)
supervisor_service_command(command='stop', communicate=communicate)