mirror of
https://github.com/ansible/awx.git
synced 2026-02-15 10:10:01 -03:30
More AWX docs edits
This commit is contained in:
@@ -1,35 +1,32 @@
|
||||
# Integration with Third-Party Log Aggregators
|
||||
# Integration With Third-Party Log Aggregators
|
||||
|
||||
This feature builds in the capability to send detailed logs to several kinds
|
||||
of 3rd party external log aggregation services. Services connected to this
|
||||
of third party external log aggregation services. Services connected to this
|
||||
data feed should be useful in order to gain insights into Tower usage
|
||||
or technical trends. The data is intended to be
|
||||
sent in JSON format via three ways: over a HTTP connection, a direct TCP
|
||||
connection or a direct UDP connection. It uses minimal service-specific
|
||||
sent in JSON format via three ways: over an HTTP connection, a direct TCP
|
||||
connection, or a direct UDP connection. It uses minimal service-specific
|
||||
tweaks engineered in a custom handler or via an imported library.
|
||||
|
||||
|
||||
## Loggers
|
||||
|
||||
This features introduces several new loggers which are intended to
|
||||
deliver a large amount of information in a predictable structured format,
|
||||
This feature introduces several new loggers which are intended to
|
||||
deliver a large amount of information in a predictable and structured format,
|
||||
following the same structure as one would expect if obtaining the data
|
||||
from the API. These data loggers are the following.
|
||||
from the API. These data loggers are the following:
|
||||
|
||||
- awx.analytics.job_events
|
||||
- Data returned from the Ansible callback module
|
||||
- awx.analytics.activity_stream
|
||||
- Record of changes to the objects within the Ansible Tower app
|
||||
- awx.analytics.system_tracking
|
||||
- Data gathered by Ansible scan modules ran by scan job templates
|
||||
- `awx.analytics.job_events` - Data returned from the Ansible callback module
|
||||
- `awx.analytics.activity_stream` - Record of changes to the objects within the Ansible Tower app
|
||||
- `awx.analytics.system_tracking` - Data gathered by Ansible scan modules ran by scan job templates
|
||||
|
||||
These loggers only use log-level of INFO.
|
||||
|
||||
Additionally, the standard Tower logs are be deliverable through this
|
||||
same mechanism. It should be obvious to the user how to enable to disable
|
||||
each of these 5 sources of data without manipulating a complex dictionary
|
||||
in their local settings file, as well as adjust the log-level consumed
|
||||
These loggers only use log-level of `INFO`. Additionally, the standard Tower logs are deliverable through this
|
||||
same mechanism. It should be obvious to the user how to enable or disable
|
||||
each of these five sources of data without manipulating a complex dictionary
|
||||
in their local settings file, as well as adjust the log level consumed
|
||||
from the standard Tower logs.
|
||||
|
||||
|
||||
## Supported Services
|
||||
|
||||
Committed to support:
|
||||
@@ -45,18 +42,19 @@ Have tested:
|
||||
Considered, but have not tested:
|
||||
|
||||
- Datadog
|
||||
- Red Hat Common Logging via logstash connector
|
||||
- Red Hat Common Logging via Logstash connector
|
||||
|
||||
|
||||
### Elastic Search Instructions
|
||||
|
||||
In the development environment, the server can be started up with the
|
||||
log aggregation services attached via the Makefile targets. This starts
|
||||
up the 3 associated services of Logstash, Elastic Search, and Kibana
|
||||
as their own separate containers individually.
|
||||
up the three associated services of Logstash, Elastic Search, and Kibana
|
||||
as their own separate containers.
|
||||
|
||||
In addition to running these services, it establishes connections to the
|
||||
tower_tools containers as needed. This is derived from the docker-elk
|
||||
project. (https://github.com/deviantony/docker-elk)
|
||||
`tower_tools` containers as needed. This is derived from the [`docker-elk`
|
||||
project](https://github.com/deviantony/docker-elk):
|
||||
|
||||
```bash
|
||||
# Start a single server with links
|
||||
@@ -65,12 +63,12 @@ make docker-compose-elk
|
||||
make docker-compose-cluster-elk
|
||||
```
|
||||
|
||||
For more instructions on getting started with the environment this stands
|
||||
up, also refer to instructions in `/tools/elastic/README.md`.
|
||||
For more instructions on getting started with the environment that this example spins
|
||||
up, also refer to instructions in [`/tools/elastic/README.md`](https://github.com/ansible/awx/blob/devel/tools/elastic/README.md).
|
||||
|
||||
If you were to start from scratch, standing up your own version the elastic
|
||||
stack, then the only change you should need is to add the following lines
|
||||
to the logstash `logstash.conf` file.
|
||||
If you were to start from scratch, standing up your own version of the Elastic
|
||||
Stack, then the only change you should need is to add the following lines
|
||||
to the Logstash `logstash.conf` file:
|
||||
|
||||
```
|
||||
filter {
|
||||
@@ -80,72 +78,77 @@ filter {
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
#### Debugging and Pitfalls
|
||||
|
||||
Backward-incompatible changes were introduced with Elastic 5.0.0, and
|
||||
customers may need different configurations depending on what
|
||||
versions they are using.
|
||||
|
||||
|
||||
# Log Message Schema
|
||||
|
||||
Common schema for all loggers:
|
||||
|
||||
| Field | Information |
|
||||
|-----------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| cluster_host_id | (string) unique identifier of the host within the Tower cluster |
|
||||
| level | (choice of DEBUG, INFO, WARNING, ERROR, etc.) Standard python log level, roughly reflecting the significance of the event All of the data loggers as a part of this feature use INFO level, but the other Tower logs will use different levels as appropriate |
|
||||
| logger_name | (string) Name of the logger we use in the settings, for example, "awx.analytics.activity_stream" |
|
||||
| @timestamp | (datetime) Time of log |
|
||||
| path | (string) File path in code where the log was generated |
|
||||
| `cluster_host_id` | (string) Unique identifier of the host within the Tower cluster |
|
||||
| `level` | (choice of `DEBUG`, `INFO`, `WARNING`, `ERROR`, etc.) Standard python log level, roughly reflecting the significance of the event; all of the data loggers (as a part of this feature) use `INFO` level, but the other Tower logs will use different levels as appropriate |
|
||||
| `logger_name` | (string) Name of the logger we use in the settings, *e.g.*, "`awx.analytics.activity_stream`" |
|
||||
| `@timestamp` | (datetime) Time of log |
|
||||
| `path` | (string) File path in code where the log was generated |
|
||||
|
||||
|
||||
## Activity Stream Schema
|
||||
|
||||
| Field | Information |
|
||||
|-------------------|-------------------------------------------------------------------------------------------------------------------------|
|
||||
| (common) | this uses all the fields common to all loggers listed above |
|
||||
| actor | (string) username of the user who took the action documented in the log |
|
||||
| changes | (string) unique identifier of the host within the Tower cluster |
|
||||
| operation | (choice of several options) the basic category of the changed logged in the activity stream, for instance, "associate". |
|
||||
| object1 | (string) Information about the primary object being operated on, consistent with what we show in the activity stream |
|
||||
| object2 | (string) if applicable, the second object involved in the action |
|
||||
| (common) | This uses all the fields common to all loggers listed above |
|
||||
| actor | (string) Username of the user who took the action documented in the log |
|
||||
| changes | (string) Unique identifier of the host within the Tower cluster |
|
||||
| operation | (choice of several options) The basic category of the change logged in the Activity Stream, for instance, "associate". |
|
||||
| object1 | (string) Information about the primary object being operated on, consistent with what we show in the Activity Stream |
|
||||
| object2 | (string) If applicable, the second object involved in the action |
|
||||
|
||||
|
||||
## Job Event Schema
|
||||
|
||||
This logger echoes the data being saved into job events, except when they
|
||||
would otherwise conflict with expected standard fields from the logger,
|
||||
in which case the fields are named differently.
|
||||
Notably, the field `host` on the job_event model is given as `event_host`.
|
||||
This logger echoes the data being saved into Job Events, except when they
|
||||
would otherwise conflict with expected standard fields from the logger (in which case the fields are named differently).
|
||||
Notably, the field `host` on the `job_event` model is given as `event_host`.
|
||||
There is also a sub-dictionary field `event_data` within the payload,
|
||||
which will contain different fields depending on the specifics of the
|
||||
Ansible event.
|
||||
|
||||
This logger also includes the common fields.
|
||||
|
||||
|
||||
## Scan / Fact / System Tracking Data Schema
|
||||
|
||||
These contain a detailed dictionary-type field either services,
|
||||
These contain a detailed dictionary-type field for either services,
|
||||
packages, or files.
|
||||
|
||||
| Field | Information |
|
||||
|--------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|
||||
| (common) | this uses all the fields common to all loggers listed above |
|
||||
| services | (dict, optional) For services scans, this field is included and has keys based on the name of the service NOTE: Periods are disallowed by elastic search in names, and are replaced with "_" by our log formatter |
|
||||
| (common) | This uses all the fields common to all loggers listed above |
|
||||
| services | (dict, optional) For services scans, this field is included and has keys based on the name of the service. **NOTE:** Periods are disallowed by Elastic Search in names, and are replaced with `"_"` by our log formatter |
|
||||
| packages | (dict, optional) Included for log messages from package scans |
|
||||
| files | (dict, optional) Included for log messages from file scans |
|
||||
| host | (str) name of host scan applies to |
|
||||
| inventory_id | (int) inventory id host is inside of
|
||||
| host | (str) Name of host that the scan applies to |
|
||||
| inventory_id | (int) Inventory ID host is inside of
|
||||
|
||||
|
||||
## Job Status Changes
|
||||
|
||||
This is a intended to be a lower-volume source of information about
|
||||
changes in job states compared to job events, and also intended to
|
||||
capture changes to types of unified jobs other than job template based
|
||||
changes in job states compared to Job Events, and also intended to
|
||||
capture changes to types of Unified Jobs other than Job Template-based
|
||||
jobs.
|
||||
|
||||
In addition to common fields, these logs include fields present on
|
||||
the job model.
|
||||
|
||||
|
||||
## Tower Logs
|
||||
|
||||
In addition to the common fields, this will contain a `msg` field with
|
||||
@@ -153,6 +156,7 @@ the log message. Errors contain a separate `traceback` field.
|
||||
These logs can be enabled or disabled in CTiT by adding or removing
|
||||
it to the setting `LOG_AGGREGATOR_LOGGERS`.
|
||||
|
||||
|
||||
# Configuring Inside of Tower
|
||||
|
||||
Parameters needed in order to configure the connection to the log
|
||||
@@ -167,8 +171,8 @@ supported services:
|
||||
- A flag to indicate how system tracking records will be sent
|
||||
- Selecting which loggers to send
|
||||
- Enabling sending logs
|
||||
- Connection type, which can be HTTPS, TCP and UDP.
|
||||
- Timeout value if connection type is based on TCP protocol (HTTPS and TCP).
|
||||
- Connection type (HTTPS, TCP, or UDP)
|
||||
- Timeout value if connection type is based on TCP protocol (HTTPS and TCP)
|
||||
|
||||
Some settings for the log handler will not be exposed to the user via
|
||||
this mechanism. For example, threading (enabled).
|
||||
@@ -186,28 +190,29 @@ connection, Port field is supposed to be provided and Host field is supposed to
|
||||
contain hostname only. If instead a URL is entered in Host field, its hostname
|
||||
portion will be extracted as the actual hostname.
|
||||
|
||||
|
||||
# Acceptance Criteria Notes
|
||||
|
||||
Connection: Testers need to replicate the documented steps for setting up
|
||||
**Connection:** Testers need to replicate the documented steps for setting up
|
||||
and connecting with a destination log aggregation service, if that is
|
||||
an officially supported service. That will involve 1) configuring the
|
||||
settings, as documented, 2) taking some action in Tower that causes a log
|
||||
message from each type of data logger to be sent and 3) verifying that
|
||||
the content is present in the log aggregation service.
|
||||
|
||||
Schema: After the connection steps are completed, a tester will need to create
|
||||
**Schema:** After the connection steps are completed, a tester will need to create
|
||||
an index. We need to confirm that no errors are thrown in this process.
|
||||
It also needs to be confirmed that the schema is consistent with the
|
||||
documentation. In the case of Splunk, we need basic confirmation that
|
||||
the data is compatible with the existing app schema.
|
||||
|
||||
Tower logs: Formatting of Traceback message is a known issue in several
|
||||
**Tower logs:** Formatting of Traceback message is a known issue in several
|
||||
open-source log handlers, so we should confirm that server errors result
|
||||
in the log aggregator receiving a well-formatted multi-line string
|
||||
with the traceback message.
|
||||
|
||||
Log messages should be sent outside of the
|
||||
request-response cycle. For example, loggly examples use
|
||||
request-response cycle. For example, Loggly examples use
|
||||
`requests_futures.sessions.FuturesSession`, which does some
|
||||
threading work to fire the message without interfering with other
|
||||
operations. A timeout on the part of the log aggregation service should
|
||||
|
||||
Reference in New Issue
Block a user