diff --git a/docs/ansible_runner_integration.md b/docs/ansible_runner_integration.md index a27173020b..00fc95157c 100644 --- a/docs/ansible_runner_integration.md +++ b/docs/ansible_runner_integration.md @@ -1,19 +1,19 @@ ## Ansible Runner Integration Overview -Much of the code in AWX around ansible and ansible-playbook invocation interacting has been removed and put into the project ansible-runner. AWX now calls out to ansible-runner to invoke ansible and ansible-playbook. +Much of the code in AWX around Ansible and `ansible-playbook` invocation has been removed and put into the project `ansible-runner`. AWX now calls out to `ansible-runner` to invoke Ansible and `ansible-playbook`. ### Lifecycle -In AWX, a task of a certain job type is kicked off (i.e. RunJob, RunProjectUpdate, RunInventoryUpdate, etc) in tasks.py. A temp directory is build to house ansible-runner parameters (i.e. envvars, cmdline, extravars, etc.). The temp directory is filled with the various concepts in AWX (i.e. ssh keys, extra varsk, etc.). The code then builds a set of parameters to be passed to the ansible-runner python module interface, `ansible-runner.interface.run()`. This is where AWX passes control to ansible-runner. Feedback is gathered by AWX via callbacks and handlers passed in. +In AWX, a task of a certain job type is kicked off (_i.e._, RunJob, RunProjectUpdate, RunInventoryUpdate, etc.) in `tasks.py`. A temp directory is built to house `ansible-runner` parameters (_i.e._, `envvars`, `cmdline`, `extravars`, etc.). The `temp` directory is filled with the various concepts in AWX (_i.e._, `ssh` keys, `extra vars`, etc.). The code then builds a set of parameters to be passed to the `ansible-runner` Python module interface, `ansible-runner.interface.run()`. This is where AWX passes control to `ansible-runner`. Feedback is gathered by AWX via callbacks and handlers passed in. The callbacks and handlers are: -* event_handler: Called each time a new event is created in ansible runner. AWX will disptach the event to rabbitmq to be processed on the other end by the callback receiver. -* cancel_callback: Called periodically by ansible runner. This is so that AWX can inform ansible runner if the job should be canceled or not. -* finished_callback: Called once by ansible-runner to denote that the process that was asked to run is finished. AWX will construct the special control event, `EOF`, with an associated total number of events that it observed. -* status_handler: Called by ansible-runner as the process transitions state internally. AWX uses the `starting` status to know that ansible-runner has made all of its decisions around the process that it will launch. AWX gathers and associates these decisions with the Job for historical observation. +* `event_handler`: Called each time a new event is created in `ansible-runner`. AWX will dispatch the event to `rabbitmq` to be processed on the other end by the callback receiver. +* `cancel_callback`: Called periodically by `ansible-runner`; this is so that AWX can inform `ansible-runner` if the job should be canceled or not. +* `finished_callback`: Called once by `ansible-runner` to denote that the process that was asked to run is finished. AWX will construct the special control event, `EOF`, with the associated total number of events that it observed. +* `status_handler`: Called by `ansible-runner` as the process transitions state internally. AWX uses the `starting` status to know that `ansible-runner` has made all of its decisions around the process that it will launch. AWX gathers and associates these decisions with the Job for historical observation. ### Debugging -If you want to debug ansible-runner then set `AWX_CLEANUP_PATHS=False`, run a job, observe the job's `AWX_PRIVATE_DATA_DIR` property, and go the node where the job was executed and inspect that directory. +If you want to debug `ansible-runner`, then set `AWX_CLEANUP_PATHS=False`, run a job, observe the job's `AWX_PRIVATE_DATA_DIR` property, and go the node where the job was executed and inspect that directory. -If you want to debug the process that ansible-runner invoked (i.e. ansible or ansible-playbook) then observe the job's job_env, job_cwd, and job_args parameters. +If you want to debug the process that `ansible-runner` invoked (_i.e._, Ansible or `ansible-playbook`), then observe the Job's `job_env`, `job_cwd`, and `job_args` parameters. diff --git a/docs/auth/README.md b/docs/auth/README.md index 3737bf1823..50578947aa 100644 --- a/docs/auth/README.md +++ b/docs/auth/README.md @@ -7,18 +7,18 @@ When a user wants to log into Tower, she can explicitly choose some of the suppo * Github Team OAuth2 * Microsoft Azure Active Directory (AD) OAuth2 -On the other hand, the rest of authentication methods use the same types of login info as Tower(username and password), but authenticate using external auth systems rather than Tower's own database. If some of these methods are enabled, Tower will try authenticating using the enabled methods *before Tower's own authentication method*. In specific, it follows the order +On the other hand, the other authentication methods use the same types of login info as Tower (username and password), but authenticate using external auth systems rather than Tower's own database. If some of these methods are enabled, Tower will try authenticating using the enabled methods *before Tower's own authentication method*. The order of precedence is: * LDAP * RADIUS * TACACS+ * SAML -Tower will try authenticating against each enabled authentication method *in the specified order*, meaning if the same username and password is valid in multiple enabled auth methods (For example, both LDAP and TACACS+), Tower will only use the first positive match (In the above example, log a user in via LDAP and skip TACACS+). +Tower will try authenticating against each enabled authentication method *in the specified order*, meaning if the same username and password is valid in multiple enabled auth methods (*e.g.*, both LDAP and TACACS+), Tower will only use the first positive match (in the above example, log a user in via LDAP and skip TACACS+). ## Notes: -* SAML users, RADIUS users and TACACS+ users are categorized as 'Enterprise' users. The following rules apply to Enterprise users: +SAML users, RADIUS users and TACACS+ users are categorized as 'Enterprise' users. The following rules apply to Enterprise users: * Enterprise users can only be created via the first successful login attempt from remote authentication backend. * Enterprise users cannot be created/authenticated if non-enterprise users with the same name has already been created in Tower. * Tower passwords of Enterprise users should always be empty and cannot be set by any user if there are enterprise backends enabled. - * If enterprise backends are disabled, an Enterprise user can be converted to a normal Tower user by setting password field. But this operation is irreversible (The converted Tower user can no longer be treated as Enterprise user) + * If enterprise backends are disabled, an Enterprise user can be converted to a normal Tower user by setting password field. But this operation is irreversible (the converted Tower user can no longer be treated as Enterprise user). diff --git a/docs/auth/ldap.md b/docs/auth/ldap.md index f8c0c3b270..7ddb5fac08 100644 --- a/docs/auth/ldap.md +++ b/docs/auth/ldap.md @@ -1,18 +1,21 @@ # LDAP -The Lightweight Directory Access Protocol (LDAP) is an open, vendor-neutral, industry standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network. +The Lightweight Directory Access Protocol (LDAP) is an open, vendor-neutral, industry-standard application protocol for accessing and maintaining distributed directory information services over an Internet Protocol (IP) network. Directory services play an important role in developing intranet and Internet applications by allowing the sharing of information about users, systems, networks, services, and applications throughout the network. + # Configure LDAP Authentication -Please see the Tower documentation as well as Ansible blog posts for basic LDAP configuration. + +Please see the [Tower documentation](https://docs.ansible.com/ansible-tower/latest/html/administration/ldap_auth.html) as well as [Ansible blog post](https://www.ansible.com/blog/getting-started-ldap-authentication-in-ansible-tower) for basic LDAP configuration. LDAP Authentication provides duplicate sets of configuration fields for authentication with up to six different LDAP servers. -The default set of configuration fields take the form `AUTH_LDAP_`. Configuration fields for additional ldap servers are numbered `AUTH_LDAP__`. - -## Test environment setup - -Please see README.md of this repository: https://github.com/jangsutsr/deploy_ldap.git. +The default set of configuration fields take the form `AUTH_LDAP_`. Configuration fields for additional LDAP servers are numbered `AUTH_LDAP__`. -# Basic setup for FreeIPA +## Test Environment Setup + +Please see `README.md` of this repository: https://github.com/jangsutsr/deploy_ldap.git. + + +# Basic Setup for FreeIPA LDAP Server URI (append if you have multiple LDAPs) `ldaps://{{serverip1}}:636` diff --git a/docs/auth/oauth.md b/docs/auth/oauth.md index a3a93be21b..58ab6a9418 100644 --- a/docs/auth/oauth.md +++ b/docs/auth/oauth.md @@ -1,16 +1,16 @@ ## Introduction Starting from Tower 3.3, OAuth 2 will be used as the new means of token-based authentication. Users will be able to manage OAuth 2 tokens as well as applications, a server-side representation of API -clients used to generate tokens. With OAuth 2, a user can authenticate by passing a token as part of +clients used to generate tokens. With OAuth 2, a user can authenticate by passing a token as part of the HTTP authentication header. The token can be scoped to have more restrictive permissions on top of -the base RBAC permissions of the user. Refer to [RFC 6749](https://tools.ietf.org/html/rfc6749) for +the base RBAC permissions of the user. Refer to [RFC 6749](https://tools.ietf.org/html/rfc6749) for more details of OAuth 2 specification. ## Basic Usage -To get started using OAuth 2 tokens for accessing the browsable API using OAuth 2, we will walkthrough acquiring a token, and using it. +To get started using OAuth 2 tokens for accessing the browsable API using OAuth 2, this document will walk through the steps of acquiring a token and using it. -1. Make an application with authorization_grant_type set to 'password'. HTTP POST the following to the `/api/v2/applications/` endpoint (supplying your own organization-id): +1. Make an application with `authorization_grant_type` set to 'password'. HTTP POST the following to the `/api/v2/applications/` endpoint (supplying your own `organization-id`): ``` { "name": "Admin Internal Application", @@ -22,7 +22,7 @@ To get started using OAuth 2 tokens for accessing the browsable API using OAuth "organization": } ``` -2. Make a token with a POST to the `/api/v2/tokens/` endpoint: +2. Make a token with a POST to the `/api/v2/tokens/` endpoint: ``` { "description": "My Access Token", @@ -32,13 +32,13 @@ To get started using OAuth 2 tokens for accessing the browsable API using OAuth ``` This will return a `` that you can use to authenticate with for future requests (this will not be shown again) -3. Use token to access a resource. We will use curl to demonstrate this: +3. Use token to access a resource. We will use `curl` to demonstrate this: ``` curl -H "Authorization: Bearer " -X GET https:///api/v2/users/ ``` > The `-k` flag may be needed if you have not set up a CA yet and are using SSL. -This token can be revoked by making a DELETE on the detail page for that token. All you need is that token's id. For example: +This token can be revoked by making a DELETE on the detail page for that token. All you need is that token's id. For example: ``` curl -ku : -X DELETE https:///api/v2/tokens// ``` @@ -48,15 +48,17 @@ Similarly, using a token: curl -H "Authorization: Bearer " -X DELETE https:///api/v2/tokens// -k ``` + ## More Information -#### Managing OAuth 2 applications and tokens -Applications and tokens can be managed as a top-level resource at `/api//applications` and -`/api//tokens`. These resources can also be accessed respective to the user at +#### Managing OAuth 2 Applications and Tokens + +Applications and tokens can be managed as a top-level resource at `/api//applications` and +`/api//tokens`. These resources can also be accessed respective to the user at `/api//users/N/`. Applications can be created by making a POST to either `api//applications` or `/api//users/N/applications`. -Each OAuth 2 application represents a specific API client on the server side. For an API client to use the API via an application token, +Each OAuth 2 application represents a specific API client on the server side. For an API client to use the API via an application token, it must first have an application and issue an access token. Individual applications will be accessible via their primary keys: @@ -111,22 +113,20 @@ generated during creation; Fields `user` and `authorization_grant_type`, on the *immutable on update*, meaning they are required fields on creation, but will become read-only after that. -On RBAC side: -- system admins will be able to see and manipulate all applications in the system; +**On RBAC side:** +- System admins will be able to see and manipulate all applications in the system; - Organization admins will be able to see and manipulate all applications belonging to Organization members; - Other normal users will only be able to see, update and delete their own applications, but cannot create any new applications. - - Tokens, on the other hand, are resources used to actually authenticate incoming requests and mask the permissions of the underlying user. Tokens can be created by POSTing to `/api/v2/tokens/` endpoint by providing `application` and `scope` fields to point to related application and specify token scope; or POSTing to `/api/v2/applications//tokens/` by providing only `scope`, while the parent application will be automatically linked. -Individual tokens will be accessible via their primary keys: +Individual tokens will be accessible via their primary keys at `/api//tokens//`. Here is a typical token: ``` { @@ -162,18 +162,19 @@ Individual tokens will be accessible via their primary keys: "scope": "read" }, ``` -For an OAuth 2 token, the only fully mutable fields are `scope` and `description`. The `application` -field is *immutable on update*, and all other fields are totally immutable, and will be auto-populated -during creation -* `user` field corresponds to the user the token is created for +For an OAuth 2 token, the only fully mutable fields are `scope` and `description`. The `application` +field is *immutable on update*, and all other fields are totally immutable, and will be auto-populated +during creation. +* `user` - this field corresponds to the user the token is created for * `expires` will be generated according to Tower configuration setting `OAUTH2_PROVIDER` * `token` and `refresh_token` will be auto-generated to be non-clashing random strings. -Both application tokens and personal access tokens will be shown at the `/api/v2/tokens/` + +Both application tokens and personal access tokens will be shown at the `/api/v2/tokens/` endpoint. Personal access tokens can be identified by the `application` field being `null`. -On RBAC side: +**On RBAC side:** - A user will be able to create a token if they are able to see the related application; -- System admin is able to see and manipulate every token in the system; +- The System Administrator is able to see and manipulate every token in the system; - Organization admins will be able to see and manipulate all tokens belonging to Organization members; System Auditors can see all tokens and applications @@ -196,7 +197,7 @@ curl -H "Authorization: Bearer kqHqxfpHGRRBXLNCOXxT5Zt3tpJogn" http:///api/ According to OAuth 2 specification, users should be able to acquire, revoke and refresh an access token. In AWX the equivalent, and easiest, way of doing that is creating a token, deleting -a token, and deleting a token quickly followed by creating a new one. +a token, and deleting a token quickly followed by creating a new one. The specification also provides standard ways of doing this. RFC 6749 elaborates on those topics, but in summary, an OAuth 2 token is officially acquired via authorization using @@ -211,7 +212,9 @@ endpoints under `/api/o/` endpoint. Detailed examples on the most typical usage are available as description text of `/api/o/`. See below for information on Application Access Token usage. > Note: The `/api/o/` endpoints can only be used for application tokens, and are not valid for personal access tokens. -#### Token scope mask over RBAC system + +#### Token Scope Mask Over RBAC System + The scope of an OAuth 2 token is a space-separated string composed of keywords like 'read' and 'write'. These keywords are configurable and used to specify permission level of the authenticated API client. For the initial OAuth 2 implementation, we use the most simple scope configuration, where the only @@ -225,7 +228,7 @@ For example, if a user has admin permission to a job template, he/she can both s and delete the job template if authenticated via session or basic auth. On the other hand, if the user is authenticated using OAuth 2 token, and the related token scope is 'read', the user can only see but not manipulate or launch the job template, despite being an admin. If the token scope is -'write' or 'read write', she can take full advantage of the job template as its admin. Note, that 'write' +'write' or 'read write', she can take full advantage of the job template as its admin. Note that 'write' implies 'read' as well. @@ -235,14 +238,15 @@ This page lists OAuth 2 utility endpoints used for authorization, token refresh Note endpoints other than `/api/o/authorize/` are not meant to be used in browsers and do not support HTTP GET. The endpoints here strictly follow [RFC specs for OAuth 2](https://tools.ietf.org/html/rfc6749), so please use that for detailed -reference. Here we give some examples to demonstrate the typical usage of these endpoints in -AWX context (Note AWX net location default to `http://localhost:8013` in examples): +reference. Below are some examples to demonstrate the typical usage of these endpoints in +AWX context (note that the AWX net location defaults to `http://localhost:8013` in these examples). -#### Application using `authorization code` grant type +#### Application Using `authorization code` Grant Type + This application grant type is intended to be used when the application is executing on the server. To create -an application named `AuthCodeApp` with the `authorization-code` grant type, -Make a POST to the `/api/v2/applications/` endpoint. +an application named `AuthCodeApp` with the `authorization-code` grant type, +make a POST to the `/api/v2/applications/` endpoint: ```text { "name": "AuthCodeApp", @@ -253,21 +257,22 @@ Make a POST to the `/api/v2/applications/` endpoint. "skip_authorization": false } ``` -You can test the authorization flow out with this new application by copying the client_id and URI link into the -homepage [here](http://django-oauth-toolkit.herokuapp.com/consumer/) and click submit. This is just a simple test -application Django-oauth-toolkit provides. +You can test the authorization flow out with this new application by copying the `client_id` and URI link into the +homepage [here](http://django-oauth-toolkit.herokuapp.com/consumer/) and click submit. This is just a simple test +application `Django-oauth-toolkit` provides. -From the client app, the user makes a GET to the Authorize endpoint with the `response_type`, +From the client app, the user makes a GET to the Authorize endpoint with the `response_type`, `client_id`, `redirect_uris`, and `scope`. AWX will respond with the authorization `code` and `state` -to the redirect_uri specified in the application. The client application will then make a POST to the -`api/o/token/` endpoint on AWX with the `code`, `client_id`, `client_secret`, `grant_type`, and `redirect_uri`. +to the `redirect_uri` specified in the application. The client application will then make a POST to the +`api/o/token/` endpoint on AWX with the `code`, `client_id`, `client_secret`, `grant_type`, and `redirect_uri`. AWX will respond with the `access_token`, `token_type`, `refresh_token`, and `expires_in`. For more information on testing this flow, refer to [django-oauth-toolkit](http://django-oauth-toolkit.readthedocs.io/en/latest/tutorial/tutorial_01.html#test-your-authorization-server). -#### Application using `password` grant type +#### Application Using `password` Grant Type + This is also called the `resource owner credentials grant`. This is for use by users who have -native access to the web app. This should be used when the client is the Resource owner. Suppose +native access to the web app. This should be used when the client is the Resource owner. Suppose we have an application `Default Application` with grant type `password`: ```text { @@ -285,7 +290,7 @@ we have an application `Default Application` with grant type `password`: } ``` -Log in is not required for `password` grant type, so we can simply use `curl` to acquire a personal access token +Login is not required for `password` grant type, so we can simply use `curl` to acquire a personal access token via `/api/o/token/`: ```bash curl -X POST \ @@ -294,12 +299,12 @@ curl -X POST \ IaUBsaVDgt2eiwOGe0bg5m5vCSstClZmtdy359RVx2rQK5YlIWyPlrolpt2LEpVeKXWaiybo" \ http:///api/o/token/ -i ``` -In the above post request, parameters `username` and `password` are username and password of the related +In the above POST request, parameters `username` and `password` are the username and password of the related AWX user of the underlying application, and the authentication information is of format `:`, where `client_id` and `client_secret` are the corresponding fields of underlying application. -Upon success, access token, refresh token and other information are given in the response body in JSON +Upon success, the access token, refresh token and other information are given in the response body in JSON format: ```text HTTP/1.1 200 OK @@ -317,9 +322,11 @@ Strict-Transport-Security: max-age=15768000 {"access_token": "9epHOqHhnXUcgYK8QanOmUQPSgX92g", "token_type": "Bearer", "expires_in": 315360000000, "refresh_token": "jMRX6QvzOTf046KHee3TU5mT3nyXsz", "scope": "read"} ``` + ## Token Functions -#### Refresh an existing access token +#### Refresh an Existing Access Token + Suppose we have an existing access token with refresh token provided: ```text { @@ -334,14 +341,14 @@ Suppose we have an existing access token with refresh token provided: "scope": "read write" } ``` -The `/api/o/token/` endpoint is used for refreshing access token: +The `/api/o/token/` endpoint is used for refreshing the access token: ```bash curl -X POST \ -d "grant_type=refresh_token&refresh_token=AL0NK9TTpv0qp54dGbC4VUZtsZ9r8z" \ -u "gwSPoasWSdNkMDtBN3Hu2WYQpPWCO9SwUEsKK22l:fI6ZpfocHYBGfm1tP92r0yIgCyfRdDQt0Tos9L8a4fNsJjQQMwp9569eIaUBsaVDgt2eiwOGe0bg5m5vCSstClZmtdy359RVx2rQK5YlIWyPlrolpt2LEpVeKXWaiybo" \ http:///api/o/token/ -i ``` -In the above post request, `refresh_token` is provided by `refresh_token` field of the access token +In the above POST request, `refresh_token` is provided by `refresh_token` field of the access token above. The authentication information is of format `:`, where `client_id` and `client_secret` are the corresponding fields of underlying related application of the access token. @@ -364,12 +371,14 @@ Strict-Transport-Security: max-age=15768000 ``` Internally, the refresh operation deletes the existing token and a new token is created immediately after, with information like scope and related application identical to the original one. We can -verify by checking the new token is present and the old token is deleted at the /api/v2/tokens/ endpoint. +verify by checking the new token is present and the old token is deleted at the `/api/v2/tokens/` endpoint. -#### Revoke an access token -##### Alternatively Revoke using the /api/o/revoke-token/ endpoint -Revoking an access token by this method is the same as deleting the token resource object, but it allows you to delete a token by providing its token value, and the associated `client_id` (and `client_secret` if the application is `confidential`). For example: +#### Revoke an Access Token + +##### Alternatively Revoke Using the /api/o/revoke-token/ Endpoint + +Revoking an access token by this method is the same as deleting the token resource object, but it allows you to delete a token by providing its token value, and the associated `client_id` (and `client_secret` if the application is `confidential`). For example: ```bash curl -X POST -d "token=rQONsve372fQwuc2pn76k3IHDCYpi7" \ -u "gwSPoasWSdNkMDtBN3Hu2WYQpPWCO9SwUEsKK22l:fI6ZpfocHYBGfm1tP92r0yIgCyfRdDQt0Tos9L8a4fNsJjQQMwp9569eIaUBsaVDgt2eiwOGe0bg5m5vCSstClZmtdy359RVx2rQK5YlIWyPlrolpt2LEpVeKXWaiybo" \ @@ -377,17 +386,12 @@ curl -X POST -d "token=rQONsve372fQwuc2pn76k3IHDCYpi7" \ ``` `200 OK` means a successful delete. -We can verify the effect by checking if the token is no longer present -at /api/v2/tokens/. - - - - - - +We can verify the effect by checking if the token is no longer present +at `/api/v2/tokens/`. ## Acceptance Criteria + * All CRUD operations for OAuth 2 applications and tokens should function as described. * RBAC rules applied to OAuth 2 applications and tokens should behave as described. * A default application should be auto-created for each new user. @@ -396,4 +400,4 @@ at /api/v2/tokens/. * Token scope mask over RBAC should work as described. * Tower configuration setting `OAUTH2_PROVIDER` should be configurable and function as described. * `/api/o/` endpoint should work as expected. In specific, all examples given in the description - help text should be working (user following the steps should get expected result). + help text should be working (a user following the steps should get expected result). diff --git a/docs/auth/saml.md b/docs/auth/saml.md index a2aa31d4e9..7576c3676c 100644 --- a/docs/auth/saml.md +++ b/docs/auth/saml.md @@ -1,20 +1,23 @@ # SAML -Security Assertion Markup Language, or SAML, is an open standard for exchanging authentication and/or authorization data between an identity provider (i.e. LDAP) and a service provider (i.e. AWX). More concretely, AWX can be configured to talk with SAML in order to authenticate (create/login/logout) users of AWX. User Team and Organization membership can be embedded in the SAML response to AWX. +Security Assertion Markup Language, or SAML, is an open standard for exchanging authentication and/or authorization data between an identity provider (*i.e.*, LDAP) and a service provider (*i.e.*, AWX). More concretely, AWX can be configured to talk with SAML in order to authenticate (create/login/logout) users of AWX. User Team and Organization membership can be embedded in the SAML response to AWX. + # Configure SAML Authentication -Please see the Tower documentation as well as Ansible blog posts for basic SAML configuration. Note that AWX's SAML implementation relies on python-social-auth which uses python-saml. AWX exposes 3 fields that are directly passed to the lower libraries: +Please see the [Tower documentation](https://docs.ansible.com/ansible-tower/latest/html/administration/ent_auth.html#saml-authentication-settings) as well as the [Ansible blog post](https://www.ansible.com/blog/using-saml-with-red-hat-ansible-tower) for basic SAML configuration. Note that AWX's SAML implementation relies on `python-social-auth` which uses `python-saml`. AWX exposes three fields which are directly passed to the lower libraries: * `SOCIAL_AUTH_SAML_SP_EXTRA` is passed to the `python-saml` library configuration's `sp` setting. * `SOCIAL_AUTH_SAML_SECURITY_CONFIG` is passed to the `python-saml` library configuration's `security` setting. * `SOCIAL_AUTH_SAML_EXTRA_DATA` See http://python-social-auth-docs.readthedocs.io/en/latest/backends/saml.html#advanced-settings for more information. -# Configure SAML for Team and Organization Membership -AWX can be configured to look for particular attributes that contain AWX Team and Organization membership to associate with users when they login to AWX. The attribute names are defined in AWX settings. Specifically, the authentication settings tab and SAML sub category fields *SAML Team Attribute Mapping* and *SAML Organization Attribute Mapping*. The meaning and usefulness of these settings is best motivated through example. -**Example SAML Organization Attribute Mapping** +# Configure SAML for Team and Organization Membership +AWX can be configured to look for particular attributes that contain AWX Team and Organization membership to associate with users when they log in to AWX. The attribute names are defined in AWX settings. Specifically, the authentication settings tab and SAML sub category fields *SAML Team Attribute Mapping* and *SAML Organization Attribute Mapping*. The meaning and usefulness of these settings is best communicated through example. + +### Example SAML Organization Attribute Mapping + Below is an example SAML attribute that embeds user organization membership in the attribute *member-of*. ``` - + Engineering IT @@ -25,9 +28,9 @@ Below is an example SAML attribute that embeds user organization membership in t IT HR - + ``` -Below, the corresponding AWX configuration. +Below, the corresponding AWX configuration: ``` { "saml_attr": "member-of", @@ -36,16 +39,16 @@ Below, the corresponding AWX configuration. 'remove_admins': true } ``` -**saml_attr:** The saml attribute name where the organization array can be found. +**saml_attr:** The SAML attribute name where the organization array can be found. -**remove:** True to remove user from all organizations before adding the user to the list of Organizations. False to keep the user in whatever Organization(s) they are in while adding the user to the Organization(s) in the SAML attribute. +**remove:** Set this to `true` to remove a user from all organizations before adding the user to the list of Organizations. Set it to `false` to keep the user in whatever Organization(s) they are in while adding the user to the Organization(s) in the SAML attribute. -**saml_admin_attr:** The saml attribute name where the organization administrators array can be found. +**saml_admin_attr:** The SAML attribute name where the organization administrators' array can be found. -**remove_admins:** True to remove user from all organizations that it is admin before adding the user to the list of Organizations admins. False to keep the user in whatever Organization(s) they are in as admin while adding the user as an Organization administrator in the SAML attribute. +**remove_admins:** Set this to `true` to remove a user from all organizations that they are administrators of before adding the user to the list of Organizations admins. Set it to `false` to keep the user in whatever Organization(s) they are in as admin while adding the user as an Organization administrator in the SAML attribute. -**Example SAML Team Attribute Mapping** -Below is another example of a SAML attribute that contains a Team membership in a list. +### Example SAML Team Attribute Mapping +Below is another example of a SAML attribute that contains a Team membership in a list: ``` ", "organization": "" }` that defines mapping from AWX Team -> AWX Organization. This is needed because the same named Team can exist in multiple Organizations in Tower. The organization to which a team listed in a SAML attribute belongs to would be ambiguous without this mapping. +**team_org_map:** An array of dictionaries of the form `{ "team": "", "organization": "" }` which defines mapping from AWX Team -> AWX Organization. This is needed because the same named Team can exist in multiple Organizations in Tower. The organization to which a team listed in a SAML attribute belongs to would be ambiguous without this mapping. diff --git a/docs/capacity.md b/docs/capacity.md index 26372790e6..4e53f29d5b 100644 --- a/docs/capacity.md +++ b/docs/capacity.md @@ -1,7 +1,7 @@ ## Ansible Tower Capacity Determination and Job Impact The Ansible Tower capacity system determines how many jobs can run on an Instance given the amount of resources -available to the Instance and the size of the jobs that are running (referred herafter as `Impact`). +available to the Instance and the size of the jobs that are running (referred to hereafter as `Impact`). The algorithm used to determine this is based entirely on two things: * How much memory is available to the system (`mem_capacity`) @@ -11,72 +11,74 @@ Capacity also impacts Instance Groups. Since Groups are composed of Instances, l assigned to multiple Groups. This means that impact to one Instance can potentially affect the overall capacity of other Groups. -Instance Groups (not Instances themselves) can be assigned to be used by Jobs at various levels (see clustering.md). -When the Task Manager is preparing its graph to determine which Group a Job will run on it will commit the capacity of -an Instance Group to a job that hasn't or isn't ready to start yet. (see task_manager_system.md) +Instance Groups (not Instances themselves) can be assigned to be used by Jobs at various levels (see [Tower Clustering/HA Overview](https://github.com/ansible/awx/blob/devel/docs/clustering.md)). +When the Task Manager is preparing its graph to determine which Group a Job will run on, it will commit the capacity of +an Instance Group to a Job that hasn't or isn't ready to start yet (see [Task Manager Overview](https://github.com/ansible/awx/blob/devel/docs/task_manager_system.md)). -Finally, if only one Instance is available, in smaller configurations, for a Job to run the Task Manager will allow that -Job to run on the Instance even if it would push the Instance over capacity. We do this as a way to guarantee that Jobs -themselves won't get clogged as a result of an under provisioned system. +Finally, if only one Instance is available (especially in smaller configurations) for a Job to run, the Task Manager will allow that +Job to run on the Instance even if it would push the Instance over capacity. We do this as a way to guarantee that jobs +themselves won't get clogged as a result of an under-provisioned system. + +These concepts mean that, in general, Capacity and Impact is not a zero-sum system relative to Jobs and Instances/Instance Groups. -These concepts mean that, in general, Capacity and Impact is not a zero-sum system relative to Jobs and Instances/Instance Groups ### Resource Determination For Capacity Algorithm - -The capacity algorithms are defined in order to determine how many `forks` a system is capable of running simultaneously. This controls how +The capacity algorithms are defined in order to determine how many `forks` a system is capable of running at the same time. This controls how many systems Ansible itself will communicate with simultaneously. Increasing the number of forks a Tower system is running will, in general, -allow jobs to run faster by performing more work in parallel. The tradeoff is that will increase the load on the system which could cause work +allow jobs to run faster by performing more work in parallel. The tradeoff is that this will increase the load on the system which could cause work to slow down overall. Tower can operate in two modes when determining capacity. `mem_capacity` (the default) will allow you to overcommit CPU resources while protecting the system -from running out of memory. If most of your work is not cpu-bound then selecting this mode will maximize the number of forks. +from running out of memory. If most of your work is not CPU-bound, then selecting this mode will maximize the number of forks. + #### Memory Relative Capacity -`mem_capacity` is calculated relative to the amount of memory needed per-fork. Taking into account the overhead for Tower's internal components this comes out -to be about `100MB` per-fork. When considering the amount of memory available to Ansible jobs the capacity algorithm will reserve 2GB of memory to account +`mem_capacity` is calculated relative to the amount of memory needed per-fork. Taking into account the overhead for Tower's internal components, this comes out +to be about `100MB` per fork. When considering the amount of memory available to Ansible jobs the capacity algorithm will reserve 2GB of memory to account for the presence of other Tower services. The algorithm itself looks like this: (mem - 2048) / mem_per_fork - + As an example: (4096 - 2048) / 100 == ~20 - + So a system with 4GB of memory would be capable of running 20 forks. The value `mem_per_fork` can be controlled by setting the Tower settings value (or environment variable) `SYSTEM_TASK_FORKS_MEM` which defaults to `100`. -#### CPU Relative Capacity -Often times Ansible workloads can be fairly cpu-bound. In these cases sometimes reducing the simultaneous workload allows more tasks to run faster and reduces +#### CPU-Relative Capacity + +Often times Ansible workloads can be fairly CPU-bound. In these cases, sometimes reducing the simultaneous workload allows more tasks to run faster and reduces the average time-to-completion of those jobs. -Just as the Tower `mem_capacity` algorithm uses the amount of memory need per-fork, the `cpu_capacity` algorithm looks at the amount of cpu resources is needed -per fork. The baseline value for this is `4` forks per-core. The algorithm itself looks like this: +Just as the Tower `mem_capacity` algorithm uses the amount of memory needed per-fork, the `cpu_capacity` algorithm looks at the amount of CPU resources is needed +per fork. The baseline value for this is `4` forks per core. The algorithm itself looks like this: cpus * fork_per_cpu - -For example a 4-core system: + +For example, in a 4-core system: 4 * 4 == 16 - -The value `fork_per_cpu` can be controlled by setting the Tower settings value (or environment variable) `SYSTEM_TASK_FORKS_CPU` which defaults to `4`. + +The value `fork_per_cpu` can be controlled by setting the Tower settings value (or environment variable) `SYSTEM_TASK_FORKS_CPU`, which defaults to `4`. ### Job Impacts Relative To Capacity -When selecting the capacity it's important to understand how each job type affects capacity. +When selecting the capacity, it's important to understand how each job type affects it. It's helpful to understand what `forks` mean to Ansible: http://docs.ansible.com/ansible/latest/intro_configuration.html#forks -The default forks value for ansible is `5`. However, if Tower knows that you're running against fewer systems than that then the actual concurrency value +The default forks value for ansible is `5`. However, if Tower knows that you're running against fewer systems than that, then the actual concurrency value will be lower. -When a job is run, Tower will add `1` to the number of forks selected to compensate for the Ansible parent process. So if you are running a playbook against `5` -systems with a `forks` value of `5` then the actual `forks` value from the perspective of Job Impact will be 6. +When a job is made to run, Tower will add `1` to the number of forks selected to compensate for the Ansible parent process. So if you are running a playbook against `5` +systems with a `forks` value of `5`, then the actual `forks` value from the perspective of Job Impact will be 6. -#### Impact of Job types in Tower +#### Impact of Job Types in Tower -Jobs and Ad-hoc jobs follow the above model `forks + 1`. +Jobs and Ad-hoc jobs follow the above model, `forks + 1`. Other job types have a fixed impact: @@ -84,16 +86,15 @@ Other job types have a fixed impact: * Project Updates: 1 * System Jobs: 5 -### Selecting the right capacity +### Selecting the Right Capacity -Selecting between a `memory` focused capacity algorithm and a `cpu` focused capacity for your Tower use means you'll be selecting between a minimum -and maximum value. In the above examples the CPU capacity would allow a maximum of 16 forks while the Memory capacity would allow 20. For some systems -the disparity between these can be large and often times you may want to have a balance between these two. +Selecting between a memory-focused capacity algorithm and a CPU-focused capacity for your Tower use means you'll be selecting between a minimum +and maximum value. In the above examples, the CPU capacity would allow a maximum of 16 forks while the Memory capacity would allow 20. For some systems, +the disparity between these can be large and oftentimes you may want to have a balance between these two. -An `Instance` field `capacity_adjustment` allows you to select how much of one or the other you want to consider. It is represented as a value between 0.0 -and 1.0. If set to a value of `1.0` then the largest value will be used. In the above example, that would be Memory capacity so a value of `20` forks would +An Instance field, `capacity_adjustment`, allows you to select how much of one or the other you want to consider. It is represented as a value between `0.0` +and `1.0`. If set to a value of `1.0`, then the largest value will be used. In the above example, that would be Memory capacity, so a value of `20` forks would be selected. If set to a value of `0.0` then the smallest value will be used. A value of `0.5` would be a 50/50 balance between the two algorithms which would be `18`: 16 + (20 - 16) * 0.5 == 18 - diff --git a/docs/clustering.md b/docs/clustering.md index 41e9e44bc7..b7d1ed4d82 100644 --- a/docs/clustering.md +++ b/docs/clustering.md @@ -85,9 +85,9 @@ hostC rabbitmq_host=10.1.0.3 - `rabbitmq_use_long_names` - RabbitMQ is pretty sensitive to what each instance is named. We are flexible enough to allow FQDNs (_host01.example.com_), short names (`host01`), or IP addresses (192.168.5.73). Depending on what is used to identify each host in the `inventory` file, this value may need to be changed. For FQDNs and IP addresses, this value needs to be `true`. For short names it should be `false` - `rabbitmq_enable_manager` - Setting this to `true` will expose the RabbitMQ management web console on each instance. -The most important field to point out for variability is `rabbitmq_use_long_name`. This cannot be detected and no reasonable default is provided for it, so it's important to point out when it needs to be changed. If instances are provisioned to where they reference other instances internally and not on external addresses then `rabbitmq_use_long_name` semantics should follow the internal addressing (aka `rabbitmq_host`). +The most important field to point out for variability is `rabbitmq_use_long_name`. This cannot be detected and no reasonable default is provided for it, so it's important to point out when it needs to be changed. If instances are provisioned to where they reference other instances internally and not on external addresses, then `rabbitmq_use_long_name` semantics should follow the internal addressing (*i.e.*, `rabbitmq_host`). -Other than `rabbitmq_use_long_name` the defaults are pretty reasonable: +Other than `rabbitmq_use_long_name`, the defaults are pretty reasonable: ``` rabbitmq_port=5672 rabbitmq_vhost=tower @@ -105,9 +105,9 @@ Recommendations and constraints: - Do not name any instance the same as a group name. -### Security Isolated Rampart Groups +### Security-Isolated Rampart Groups -In Tower versions 3.2+ customers may optionally define isolated groups inside of security-restricted networking zones from which to run jobs and ad hoc commands. Instances in these groups will _not_ have a full install of Tower, but will have a minimal set of utilities used to run jobs. Isolated groups must be specified in the inventory file prefixed with `isolated_group_`. An example inventory file is shown below: +In Tower versions 3.2+, customers may optionally define isolated groups inside of security-restricted networking zones from which to run jobs and ad hoc commands. Instances in these groups will _not_ have a full install of Tower, but will have a minimal set of utilities used to run jobs. Isolated groups must be specified in the inventory file prefixed with `isolated_group_`. An example inventory file is shown below: ``` [tower] @@ -154,18 +154,18 @@ Recommendations for system configuration with isolated groups: Isolated Instance Authentication -------------------------------- -By default - at installation time - a randomized RSA key is generated and distributed as an authorized key to all "isolated" instances. The private half of the key is encrypted and stored within Tower, and is used to authenticat from "controller" instances to "isolated" instances when jobs are run. +At installation time, by default, a randomized RSA key is generated and distributed as an authorized key to all "isolated" instances. The private half of the key is encrypted and stored within Tower, and is used to authenticate from "controller" instances to "isolated" instances when jobs are run. -For users who wish to manage SSH authentication from controlling instances to isolated instances via some system _outside_ of Tower (such as externally-managed passwordless SSH keys), this behavior can be disabled by unsetting two Tower API settings values: +For users who wish to manage SSH authentication from controlling instances to isolated instances via some system _outside_ of Tower (such as externally-managed, password-less SSH keys), this behavior can be disabled by unsetting two Tower API settings values: `HTTP PATCH /api/v2/settings/jobs/ {'AWX_ISOLATED_PRIVATE_KEY': '', 'AWX_ISOLATED_PUBLIC_KEY': ''}` ### Provisioning and Deprovisioning Instances and Groups -* **Provisioning** - Provisioning Instances after installation is supported by updating the `inventory` file and re-running the setup playbook. It's important that this file contain all passwords and information used when installing the cluster or other instances may be reconfigured (this could be intentional). +* **Provisioning** - Provisioning Instances after installation is supported by updating the `inventory` file and re-running the setup playbook. It's important that this file contain all passwords and information used when installing the cluster, or other instances may be reconfigured (this can be done intentionally). -* **Deprovisioning** - Tower does not automatically de-provision instances since it cannot distinguish between an instance that was taken offline intentionally or due to failure. Instead the procedure for deprovisioning an instance is to shut it down (or stop the `ansible-tower-service`) and run the Tower deprovision command: +* **Deprovisioning** - Tower does not automatically de-provision instances since it cannot distinguish between an instance that was taken offline intentionally or due to failure. Instead, the procedure for de-provisioning an instance is to shut it down (or stop the `ansible-tower-service`) and run the Tower de-provision command: ``` $ awx-manage deprovision_instance --hostname= @@ -179,7 +179,7 @@ $ awx-manage unregister_queue --queuename= ### Configuring Instances and Instance Groups from the API -Instance Groups can be created by posting to `/api/v2/instance_groups` as a System Admin. +Instance Groups can be created by posting to `/api/v2/instance_groups` as a System Administrator. Once created, `Instances` can be associated with an Instance Group with: @@ -205,12 +205,13 @@ Instance Group Policies are controlled by three optional fields on an `Instance * `Instances` that are assigned directly to `Instance Groups` by posting to `/api/v2/instance_groups/x/instances` or `/api/v2/instances/x/instance_groups` are automatically added to the `policy_instance_list`. This means they are subject to the normal caveats for `policy_instance_list` and must be manually managed. -* `policy_instance_percentage` and `policy_instance_minimum` work together. For example, if you have a `policy_instance_percentage` of 50% and a `policy_instance_minimum` of 2 and you start 6 `Instances`, 3 of them would be assigned to the `Instance Group`. If you reduce the number of `Instances` to 2 then both of them would be assigned to the `Instance Group` to satisfy `policy_instance_minimum`. In this way, you can set a lower bound on the amount of available resources. +* `policy_instance_percentage` and `policy_instance_minimum` work together. For example, if you have a `policy_instance_percentage` of 50% and a `policy_instance_minimum` of 2 and you start 6 `Instances`, 3 of them would be assigned to the `Instance Group`. If you reduce the number of `Instances` to 2, then both of them would be assigned to the `Instance Group` to satisfy `policy_instance_minimum`. In this way, you can set a lower bound on the amount of available resources. * Policies don't actively prevent `Instances` from being associated with multiple `Instance Groups` but this can effectively be achieved by making the percentages sum to 100. If you have 4 `Instance Groups`, assign each a percentage value of 25 and the `Instances` will be distributed among them with no overlap. ### Manually Pinning Instances to Specific Groups + If you have a special `Instance` which needs to be _exclusively_ assigned to a specific `Instance Group` but don't want it to automatically join _other_ groups via "percentage" or "minimum" policies: 1. Add the `Instance` to one or more `Instance Group`s' `policy_instance_list`. @@ -243,6 +244,7 @@ Tower itself reports as much status as it can via the API at `/api/v2/ping` in o A more detailed view of Instances and Instance Groups, including running jobs and membership information can be seen at `/api/v2/instances/` and `/api/v2/instance_groups`. + ### Instance Services and Failure Behavior Each Tower instance is made up of several different services working collaboratively: @@ -253,14 +255,14 @@ Each Tower instance is made up of several different services working collaborati * **RabbitMQ** - A Message Broker, this is used as a signaling mechanism for Celery as well as any event data propagated to the application. * **Memcached** - A local caching service for the instance it lives on. -Tower is configured in such a way that if any of these services or their components fail, then all services are restarted. If these fail sufficiently often in a short span of time, then the entire instance will be placed offline in an automated fashion in order to allow remediation without causing unexpected behavior. +Tower is configured in such a way that if any of these services or their components fail, then all services are restarted. If these fail sufficiently (often in a short span of time), then the entire instance will be placed offline in an automated fashion in order to allow remediation without causing unexpected behavior. ### Job Runtime Behavior Ideally a regular user of Tower should not notice any semantic difference to the way jobs are run and reported. Behind the scenes it is worth pointing out the differences in how the system behaves. -When a job is submitted from the API interface it gets pushed into the Celery queue on RabbitMQ. A single RabbitMQ instance is the responsible master for individual queues, but each Tower instance will connect to and receive jobs from that queue using a Fair scheduling algorithm. Any instance on the cluster is just as likely to receive the work and execute the task. If an instance fails while executing jobs, then the work is marked as permanently failed. +When a job is submitted from the API interface, it gets pushed into the Dispatcher queue on RabbitMQ. A single RabbitMQ instance is the responsible master for individual queues, but each Tower instance will connect to and receive jobs from that queue using a fair-share scheduling algorithm. Any instance on the cluster is just as likely to receive the work and execute the task. If an instance fails while executing jobs, then the work is marked as permanently failed. If a cluster is divided into separate Instance Groups, then the behavior is similar to the cluster as a whole. If two instances are assigned to a group then either one is just as likely to receive a job as any other in the same group. @@ -270,60 +272,56 @@ It's important to note that not all instances are required to be provisioned wit If an Instance Group is configured but all instances in that group are offline or unavailable, any jobs that are launched targeting only that group will be stuck in a waiting state until instances become available. Fallback or backup resources should be provisioned to handle any work that might encounter this scenario. -#### Project synchronization behavior +#### Project Synchronization Behavior -Project updates behave differently than they did before. Previously they were ordinary jobs that ran on a single instance. It's now important that they run successfully on any instance that could potentially run a job. Projects will sync themselves to the correct version on the instance immediately prior to running the job. If the needed revision is already locally checked out and galaxy or collections updates are not needed, then a sync may not be performed. +Project updates behave differently than they did before. Previously they were ordinary jobs that ran on a single instance. It's now important that they run successfully on any instance that could potentially run a job. Projects will sync themselves to the correct version on the instance immediately prior to running the job. If the needed revision is already locally checked out and Galaxy or Collections updates are not needed, then a sync may not be performed. When the sync happens, it is recorded in the database as a project update with a `launch_type` of "sync" and a `job_type` of "run". Project syncs will not change the status or version of the project; instead, they will update the source tree _only_ on the instance where they run. The only exception to this behavior is when the project is in the "never updated" state (meaning that no project updates of any type have been run), in which case a sync should fill in the project's initial revision and status, and subsequent syncs should not make such changes. -#### Controlling where a particular job runs +#### Controlling Where a Particular Job Runs By default, a job will be submitted to the `tower` queue, meaning that it can be picked up by any of the workers. -##### How to restrict the instances a job will run on +##### How to Restrict the Instances a Job Will Run On -If any of the job template, inventory, -or organization has instance groups associated with them, a job run from that job template will not be eligible for the default behavior. That means that if all of the instance associated with these three resources are out of capacity, the job will remain in the `pending` state until capacity frees up. +If the Job Template, Inventory, or Organization have instance groups associated with them, a job run from that Job Template will not be eligible for the default behavior. This means that if all of the instance associated with these three resources are out of capacity, the job will remain in the `pending` state until capacity frees up. -##### How to set up a preferred instance group +##### How to Set Up a Preferred Instance Group -The order of preference in determining which instance group to which the job gets submitted is as follows: +The order of preference in determining which instance group the job gets submitted to is as follows: 1. Job Template 2. Inventory 3. Organization (by way of Inventory) -To expand further: If instance groups are associated with the job template and all of them are at capacity, then the job will be submitted to instance groups specified on inventory, and then organization. +To expand further: If instance groups are associated with the Job Template and all of them are at capacity, then the job will be submitted to instance groups specified on Inventory, and then Organization. The global `tower` group can still be associated with a resource, just like any of the custom instance groups defined in the playbook. This can be used to specify a preferred instance group on the job template or inventory, but still allow the job to be submitted to any instance if those are out of capacity. #### Instance Enable / Disable -In order to support temporarily taking an `Instance` offline there is a boolean property `enabled` defined on each instance. +In order to support temporarily taking an `Instance` offline, there is a boolean property `enabled` defined on each instance. -When this property is disabled no jobs will be assigned to that `Instance`. Existing jobs will finish but no new work will be -assigned. +When this property is disabled, no jobs will be assigned to that `Instance`. Existing jobs will finish but no new work will be assigned. ## Acceptance Criteria -When verifying acceptance we should ensure the following statements are true +When verifying acceptance, we should ensure that the following statements are true: * Tower should install as a standalone Instance * Tower should install in a Clustered fashion -* Instance should, optionally, be able to be grouped arbitrarily into different Instance Groups -* Capacity should be tracked at the group level and capacity impact should make sense relative to what instance a job is - running on and what groups that instance is a member of. +* Instances should, optionally, be able to be grouped arbitrarily into different Instance Groups +* Capacity should be tracked at the group level and capacity impact should make sense relative to what instance a job is running on and what groups that instance is a member of * Provisioning should be supported via the setup playbook * De-provisioning should be supported via a management command * All jobs, inventory updates, and project updates should run successfully -* Jobs should be able to run on hosts which it is targeted. If assigned implicitly or directly to groups then it should - only run on instances in those Instance Groups. +* Jobs should be able to run on hosts for which they are targeted; if assigned implicitly or directly to groups, then they should only run on instances in those Instance Groups * Project updates should manifest their data on the host that will run the job immediately prior to the job running * Tower should be able to reasonably survive the removal of all instances in the cluster -* Tower should behave in a predictable fashiong during network partitioning +* Tower should behave in a predictable fashion during network partitioning ## Testing Considerations @@ -331,39 +329,30 @@ When verifying acceptance we should ensure the following statements are true * Basic playbook testing to verify routing differences, including: - Basic FQDN - Short-name name resolution - - ip addresses - - /etc/hosts static routing information -* We should test behavior of large and small clusters. I would envision small clusters as 2 - 3 instances and large - clusters as 10 - 15 instances -* Failure testing should involve killing single instances and killing multiple instances while the cluster is performing work. - Job failures during the time period should be predictable and not catastrophic. -* Instance downtime testing should also include recoverability testing. Killing single services and ensuring the system can - return itself to a working state -* Persistent failure should be tested by killing single services in such a way that the cluster instance cannot be recovered - and ensuring that the instance is properly taken offline -* Network partitioning failures will be important also. In order to test this + - IP addresses + - `/etc/hosts` static routing information +* We should test behavior of large and small clusters; small clusters usually consist of 2 - 3 instances and large clusters have 10 - 15 instances. +* Failure testing should involve killing single instances and killing multiple instances while the cluster is performing work. Job failures during the time period should be predictable and not catastrophic. +* Instance downtime testing should also include recoverability testing (killing single services and ensuring the system can return itself to a working state). +* Persistent failure should be tested by killing single services in such a way that the cluster instance cannot be recovered and ensuring that the instance is properly taken offline. +* Network partitioning failures will also be important. In order to test this: - Disallow a single instance from communicating with the other instances but allow it to communicate with the database - - Break the link between instances such that it forms 2 or more groups where groupA and groupB can't communicate but all instances - can communicate with the database. -* Crucially when network partitioning is resolved all instances should recover into a consistent state -* Upgrade Testing, verify behavior before and after are the same for the end user. -* Project Updates should be thoroughly tested for all scm types (git, svn, hg) and for manual projects. + - Break the link between instances such that it forms two or more groups where Group A and Group B can't communicate but all instances can communicate with the database. +* Crucially, when network partitioning is resolved, all instances should recover into a consistent state. +* Upgrade Testing - verify behavior before and after are the same for the end user. +* Project Updates should be thoroughly tested for all SCM types (`git`, `svn`, `hg`) and for manual projects. * Setting up instance groups in two scenarios: a) instances are shared between groups b) instances are isolated to particular groups - Organizations, Inventories, and Job Templates should be variously assigned to one or many groups and jobs should execute - in those groups in preferential order as resources are available. + Organizations, Inventories, and Job Templates should be variously assigned to one or many groups and jobs should execute in those groups in preferential order as resources are available. ## Performance Testing -Performance testing should be twofold. +Performance testing should be twofold: -* Large volume of simultaneous jobs. -* Jobs that generate a large amount of output. +* A large volume of simultaneous jobs +* Jobs that generate a large amount of output -These should also be benchmarked against the same playbooks using the 3.0.X Tower release and a stable Ansible version. -For a large volume playbook I might recommend a customer provided one that we've seen recently: +These should also be benchmarked against the same playbooks using the 3.0.X Tower release and a stable Ansible version. For a large volume playbook (*e.g.*, against 100+ hosts), something like the following is recommended: https://gist.github.com/michelleperz/fe3a0eb4eda888221229730e34b28b89 - -Against 100+ hosts. diff --git a/docs/collections.md b/docs/collections.md index a0fec218f3..23d805b74c 100644 --- a/docs/collections.md +++ b/docs/collections.md @@ -1,19 +1,18 @@ ## Collections -AWX supports using Ansible collections. -This section will give ways to use collections in job runs. +AWX supports the use of Ansible Collections. This section will give ways to use Collections in job runs. ### Project Collections Requirements -If you specify a collections requirements file in SCM at `collections/requirements.yml`, -then AWX will install collections in that file in the implicit project sync +If you specify a Collections requirements file in SCM at `collections/requirements.yml`, +then AWX will install Collections in that file in the implicit project sync before a job run. The invocation is: ``` ansible-galaxy collection install -r requirements.yml -p ``` -Example of tmp directory where job is running: +Example of `tmp` directory where job is running: ``` ├── project diff --git a/docs/credential_plugins.md b/docs/credentials/credential_plugins.md similarity index 92% rename from docs/credential_plugins.md rename to docs/credentials/credential_plugins.md index 01f57a235c..63a6594519 100644 --- a/docs/credential_plugins.md +++ b/docs/credentials/credential_plugins.md @@ -2,7 +2,7 @@ Credential Plugins ================== By default, sensitive credential values (such as SSH passwords, SSH private -keys, API tokens for cloud services) in AWX are stored in the AWX database +keys, API tokens for cloud services, etc.) in AWX are stored in the AWX database after being encrypted with a symmetric encryption cipher utilizing AES-256 in CBC mode alongside a SHA-256 HMAC. @@ -19,9 +19,9 @@ When configuring AWX to pull a secret from a third party system, there are generally three steps. Here is an example of creating an (1) AWX Machine Credential with -a static username, `example-user` and (2) an externally sourced secret from +a static username, `example-user` and (2) an externally-sourced secret from HashiCorp Vault Key/Value system which will populate the (3) password field on -the Machine Credential. +the Machine Credential: 1. Create the Machine Credential with a static username, `example-user`. @@ -29,13 +29,13 @@ the Machine Credential. secret management system (in this example, specifying a URL and an OAuth2.0 token _to access_ HashiCorp Vault) -3. _Link_ the `password` field for the Machine credential to the external - system by specifying the source (in this example, the HashiCorp credential) +3. _Link_ the `password` field for the Machine cCredential to the external + system by specifying the source (in this example, the HashiCorp Credential) and metadata about the path (e.g., `/some/path/to/my/password/`). Note that you can perform these lookups on *any* field for any non-external credential, including those with custom credential types. You could just as -easily create an AWS credential and use lookups to retrieve the Access Key and +easily create an AWS Credential and use lookups to retrieve the Access Key and Secret Key from an external secret management system. External credentials cannot have lookups applied to their fields. @@ -150,10 +150,10 @@ HashiCorp Vault KV AWX supports retrieving secret values from HashiCorp Vault KV (https://www.vaultproject.io/api/secret/kv/) -The following example illustrates how to configure a Machine credential to pull -its password from an HashiCorp Vault: +The following example illustrates how to configure a Machine Credential to pull +its password from a HashiCorp Vault: -1. Look up the ID of the Machine and HashiCorp Vault Secret Lookup credential +1. Look up the ID of the Machine and HashiCorp Vault Secret Lookup Credential types (in this example, `1` and `15`): ```shell @@ -182,7 +182,7 @@ HTTP/1.1 200 OK ... ``` -2. Create a Machine and a HashiCorp Vault credential: +2. Create a Machine and a HashiCorp Vault Credential: ```shell ~ curl -sik "https://awx.example.org/api/v2/credentials/" \ @@ -214,7 +214,7 @@ HTTP/1.1 201 Created ... ``` -3. Link the Machine credential to the HashiCorp Vault credential: +3. Link the Machine Credential to the HashiCorp Vault Credential: ```shell ~ curl -sik "https://awx.example.org/api/v2/credentials/1/input_sources/" \ @@ -232,10 +232,10 @@ HashiCorp Vault SSH Secrets Engine AWX supports signing public keys via HashiCorp Vault's SSH Secrets Engine (https://www.vaultproject.io/api/secret/ssh/) -The following example illustrates how to configure a Machine credential to sign +The following example illustrates how to configure a Machine Credential to sign a public key using HashiCorp Vault: -1. Look up the ID of the Machine and HashiCorp Vault Signed SSH credential +1. Look up the ID of the Machine and HashiCorp Vault Signed SSH Credential types (in this example, `1` and `16`): ```shell @@ -263,7 +263,7 @@ HTTP/1.1 200 OK "name": "HashiCorp Vault Signed SSH", ``` -2. Create a Machine and a HashiCorp Vault credential: +2. Create a Machine and a HashiCorp Vault Credential: ```shell ~ curl -sik "https://awx.example.org/api/v2/credentials/" \ @@ -295,7 +295,7 @@ HTTP/1.1 201 Created ... ``` -3. Link the Machine credential to the HashiCorp Vault credential: +3. Link the Machine Credential to the HashiCorp Vault Credential: ```shell ~ curl -sik "https://awx.example.org/api/v2/credentials/1/input_sources/" \ @@ -306,7 +306,7 @@ HTTP/1.1 201 Created HTTP/1.1 201 Created ``` -4. Associate the Machine credential with a Job Template. When the Job Template +4. Associate the Machine Credential with a Job Template. When the Job Template is run, AWX will use the provided HashiCorp URL and token to sign the unsigned public key data using the HashiCorp Vault SSH Secrets API. AWX will generate an `id_rsa` and `id_rsa-cert.pub` on the fly and diff --git a/docs/custom_credential_types.md b/docs/credentials/custom_credential_types.md similarity index 95% rename from docs/custom_credential_types.md rename to docs/credentials/custom_credential_types.md index 33cb903b78..238dad12ba 100644 --- a/docs/custom_credential_types.md +++ b/docs/credentials/custom_credential_types.md @@ -27,7 +27,7 @@ Important Changes By utilizing these custom ``Credential Types``, customers have the ability to define custom "Cloud" and "Network" ``Credential Types`` which modify environment variables, extra vars, and generate file-based - credentials (such as file-based certificates or .ini files) at + credentials (such as file-based certificates or `.ini` files) at `ansible-playbook` runtime. * Multiple ``Credentials`` can now be assigned to a ``Job Template`` as long as @@ -136,9 +136,10 @@ ordered fields for that type: "multiline": false # if true, the field should be rendered # as multi-line for input entry # (only applicable to `type=string`) + "default": "default value" # optional, can be used to provide a - # default value if the field is left empty - # when creating a credential of this type + # default value if the field is left empty; + # when creating a credential of this type, # credential forms will use this value # as a prefill when making credentials of # this type @@ -164,7 +165,7 @@ When `type=string`, fields can optionally specify multiple choice options: Defining Custom Credential Type Injectors ----------------------------------------- A ``Credential Type`` can inject ``Credential`` values through the use -of the Jinja templating language (which should be familiar to users of Ansible): +of the [Jinja templating language](https://jinja.palletsprojects.com/en/2.10.x/) (which should be familiar to users of Ansible): "injectors": { "env": { @@ -175,7 +176,7 @@ of the Jinja templating language (which should be familiar to users of Ansible): } } -``Credential Types`` can also generate temporary files to support .ini files or +``Credential Types`` can also generate temporary files to support `.ini` files or certificate/key data: "injectors": { @@ -274,7 +275,7 @@ Additional Criteria Acceptance Criteria ------------------- -When verifying acceptance we should ensure the following statements are true: +When verifying acceptance, the following statements should be true: * `Credential` injection for playbook runs, SCM updates, inventory updates, and ad-hoc runs should continue to function as they did prior to Tower 3.2 for the @@ -290,15 +291,15 @@ When verifying acceptance we should ensure the following statements are true: * Users should not be able to use the syntax for injecting single and multiple files in the same custom credential. * The default `Credential Types` included with Tower in 3.2 should be - non-editable/readonly and cannot be deleted by any user. + non-editable/read-only and unable to be deleted by any user. * Stored `Credential` values for _all_ types should be consistent before and - after Tower 3.2 migration/upgrade. + after a Tower 3.2 migration/upgrade. * `Job Templates` should be able to specify multiple extra `Credentials` as defined in the constraints in this document. * Custom inventory sources should be able to specify a cloud/network `Credential` and they should properly update the environment (environment variables, extra vars, written files) when an inventory source update runs. * If a `Credential Type` is being used by one or more `Credentials`, the fields - defined in its ``inputs`` should be read-only. -* `Credential Types` should support activity stream history for basic object + defined in its `inputs` should be read-only. +* `Credential Types` should support Activity Stream history for basic object modification. diff --git a/docs/multi_credential_assignment.md b/docs/multi_credential_assignment.md deleted file mode 100644 index 0b0732b679..0000000000 --- a/docs/multi_credential_assignment.md +++ /dev/null @@ -1,233 +0,0 @@ -Multi-Credential Assignment -=========================== - -awx has added support for assigning zero or more credentials to -JobTemplates and InventoryUpdates via a singular, unified interface. - -Background ----------- - -Prior to awx (Tower 3.2), Job Templates had a certain set of requirements -surrounding their relation to Credentials: - -* All Job Templates (and Jobs) were required to have exactly *one* Machine/SSH - or Vault credential (or one of both). -* All Job Templates (and Jobs) could have zero or more "extra" Credentials. -* These extra Credentials represented "Cloud" and "Network" credentials that -* could be used to provide authentication to external services via environment -* variables (e.g., AWS_ACCESS_KEY_ID). - -This model required a variety of disjoint interfaces for specifying Credentials -on a JobTemplate. For example, to modify assignment of Machine/SSH and Vault -credentials, you would change the Credential key itself: - -`PATCH /api/v2/job_templates/N/ {'credential': X, 'vault_credential': Y}` - -Modifying `extra_credentials` was accomplished on a separate API endpoint -via association/disassociation actions: - -``` -POST /api/v2/job_templates/N/extra_credentials {'associate': true, 'id': Z} -POST /api/v2/job_templates/N/extra_credentials {'disassociate': true, 'id': Z} -``` - -This model lacked the ability associate multiple Vault credentials with -a playbook run, a use case supported by Ansible core from Ansible 2.4 onwards. - -This model also was a stumbling block for certain playbook execution workflows. -For example, some users wanted to run playbooks with `connection:local` that -only interacted with some cloud service via a cloud Credential. In this -scenario, users often generated a "dummy" Machine/SSH Credential to attach to -the Job Template simply to satisfy the requirement on the model. - -Important Changes ------------------ - -JobTemplates now have a single interface for Credential assignment: - -`GET /api/v2/job_templates/N/credentials/` - -Users can associate and disassociate credentials using `POST` requests to this -interface, similar to the behavior in the now-deprecated `extra_credentials` -endpoint: - -``` -POST /api/v2/job_templates/N/credentials/ {'associate': true, 'id': X'} -POST /api/v2/job_templates/N/credentials/ {'disassociate': true, 'id': Y'} -``` - -Under this model, a JobTemplate is considered valid even when it has _zero_ -Credentials assigned to it. - -Launch Time Considerations --------------------------- - -Prior to this change, JobTemplates had a configurable attribute, -`ask_credential_on_launch`. This value was used at launch time to determine -which missing credential values were necessary for launch - this was primarily -used as a mechanism for users to specify an SSH (or Vault) credential to satisfy -the minimum Credential requirement. - -Under the new unified Credential list model, this attribute still exists, but it -is no longer bound to a notion of "requiring" a Credential. Now when -`ask_credential_on_launch` is `True`, it signifies that users may (if they -wish) specify a list of credentials at launch time to override those defined on -the JobTemplate: - -`POST /api/v2/job_templates/N/launch/ {'credentials': [A, B, C]}` - -If `ask_credential_on_launch` is `False`, it signifies that custom `credentials` -provided in the payload to `POST /api/v2/job_templates/N/launch/` will be -ignored. - -Under this model, the only purpose for `ask_credential_on_launch` is to signal -that API clients should prompt the user for (optional) changes at launch time. - -Backwards Compatibility Concerns --------------------------------- -Requests to update `JobTemplate.credential` and `JobTemplate.vault_credential` -will no longer work. Example request format: - -`PATCH /api/v2/job_templates/N/ {'credential': X, 'vault_credential': Y}` - -This request will have no effect because support for using these -fields has been removed. - -The relationship `extra_credentials` is deprecated but still supported for now. -Clients should favor the `credentials` relationship instead. - -`GET` requests to `/api/v2/job_templates/N/` and `/api/v2/jobs/N/` -will include this via `related_fields`: - -``` -{ - "related": { - ... - "credentials": "/api/v2/job_templates/5/credentials/", - "extra_credentials": "/api/v2/job_templates/5/extra_credentials/", - } -} -``` - -...and `summary_fields`, which is not included in list views: - -``` -{ - "summary_fields": { - "credentials": [ - { - "description": "", - "credential_type_id": 5, - "id": 2, - "kind": "aws", - "name": "some-aws" - }, - { - "description": "", - "credential_type_id": 10, - "id": 4, - "kind": "gce", - "name": "some-gce" - } - ], - "extra_credentials": [ - { - "description": "", - "credential_type_id": 5, - "id": 2, - "kind": "aws", - "name": "some-aws" - }, - { - "description": "", - "credential_type_id": 10, - "id": 4, - "kind": "gce", - "name": "some-gce" - } - ], - } -} -``` - -The only difference between `credentials` and `extra_credentials` is that the -latter is filtered to only show "cloud" type credentials, whereas the former -can be used to manage all types of related credentials. - -The `/api/v2/job_templates/N/launch/` endpoint no longer provides -backwards compatible support for specifying credentials at launch time -via the `credential` or `vault_credential` fields. -The launch endpoint can still accept a list under the `extra_credentials` key, -but this is deprecated in favor `credentials`. - - -Specifying Multiple Vault Credentials -------------------------------------- -One interesting use case supported by the new "zero or more credentials" model -is the ability to assign multiple Vault credentials to a Job Template run. - -This specific use case covers Ansible's support for multiple vault passwords for -a playbook run (since Ansible 2.4): -http://docs.ansible.com/ansible/latest/vault.html#vault-ids-and-multiple-vault-passwords - -Vault credentials in awx now have an optional field, `vault_id`, which is -analogous to the `--vault-id` argument to `ansible-playbook`. To run -a playbook which makes use of multiple vault passwords: - -1. Make a Vault credential in Tower for each vault password; specify the Vault - ID as a field on the credential and input the password (which will be - encrypted and stored). -2. Assign multiple vault credentials to the job template via the new - `credentials` endpoint: - - ``` - POST /api/v2/job_templates/N/credentials/ - - { - 'associate': true, - 'id': X - } - ``` -3. Launch the job template, and `ansible-playbook` will be invoked with - multiple `--vault-id` arguments. - -Prompted Vault Credentials --------------------------- -Vault credentials can have passwords that are marked as "Prompt on launch". -When this is the case, the launch endpoint of any related Job Templates will -communicate necessary Vault passwords via the `passwords_needed_to_start` key: - -``` -GET /api/v2/job_templates/N/launch/ -{ - 'passwords_needed_to_start': [ - 'vault_password.X', - 'vault_password.Y', - ] -} -``` - -...where `X` and `Y` are primary keys of the associated Vault credentials. - -``` -POST /api/v2/job_templates/N/launch/ -{ - 'credential_passwords': { - 'vault_password.X': 'first-vault-password' - 'vault_password.Y': 'second-vault-password' - } -} -``` - -Inventory Source Credentials ----------------------------- - -Inventory sources and inventory updates that they spawn also use the same -relationship. The new endpoints for this are - - `/api/v2/inventory_sources/N/credentials/` and - - `/api/v2/inventory_updates/N/credentials/` - -Most cloud sources will continue to adhere to the constraint that they -must have a single credential that corresponds to their cloud type. -However, this relationship allows users to associate multiple vault -credentials of different ids to inventory sources. diff --git a/docs/rbac.md b/docs/rbac.md index c0bcb6860e..fc4cd04de7 100644 --- a/docs/rbac.md +++ b/docs/rbac.md @@ -7,13 +7,13 @@ The intended audience of this document is the Ansible Tower developer. ### RBAC - System Basics -There are three main concepts to be familiar with, Roles, Resources, and Users. +There are three main concepts to be familiar with: Roles, Resources, and Users. Users can be members of a role, which gives them certain access to any resources associated with that role, or any resources associated with "descendent" roles. For example, if I have an organization named "MyCompany" and I want to allow -two people, "Alice", and "Bob", access to manage all the settings associated +two people, "Alice", and "Bob", access to manage all of the settings associated with that organization, I'd make them both members of the organization's `admin_role`. It is often the case that you have many Roles in a system, and you want some @@ -21,9 +21,9 @@ roles to include all of the capabilities of other roles. For example, you may want a System Administrator to have access to everything that an Organization Administrator has access to, who has everything that a Project Administrator has access to, and so on. We refer to this concept as the 'Role Hierarchy', and -is represented by allowing Roles to have "Parent Roles". Any permission that a -Role has is implicitly granted to any parent roles (or parents of those -parents, and so on). Of course Roles can have more than one parent, and +is represented by allowing roles to have "Parent Roles". Any permission that a +role has is implicitly granted to any parent roles (or parents of those +parents, and so on). Of course roles can have more than one parent, and capabilities are implicitly granted to all parents. (Technically speaking, this forms a directional acyclic graph instead of a strict hierarchy, but the concept should remain intuitive.) @@ -34,10 +34,10 @@ concept should remain intuitive.) ### Implementation Overview The RBAC system allows you to create and layer roles for controlling access to resources. Any Django Model can -be made into a resource in the RBAC system by using the `ResourceMixin`. Once a model is accessible as a resource you can +be made into a resource in the RBAC system by using the `ResourceMixin`. Once a model is accessible as a resource, you can extend the model definition to have specific roles using the `ImplicitRoleField`. Within the declaration of this role field you can also specify any parents the role may have, and the RBAC system will take care of -all the appropriate ancestral binding that takes place behind the scenes to ensure that the model you've declared +all of the appropriate ancestral binding that takes place behind the scenes to ensure that the model you've declared is kept up to date as the relations in your model change. ### Roles @@ -52,7 +52,7 @@ what roles are checked when accessing a resource. | -- AdminRole |-- parent = ResourceA.AdminRole -When a user attempts to access ResourceB we will check for their access using the set of all unique roles, including the parents. +When a user attempts to access ResourceB, we will check for their access using the set of all unique roles, including the parents. ResourceA.AdminRole, ResourceB.AdminRole @@ -60,7 +60,7 @@ This would provide any members of the above roles with access to ResourceB. #### Singleton Role -There is a special case _Singleton Role_ that you can create. This type of role is for system wide roles. +There is a special case _Singleton Role_ that you can create. This type of role is for system-wide roles. ### Models @@ -72,7 +72,7 @@ The RBAC system defines a few new models. These models represent the underlying ##### `visible_roles(cls, user)` -`visible_roles` is a class method that will lookup all of the `Role` instances a user can "see". This includes any roles the user is a direct decendent of as well as any ancestor roles. +`visible_roles` is a class method that will look up all of the `Role` instances a user can "see". This includes any roles the user is a direct descendent of as well as any ancestor roles. ##### `singleton(cls, name)` @@ -137,7 +137,7 @@ By mixing in the `ResourceMixin` to your model, you are turning your model in to ## Usage -After exploring the _Overview_ the usage of the RBAC implementation in your code should feel unobtrusive and natural. +After exploring the _Overview_, the usage of the RBAC implementation in your code should feel unobtrusive and natural. ```python # make your model a Resource @@ -150,7 +150,7 @@ After exploring the _Overview_ the usage of the RBAC implementation in your code ) ``` -Now that your model is a resource and has a `Role` defined, you can begin to access the helper methods provided to you by the `ResourceMixin` for checking a users access to your resource. Here is the output of a Python REPL session. +Now that your model is a resource and has a `Role` defined, you can begin to access the helper methods provided to you by the `ResourceMixin` for checking a user's access to your resource. Here is the output of a Python REPL session: ```python # we've created some documents and a user diff --git a/docs/retry_by_status.md b/docs/retry_by_status.md index 2156848a4f..c3b7607630 100644 --- a/docs/retry_by_status.md +++ b/docs/retry_by_status.md @@ -1,7 +1,7 @@ # Relaunch on Hosts with Status -This feature allows the user to relaunch a job, targeting only hosts marked -as failed in the original job. +This feature allows the user to relaunch a job, targeting only the hosts marked +as "failed" in the original job. ### Definition of "failed" @@ -10,27 +10,27 @@ is different from "hosts with failed tasks". Unreachable hosts can have no failed tasks. This means that the count of "failed hosts" can be different from the failed count, given in the summary at the end of a playbook. -This definition corresponds to Ansible .retry files. +This definition corresponds to Ansible `.retry` files. ### API Design of Relaunch #### Basic Relaunch -POST to `/api/v2/jobs/N/relaunch/` without any request data should relaunch +POSTs to `/api/v2/jobs/N/relaunch/` without any request data should relaunch the job with the same `limit` value that the original job used, which may be an empty string. -This is implicitly the "all" option below. +This is implicitly the "all" option, mentioned below. #### Relaunch by Status Providing request data containing `{"hosts": "failed"}` should change the `limit` of the relaunched job to target failed hosts from the previous job. Hosts will be provided as a comma-separated list in the limit. Formally, -these are options +these are options: - all: relaunch without changing the job limit - - failed: relaunch against all hos + - failed: relaunch against all hosts ### Relaunch Endpoint @@ -60,12 +60,12 @@ then the request will be rejected. For example, if a GET yielded: } ``` -Then a POST of `{"hosts": "failed"}` should return a descriptive response +...then a POST of `{"hosts": "failed"}` should return a descriptive response with a 400-level status code. # Acceptance Criteria -Scenario: user launches a job against host "foobar", and the run fails +Scenario: User launches a job against host "foobar", and the run fails against this host. User changes name of host to "foo", and relaunches job against failed hosts. The `limit` of the relaunched job should reference "foo" and not "foobar". @@ -79,9 +79,9 @@ relaunch the same way that relaunching has previously worked. If a playbook provisions a host, this feature should behave reasonably when relaunching against a status that includes these hosts. -Feature should work even if hosts have tricky characters in their names, +This feature should work even if hosts have tricky characters in their names, like commas. -Also need to consider case where a task `meta: clear_host_errors` is present -inside a playbook, and that the retry subset behavior is the same as Ansible +One may also need to consider cases where a task `meta: clear_host_errors` is present +inside a playbook; the retry subset behavior is the same as Ansible's for this case. diff --git a/docs/tasks.md b/docs/tasks.md index 74d824ac9f..53b30c8ce8 100644 --- a/docs/tasks.md +++ b/docs/tasks.md @@ -1,20 +1,20 @@ Background Tasks in AWX ======================= -In this document, we will go into a bit of detail about how and when AWX runs Python code _in the background_ (_i.e._, _outside_ of the context of an HTTP request), such as: +In this document, we will go into a bit of detail about how and when AWX runs Python code _in the background_ (_i.e._, **outside** of the context of an HTTP request), such as: * Any time a Job is launched in AWX (a Job Template, an Ad Hoc Command, a Project Update, an Inventory Update, a System Job), a background process retrieves metadata _about_ that job from the database and forks some process (_e.g._, `ansible-playbook`, `awx-manage inventory_import`) -* Certain expensive or time-consuming tasks run in the background +* Certain expensive or time-consuming tasks running in the background asynchronously (_e.g._, when deleting an inventory). * AWX runs a variety of periodic background tasks on a schedule. Some examples are: - AWX's "Task Manager/Scheduler" wakes up periodically and looks for - `pending` jobs that have been launched and are ready to start running. + `pending` jobs that have been launched and are ready to start running - AWX periodically runs code that looks for scheduled jobs and launches - them. + them - AWX runs a variety of periodic tasks that clean up temporary files, and performs various administrative checks - Every node in an AWX cluster runs a periodic task that serves as diff --git a/docs/tower_configuration.md b/docs/tower_configuration.md index a5044c8df0..91beca5085 100644 --- a/docs/tower_configuration.md +++ b/docs/tower_configuration.md @@ -1,11 +1,11 @@ -Tower configuration gives tower users the ability to adjust multiple runtime parameters of Tower, thus take fine-grained control over Tower run. +Tower configuration gives Tower users the ability to adjust multiple runtime parameters of Tower, which enables much more fine-grained control over Tower runs. ## Usage manual -#### To use -The REST endpoint for CRUD operations against Tower configurations is `/api//settings/`. GETing to that endpoint will return a list of available Tower configuration categories and their urls, such as `"system": "/api//settings/system/"`. The URL given to each category is the endpoint for CRUD operations against individual settings under that category. +#### To Use: +The REST endpoint for CRUD operations against Tower configurations can be found at `/api//settings/`. GETing to that endpoint will return a list of available Tower configuration categories and their URLs, such as `"system": "/api//settings/system/"`. The URL given to each category is the endpoint for CRUD operations against individual settings under that category. -Here is a typical Tower configuration category GET response. +Here is a typical Tower configuration category GET response: ``` GET /api/v2/settings/github-team/ HTTP 200 OK @@ -27,10 +27,10 @@ X-API-Time: 0.026s } ``` -The returned body is a JSON of key-value pairs, where the key is the name of Tower configuration setting, and the value is the value of that setting. To update the settings, simply update setting values and PUT/PATCH to the same endpoint. +The returned body is a JSON of key-value pairs, where the key is the name of the Tower configuration setting, and the value is the value of that setting. To update the settings, simply update setting values and PUT/PATCH to the same endpoint. -#### To develop -Each Django app in tower should have a `conf.py` file where related settings get registered. Below is the general format for `conf.py`: +#### To Develop: +Each Django app in Tower should have a `conf.py` file where related settings get registered. Below is the general format for `conf.py`: ```python # Other dependencies @@ -52,7 +52,7 @@ register( # Other setting registries ``` -`register` is the endpoint API for registering individual tower configurations: +`register` is the endpoint API for registering individual Tower configurations: ``` register( setting, @@ -66,34 +66,34 @@ register( defined_in_file=False, ) ``` -Here is the details of each argument: +Here are the details for each argument: | Argument Name | Argument Value Type | Description | |--------------------------|-------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `setting` | `str` | Name of the setting. Usually all-capital connected by underscores like `'FOO_BAR'` | | `field_class` | a subclass of DRF serializer field available in `awx.conf.fields` | The class wrapping around value of the configuration, responsible for retrieving, setting, validating and storing configuration values. | -| `**field_related_kwargs` | **kwargs | Key-worded arguments needed to initialize an instance of `field_class`. | +| `**field_related_kwargs` | `**kwargs` | Key-worded arguments needed to initialize an instance of `field_class`. | | `category_slug` | `str` | The actual identifier used for finding individual setting categories. | | `category` | transformable string, like `_('foobar')` | The human-readable form of `category_slug`, mainly for display. | -| `depends_on` | `list` of `str`s | A list of setting names this setting depends on. A setting this setting depends on is another tower configuration setting whose changes may affect the value of this setting. | -| `placeholder` | transformable string, like `_('foobar')` | A human-readable string displaying a typical value for the setting, mainly used by UI | -| `encrypted` | `boolean` | Flag determining whether the setting value should be encrypted | -| `defined_in_file` | `boolean` | Flag determining whether a value has been manually set in settings file. | +| `depends_on` | `list` of `str`s | A list of setting names this setting depends on. A setting this setting depends on is another Tower configuration setting whose changes may affect the value of this setting. | +| `placeholder` | transformable string, like `_('foobar')` | A human-readable string displaying a typical value for the setting, mainly used by the UI. | +| `encrypted` | `boolean` | A flag which determines whether the setting value should be encrypted. | +| `defined_in_file` | `boolean` | A flag which determines whether a value has been manually set in the settings file. | -During Tower bootstrapping, All settings registered in `conf.py` modules of Tower Django apps will be loaded (registered). The set of Tower configuration settings will form a new top-level of `django.conf.settings` object. Later all Tower configuration settings will be available as attributes of it, just like normal Django settings. Note Tower configuration settings take higher priority over normal settings, meaning if a setting `FOOBAR` is both defined in a settings file and registered in a `conf.py`, the registered attribute will be used over the defined attribute every time. +During Tower bootstrapping, **all** settings registered in `conf.py` modules of Tower Django apps will be loaded (registered). This set of Tower configuration settings will form a new top-level of the `django.conf.settings` object. Later, all Tower configuration settings will be available as attributes of it, just like the normal Django settings. Note that Tower configuration settings take higher priority over normal settings, meaning if a setting `FOOBAR` is both defined in a settings file *and* registered in `conf.py`, the registered attribute will be used over the defined attribute every time. -Note when registering new configurations, it is desired to provide a default value if it is possible to do so, as Tower configuration UI has a 'revert all' functionality that revert all settings to it's default value. +Please note that when registering new configurations, it is recommended to provide a default value if it is possible to do so, as the Tower configuration UI has a 'revert all' functionality that reverts all settings to its default value. -Starting from 3.2, Tower configuration supports category-specific validation functions. They should also be defined under `conf.py` in the form +Starting with version 3.2, Tower configuration supports category-specific validation functions. They should also be defined under `conf.py` in the form ```python def custom_validate(serializer, attrs): ''' Method details ''' ``` -Where argument `serializer` refers to the underlying `SettingSingletonSerializer` object, and `attrs` refers to a dictionary of input items. +...where the argument `serializer` refers to the underlying `SettingSingletonSerializer` object, and `attrs` refers to a dictionary of input items. -Then at the end of `conf.py`, register defined custom validation methods to different configuration categories (`category_slug`) using `awx.conf.register_validate`: +At the end of `conf.py`, register defined custom validation methods to different configuration categories (`category_slug`) using `awx.conf.register_validate`: ```python # conf.py ... diff --git a/docs/websockets.md b/docs/websockets.md index 2a2f21f0a4..2095905028 100644 --- a/docs/websockets.md +++ b/docs/websockets.md @@ -4,19 +4,18 @@ Our channels/websocket implementation handles the communication between Tower AP ## Architecture -Tower enlists the help of the `django-channels` library to create our communications layer. `django-channels` provides us with per-client messaging integration in to our application by implementing the Asynchronous Server Gateway Interface or ASGI. +Tower enlists the help of the `django-channels` library to create our communications layer. `django-channels` provides us with per-client messaging integration in our application by implementing the Asynchronous Server Gateway Interface (ASGI). -To communicate between our different services we use RabbitMQ to exchange messages. Traditionally, `django-channels` uses Redis, but Tower uses a custom `asgi_amqp` library that allows use to RabbitMQ for the same purpose. +To communicate between our different services we use RabbitMQ to exchange messages. Traditionally, `django-channels` uses Redis, but Tower uses a custom `asgi_amqp` library that allows access to RabbitMQ for the same purpose. -Inside Tower we use the emit_channel_notification which places messages on to the queue. The messages are given an explicit -event group and event type which we later use in our wire protocol to control message delivery to the client. +Inside Tower we use the `emit_channel_notification` function which places messages onto the queue. The messages are given an explicit event group and event type which we later use in our wire protocol to control message delivery to the client. ## Protocol -You can connect to the Tower channels implementation using any standard websocket library but pointing it to `/websocket`. You must +You can connect to the Tower channels implementation using any standard websocket library by pointing it to `/websocket`. You must provide a valid Auth Token in the request URL. -Once you've connected, you are not subscribed to any event groups. You subscribe by sending a json request that looks like the following: +Once you've connected, you are not subscribed to any event groups. You subscribe by sending a `json` request that looks like the following: 'groups': { 'jobs': ['status_changed', 'summary'], @@ -30,37 +29,28 @@ Once you've connected, you are not subscribed to any event groups. You subscribe 'control': ['limit_reached_'], } -These map to the event group and event type you are interested in. Sending in a new groups dictionary will clear all of your previously -subscribed groups before subscribing to the newly requested ones. This is intentional, and makes the single page navigation much easier since -you only need to care about current subscriptions. +These map to the event group and event type that the user is interested in. Sending in a new groups dictionary will clear all previously-subscribed groups before subscribing to the newly requested ones. This is intentional, and makes the single page navigation much easier since users only need to care about current subscriptions. ## Deployment -This section will specifically discuss deployment in the context of websockets and the path your request takes through the system. +This section will specifically discuss deployment in the context of websockets and the path those requests take through the system. -Note: The deployment of Tower changes slightly with the introduction of `django-channels` and websockets. There are some minor differences between -production and development deployments that I will point out, but the actual services that run the code and handle the requests are identical -between the two environments. +**Note:** The deployment of Tower changes slightly with the introduction of `django-channels` and websockets. There are some minor differences between production and development deployments that will be pointed out in this document, but the actual services that run the code and handle the requests are identical between the two environments. ### Services | Name | Details | |:-----------:|:-----------------------------------------------------------------------------------------------------------:| -| nginx | listens on ports 80/443, handles HTTPS proxying, serves static assets, routes requests for daphne and uwsgi | -| uwsgi | listens on port 8050, handles API requests | -| daphne | listens on port 8051, handles Websocket requests | -| runworker | no listening port, watches and processes the message queue | -| supervisord | (production-only) handles the process management of all the services except nginx | +| `nginx` | listens on ports 80/443, handles HTTPS proxying, serves static assets, routes requests for `daphne` and `uwsgi` | +| `uwsgi` | listens on port 8050, handles API requests | +| `daphne` | listens on port 8051, handles websocket requests | +| `runworker` | no listening port, watches and processes the message queue | +| `supervisord` | (production-only) handles the process management of all the services except `nginx` | -When a request comes in to *nginx* and have the `Upgrade` header and is for the path `/websocket`, then *nginx* knows that it should -be routing that request to our *daphne* service. +When a request comes in to `nginx` and has the `Upgrade` header and is for the path `/websocket`, then `nginx` knows that it should be routing that request to our `daphne` service. -*daphne* receives the request and generates channel and routing information for the request. The configured event handlers for *daphne* -then unpack and parse the request message using the wire protocol mentioned above. This ensures that the connect has its context limited to only -receive messages for events it is interested in. *daphne* uses internal events to trigger further behavior, which will generate messages -and send them to the queue, that queue is processed by the *runworker*. +`daphne` receives the request and generates channel and routing information for the request. The configured event handlers for `daphne` then unpack and parse the request message using the wire protocol mentioned above. This ensures that the connection has its context limited to only receive messages for events it is interested in. `daphne` uses internal events to trigger further behavior, which will generate messages and send them to the queue, which is then processed by the `runworker`. -*runworker* processes the messages from the queue. This uses the contextual information of the message provided -by the *daphne* server and our *asgi_amqp* implementation to broadcast messages out to each client. +`runworker` processes the messages from the queue. This uses the contextual information of the message provided by the `daphne` server and our `asgi_amqp` implementation to broadcast messages out to each client. ### Development - - nginx listens on 8013/8043 instead of 80/443 + - `nginx` listens on 8013/8043 instead of 80/443