mirror of
https://github.com/ansible/awx.git
synced 2026-01-31 17:18:59 -03:30
Feature: retry on subset of jobs hosts
This commit is contained in:
@@ -53,3 +53,4 @@
|
||||
`deprovision_node` -> `deprovision_instance`, and `instance_group_remove` -> `remove_from_queue`,
|
||||
which backward compatibility support for 3.1 use pattern
|
||||
[[#6915](https://github.com/ansible/ansible-tower/issues/6915)]
|
||||
* Allow relaunching jobs on a subset of hosts, by status.[[#219](https://github.com/ansible/awx/issues/219)]
|
||||
|
||||
76
docs/retry_by_status.md
Normal file
76
docs/retry_by_status.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Relaunch on Hosts with Status
|
||||
|
||||
This feature allows the user to relaunch a job, targeting only a subset
|
||||
of hosts that had a particular status in the prior job.
|
||||
|
||||
### API Design of Relaunch
|
||||
|
||||
#### Basic Relaunch
|
||||
|
||||
POST to `/api/v2/jobs/N/relaunch/` without any request data should relaunch
|
||||
the job with the same `limit` value that the original job used, which
|
||||
may be an empty string.
|
||||
|
||||
#### Relaunch by Status
|
||||
|
||||
Providing request data containing `{"hosts": "<status>"}` should change
|
||||
the `limit` of the relaunched job to target the hosts matching that status
|
||||
from the previous job (unless the default option of "all" is used).
|
||||
The options and meanings of `<status>` include:
|
||||
|
||||
- all: relaunch without changing the job limit
|
||||
- ok: relaunch against all hosts with >=1 tasks that returned the "ok" status
|
||||
- changed: relaunch against all hosts with >=1 tasks had a changed status
|
||||
- failed: relaunch against all hosts with >=1 tasks failed plus all unreachable hosts
|
||||
- unreachable: relaunch against all hosts with >=1 task when they were unreachable
|
||||
|
||||
These correspond to the playbook summary states from a playbook run, with
|
||||
the notable exception of "failed" hosts. Ansible does not count an unreachable
|
||||
event as a failed task, so unreachable hosts can (and often do) have no
|
||||
associated failed tasks. The "failed" status here will still target both
|
||||
status types, because Ansible will mark the _host_ as failed and include it
|
||||
in the retry file if it was unreachable.
|
||||
|
||||
### Relaunch Endpoint
|
||||
|
||||
Doing a GET to the relaunch endpoint should return additional information
|
||||
regarding the host summary of the last job. Example response:
|
||||
|
||||
```json
|
||||
{
|
||||
"passwords_needed_to_start": [],
|
||||
"retry_counts": {
|
||||
"all": 30,
|
||||
"failed": 18,
|
||||
"ok": 25,
|
||||
"changed": 4,
|
||||
"unreachable": 9
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
If the user launches, providing a status for which there were 0 hosts,
|
||||
then the request will be rejected.
|
||||
|
||||
# Acceptance Criteria
|
||||
|
||||
Scenario: user launches a job against host "foobar", and the run fails
|
||||
against this host. User changes name of host to "foo", and relaunches job
|
||||
against failed hosts. The `limit` of the relaunched job should reference
|
||||
"foo" and not "foobar".
|
||||
|
||||
The user should be able to provide passwords on relaunch, while also
|
||||
running against hosts of a particular status.
|
||||
|
||||
Not providing the "hosts" key in a POST to the relaunch endpoint should
|
||||
relaunch the same way that relaunching has previously worked.
|
||||
|
||||
If a playbook provisions a host, this feature should behave reasonably
|
||||
when relaunching against a status that includes these hosts.
|
||||
|
||||
Feature should work even if hosts have tricky characters in their names,
|
||||
like commas.
|
||||
|
||||
Also need to consider case where a task `meta: clear_host_errors` is present
|
||||
inside a playbook, and that the retry subset behavior is the same as Ansible
|
||||
for this case.
|
||||
Reference in New Issue
Block a user