Commit Graph

55 Commits

Author SHA1 Message Date
Ryan Petrello
9843e21632 skip non-files when consuming events synced from isolated hosts
see: https://github.com/ansible/awx/issues/6675
2020-04-13 10:14:10 -04:00
Ryan Petrello
6b4219badb more ansible runner isolated cleanup
follow-up to https://github.com/ansible/awx/pull/6296
2020-04-08 01:18:05 -04:00
Ryan Petrello
b73e8d8a56 fix a bug in isolated event handling
see: https://github.com/ansible/awx/issues/6280
2020-03-16 13:15:10 -04:00
Ryan Petrello
f8818730d4 consolidate isolated event handling code into one function
make the non-isolated *and* isolated event handling share the same
function so we don't regress on behavior between the two
2020-03-13 10:05:48 -04:00
Ryan Petrello
326ed22efe properly handle import errors in the isolated capacity healthcheck
if the awx_capacity module runs on an isolated node with missing
libraries (i.e., psutil) or bad permissions, then the runner status will
be "failed"

in this scenario, we *still* want to react by recording a capacity=0
2020-01-31 10:17:20 -05:00
Ryan Petrello
220168f5ee fix a bug in isolated check timeout handling 2019-12-06 12:44:50 -05:00
Christian Adams
4f8b624b96 Make spelling of canceled consistent 2019-11-26 00:31:15 -05:00
Shane McDonald
db2316b791 Remove usage of idle_timeout when checking status of isolated / containerized jobs 2019-11-22 11:41:00 -05:00
Ryan Petrello
ccaaee61f0 improve cleanup of anonymous kubeconfig files 2019-10-29 11:24:12 -04:00
Ryan Petrello
6dfc714c75 when isolated or container jobs fail to launch, set job status to error
a status of error makes more sense, because failed generally points to
an issue with the playbook itself, while error is more generally used
for reporting issues internal to Tower

see: https://github.com/ansible/awx/issues/4909
2019-10-29 11:24:10 -04:00
Shane McDonald
bd5003ca98 Task manager / scheduler Kubernetes integration 2019-10-04 13:21:21 -04:00
Ryan Petrello
82be87566f improve host key checking configurability
see: https://github.com/ansible/tower/issues/3737
2019-09-30 14:13:07 -04:00
Ryan Petrello
c6c14d4fb9 properly record Instance.cpu and Instance.memory for isolated nodes 2019-05-03 15:30:41 -04:00
Ryan Petrello
f1d87bf392 fix a bug that breaks the isolated heartbeat 2019-04-16 16:24:40 -04:00
softwarefactory-project-zuul[bot]
d222bed932 Merge pull request #3712 from jladdjr/iso_node_healthcheck_should_not_reset_capacity
Do not reset capacity of iso nodes when disabled

Reviewed-by: https://github.com/softwarefactory-project-zuul[bot]
2019-04-15 20:40:01 +00:00
Jim Ladd
6ef3b18803 Do not reset capacity of iso nodes when disabled 2019-04-15 12:36:15 -07:00
Ryan Petrello
387682ed8d if runner crashes, attempt to record why
this attempts to surface the underlying runner exception for tracebacks
like this one:

FileNotFoundError: [Errno 2] No such file or directory:
'/tmp/awx_41_93gtgv25/artifacts/41/status'
2019-04-15 13:17:45 -04:00
softwarefactory-project-zuul[bot]
58966d7368 Merge pull request #3625 from ryanpetrello/iso-forks
WIP: specify --forks on isolated health check calls

Reviewed-by: https://github.com/softwarefactory-project-zuul[bot]
2019-04-11 21:41:37 +00:00
softwarefactory-project-zuul[bot]
e3dfc6c796 Merge pull request #3596 from jbradberry/capture-isolated-command
Updated IsolatedManager to take a callback that captures the remote command

Reviewed-by: https://github.com/softwarefactory-project-zuul[bot]
2019-04-05 17:15:11 +00:00
Ryan Petrello
81fe923577 don't write playbook stdout to sys.stdout (it's duplicated in log files)
this instructs runner to _not_ write to stdout when we invoke
runner.interface.run(); AWX consumes/ingests this strictly as events
2019-04-05 11:20:34 -04:00
Ryan Petrello
79d580d5b9 update periodic isolated cleanup to match the new paths post-runner 2019-04-05 09:43:27 -04:00
Ryan Petrello
5a4a812c73 specify --forks on isolated health check calls
this requires ansible-runner 1.3.2
2019-04-04 20:12:14 -04:00
Jeff Bradberry
3f6d3506c6 Change the artifact file convention for isolated nodes to 'command'
since that's what landed in the ansible-runner PR.
2019-04-04 14:25:50 -04:00
Jeff Bradberry
467700e4bb Bring the check_callback back into the loop
but try to process it only once.
2019-04-03 16:04:07 -04:00
Jeff Bradberry
b4e508f72a Bring the check_callback call out of the loop
We shouldn't need to call it multiple times.
2019-04-03 15:12:29 -04:00
Jeff Bradberry
32286a9d49 Change the artifact to also capture the actual envvars data 2019-04-02 17:10:26 -04:00
Jeff Bradberry
cac48e7cfb Updated IsolatedManager to take a callback that captures the remote command 2019-04-02 15:40:56 -04:00
chris meyers
71fcb1a82c process host facts for iso runs
* Move isolated clean to our final run hook
* ISO and non-iso code path now share the post-fact-processing code
2019-03-29 16:16:22 -04:00
Ryan Petrello
563a0cc2a4 move awx.main.expect to awx.main.isolated 2019-03-29 12:14:40 -04:00
AlanCoding
e79ca131a6 initial commit to move folder isolated->expect 2017-08-15 11:32:44 -04:00
AlanCoding
42ccd870d9 Automatically cancel job if cancel callback fails and log 2017-08-11 16:43:08 -04:00
Ryan Petrello
7db9b48e9c add a configurable for disabling the auto-generated isolated RSA key
some users won't want to utilize the RSA key we auto-generate for
isolated node SSH access, but will instead want to manage SSH
authentication by hand outside of Tower

see: https://github.com/ansible/ansible-tower/issues/7380
2017-08-03 17:16:28 -04:00
AlanCoding
5d254d781a provide the job id in isolated management logs 2017-08-03 10:29:48 -04:00
AlanCoding
1112557c79 set capacity to 0 if instance has not checked in lately 2017-07-27 16:20:04 -04:00
Matthew Jones
b3b4a515e2 Refactor some tower periodic tasks to label as awx 2017-07-26 13:35:30 -04:00
Matthew Jones
d4b1a07495 Rename tower display plugins to awx display 2017-07-26 13:33:30 -04:00
Matthew Jones
c7a85d9738 Mass rename from ansible_(awx|tower) -> (awx|tower) 2017-07-26 13:33:26 -04:00
Ryan Petrello
e29492a259 more tower -> awx for task execution and isolated tooling 2017-07-25 10:36:06 -04:00
Ryan Petrello
8ce1421c6a fix tower-expect -> awx-expect for isolated tower builds 2017-07-24 16:03:58 -04:00
Ryan Petrello
d42ea31f75 use a named pipe for isolated secret passthrough (not stdin)
it's not unusual for the secret data we pass into the `run_isolated.yml`
playbook to be quite long, namely because it can contain RSA key
data; by passing this value into the ansible-playbook process using
`vars_prompt`, we're limited by pexpect's tty line limit (which looks
like it caps out around 4k).  Because of this, large payloads are
being truncated and causing job run failures.

this changes the implementation to use a named pipe instead, which
doesn't have the same limitation

see: #7183
2017-07-20 12:42:03 -04:00
Ryan Petrello
53259e4d24 properly capture job events for adhoc commands run on isolated instances
see: #7100
2017-07-17 14:51:24 -04:00
Ryan Petrello
0a5b9c458b standardize tasks.py temporary file paths under a single parameter
see: #3472
2017-07-05 13:50:43 -04:00
AlanCoding
70b1b9c81d isolated connection timeout and log file for playbook out 2017-07-05 08:48:01 -04:00
Ryan Petrello
413e8c3bc9 isolated nodes should report their awx version in their heartbeat
see: #6810
2017-06-29 16:55:11 -04:00
Ryan Petrello
405c01a847 more isolated production tinkering
see: #5903
see: #6507
2017-06-29 09:35:26 -04:00
AlanCoding
05bcd4b674 fix bug where isolated management jobs could not load JSON output 2017-06-28 11:41:30 -04:00
Ryan Petrello
aaff005234 Merge pull request #6745 from ryanpetrello/fix-6659
RFC: install a randomized RSA key for controller -> isolated rampart auth
2017-06-27 11:52:36 -04:00
Ryan Petrello
3000f52a92 install a randomized RSA key for controller -> isolated rampart auth
see: #6507
2017-06-27 10:53:44 -04:00
Ryan Petrello
5adc1c603a properly update the heartbeat timestamp for isolated nodes 2017-06-26 11:03:56 -04:00
AlanCoding
40287d8e78 multi-host isolated heartbeat w tower-isolated check
* use tower-expect command to determine job status when running
  the isolated heartbeat playbook
* grok JSON output of playbook to obtain result information
* run playbook against multiple isolated hosts at the same time
  (addresses scalability concerns)
2017-06-20 14:36:18 -04:00