Use the ansible-runner worker --worker-info to perform execution node capacity checks (#10825)

* Introduce utilities for --worker-info health check integration

* Handle case where ansible-runner is not installed

* Add ttl parameter for health check

* Reformulate return data structure and add lots of error cases

* Move up the cleanup tasks, close sockets

* Integrate new --worker-info into the execution node capacity check

* Undo the raw value override from the PoC

* Additional refinement to execution node check frequency

* Put in more complete network diagram

* Followup on comment to remove modified from from health check responsibilities
This commit is contained in:
Alan Rominger
2021-08-11 10:14:20 -04:00
parent 4e84c7c4c4
commit 3b1e40d227
5 changed files with 212 additions and 178 deletions

View File

@@ -131,6 +131,13 @@ class Instance(HasPolicyEditsMixin, BaseModel):
grace_period = 120
return self.modified < ref_time - timedelta(seconds=grace_period)
def mark_offline(self, on_good_terms=False):
self.cpu = self.cpu_capacity = self.memory = self.mem_capacity = self.capacity = 0
update_fields = ['capacity', 'cpu', 'memory', 'cpu_capacity', 'mem_capacity']
if on_good_terms:
update_fields.append('modified')
self.save()
def refresh_capacity(self):
cpu = get_cpu_capacity()
mem = get_mem_capacity()