Add max concurrent jobs and max forks per ig

The intention of this feature is primarily to provide some notion of max capacity of container groups, but the logic I've left generic. Default is 0, which will be interpereted as no maximum number of jobs or forks. Includes refactor of variable and method names for clarity. instances_by_hostname is an internal attribute of TaskManagerInstances. Clarify when we are expecting the actual TaskManagerInstances object. Unify how we process running tasks and consume capacity. This has the effect that we do less expensive work in after_lock_init and have 1 less loop over all the running tasks. Previously we looped for both building the dependency graph as well as for calculating the starting capacity of all the instances and instance groups. Now we acheive both tasks in the same loop. Because of how this changes the somewhat subtle "do-si-do" of how to initialize the Task Manager models, introduce a wrapper class that tries to take some of that burden off of other areas where we re-use this like in the serializer and the metrics. Also use this wrapper class to handle nicities of how to track capacity consumption on instances and instance groups. Add tests for max_forks and max_concurrent_jobs Fixup tests that use TaskManagerModels to accomodate changes. assign ig before call to consume capacity if we don't do it in that order, then we don't correctly account for the container group jobs we are starting in the middle of the task manager run
2026-05-23 00:37:37 -02:30 · 2022-10-23 23:20:41 -04:00
parent 65c3db8cb8
commit 86856f242a
11 changed files with 490 additions and 185 deletions
--- a/awx/main/analytics/collectors.py
+++ b/awx/main/analytics/collectors.py
@@ -16,7 +16,7 @@ from awx.conf.license import get_license
 from awx.main.utils import get_awx_version, camelcase_to_underscore, datetime_hook
 from awx.main import models
 from awx.main.analytics import register
-from awx.main.scheduler.task_manager_models import TaskManagerInstances
+from awx.main.scheduler.task_manager_models import TaskManagerModels

 """
 This module is used to define metrics collected by awx.main.analytics.gather()
@@ -237,11 +237,10 @@ def projects_by_scm_type(since, **kwargs):
 def instance_info(since, include_hostnames=False, **kwargs):
    info = {}
    # Use same method that the TaskManager does to compute consumed capacity without querying all running jobs for each Instance
-    active_tasks = models.UnifiedJob.objects.filter(status__in=['running', 'waiting']).only('task_impact', 'controller_node', 'execution_node')
-    tm_instances = TaskManagerInstances(
-        active_tasks, instance_fields=['uuid', 'version', 'capacity', 'cpu', 'memory', 'managed_by_policy', 'enabled', 'node_type']
-    )
-    for tm_instance in tm_instances.instances_by_hostname.values():
+    tm_models = TaskManagerModels.init_with_consumed_capacity(
+            instance_fields=['uuid', 'version', 'capacity', 'cpu', 'memory', 'managed_by_policy', 'enabled']
+            )
+    for tm_instance in tm_models.instances.instances_by_hostname.values():
        instance = tm_instance.obj
        instance_info = {
            'uuid': instance.uuid,