AAP-68024 perf: derive last_job_host_summary from query instead of denormalized FK (#16332)

* perf: stop eagerly updating Host.last_job_host_summary on every job completion

The playbook_on_stats wrapup path bulk-updates last_job_host_summary_id
on every host touched by a job. In the Q4CY25 scale lab this query had
a median execution time of 75 seconds due to index churn on main_host.

Replace all reads of the denormalized FK with a new classmethod
JobHostSummary.latest_for_host(host_id) that queries for the most
recent summary on demand. This eliminates the write-side bulk_update
of last_job_host_summary_id entirely.

Changes:
- Add JobHostSummary.latest_for_host() classmethod
- Serializer: use latest_for_host() instead of obj.last_job_host_summary
- Dashboard view: use subquery instead of FK traversal for failed hosts
- Inventory.update_computed_fields: use subquery for failed host count
- events.py: remove last_job_host_summary_id from bulk_update
- signals.py: simplify _update_host_last_jhs to only update last_job
- access.py/managers.py: remove select_related/defer through the FK

The FK field on Host is left in place for now (removal requires a
migration) but is no longer written to.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix .pk AttributeError, add job_template annotations, annotate host sublists

- Add 'pk' to AnnotatedSummary dynamic type (fixes AttributeError in get_related)
- Add job_template_id and job_template_name to subquery annotations so list
  views include these fields in summary_fields.last_job (matching detail views)
- Traverse job__ FK from JobHostSummary instead of using separate UnifiedJob
  subquery with OuterRef on another annotation (cleaner SQL, avoids alias issue)
- Annotate all host sublist views (InventoryHostsList, GroupHostsList,
  GroupAllHostsList, InventorySourceHostsList) to prevent N+1 queries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update test_events to use JobHostSummary.latest_for_host instead of stale FKs

Tests were asserting host.last_job_id and host.last_job_host_summary_id
which are no longer updated. Use JobHostSummary.latest_for_host() to
derive the same data, matching the new read-time derivation approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove stale failures_url from deprecated DashboardView

The failures_url linked to ?last_job_host_summary__failed=True which
filters on the now-stale FK. The dashboard count itself was already
fixed to use a subquery annotation. Since DashboardView is deprecated
and has_active_failures is a SerializerMethodField (not filterable),
remove the failures_url entirely rather than creating a custom filter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply black formatting to changed files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Refactor: replace 10 subquery annotations with bulk prefetch

Instead of annotating every host queryset with 10 correlated subqueries
(summary + job + job_template fields), annotate only _latest_summary_id
and bulk-fetch the full JobHostSummary objects after pagination via
select_related('job', 'job__job_template').

This reduces the SQL from 10 correlated subqueries to 1 subquery + 1 IN
query, addressing review feedback about annotation overhead on host list
views.

- _annotate_host_latest_summary: only annotates _latest_summary_id
- _prefetch_latest_summaries: bulk-fetches and attaches to host objects
- HostSummaryPrefetchMixin: hooks into list() after pagination
- Serializer uses real JobHostSummary objects (no more AnnotatedSummary)
- to_representation always overwrites stale FK values

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Refactor: move latest summary to QuerySet._fetch_all + Host.latest_summary

Per review feedback, replace the view-level HostSummaryPrefetchMixin
with a custom QuerySet that bulk-attaches summaries at evaluation time
(like prefetch_related), and a Host.latest_summary property as the
single access point.

- HostLatestSummaryQuerySet: overrides _fetch_all() to bulk-fetch
  JobHostSummary objects with select_related after queryset evaluation
- HostManager now inherits from the custom queryset via from_queryset()
- Host.latest_summary property: uses cache if available, falls back to
  individual query
- Remove _annotate_host_latest_summary, _prefetch_latest_summaries,
  HostSummaryPrefetchMixin from views — no more list() override needed
- Remove last_job/last_job_host_summary from SUMMARIZABLE_FK_FIELDS
- Serializer uses obj.latest_summary and DEFAULT_SUMMARY_FIELDS loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix: scope annotation to views, restore license_error/canceled_on

- Remove with_latest_summary_id() from HostManager.get_queryset() to
  avoid applying the correlated subquery to every Host query globally
  (count, exists, internal relations)
- Apply with_latest_summary_id() in get_queryset() of the 6
  host-serving views only
- Restore license_error and canceled_on to last_job summary fields
  to avoid breaking API change

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Guard _fetch_all() to skip bulk-attach on non-annotated querysets

Without this guard, _fetch_all() would set _latest_summary_cache=None
on every host in non-annotated querysets (e.g. Host.objects.filter()),
masking the per-object fallback query in Host.latest_summary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove name from last_job_host_summary and canceled_on from last_job summary

Per reviewer feedback: these fields were not in the original API contract
via SUMMARIZABLE_FK_FIELDS and their addition would be an API change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add functional tests for HostLatestSummaryQuerySet and Host.latest_summary

Tests cover:
- with_latest_summary_id() annotation and most-recent selection
- _fetch_all() bulk-attach behavior on annotated querysets
- _fetch_all() skips non-annotated querysets (preserves fallback)
- .count() and .exists() do NOT trigger _fetch_all
- Host.latest_summary cache hits (zero queries) and fallback
- Host.latest_job property
- select_related on bulk-attached summaries (no N+1)
- Chaining preserves annotation
- Multiple jobs / partial host coverage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply black formatting to test_host_queryset.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>

* Fix flake8 F841: remove unused job1/job2 variables in tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>

* Add comment explaining why Prefetch was not used for host latest summary

Django Prefetch cannot handle latest per group -- [:1] slicing fetches
1 record globally, not per host (Django ticket #26780). The custom
_fetch_all override uses the same 2-query pattern as prefetch_related
internally, customized for this use case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix null handling to keep old behavior

---------

Signed-off-by: Ben Thomasson <bthomass@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: AlanCoding <arominge@redhat.com>
This commit is contained in:
Ben Thomasson
2026-04-28 10:47:22 -04:00
committed by GitHub
parent f3b7d442c3
commit d1b3ae53ae
11 changed files with 490 additions and 100 deletions

View File

@@ -174,8 +174,8 @@ SUMMARIZABLE_FK_FIELDS = {
'workflow_approval': DEFAULT_SUMMARY_FIELDS + ('timeout',),
'schedule': DEFAULT_SUMMARY_FIELDS + ('next_run',),
'unified_job_template': DEFAULT_SUMMARY_FIELDS + ('unified_job_type',),
'last_job': DEFAULT_SUMMARY_FIELDS + ('finished', 'status', 'failed', 'license_error', 'canceled_on'),
'last_job_host_summary': DEFAULT_SUMMARY_FIELDS + ('failed',),
# last_job and last_job_host_summary are derived from JobHostSummary in HostSerializer,
# not from the stale FK fields on Host.
'last_update': DEFAULT_SUMMARY_FIELDS + ('status', 'failed', 'license_error'),
'current_update': DEFAULT_SUMMARY_FIELDS + ('status', 'failed', 'license_error'),
'current_job': DEFAULT_SUMMARY_FIELDS + ('status', 'failed', 'license_error'),
@@ -1837,19 +1837,35 @@ class HostSerializer(BaseSerializerWithVariables):
res['ansible_facts'] = self.reverse('api:host_ansible_facts_detail', kwargs={'pk': obj.instance_id})
if obj.inventory:
res['inventory'] = self.reverse('api:inventory_detail', kwargs={'pk': obj.inventory.pk})
if obj.last_job:
res['last_job'] = self.reverse('api:job_detail', kwargs={'pk': obj.last_job.pk})
if obj.last_job_host_summary:
res['last_job_host_summary'] = self.reverse('api:job_host_summary_detail', kwargs={'pk': obj.last_job_host_summary.pk})
last_summary = obj.latest_summary
if last_summary:
res['last_job_host_summary'] = self.reverse('api:job_host_summary_detail', kwargs={'pk': last_summary.pk})
if last_summary.job_id:
res['last_job'] = self.reverse('api:job_detail', kwargs={'pk': last_summary.job_id})
return res
def get_summary_fields(self, obj):
d = super(HostSerializer, self).get_summary_fields(obj)
try:
d['last_job']['job_template_id'] = obj.last_job.job_template.id
d['last_job']['job_template_name'] = obj.last_job.job_template.name
except (KeyError, AttributeError):
pass
last_summary = obj.latest_summary
if last_summary:
d['last_job_host_summary'] = OrderedDict()
d['last_job_host_summary']['id'] = last_summary.id
d['last_job_host_summary']['failed'] = last_summary.failed
try:
last_job = last_summary.job
d['last_job'] = OrderedDict()
for field in DEFAULT_SUMMARY_FIELDS + ('finished', 'status', 'failed', 'canceled_on'):
fval = getattr(last_job, field, None)
if fval is not None:
d['last_job'][field] = fval
if last_job.job_template:
d['last_job']['job_template_id'] = last_job.job_template.id
d['last_job']['job_template_name'] = last_job.job_template.name
except ObjectDoesNotExist:
pass
else:
d.pop('last_job', None)
d.pop('last_job_host_summary', None)
if has_model_field_prefetched(obj, 'groups'):
group_list = sorted([{'id': g.id, 'name': g.name} for g in obj.groups.all()], key=lambda x: x['id'])[:5]
else:
@@ -1924,14 +1940,16 @@ class HostSerializer(BaseSerializerWithVariables):
return ret
if 'inventory' in ret and not obj.inventory:
ret['inventory'] = None
if 'last_job' in ret and not obj.last_job:
ret['last_job'] = None
if 'last_job_host_summary' in ret and not obj.last_job_host_summary:
ret['last_job_host_summary'] = None
last_summary = obj.latest_summary
if 'last_job' in ret:
ret['last_job'] = last_summary.job_id if last_summary else None
if 'last_job_host_summary' in ret:
ret['last_job_host_summary'] = last_summary.pk if last_summary else None
return ret
def get_has_active_failures(self, obj):
return bool(obj.last_job_host_summary and obj.last_job_host_summary.failed)
last_summary = obj.latest_summary
return bool(last_summary and last_summary.failed)
def get_has_inventory_sources(self, obj):
return obj.inventory_sources.exists()

View File

@@ -21,7 +21,7 @@ from urllib3.exceptions import ConnectTimeoutError
# Django
from django.conf import settings
from django.core.exceptions import FieldError, ObjectDoesNotExist
from django.db.models import Q, Sum, Count
from django.db.models import Q, Sum, Count, Subquery, OuterRef
from django.db import IntegrityError, ProgrammingError, transaction, connection
from django.db.models.fields.related import ManyToManyField, ForeignKey
from django.db.models.functions import Trunc
@@ -210,10 +210,10 @@ class DashboardView(APIView):
data['groups'] = {'url': reverse('api:group_list', request=request), 'total': user_groups.count(), 'inventory_failed': groups_inventory_failed}
user_hosts = get_user_queryset(request.user, models.Host)
user_hosts_failed = user_hosts.filter(last_job_host_summary__failed=True)
latest_summary_failed = Subquery(models.JobHostSummary.objects.filter(host_id=OuterRef('pk')).order_by('-id').values('failed')[:1])
user_hosts_failed = user_hosts.annotate(_latest_failed=latest_summary_failed).filter(_latest_failed=True)
data['hosts'] = {
'url': reverse('api:host_list', request=request),
'failures_url': reverse('api:host_list', request=request) + "?last_job_host_summary__failed=True",
'total': user_hosts.count(),
'failed': user_hosts_failed.count(),
}
@@ -1943,7 +1943,7 @@ class HostList(HostRelatedSearchMixin, ListCreateAPIView):
if filter_string:
filter_qs = SmartFilter.query_from_string(filter_string)
qs &= filter_qs
return qs.distinct()
return qs.distinct().with_latest_summary_id()
def list(self, *args, **kwargs):
try:
@@ -1958,6 +1958,9 @@ class HostDetail(RelatedJobsPreventDeleteMixin, RetrieveUpdateDestroyAPIView):
serializer_class = serializers.HostSerializer
resource_purpose = 'host detail'
def get_queryset(self):
return super().get_queryset().with_latest_summary_id()
@extend_schema_if_available(extensions={"x-ai-description": "Delete a host"})
def delete(self, request, *args, **kwargs):
if self.get_object().inventory.pending_deletion:
@@ -1991,6 +1994,9 @@ class InventoryHostsList(HostRelatedSearchMixin, SubListCreateAttachDetachAPIVie
filter_read_permission = False
resource_purpose = 'hosts of an inventory'
def get_queryset(self):
return super().get_queryset().with_latest_summary_id()
class HostGroupsList(SubListCreateAttachDetachAPIView):
'''the list of groups a host is directly a member of'''
@@ -2174,6 +2180,9 @@ class GroupHostsList(HostRelatedSearchMixin, SubListCreateAttachDetachAPIView):
relationship = 'hosts'
resource_purpose = 'hosts of a group'
def get_queryset(self):
return super().get_queryset().with_latest_summary_id()
def update_raw_data(self, data):
data.pop('inventory', None)
return super(GroupHostsList, self).update_raw_data(data)
@@ -2205,7 +2214,7 @@ class GroupAllHostsList(HostRelatedSearchMixin, SubListAPIView):
self.check_parent_access(parent)
qs = self.request.user.get_queryset(self.model).distinct() # need distinct for '&' operator
sublist_qs = parent.all_hosts.distinct()
return qs & sublist_qs
return (qs & sublist_qs).with_latest_summary_id()
class GroupInventorySourcesList(SubListAPIView):
@@ -2498,6 +2507,9 @@ class InventorySourceHostsList(HostRelatedSearchMixin, SubListDestroyAPIView):
check_sub_obj_permission = False
resource_purpose = 'hosts of an inventory source'
def get_queryset(self):
return super().get_queryset().with_latest_summary_id()
def perform_list_destroy(self, instance_list):
inv_source = self.get_parent_object()
with ignore_inventory_computed_fields():