Commit Graph

3597 Commits

Author SHA1 Message Date
Dave
67048a609a feat(analytics): log request_id, account_number, org_id from ingress API response (#16512)
* feat(analytics): ANSTRAT-2268 log request_id, account_number, org_id from ingress API response

Log the ingress API success response fields (request_id, account_number,
org_id) in the controller task log when gather_analytics tarballs are
uploaded. This enables support engineers to trace uploads through to
Kibana without source code modifications.

* style(analytics): use f-strings in _log_shipping_response to match codebase conventions

* style(analytics): fix black formatting in test assertions
2026-06-22 10:03:57 -04:00
Dirk Julich
c1bd2eb338 [AAP-72817] Fix cartesian product in organization user/admin count queries (#16501)
* Fix cartesian product in organization user/admin count queries

The organizations list and detail endpoints annotated each org with user and admin counts using two Count() calls that traverse the Role.members M2M. Django generated two LEFT JOINs on the same through table, crossing every member row with every admin row before COUNT(DISTINCT) reduced the product.

At scale (2,617 members × 46,233 admins) this produced 120M intermediate rows and 96-second query times, causing 504 timeouts.

Replace with independent Subquery expressions that each query main_rbac_roles_members separately - no cross product.

Fixes: AAP-72817
Fixes: AAP-72480

* Fix variable names which do not meet coding standards

* Fix formatting inconsistency in organization detail subquery annotation

Break the long .annotate() line across multiple lines to match the style used in mixin.py.

* Rewrite org count subqueries to use DAB RBAC models

Replace old RBAC Role.members.through subqueries with
RoleUserAssignment-based correlated subqueries, querying
managed RoleDefinitions ('Organization Member' / 'Organization Admin')
directly. This aligns with the DAB RBAC migration direction and
eliminates dependency on the deprecated ImplicitRoleField M2M tables
for these counts.

Update test fixtures to use RoleDefinition.give_permission() and
add setup_managed_roles where needed.

* Fix collection tests: set up managed role definitions

The DAB RBAC migration to use RoleUserAssignment subqueries in
organization views requires managed role definitions (Organization
Member, Organization Admin) to exist in the test database.

Add an autouse fixture to the collection test conftest that calls
setup_managed_role_definitions() before each test.

* Add setup_managed_roles fixture to functional tests hitting org views

Tests that hit organization list/detail views now require the
setup_managed_roles fixture to pre-create the Organization Member
and Organization Admin RoleDefinition objects used by the DAB RBAC
subqueries.

* Revert setup_managed_roles from ext_auditor tests

The setup_managed_roles fixture conflicts with the ext_auditor_rd
fixture by deleting the Alien Auditor role definition. These tests
don't need it — the defensive view code handles missing role
definitions gracefully.

* Handle missing Organization Member/Admin role definitions gracefully

Use filter().first() instead of get() for RoleDefinition lookups in
organization list and detail views. Returns 0 for user/admin counts
when role definitions are not yet created, preventing 500 errors in
environments where post_migrate signals haven't run.

* Cast OuterRef('pk') to TextField for RoleUserAssignment.object_id comparison

RoleUserAssignment.object_id is a TextField, but OuterRef('pk') on
Organization produces an integer. PostgreSQL strictly rejects text = integer
comparisons. Use Cast() to explicitly convert the PK to text.

---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-18 18:35:22 +02:00
Dirk Julich
61d17673d9 Optimize HostManager.active_count() with cache (#16505)
* [AAP-78392] Optimize HostManager.active_count() with cache and functional index

active_count() runs a full sequential scan with LOWER()+DISTINCT on main_host for every license check. At customer scale this consumed 74.5 minutes of DB time over 4 hours (47K calls at 93ms avg).

Add a 60-second Redis-backed cache via the existing memoize decorator to reduce call volume by ~99.5%. Add a functional btree index on LOWER(name) to eliminate the sequential scan for the remaining calls.

* Use AddIndexConcurrently instead of AddIndex in the migration for host name lower index

* Revert AddIndexConcurrently to AddIndex for CI compatibility

The api-migrations CI job runs against SQLite which does not support PostgreSQL-specific AddIndexConcurrently. Standard AddIndex works across all backends and the brief write lock during production upgrades is acceptable for this table size.

* Remove functional index, keep cache-only fix per reviewer feedback

Drop the LOWER(name) functional index and migration to minimize
the change footprint.

----
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-06-18 17:48:29 +02:00
Rodrigo Toshiaki Horie
242f008f44 feat: restore x-ai-description entries in OpenAPI schema (#16502)
feat: inject x-ai-description from overlay file during schema generation

Many endpoints have human-readable AI descriptions that were added
downstream in aap-mcp-server (PRs #73 and #119) but never backported
as @extend_schema_if_available decorators. This causes 470 out of 631
x-ai-description entries to be lost every time the spec is regenerated.

Add a JSON overlay file (openapi_ai_descriptions.json) containing the
missing descriptions keyed by operationId, and a drf-spectacular
postprocessing hook that merges them into the generated schema for any
operation that doesn't already have x-ai-description from a decorator.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-17 15:26:24 -03:00
Seth Foster
d5e5ea3670 Lazy-load plugin registries, move DB sync to dispatcher (#16483)
Move plugin loading to lazy-on-first-access, DB sync to dispatcher

Remove credential type and inventory plugin loading from Django's
app.ready() path. In-memory registries (ManagedCredentialType.registry
and InventorySourceOptions.injectors) are now populated lazily on first
access via LazyLoadDict, a dict subclass that calls a loader function
on the first read operation. This ensures web workers, dispatcher
workers, and management commands all get the registries populated
exactly when needed, without eager loading at startup.

The DB sync (CredentialType.setup_tower_managed_defaults) is moved to
the dispatcher's startup task, where it only needs to run once per
deployment rather than in every Django process.

Co-Authored-By: Alan Rominger <arominge@redhat.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-09 12:45:50 -04:00
Peter Braun
c8cb465fde fix: do not allow exec instances to be added to the control plane (#16477)
Co-authored-by: Stevenson Michel <iamstevensonmichel@outlook.com>
2026-06-09 09:04:10 -04:00
Alan Rominger
49e21d7c1c Move PG version check to awx-manage check_db & migrate commands (#15463)
* Move PG version check to check_db command

Move to utils, check in pre_migrate signal

* Add back in environment var skip

* Add tests for compliance

tests Assisted-By: claude
2026-06-08 10:57:55 -04:00
Lila Yasin
b4f27de4a2 [AAP-53283] Use getproxies() for analytics API proxy configuration (#16475)
Revert "[AAP-53283] Fix analytics API requests to respect proxy environment variables (#16451)"

This reverts commit 45480941f8.
2026-06-02 15:38:18 -04:00
Seth Foster
5cc467d4cf [AAP-74497] Reset orphaned waiting jobs when controller node is deprovisioned (#16467)
Reset orphaned waiting jobs when controller node is deprovisioned

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-06-02 10:46:52 -04:00
Stevenson Michel
b14b9e1771 [Devel] Reject RRULE with INTERVAL=0 to Prevent Scheduler Hang (#16464)
* added interval null rrule check and updated tests

* Added secondly to the expected errors
2026-06-02 10:22:36 -04:00
Alan Rominger
fccb6744f9 Address even more pytest warnings, and migrate smart inventory tests (#16330)
* Address even more pytest warnings, co-authored with Opus 4.6

* Upgrade pyparsing

* Attempt to update smart inventory logic

* Move smart inventory tests here

* Fix some failing dev env tests

Assisted-by: claude

* Use shared fixture for teardown

* Fix test goof

assisted-by: claude Opus 4.6
2026-05-28 16:17:13 -04:00
Alan Rominger
200a68aefa [AAP-57274] Fix creator permissions for models without old-style roles (#16457)
* [AAP-57274] Fix creator permissions for models without old-style roles

NotificationTemplate has no old-style ImplicitRoleField (like admin_role)
because notification permissions were historically org-level only.
When a non-admin user creates a notification template,
give_creator_permissions tries to sync the DAB RBAC assignment back
to the old role system and hits an AttributeError.

Catch the AttributeError so the DAB RBAC assignment still succeeds.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-28 16:06:03 -04:00
Alan Rominger
9b922f70ed Make dispatcherd min_workers literally 4 (#16421)
assisted-by: claude
2026-05-28 09:58:13 -04:00
Lila Yasin
e4fa4810eb Expose JOB_VARIABLE_PREFIXES as a configurable setting (#16452)
*  INCLUDE_AWX_VAR_PREFIX to USE_TOWER_VAR_PREFIX boolean toggle replace the include legacy prefix boolean with a tower-or-awx toggle USE_TOWER_VAR_PREFIX=True (default) emits only tower_ prefixed variables, false emits only awx_ (deprecated)

* Clean up dead constant and cache get_job_variable_prefixes() calls

* Revise tests to reflect new behavior

* Fix fragile fallback test to actually exercise getattr default

* Fix mock target for settings fallback test
2026-05-26 16:43:50 -04:00
Dirk Julich
b37f3892b6 [AAP-74343] Decouple installed_collections/ansible_version from indirect node counting flag (#16453)
* [AAP-74343] Decouple installed_collections and ansible_version from indirect node counting flag

The indirect_instance_count callback plugin and its artifact processing
were entirely gated behind FEATURE_INDIRECT_NODE_COUNTING_ENABLED. This
caused installed_collections and ansible_version to remain unpopulated
when the flag was off, even though these are baseline analytics fields
unrelated to indirect host counting.

Always run the callback plugin and persist installed_collections and
ansible_version to the database. Only the indirect-counting-specific
parts (EventQuery creation, event_queries_processed flag, and vendor
collections) remain gated behind the feature flag.

* [AAP-74343] Read callbacks_enabled from ansible.cfg so user-configured callbacks are preserved

The check for 'callbacks_enabled' in config_values was dead code because
read_ansible_config was never asked to read that setting. Now that the
callback registration runs unconditionally, fix this by including
'callbacks_enabled' in the variables of interest.

* [AAP-74343] Use comma delimiter for ANSIBLE_CALLBACKS_ENABLED

Ansible's CALLBACKS_ENABLED config is type list and splits on commas.
The colon delimiter would cause combined callback names to be treated
as a single invalid name.

* [AAP-74343] Add tests for ANSIBLE_CALLBACKS_ENABLED configuration

Verify that indirect_instance_count is always set, user-configured
callbacks from ansible.cfg are preserved, and the comma delimiter
is used as ansible-core expects.

* [AAP-74343] Use public API for namespace package path access

Replace library.__path__._path[0] with library.__path__[0] to avoid
relying on a private CPython implementation detail of _NamespacePath.

* [AAP-74343] Skip host query scanning when indirect counting flag is off

The indirect_instance_count callback plugin now checks AWX_COLLECT_HOST_QUERIES
to decide whether to scan for host query files. When the feature flag is off,
the plugin only collects collection metadata (name + version) and ansible_version,
skipping the expensive embedded/external query file discovery.

* [AAP-74343] Set AWX_COLLECT_HOST_QUERIES in query discovery tests

The TestExternalQueryDiscovery tests exercise the host query scanning
path, which now requires AWX_COLLECT_HOST_QUERIES=1 in the environment.

* [AAP-74343] Use Ansible plugin config system for collect_host_queries

Declare collect_host_queries as a formal plugin option in DOCUMENTATION
with env var AWX_COLLECT_HOST_QUERIES, replacing the raw os.getenv() call
with self.get_option(). This follows the standard Ansible plugin
configuration pattern.

* [AAP-74343] Add test for disabled collect_host_queries path

Verify that when collect_host_queries is false, the plugin still
enumerates collections for metadata but skips host query file scanning.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-05-22 14:45:08 +02:00
Lila Yasin
45480941f8 [AAP-53283] Fix analytics API requests to respect proxy environment variables (#16451)
Fix analytics API requests to respect proxy environment variables 
Assisted-by: Claude
2026-05-15 13:13:15 -04:00
jessicamack
90b7d35554 Implement Candlepin certificate integration (#16388)
* AAP-12516 [option 2] Handle nested workflow artifacts via root node `ancestor_artifacts` (#16381)

* Add new test for artfact precedence upstream node vs outer workflow

* Fix bugs, upstream artifacts come first for precedence

* Track nested artifacts path through ancestor_artifacts on root nodes

* Fix case where first root node did not get the vars

* touchup comment

* Prevent conflict with sliced jobs hack

* Reorder URLs so that Django debug toolbar can work (#16352)

* Reorder URLs so that Django debug toolbar can work

* Move comment with URL move

* feat: support for oidc credential /test endpoint (#16370)

Adds support for testing external credentials that use OIDC workload identity tokens.
When FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED is enabled, the /test endpoints return
JWT payload details alongside test results.

- Add OIDC credential test endpoints with job template selection
- Return JWT payload and secret value in test response
- Maintain backward compatibility (detail field for errors)
- Add comprehensive unit and functional tests
- Refactor shared error handling logic

Co-authored-by: Daniel Finca <dfinca@redhat.com>
Co-authored-by: melissalkelly <melissalkelly1@gmail.com>

* Bind the install bundle to the ansible.receptor collection 2.0.8 version (#16396)

* [Devel] Config Endpoint Optimization (#16389)

* Improved performance of the config endpoint by reducing database queries in GET /api/controller/v2/config/

* Fix OIDC workload identity for inventory sync (#16390)

The cloud credential used by inventory updates was not going through
the OIDC workload identity token flow because it lives outside the
normal _credentials list. This overrides populate_workload_identity_tokens
in RunInventoryUpdate to include the cloud credential as an
additional_credentials argument to the base implementation, and
patches get_cloud_credential on the instance so the injector picks up
the credential with OIDC context intact.

Co-authored-by: Alan Rominger <arominge@redhat.com>
Co-authored-by: Dave Mulford <dmulford@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: integrate awx-tui to the awx_devel image (#16399)

* Aap 45980 (#16395)

* support bitbucket_dc webhooks

* add test

* update docs

* fix import for refactored method (#16394)

retrieve_workload_identity_jwt_with_claims is now
in a separate utility file, not in jobs.py

Signed-off-by: Seth Foster <fosterbseth@gmail.com>

* AAP-70257 controller collection should retry transient HTTP errors with exponential backoff. (#16415)

controller collection should retry transient HTTP errors with exponential backoff

* AAP-71844 Fix rrule fast-forward across DST boundaries (#16407)

Fix rrule fast-forward producing wrong occurrences across DST boundaries

The UTC round-trip in _fast_forward_rrule shifts the dtstart's local
hour when the original and fast-forwarded times are in different DST
periods. Since dateutil generates HOURLY occurrences by stepping in
local time, the shifted hour changes the set of reachable hours. With
BYHOUR constraints this causes a ValueError crash; without BYHOUR,
occurrences are silently shifted by 1 hour.

Fix by performing all arithmetic in the dtstart's original timezone.
Python aware-datetime subtraction already computes absolute elapsed
time regardless of timezone, so the UTC conversion was unnecessary
for correctness and actively harmful during fall-back ambiguity.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Correctly restrict push actions to ownership repos (#16398)

* Correctly restrict push actions to ownership repos

* Use standard action to see if push actions should run

* Run spec job for 2.6 and higher

* Be even more restrictve, do not push if on a fork

* [Devel] Performance Optimization for Select Hosts Query (#16413)

* Fixed black reformating

* Make test simulate 500k hosts in real world scenario

* feat: improve unauthorized response on aap deployments (#16422)

* fix: do not include secret values in the credentials test endpoint an… (#16425)

fix: do not include secret values in the credentials test endpoint and add a guard to make sure credentials are testable

* [devel backport] AAP-41742: Fix workflow node update failing when JT has unprompted labels (#16426)

* AAP-41742: Fix workflow node update failing when JT has unprompted labels

PATCH extra_data on a workflow node fails with
{"labels":["Field is not configured to prompt on launch."]}
when the node has labels associated but the JT has
ask_labels_on_launch=False.

The serializer was passing all persisted M2M state from prompts_dict()
to _accept_or_ignore_job_kwargs() on every PATCH, re-validating
unchanged fields. Fix scopes validation to only the fields in the
request; full re-validation still occurs when unified_job_template
is being changed.

* Capture attrs keys before _build_mock_obj mutates them

_build_mock_obj() pops pseudo-fields (limit, scm_branch, job_tags,
etc.) from attrs. Computing requested_prompt_fields after the pop
would miss those fields and skip their ask_on_launch validation.

* Include survey_passwords when validating extra_vars prompts

prompts_dict() emits survey_passwords alongside extra_vars.
_accept_or_ignore_job_kwargs uses it to decrypt encrypted survey
values before validation. Without it, encrypted password blobs
are validated as-is against the survey spec.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* feat: add test to ensure credential secret values are not returned (#16434)

* AAP-68024 perf: derive last_job_host_summary from query instead of denormalized FK (#16332)

* perf: stop eagerly updating Host.last_job_host_summary on every job completion

The playbook_on_stats wrapup path bulk-updates last_job_host_summary_id
on every host touched by a job. In the Q4CY25 scale lab this query had
a median execution time of 75 seconds due to index churn on main_host.

Replace all reads of the denormalized FK with a new classmethod
JobHostSummary.latest_for_host(host_id) that queries for the most
recent summary on demand. This eliminates the write-side bulk_update
of last_job_host_summary_id entirely.

Changes:
- Add JobHostSummary.latest_for_host() classmethod
- Serializer: use latest_for_host() instead of obj.last_job_host_summary
- Dashboard view: use subquery instead of FK traversal for failed hosts
- Inventory.update_computed_fields: use subquery for failed host count
- events.py: remove last_job_host_summary_id from bulk_update
- signals.py: simplify _update_host_last_jhs to only update last_job
- access.py/managers.py: remove select_related/defer through the FK

The FK field on Host is left in place for now (removal requires a
migration) but is no longer written to.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix .pk AttributeError, add job_template annotations, annotate host sublists

- Add 'pk' to AnnotatedSummary dynamic type (fixes AttributeError in get_related)
- Add job_template_id and job_template_name to subquery annotations so list
  views include these fields in summary_fields.last_job (matching detail views)
- Traverse job__ FK from JobHostSummary instead of using separate UnifiedJob
  subquery with OuterRef on another annotation (cleaner SQL, avoids alias issue)
- Annotate all host sublist views (InventoryHostsList, GroupHostsList,
  GroupAllHostsList, InventorySourceHostsList) to prevent N+1 queries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update test_events to use JobHostSummary.latest_for_host instead of stale FKs

Tests were asserting host.last_job_id and host.last_job_host_summary_id
which are no longer updated. Use JobHostSummary.latest_for_host() to
derive the same data, matching the new read-time derivation approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove stale failures_url from deprecated DashboardView

The failures_url linked to ?last_job_host_summary__failed=True which
filters on the now-stale FK. The dashboard count itself was already
fixed to use a subquery annotation. Since DashboardView is deprecated
and has_active_failures is a SerializerMethodField (not filterable),
remove the failures_url entirely rather than creating a custom filter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply black formatting to changed files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Refactor: replace 10 subquery annotations with bulk prefetch

Instead of annotating every host queryset with 10 correlated subqueries
(summary + job + job_template fields), annotate only _latest_summary_id
and bulk-fetch the full JobHostSummary objects after pagination via
select_related('job', 'job__job_template').

This reduces the SQL from 10 correlated subqueries to 1 subquery + 1 IN
query, addressing review feedback about annotation overhead on host list
views.

- _annotate_host_latest_summary: only annotates _latest_summary_id
- _prefetch_latest_summaries: bulk-fetches and attaches to host objects
- HostSummaryPrefetchMixin: hooks into list() after pagination
- Serializer uses real JobHostSummary objects (no more AnnotatedSummary)
- to_representation always overwrites stale FK values

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Refactor: move latest summary to QuerySet._fetch_all + Host.latest_summary

Per review feedback, replace the view-level HostSummaryPrefetchMixin
with a custom QuerySet that bulk-attaches summaries at evaluation time
(like prefetch_related), and a Host.latest_summary property as the
single access point.

- HostLatestSummaryQuerySet: overrides _fetch_all() to bulk-fetch
  JobHostSummary objects with select_related after queryset evaluation
- HostManager now inherits from the custom queryset via from_queryset()
- Host.latest_summary property: uses cache if available, falls back to
  individual query
- Remove _annotate_host_latest_summary, _prefetch_latest_summaries,
  HostSummaryPrefetchMixin from views — no more list() override needed
- Remove last_job/last_job_host_summary from SUMMARIZABLE_FK_FIELDS
- Serializer uses obj.latest_summary and DEFAULT_SUMMARY_FIELDS loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix: scope annotation to views, restore license_error/canceled_on

- Remove with_latest_summary_id() from HostManager.get_queryset() to
  avoid applying the correlated subquery to every Host query globally
  (count, exists, internal relations)
- Apply with_latest_summary_id() in get_queryset() of the 6
  host-serving views only
- Restore license_error and canceled_on to last_job summary fields
  to avoid breaking API change

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Guard _fetch_all() to skip bulk-attach on non-annotated querysets

Without this guard, _fetch_all() would set _latest_summary_cache=None
on every host in non-annotated querysets (e.g. Host.objects.filter()),
masking the per-object fallback query in Host.latest_summary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove name from last_job_host_summary and canceled_on from last_job summary

Per reviewer feedback: these fields were not in the original API contract
via SUMMARIZABLE_FK_FIELDS and their addition would be an API change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add functional tests for HostLatestSummaryQuerySet and Host.latest_summary

Tests cover:
- with_latest_summary_id() annotation and most-recent selection
- _fetch_all() bulk-attach behavior on annotated querysets
- _fetch_all() skips non-annotated querysets (preserves fallback)
- .count() and .exists() do NOT trigger _fetch_all
- Host.latest_summary cache hits (zero queries) and fallback
- Host.latest_job property
- select_related on bulk-attached summaries (no N+1)
- Chaining preserves annotation
- Multiple jobs / partial host coverage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply black formatting to test_host_queryset.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>

* Fix flake8 F841: remove unused job1/job2 variables in tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>

* Add comment explaining why Prefetch was not used for host latest summary

Django Prefetch cannot handle latest per group -- [:1] slicing fetches
1 record globally, not per host (Django ticket #26780). The custom
_fetch_all override uses the same 2-query pattern as prefetch_related
internally, customized for this use case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix null handling to keep old behavior

---------

Signed-off-by: Ben Thomasson <bthomass@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: AlanCoding <arominge@redhat.com>

* [AAP-72722] Use url instead of jwt_aud for workload identity audience (#16432)

* [AAP-72722] Use url instead of jwt_aud for workload identity audience

The OIDC credential plugin's jwt_aud field is being removed. Use the
plugin's url field as the audience when requesting workload identity
tokens, since the target service URL is the appropriate audience value.

Assisted-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [Devel] Optimize host_list_rbac query  (#16408)

* Defer ansible_facts in HostManager to avoid fetching large JSON column in host list queries (AAP-68023)

The host list endpoint (GET /api/v2/hosts/) fetches the ansible_facts
JSON column unnecessarily, contributing to the 7.8s median query time
at scale. This column can be very large and is not used by the list
serializer.

Changes:
- HostManager.get_queryset() now defers ansible_facts
- finish_fact_cache call site uses .only(*HOST_FACTS_FIELDS) to eagerly
  load ansible_facts when actually needed, avoiding N+1 queries
- Unit test mocks updated to support .only() queryset chaining
- Points DAB dependency at the RBAC query optimization branch for
  combined testing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------

* fix: constructed inventories no longer increase the host count (#16433)

* Fix version worktree (#16431)

* git worktree friendly precomit install

* worktrees don't have a .git directory. Before, docker-compose would
  trigger pre-commit install and fail.

* make docker-compose work in git worktree

* AWX tries to discover the version via info stored in .git/ dir.
  setuptools-scm is capable of finding the .git/ dir, starting from a
  worktree, but is unable because only the worktree is mapped into the
  container, not the .git/ dir itself. Thus, we have to detect and pass
  the version into the container from outside. That is why this change
  landed in the Makefile.

* fix: as_user() gateway session cookie fallback (#16437)

Add a fallback that checks for `gateway_sessionid` when no cookie
matches `session_cookie_name`, mirroring the existing fallback in
`Connection.login()`. The finally block now cleans up whichever
cookie name was actually used.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Pass setting to dispatcherd so it can be configured (#16438)

* fix: allow blank password field to fix OpenAPI schema validation (#16440)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* first pass porting over metrics

* move settings to defaults

* add more unit tests

* update unit tests

* lint fixes

* more lint fixes

* refactor and address feedback

* remove the api views

* remove model and move helper functions out of licensing

* add settings to API, fix tests, refactoring

* fix circular import

* update tests

* remove duplicate code, handle edge cases, use clearer naming, add test coverage

* update test for changes in ship()

* remove unneeded setting

* _discover_org should account for verify-tls=False

* directly assign settings, detect url, update tests

* log errors close to occurance

* rename function for clarity, focus on critical tests

* rename for clarity, lint fixes

* fix test params, priority for org discovery

* fix test failures and linting

---------

Signed-off-by: Seth Foster <fosterbseth@gmail.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>
Co-authored-by: Alan Rominger <arominge@redhat.com>
Co-authored-by: Daniel Finca <dfinca@redhat.com>
Co-authored-by: melissalkelly <melissalkelly1@gmail.com>
Co-authored-by: Tong He <68936428+unnecessary-username@users.noreply.github.com>
Co-authored-by: Stevenson Michel <iamstevensonmichel@outlook.com>
Co-authored-by: Seth Foster <fosterseth@users.noreply.github.com>
Co-authored-by: Dave Mulford <dmulford@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Adrià Sala <22398818+adrisala@users.noreply.github.com>
Co-authored-by: Peter Braun <pbraun@redhat.com>
Co-authored-by: Sean Sullivan <ssulliva@redhat.com>
Co-authored-by: Dirk Julich <djulich@redhat.com>
Co-authored-by: Ben Thomasson <bthomass@redhat.com>
Co-authored-by: Dan Leehr <dleehr@users.noreply.github.com>
Co-authored-by: Lila Yasin <lyasin@redhat.com>
Co-authored-by: Chris Meyers <chrismeyersfsu@users.noreply.github.com>
2026-05-14 09:33:48 -04:00
Alan Rominger
9606366625 Consolidate validation rules for same-org restrictions (#16427)
* Consolidate implementation of same-org validation rule

* Update tests for the simplified validation

* Still do validation with deferance to the new callback

* Correctly falsy handling in view logic
2026-05-12 08:59:45 -04:00
Alan Rominger
6179b16987 AAP-72269 Change fact processing loop to use file listing (#16403)
* Change fact processing loop to use file listing

* Fix some test

* Address coderabbit comments

* Handle saving facts in batches to keep memory low

* Improve log about mismatch in response to review comment
2026-05-05 15:35:46 +02:00
jessicamack
cbbd683720 AAP-70294: Migrate Unit Test Candidates from ATF to Upstream (#16385)
* add converted atf tests

* fix bulk settings test
2026-05-04 15:07:46 +00:00
Peter Braun
df771d0e9d fix: constructed inventories no longer increase the host count (#16433) 2026-04-28 20:01:21 +00:00
Lila Yasin
1213ea6f62 [Devel] Optimize host_list_rbac query (#16408)
* Defer ansible_facts in HostManager to avoid fetching large JSON column in host list queries (AAP-68023)

The host list endpoint (GET /api/v2/hosts/) fetches the ansible_facts
JSON column unnecessarily, contributing to the 7.8s median query time
at scale. This column can be very large and is not used by the list
serializer.

Changes:
- HostManager.get_queryset() now defers ansible_facts
- finish_fact_cache call site uses .only(*HOST_FACTS_FIELDS) to eagerly
  load ansible_facts when actually needed, avoiding N+1 queries
- Unit test mocks updated to support .only() queryset chaining
- Points DAB dependency at the RBAC query optimization branch for
  combined testing

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
2026-04-28 14:00:13 -04:00
Dan Leehr
b66c0105ae [AAP-72722] Use url instead of jwt_aud for workload identity audience (#16432)
* [AAP-72722] Use url instead of jwt_aud for workload identity audience

The OIDC credential plugin's jwt_aud field is being removed. Use the
plugin's url field as the audience when requesting workload identity
tokens, since the target service URL is the appropriate audience value.

Assisted-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-28 10:53:09 -04:00
Ben Thomasson
d1b3ae53ae AAP-68024 perf: derive last_job_host_summary from query instead of denormalized FK (#16332)
* perf: stop eagerly updating Host.last_job_host_summary on every job completion

The playbook_on_stats wrapup path bulk-updates last_job_host_summary_id
on every host touched by a job. In the Q4CY25 scale lab this query had
a median execution time of 75 seconds due to index churn on main_host.

Replace all reads of the denormalized FK with a new classmethod
JobHostSummary.latest_for_host(host_id) that queries for the most
recent summary on demand. This eliminates the write-side bulk_update
of last_job_host_summary_id entirely.

Changes:
- Add JobHostSummary.latest_for_host() classmethod
- Serializer: use latest_for_host() instead of obj.last_job_host_summary
- Dashboard view: use subquery instead of FK traversal for failed hosts
- Inventory.update_computed_fields: use subquery for failed host count
- events.py: remove last_job_host_summary_id from bulk_update
- signals.py: simplify _update_host_last_jhs to only update last_job
- access.py/managers.py: remove select_related/defer through the FK

The FK field on Host is left in place for now (removal requires a
migration) but is no longer written to.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix .pk AttributeError, add job_template annotations, annotate host sublists

- Add 'pk' to AnnotatedSummary dynamic type (fixes AttributeError in get_related)
- Add job_template_id and job_template_name to subquery annotations so list
  views include these fields in summary_fields.last_job (matching detail views)
- Traverse job__ FK from JobHostSummary instead of using separate UnifiedJob
  subquery with OuterRef on another annotation (cleaner SQL, avoids alias issue)
- Annotate all host sublist views (InventoryHostsList, GroupHostsList,
  GroupAllHostsList, InventorySourceHostsList) to prevent N+1 queries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Update test_events to use JobHostSummary.latest_for_host instead of stale FKs

Tests were asserting host.last_job_id and host.last_job_host_summary_id
which are no longer updated. Use JobHostSummary.latest_for_host() to
derive the same data, matching the new read-time derivation approach.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove stale failures_url from deprecated DashboardView

The failures_url linked to ?last_job_host_summary__failed=True which
filters on the now-stale FK. The dashboard count itself was already
fixed to use a subquery annotation. Since DashboardView is deprecated
and has_active_failures is a SerializerMethodField (not filterable),
remove the failures_url entirely rather than creating a custom filter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply black formatting to changed files

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Refactor: replace 10 subquery annotations with bulk prefetch

Instead of annotating every host queryset with 10 correlated subqueries
(summary + job + job_template fields), annotate only _latest_summary_id
and bulk-fetch the full JobHostSummary objects after pagination via
select_related('job', 'job__job_template').

This reduces the SQL from 10 correlated subqueries to 1 subquery + 1 IN
query, addressing review feedback about annotation overhead on host list
views.

- _annotate_host_latest_summary: only annotates _latest_summary_id
- _prefetch_latest_summaries: bulk-fetches and attaches to host objects
- HostSummaryPrefetchMixin: hooks into list() after pagination
- Serializer uses real JobHostSummary objects (no more AnnotatedSummary)
- to_representation always overwrites stale FK values

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Refactor: move latest summary to QuerySet._fetch_all + Host.latest_summary

Per review feedback, replace the view-level HostSummaryPrefetchMixin
with a custom QuerySet that bulk-attaches summaries at evaluation time
(like prefetch_related), and a Host.latest_summary property as the
single access point.

- HostLatestSummaryQuerySet: overrides _fetch_all() to bulk-fetch
  JobHostSummary objects with select_related after queryset evaluation
- HostManager now inherits from the custom queryset via from_queryset()
- Host.latest_summary property: uses cache if available, falls back to
  individual query
- Remove _annotate_host_latest_summary, _prefetch_latest_summaries,
  HostSummaryPrefetchMixin from views — no more list() override needed
- Remove last_job/last_job_host_summary from SUMMARIZABLE_FK_FIELDS
- Serializer uses obj.latest_summary and DEFAULT_SUMMARY_FIELDS loop

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix: scope annotation to views, restore license_error/canceled_on

- Remove with_latest_summary_id() from HostManager.get_queryset() to
  avoid applying the correlated subquery to every Host query globally
  (count, exists, internal relations)
- Apply with_latest_summary_id() in get_queryset() of the 6
  host-serving views only
- Restore license_error and canceled_on to last_job summary fields
  to avoid breaking API change

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Guard _fetch_all() to skip bulk-attach on non-annotated querysets

Without this guard, _fetch_all() would set _latest_summary_cache=None
on every host in non-annotated querysets (e.g. Host.objects.filter()),
masking the per-object fallback query in Host.latest_summary.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Remove name from last_job_host_summary and canceled_on from last_job summary

Per reviewer feedback: these fields were not in the original API contract
via SUMMARIZABLE_FK_FIELDS and their addition would be an API change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Add functional tests for HostLatestSummaryQuerySet and Host.latest_summary

Tests cover:
- with_latest_summary_id() annotation and most-recent selection
- _fetch_all() bulk-attach behavior on annotated querysets
- _fetch_all() skips non-annotated querysets (preserves fallback)
- .count() and .exists() do NOT trigger _fetch_all
- Host.latest_summary cache hits (zero queries) and fallback
- Host.latest_job property
- select_related on bulk-attached summaries (no N+1)
- Chaining preserves annotation
- Multiple jobs / partial host coverage

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply black formatting to test_host_queryset.py

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>

* Fix flake8 F841: remove unused job1/job2 variables in tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Ben Thomasson <bthomass@redhat.com>

* Add comment explaining why Prefetch was not used for host latest summary

Django Prefetch cannot handle latest per group -- [:1] slicing fetches
1 record globally, not per host (Django ticket #26780). The custom
_fetch_all override uses the same 2-query pattern as prefetch_related
internally, customized for this use case.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Fix null handling to keep old behavior

---------

Signed-off-by: Ben Thomasson <bthomass@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: AlanCoding <arominge@redhat.com>
2026-04-28 10:47:22 -04:00
Peter Braun
f3b7d442c3 feat: add test to ensure credential secret values are not returned (#16434) 2026-04-27 12:50:51 +00:00
Dirk Julich
376f964a40 [devel backport] AAP-41742: Fix workflow node update failing when JT has unprompted labels (#16426)
* AAP-41742: Fix workflow node update failing when JT has unprompted labels

PATCH extra_data on a workflow node fails with
{"labels":["Field is not configured to prompt on launch."]}
when the node has labels associated but the JT has
ask_labels_on_launch=False.

The serializer was passing all persisted M2M state from prompts_dict()
to _accept_or_ignore_job_kwargs() on every PATCH, re-validating
unchanged fields. Fix scopes validation to only the fields in the
request; full re-validation still occurs when unified_job_template
is being changed.

* Capture attrs keys before _build_mock_obj mutates them

_build_mock_obj() pops pseudo-fields (limit, scm_branch, job_tags,
etc.) from attrs. Computing requested_prompt_fields after the pop
would miss those fields and skip their ask_on_launch validation.

* Include survey_passwords when validating extra_vars prompts

prompts_dict() emits survey_passwords alongside extra_vars.
_accept_or_ignore_job_kwargs uses it to decrypt encrypted survey
values before validation. Without it, encrypted password blobs
are validated as-is against the survey spec.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 16:17:04 +02:00
Peter Braun
c71a49e044 fix: do not include secret values in the credentials test endpoint an… (#16425)
fix: do not include secret values in the credentials test endpoint and add a guard to make sure credentials are testable
2026-04-24 12:35:12 +00:00
Stevenson Michel
55ad29ac68 [Devel] Performance Optimization for Select Hosts Query (#16413)
* Fixed black reformating

* Make test simulate 500k hosts in real world scenario
2026-04-22 12:05:36 -04:00
Seth Foster
1636abd669 AAP-71844 Fix rrule fast-forward across DST boundaries (#16407)
Fix rrule fast-forward producing wrong occurrences across DST boundaries

The UTC round-trip in _fast_forward_rrule shifts the dtstart's local
hour when the original and fast-forwarded times are in different DST
periods. Since dateutil generates HOURLY occurrences by stepping in
local time, the shifted hour changes the set of reachable hours. With
BYHOUR constraints this causes a ValueError crash; without BYHOUR,
occurrences are silently shifted by 1 hour.

Fix by performing all arithmetic in the dtstart's original timezone.
Python aware-datetime subtraction already computes absolute elapsed
time regardless of timezone, so the UTC conversion was unnecessary
for correctness and actively harmful during fall-back ambiguity.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-21 10:54:42 -04:00
Seth Foster
b8c9ae73cd Fix OIDC workload identity for inventory sync (#16390)
The cloud credential used by inventory updates was not going through
the OIDC workload identity token flow because it lives outside the
normal _credentials list. This overrides populate_workload_identity_tokens
in RunInventoryUpdate to include the cloud credential as an
additional_credentials argument to the base implementation, and
patches get_cloud_credential on the instance so the injector picks up
the credential with OIDC context intact.

Co-authored-by: Alan Rominger <arominge@redhat.com>
Co-authored-by: Dave Mulford <dmulford@redhat.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 16:26:18 -04:00
Stevenson Michel
d71f18fa44 [Devel] Config Endpoint Optimization (#16389)
* Improved performance of the config endpoint by reducing database queries in GET /api/controller/v2/config/
2026-04-09 16:24:03 -04:00
Daniel Finca
b83019bde6 feat: support for oidc credential /test endpoint (#16370)
Adds support for testing external credentials that use OIDC workload identity tokens.
When FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED is enabled, the /test endpoints return
JWT payload details alongside test results.

- Add OIDC credential test endpoints with job template selection
- Return JWT payload and secret value in test response
- Maintain backward compatibility (detail field for errors)
- Add comprehensive unit and functional tests
- Refactor shared error handling logic

Co-authored-by: Daniel Finca <dfinca@redhat.com>
Co-authored-by: melissalkelly <melissalkelly1@gmail.com>
2026-04-06 15:56:11 -04:00
Alan Rominger
7155400efc AAP-12516 [option 2] Handle nested workflow artifacts via root node ancestor_artifacts (#16381)
* Add new test for artfact precedence upstream node vs outer workflow

* Fix bugs, upstream artifacts come first for precedence

* Track nested artifacts path through ancestor_artifacts on root nodes

* Fix case where first root node did not get the vars

* touchup comment

* Prevent conflict with sliced jobs hack
2026-04-02 15:18:11 -04:00
Matthew Sandoval
7c75788b0a AAP-67740 Pass plugin_description through to CredentialType.description (#16364)
* Pass plugin_description through to CredentialType.description

Propagate the plugin_description field from credential plugins into the
CredentialType description when loading and creating managed credential
types, including updates to existing records.

Assisted-by: Claude

* Add unit tests for plugin_description passthrough to CredentialType

Tests cover load_plugin, get_creation_params, and
_setup_tower_managed_defaults handling of the description field.

Assisted-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: PabloHiro <palonso@redhat.com>
2026-03-25 11:03:11 +01:00
Peter Braun
ab294385ad fix: avoid delete in loop in inventory import (#16366) 2026-03-24 15:37:59 +00:00
Alan Rominger
377dfce197 Record whether a file was written for fact cache (#16361) 2026-03-20 12:53:34 -04:00
Matthew Sandoval
ff68d6196d Add feature flag for OIDC workload identity credential types (#16348)
Add install-time feature flag for OIDC workload identity credential types

Implements FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED feature flag to gate
HashiCorp Vault OIDC credential types as a Technology Preview feature.

When the feature flag is disabled (default), OIDC credential types are
not loaded into the plugin registry at application startup and do not
exist in the database.

When enabled, OIDC credential types are loaded normally and function
as expected.

Changes:
- Add FEATURE_OIDC_WORKLOAD_IDENTITY_ENABLED setting (defaults to False)
- Add OIDC_CREDENTIAL_TYPE_NAMESPACES constant for maintainability
- Modify load_credentials() to skip OIDC types when flag is disabled
- Add test coverage (2 test cases)

This is an install-time flag that requires application restart to take
effect. The flag is checked during application startup when credential
types are loaded from plugins.

Fixes: AAP-64510

Assisted-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-03-18 22:40:33 -04:00
Alan Rominger
0aaca1bffd Fix job cancel chain bugs (#16325)
* Fix job cancel chain bugs

* Early relief valve for canceled jobs, ATF related changes

* Add test and fix for approval nodes as well

* Revert unwanted change

* Refactor workflow approval nodes to make it more clean

* Revert data structure changes

* Delete local utility file

* Review comment addressing

* Use canceled status in websocket

* Delete slop

* Add agent marker

* Bugbot comment about status websocket mismatch
2026-03-18 12:08:27 -04:00
Rodrigo Toshiaki Horie
679e48cbe8 [AAP-68258] Fix SonarCloud Reliability issues (#16354)
* Fix SonarCloud Reliability issues: time-dependent class attrs and dict comprehensions

- Move last_stats/last_flush from class body to __init__ in CallbackBrokerWorker
  (S8434: time-dependent expressions evaluated at class definition)
- Replace dict comprehensions with dict.fromkeys() in has_create.py
  (S7519: constant-value dict should use fromkeys)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix callback receiver tests to use flush(force=True)

Tests were implicitly relying on last_flush being a stale class-level
timestamp. Now that last_flush is set in __init__, the time-based flush
condition isn't met when flush() is called immediately after construction.
Use force=True to explicitly trigger an immediate flush in tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-18 14:21:39 +00:00
Alan Rominger
cc2fbf332c Stop writing tmp test files that are not cleaned up (#16358)
* Stop writing tmp test files that are not cleaned up
2026-03-17 17:02:47 -04:00
Tong He
7e29f9e3f2 Enrich tests against is_ha_environment() 2026-03-11 11:43:12 +01:00
Andrea Restle-Lay
619d8c67a9 [AAP-63314] P4.4: Controller - Pass Workload TTL to Gateway (#16303)
* Pass workload TTL to Gateway (minimal changes) assisted-by: Claude

* lint
Assisted-by: Claude

* fix unit tests assisted-by claude

* use existing functions assisted-by: Claude

* fix test assisted-by: Claude

* fixes for sonarcloud assisted-by: Claude

* nit

* nit

* address feedback

* feedback from pr review assisted-by: Claude

* feedback from pr review assisted-by: Claude

* Apply suggestion from @dleehr

Co-authored-by: Dan Leehr <dleehr@users.noreply.github.com>

* lint assisted-by: Claude

* fix: narrow vendor_collections_dir fixture teardown scope (#16326)

Only remove the collection directory the fixture created
(redhat/indirect_accounting) instead of the entire
/var/lib/awx/vendor_collections/ root, so we don't accidentally
delete vendor collections that may have been installed by the
build process.

Forward-port of ansible/tower#7350.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* AAP-67436 Remove pbr from requirements (#16337)

* Remove pbr from requirements

pbr was temporarily added to support ansible-runner installed from a git
branch. It is no longer needed as a direct dependency.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Retrigger CI

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* [AAP-64062] Enforce JWT-only authentication for Controller when deployed as part of AAP (#16283)

After all settings are loaded, override DEFAULT_AUTHENTICATION_CLASSES
to only allow Gateway JWT authentication when RESOURCE_SERVER__URL is
set. This makes the lockdown immutable — no configuration file or
environment variable can re-enable legacy auth methods (Basic, Session,
OAuth2, Token).

This is the same pattern used by Hub (galaxy_ng) and EDA (eda-server)
for ANSTRAT-1840.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Re-trigger CI

Made-with: Cursor

* Re-trigger CI

Made-with: Cursor

* [AAP-63314] Pass job timeout as workload_ttl_seconds to Gateway    Assisted-by: Claude

* Additional unit test requested at review  Assisted-by: Claude

* Revert profiled_pg/base.py rebase error, unrelated to AAP-63314

* revert requirements changes introduced by testing

* revert

* revert

* docstring nit from coderabbit

---------

Co-authored-by: Dan Leehr <dleehr@users.noreply.github.com>
Co-authored-by: Dirk Julich <djulich@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Hao Liu <44379968+TheRealHaoLiu@users.noreply.github.com>
2026-03-10 08:54:28 -04:00
Dirk Julich
212546f92b fix: narrow vendor_collections_dir fixture teardown scope (#16326)
Only remove the collection directory the fixture created
(redhat/indirect_accounting) instead of the entire
/var/lib/awx/vendor_collections/ root, so we don't accidentally
delete vendor collections that may have been installed by the
build process.

Forward-port of ansible/tower#7350.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 14:19:24 +01:00
Seth Foster
f74f82e30c Forward port external query files from stable-2.6 (#16312)
* Revert "AAP-58452 Add version fallback for external query files (#16309)"

This reverts commit 0f2692b504.

* AAP-58441: Add runtime integration for external query collection (#7208)

Extend build_private_data_files() to copy vendor collections from
/var/lib/awx/vendor_collections/ to the job's private_data_dir,
making external query files available to the indirect node counting
callback plugin in execution environments.

Changes:
- Copy vendor_collections to private_data_dir during job preparation
- Add vendor_collections path to ANSIBLE_COLLECTIONS_PATH in build_env()
- Gracefully handle missing source directory with warning log
- Feature gated by FEATURE_INDIRECT_NODE_COUNTING_ENABLED flag

This enables external query file discovery for indirect node counting
across all deployment types (RPM, Podman, OpenShift, Kubernetes) using
the existing private_data_dir mechanism.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* [stable-2.6] AAP-58451: Add callback plugin discovery for external query files (#7223)

* AAP-58451: Add callback plugin discovery for external query files

Extend the indirect_instance_count callback plugin to discover and load
external query files from the bundled redhat.indirect_accounting collection
when embedded queries are not present in the target collection.

Changes:
- Add external query discovery with precedence (embedded queries first)
- External query path: redhat.indirect_accounting/extensions/audit/
  external_queries/{namespace}.{name}.{version}.yml
- Use self._display.v() for external query messages (visible with -v)
- Use self._display.vv() for embedded query messages (visible with -vv)
- Fix: Change .exists() to .is_file() per Traversable ABC
- Handle missing external query collection gracefully (ModuleNotFoundError)

Note: This implements exact version match only. Version fallback logic
is covered in AAP-58452.

* fix CI error when using Traversable.is_file

* Add minimal implementation for AAP-58451

* Fix formatting

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>

* AAP-58452 Add version fallback for external query files (#7254)

* AAP-58456 unit test suite for external query handling (#7283)

* Add unit tests for external query handling

* Refactor unit tests for external query handling

* Refactor indirect node counting callback code to improve testing code

* Refactor unit tests for external query handling for improved callback code

* Fix test for majore version boundary check

* Fix weaknesses in some unit tests

* Make callback plugin module self contained, independent from awx

* AAP-58470 integration tests (core) for external queries (#7278)

* Add collection for testing external queries

* Add query files for testing external query file runtime integration

* Add live tests for external query file runtime integration

* Remove redundant wait for events and refactor test data folders

* Fix unit tests: mock flag_enabled to avoid DB access

The AAP-58441 cherry-pick added a flag_enabled() call in
BaseTask.build_private_data_files(), which is called by all task types.
Tests for RunInventoryUpdate and RunJob credentials now hit this code
path and need the flag mocked to avoid database access in unit tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: attempt exact query file match before Version parsing (#7345)

The exact-version filename check does not require PEP440 parsing, but
Version() was called first, causing early return on non-PEP440 version
strings even when an exact file exists on disk. Move the exact file
check before Version parsing so fallback logic only parses when needed.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Do no longer mutate global sys.modules (#7337)

* [stable-2.6] AAP-58452 fix: Add queries_dir guard (#7338)

* Add queries_dir guard

* fix: update unit tests to mock _get_query_file_dir instead of files

The TestVersionFallback tests mocked `files()` with chainable path
mocks, but `find_external_query_with_fallback` now uses
`_get_query_file_dir()` which returns the queries directory directly.
Mock the helper instead for simpler, correct tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: remove unused EXTERNAL_QUERY_PATH constant (#7336)

The constant was defined but never referenced — the path is constructed
inline via Traversable's `/` operator which requires individual segments,
not a slash-separated string.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* fix: restore original feature flag state in test fixture (#7347)

The enable_indirect_host_counting fixture unconditionally disabled the
FEATURE_INDIRECT_NODE_COUNTING_ENABLED flag on teardown, even when it
was already enabled before the test (as is the case in development via
development_defaults.py). This caused test_indirect_host_counting to
fail when run after the external query tests, because the callback
plugin was no longer enabled.

Save and restore the original flag state instead.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Dirk Julich <djulich@redhat.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-03-05 14:05:58 +01:00
Alan Rominger
be5fbf365e AAP-65054 Fix bugs where concurrent jobs would clear facts of unrelated hosts (#16318)
* Add new tests for bug saving concurrent facts

* Fix first bug and improve tests

* Fix new bug where concurrent job clears facts from other job in unwanted way

* minor test fixes

* Add in missing playbook

* Fix host reference for constructed inventory

* Increase speed for concurrent fact tests

* Make test a bit faster

* Fix linters

* Add some functional tests

* Remove the sanity test

* Agent markers added

* Address SonarCloud

* Do backdating method, resolving stricter assertions

* Address coderabbit comments

* Address review comment with qs only method

* Delete missed sleep statement

* Add more coverage
2026-03-05 07:58:32 -05:00
Pablo H.
57f9eb093a feat: workload identity credentials integration (#16286)
* feat: workload identity credentials integration

* feat: cache credentials and add context property to Credential

Assisted-by: Claude

* feat: include safeguard in case feature flag is disabled

* feat: tests to validate workload identity credentials integration

* fix: affected tests by the credential cache mechanism

* feat: remove word cache from variables and comments, use standard library decorators

* fix: reorder tests in correct files

* Use better error catching mechanisms

* Adjust logic to support multiple credential input sources and use internal field

* Remove hardcoded credential type names

* Add tests for the internal field

Assited-by: Claude
2026-03-04 10:22:27 -05:00
Seth Foster
2c71bcda32 Improve transactional integrity for starting controller jobs in dispatcherd (#16300)
Remove SELECT FOR UPDATE from job dispatch to reduce transaction rollbacks
                                                                                                                                                                                                                                                                                           
  Move status transition from BaseTask.transition_status (which used
  SELECT FOR UPDATE inside transaction.atomic()) into                                                                                                                                                                                                                                      
  dispatch_waiting_jobs. The new approach uses filter().update() which                                                                                                                                                                                                                     
  is atomic at the database level without requiring explicit row locks,
  reducing transaction contention and rollbacks observed in perfscale
  testing.

  The transition_status method was an artifact of the feature flag era
  where we needed to support both old and new code paths. Since
  dispatch_waiting_jobs is already a singleton
  (on_duplicate='queue_one') scoped to the local node, the
  de-duplication logic is unnecessary.

  Status is updated after task submission to dispatcherd, so the job's
  UUID is in the dispatch pipeline before being marked running —
  preventing the reaper from incorrectly reaping jobs during the
  handoff window. RunJob.run() handles the race where a worker picks
  up the task before the status update lands by accepting waiting and
  transitioning it to running itself.

  Signed-off-by: Seth Foster <fosterbseth@gmail.com>
  Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-26 14:16:36 -05:00
Daniel Finca Martínez
2a35ce5524 AAP-62693 Integrate workload identity client to request JWTs (#16296)
* Add retrieve_workload_identity_jwt to jobs.py and tests

* Apply linting

* Add precondition to client retrieval

* Add test case for client not configured

* Remove trailing period in match string
2026-02-19 09:13:32 -05:00
Alan Rominger
567a980a03 Give error details of sliced jobs if they error in live tests (#16273) 2026-02-18 15:12:12 -05:00
Alan Rominger
9059cfbda6 Fix some pytest warnings using Opus 4.6 (#16269)
* Fix some pytest warnings using Opus 4.6

* Fix review comments

* Use raw-strings and regex markers for matching exception pattern

Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>

* Make regex work

* Undo always true assertion edit

---------

Co-authored-by: 🇺🇦 Sviatoslav Sydorenko (Святослав Сидоренко) <wk.cvs.github@sydorenko.org.ua>
2026-02-18 15:11:41 -05:00