awx/awx_collection
Alan Rominger 94764a1f17
AAP-42649 Flag-gated use of "dispatcherd" as its own library (#15981)
Use dynamic AWX max_workers value

Make basic --status and --running commands work

Make feature flag enabled true by default for development

* [dispatcherd] Dispatcher socket-based `--status` demo working (#15908)

* Fix Task Decorator to Work With and Without Feature Flag (AAP-41775) (#15911)

* refactor(system): extract common heartbeat helpers and split cluster_node_heartbeat

Extract common heartbeat logic into helper functions:  _heartbeat_instance_management: consolidates instance management, health checks, and lost-instance detection.  _heartbeat_check_versions: compares instance versions and initiates shutdown when necessary.  _heartbeat_handle_lost_instances: reaps jobs and marks lost instances offline.

Refactor the original cluster_node_heartbeat to use these helpers and retain legacy behavior (using bind_kwargs).

Introduce adispatch_cluster_node_heartbeat for dispatcherd: uses the control API to retrieve running tasks and reaps them.

Link the two implementations by attaching adispatch_cluster_node_heartbeat as the _new_method on cluster_node_heartbeat.

* feat(publish): delegate heartbeat task submission to new dispatcherd implementation

Update apply_async to check at runtime if FEATURE_NEW_DISPATCHER is enabled.

When the task is cluster_node_heartbeat and a _new_method is attached, delegate the task submission to the new dispatcherd implementation.

Preserve the original behavior for all other tasks and fallback on error.

* refactor(system): extract task ID retrieval from dispatcherd into helper function

Improves readability of adispatch_cluster_node_heartbeat by extracting
the complex UUID parsing logic into a dedicated helper function.
Adds clearer error handling and follows established code patterns.

* fix(dispatcher): Enable task decorator to work with and without feature flag

Implemented a new approach for handling task execution with feature flags
by attaching alternative implementations to apply_async._new_method. This
allows cluster_node_heartbeat to work correctly with both the legacy and
new dispatcher systems without modifying core decorator logic.

AAP-41775

* fix(dispatcher): Improve error handling and logging in feature flag implementation

- Add error handling when attaching alternative dispatcher implementation
- Fix method self-reference in apply_async to properly use cls.apply_async
- Document limitations of this targeted approach for specific tasks
- Add logging for better debugging of dispatcher selection
- Ensure decorator timing by keeping method attachment after function definitions

This completes the robust implementation for switching between dispatcher
implementations based on feature flags.

AAP-41775

* fix(dispatcher): Implement registry pattern for dispatcher feature flag compatibility

Replaces direct method attribute assignment with a global registry for
alternative implementations. The original approach tried to attach new
methods directly to apply_async bound methods, which fails because bound
methods don't support attribute assignment in Python.

The registry pattern:
- Creates a global ALTERNATIVE_TASK_IMPLEMENTATIONS dict in publish.py
- Registers alternative implementations by task name
- Modifies apply_async to check the registry when feature flag is enabled
- Adds extensive logging throughout the process for debugging

This enables cluster_node_heartbeat to work correctly with both the legacy
and new dispatcher implementations based on the FEATURE_NEW_DISPATCHER flag.

AAP-41775

* refactor(dispatcher): Remove excessive logging from dispatcher implementation

Reduces verbose debugging logs while maintaining essential logging for critical
operations. Preserves:
- Task implementation selection based on feature flag
- Registration success/failure messages
- Critical error reporting

Removed:
- Registry content debugging messages
- Repetitive task diagnostics
- Non-essential information logging

AAP-41775

* fix(dispatcher): Fix shallow copy in dispatcher schedule conversion

This resolves "AttributeError: 'float' object has no attribute 'total_seconds'"
errors when the dispatcher is restarted.

Refs: AAP-41775

* Use IPC mechanism to get running tasks (#15926)
* Allow tasks from tasks
* Fix failure to limit to waiting jobs
* Get job record with lock
* Fix failures in dispatcherd feature branch (#15930)
* Fully handle DispatcherCancel
* Complete rest of preload import work
* Complete dispatcherd integration & job cancellation (AAP-43033) (#15941)
* feat(dispatcher): Implement job cancellation for new dispatcher

Adds feature-flag-aware job cancellation that routes cancel requests to either
the legacy dispatcher or the new dispatcherd library based on the
FEATURE_NEW_DISPATCHER flag.

- Updates cancel_dispatcher_process() to use dispatcherd's control API when enabled
- Handles both direct cancellation and task manager workflow cancellation cases
- Works with DispatcherCancel exception handling to properly handle SIGUSR1 signals

AAP-43033

* fix(dispatcher): Update run_dispatcher.py to properly handle task cancellation

Modifies the cancel command in run_dispatcher.py to properly cancel tasks
when the FEATURE_NEW_DISPATCHER flag is enabled, rather than just listing
running tasks.

The implementation translates each task UUID to the appropriate
filter format expected by the dispatcherd control API, maintaining the same
behavior as the original implementation.

Part of: AAP-43033

* refactor(system): Refactor dispatch_startup() to extract common startup logic and branch based on feature flag

This commit refactors the dispatch_startup() function to improve clarity and consistency across the legacy
and new dispatcherd flows.

No dispatcher-specific functionality is needed beyond the changes made, so this refactoring improves robustness without
altering core behavior.

* refactor(system): Refactor inform_cluster_of_shutdown() for clarity

* refactor(tasks): Replace @task with @task_awx across 22 tasks for dispatcher compatibility

- Migrated all task decorators to use @task_awx, ensuring dispatcher-aware behavior.
- Tested each task with the new dispatcherd, verifying that tasks using the registry pattern execute correctly without needing binder‐based alternative implementations.
- Removed redundant logging and outdated comments.
- Legacy tasks that do not require special parameter extraction continue to use their original logic.
- This commit reflects our complete journey of testing and verifying dispatcherd compatibility across all 22 tasks.

* refactor(publish): fix linter

* Fix bug from the branch rebase

* AAP-43763 Add tests for connection management in dispatcherd workers (#15949)

* Add test for job cancel in live tests
* Fix bug from the branch rebase
* Add test for connection recovery after connection broke
* Add test for breaking connection

* Fix dispatcherd bugs: schedule aliases, job kwargs handling, cancel handling (#15960)

* Put in job kwargs handling, not done before

* AAP-44382 [dispatcherd] Fixes for running with feature flag off (#15973)

* Use correct decorator for test of tasks

* Finalize dispatcherd feature branch (#15975)

* Work dispatcherd into dependency management system

* Use util methods from DAB

* Rename the dispatcherd feature flag, and flip default to not-enabled

* Move to new submit_task method

* Update the location of the sock file

* AAP-44381 Make dispatcherd config loading more lazy (#15979)

* Make dispatcherd config loading more lazy

* Make submission error more obvious

* Fix signal handling gap, hijack SIGUSR1 from dispatcherd (#15983)

* Fix signal handling gap, hijack SIGUSR1 from dispatcherd

* Minor adjustments to dispatcherd status command

* [dispatcherd] Get rid of alternative task registry (#15984)

Get rid of alternative task registry

* Fix deadlock error and other cleanup errors (#15987)
* Move to proper error handling location

---------

Co-authored-by: artem_tiupin <70763601+art-tapin@users.noreply.github.com>
2025-05-16 09:39:22 -04:00
..
2025-01-15 15:09:28 -05:00
2021-06-08 14:33:23 -04:00

AWX Ansible Collection

This Ansible collection allows for easy interaction with an AWX server via Ansible playbooks.

This source for this collection lives in the awx_collection folder inside of the AWX GitHub repository. The previous home for this collection was inside the folder lib/ansible/modules/web_infrastructure/ansible_tower in the Ansible repo, as well as other places for the inventory plugin, module utils, and doc fragment.

Building and Installing

This collection templates the galaxy.yml file it uses. Run make build_collection from the root folder of the AWX source tree. This will create the tar.gz file inside the awx_collection folder with the current AWX version, for example: awx_collection/awx-awx-9.2.0.tar.gz.

Installing the tar.gz involves no special instructions.

Running

Non-deprecated modules in this collection have no Python requirements, but may require the official AWX CLI in the future. The DOCUMENTATION for each module will report this.

You can specify authentication by host, username, and password.

These can be specified via (from highest to lowest precedence):

  • direct module parameters
  • environment variables (most useful when running against localhost)
  • a config file path specified by the tower_config_file parameter
  • a config file at ~/.tower_cli.cfg
  • a config file at /etc/tower/tower_cli.cfg

Config file syntax looks like this:

[general]
host = https://localhost:8043
verify_ssl = true
username = foo
password = bar

Release and Upgrade Notes

Notable releases of the awx.awx collection:

  • 7.0.0 is intended to be identical to the content prior to the migration, aside from changes necessary to function as a collection.
  • 11.0.0 has no non-deprecated modules that depend on the deprecated tower-cli PyPI.
  • 19.2.1 large renaming purged "tower" names (like options and module names), adding redirects for old names
  • 21.11.0 "tower" modules deprecated and symlinks removed.
  • 25.0.0 "token" and "application" modules have been removed as oauth is no longer supported, use basic auth instead
  • X.X.X added support of named URLs to all modules. Anywhere that previously accepted name or id can also support named URLs
  • 0.0.1-devel is the version you should see if installing from source, which is intended for development and expected to be unstable.

The following notes are changes that may require changes to playbooks:

  • The credential module no longer allows kind as a parameter; additionally, inputs must now be used with a variety of key/value parameters to go with it (e.g., become_method)

  • The job_wait module no longer allows min_interval/ max_interval parameters; use interval instead

  • The notification_template requires various notification configuration information to be listed as a dictionary under the notification_configuration parameter (e.g., use_ssl)

  • In the inventory_source module, the source_project (when provided) lookup defaults to the specified organization in the same way the inventory is looked up

  • The module tower_notification was renamed tower_notification_template. In ansible >= 2.10 there is a seamless redirect. Ansible 2.9 does not respect the redirect.

  • When a project is created, it will wait for the update/sync to finish by default; this can be turned off with the wait parameter, if desired.

  • Creating a "scan" type job template is no longer supported.

  • Specifying a custom certificate via the TOWER_CERTIFICATE environment variable no longer works.

  • Type changes of variable fields:

    • extra_vars in the tower_job_launch module worked with a list previously, but now only works with a dict type
    • extra_vars in the tower_workflow_job_template module worked with a string previously but now expects a dict
    • When the extra_vars parameter is used with the tower_job_launch module, the launch will fail unless ask_extra_vars or survey_enabled is explicitly set to True on the Job Template
    • The variables parameter in the tower_group, tower_host and tower_inventory modules now expects a dict type and no longer supports the use of @ syntax for a file
  • Type changes of other types of fields:

    • inputs or injectors in the tower_credential_type module worked with a string previously but now expects a dict
    • schema in the tower_workflow_job_template module worked with a string previously but not expects a list of dicts
  • tower_group used to also service inventory sources, but this functionality has been removed from this module; use tower_inventory_source instead.

  • Specified tower_config file used to handle k=v pairs on a single line; this is no longer supported. Please use a file formatted as yaml, json or ini only.

  • Some return values (e.g., credential_type) have been removed. Use of id is recommended.

  • tower_job_template no longer supports the deprecated extra_vars_path parameter, please use extra_vars with the lookup plugin to replace this functionality.

  • The notification_configuration parameter of tower_notification_template has changed from a string to a dict. Please use the lookup plugin to read an existing file into a dict.

  • tower_credential no longer supports passing a file name to ssh_key_data.

  • The HipChat notification_type has been removed and can no longer be created using the tower_notification_template module.

  • Lookup plugins now always return a list, and if you want a scalar value use lookup as opposed to query

Running Unit Tests

Tests to verify compatibility with the most recent AWX code are in awx_collection/test/awx. These can be ran via the make test_collection command in the development container.

To run tests outside of the development container, or to run against Ansible source, set up a dedicated virtual environment:

mkvirtualenv my_new_venv
# may need to replace psycopg3 with psycopg3-binary in requirements/requirements.txt
pip install -r requirements/requirements.txt -r requirements/requirements_dev.txt -r requirements/requirements_git.txt
make clean-api
pip install -e <path to your Ansible>
pip install -e .
pip install -e awxkit
py.test awx_collection/test/awx/

Running Integration Tests

The integration tests require a virtualenv with ansible >= 2.9 and awxkit. The collection must first be installed, which can be done using make install_collection. You also need a configuration file, as described in the Running section.

How to run the tests:

# ansible-test must be run from the directory in which the collection is installed
cd ~/.ansible/collections/ansible_collections/awx/awx/
ansible-test integration

Licensing

All content in this folder is licensed under the same license as Ansible, which is the same as the license that applied before the split into an independent collection.