If canceled attempted before, still allow attempting another cancel
in this case, attempt to send the sigterm signal again.
Keep clicking, you might help!
Replace other cancel_callbacks with sigterm watcher
adapt special inventory mechanism for this too
Get rid of the cancel_watcher method with exception in main thread
Handle academic case of sigterm race condition
Process cancelation as control signal
Fully connect cancel method and run_dispatcher to control
Never transition workflows directly to canceled, add logs
* Reap jobs on dispatcher startup to increase clarity, replace existing reaping logic
* Exit jobs if receiving SIGTERM signal
* Fix unwanted reaping on shutdown, let subprocess close out
* Add some sanity tests for signal module
* Add a log for an unhandled dispatcher error
* Refine wording of error messages
Co-authored-by: Elijah DeLee <kdelee@redhat.com>
* Delay update of artifacts until final job save
Save tracebacks from receptor module to callback object
Move receptor traceback check up to be more logical
Use new mock_me fixture to avoid DB call with me method
Update the special runner message to the delay_update pattern
* Move special runner message into post-processing of callback fields
* Track host_status_counts and use that to process notifications
* Remove now unused setting
* Back out changes to callback class not needed after all
* Skirt the need for duck typing by leaning on the cached field
* Delete tests for deleted task
* Revert "Back out changes to callback class not needed after all"
This reverts commit 3b8ae350d218991d42bffd65ce4baac6f41926b2.
* Directly hardcode stats_event_type for callback class
* Fire notifications if stats event was never sent
* Remove test content for deleted methods
* Add placeholder for when no hosts matched
* Make field default be None, denote events processed with empty dict
* Make UI process null value for host_status_counts
* Fix tracking of EOF dispatch for system jobs
* Reorganize EVENT_MAP into class properties
* Consolidate conditional I missed from EVENT_MAP refactor
* Give up on the null condition, also applies for empty hosts
* Remove cls position argument not being used
* Move wrapup method out of class, add tests
* Simple patches to make jobs robust to database restarts
* Add some wait time before retrying loop due to DB error
* Apply dispatcher downtime setting to job updates, fix dispatcher bug
This resolves a bug where the pg_is_down property
never had the right value
the loop is normally stuck in the conn.events() iterator
so it never recognized successful database interactions
this lead to serial database outages terminating jobs
New setting for allowable PG downtime is shared with task code
any calls to update_model will use _max_attempts parameter
to make it align with the patience time that the dispatcher
respects when consuming new events
* To avoid restart loops, handle DB errors on startup with prejudice
* If reconnect consistently fails, exit with non-zero code
--- Removed all callback functions from 'jobs.py' and put them in a new file '/awx/main/tasks/callback.py'
--- Modified Unit tests unit moved
--- Moved 'update_model' from jobs.py to /awx/main/utils/update_model.py