Commit Graph

18 Commits

Author SHA1 Message Date
Alan Rominger
362e11aaf2 Respect old downtime setting name if user has already set it 2024-02-15 12:34:24 +00:00
Chris Meyers
527755d986 Always get host from event data
* Regardless of if there is a host map or not, get the host that Ansible
  reports and assign it to the Events' host_name
2024-01-31 09:42:01 -05:00
Alan Rominger
cdb4f0b7fd Consume job_explanation from runner, fix error reporting error (#13482) 2023-08-30 16:45:50 -04:00
jbreitwe-rh
bb1c155bc9 Fixed typos (#14347) 2023-08-16 15:05:23 -04:00
Alan Rominger
6f9ea1892b AAP-14538 Only process ansible_facts for successful jobs (#14313) 2023-08-04 17:10:14 -04:00
Alan Rominger
06808ef4c4 Merge pull request #13608 from AlanCoding/keepalive
Use ansible-runner change to get periodic keep-alive messages in K8S
2023-03-06 14:34:37 -05:00
Alan Rominger
d5de1f9d11 Make use of new keepalive messages from ansible-runner
Make setting API configurable and process keepalive events
  when seen in the event callback

Use env var in pod spec and make it specific to K8S
2023-02-28 09:04:07 -05:00
Hao Liu
cf21eab7f4 [chore] update project_update playbook to be compliant with ansible-lint
reshaving the yak

Co-Authored-By: Gabriel Muniz <gmuniz@redhat.com>
2023-02-27 18:32:10 -05:00
Alan Rominger
f5785976be Update to comply with new black rules 2023-02-01 14:59:38 -05:00
Alan Rominger
6538d34b48 Remove ssh_key_data fix, handled in runner now 2022-10-31 11:01:28 -04:00
Alan Rominger
c59bbdecdb Refactor canceling to work through messaging and signals, not database
If canceled attempted before, still allow attempting another cancel
in this case, attempt to send the sigterm signal again.
Keep clicking, you might help!

Replace other cancel_callbacks with sigterm watcher
  adapt special inventory mechanism for this too

Get rid of the cancel_watcher method with exception in main thread

Handle academic case of sigterm race condition

Process cancelation as control signal

Fully connect cancel method and run_dispatcher to control

Never transition workflows directly to canceled, add logs
2022-09-01 15:20:31 -04:00
Alan Rominger
fd671ecc9d Give specific messages if job was killed due to SIGTERM or SIGKILL (#12435)
* Reap jobs on dispatcher startup to increase clarity, replace existing reaping logic

* Exit jobs if receiving SIGTERM signal

* Fix unwanted reaping on shutdown, let subprocess close out

* Add some sanity tests for signal module

* Add a log for an unhandled dispatcher error

* Refine wording of error messages

Co-authored-by: Elijah DeLee <kdelee@redhat.com>
2022-06-30 13:20:08 -04:00
Alan Rominger
452744b67e Delay update of artifacts and error fields until final job save (#11832)
* Delay update of artifacts until final job save

Save tracebacks from receptor module to callback object

Move receptor traceback check up to be more logical

Use new mock_me fixture to avoid DB call with me method

Update the special runner message to the delay_update pattern

* Move special runner message into post-processing of callback fields
2022-05-03 14:42:50 -04:00
Alan Rominger
29d60844a8 Fix notification timing issue by sending in the latter of 2 events (#12110)
* Track host_status_counts and use that to process notifications

* Remove now unused setting

* Back out changes to callback class not needed after all

* Skirt the need for duck typing by leaning on the cached field

* Delete tests for deleted task

* Revert "Back out changes to callback class not needed after all"

This reverts commit 3b8ae350d218991d42bffd65ce4baac6f41926b2.

* Directly hardcode stats_event_type for callback class

* Fire notifications if stats event was never sent

* Remove test content for deleted methods

* Add placeholder for when no hosts matched

* Make field default be None, denote events processed with empty dict

* Make UI process null value for host_status_counts

* Fix tracking of EOF dispatch for system jobs

* Reorganize EVENT_MAP into class properties

* Consolidate conditional I missed from EVENT_MAP refactor

* Give up on the null condition, also applies for empty hosts

* Remove cls position argument not being used

* Move wrapup method out of class, add tests
2022-04-29 13:54:31 -04:00
Alan Rominger
73e02e745a Patches to make jobs robust to database restarts (#11905)
* Simple patches to make jobs robust to database restarts

* Add some wait time before retrying loop due to DB error

* Apply dispatcher downtime setting to job updates, fix dispatcher bug

This resolves a bug where the pg_is_down property
  never had the right value
  the loop is normally stuck in the conn.events() iterator
  so it never recognized successful database interactions
  this lead to serial database outages terminating jobs

New setting for allowable PG downtime is shared with task code
  any calls to update_model will use _max_attempts parameter
  to make it align with the patience time that the dispatcher
  respects when consuming new events

* To avoid restart loops, handle DB errors on startup with prejudice

* If reconnect consistently fails, exit with non-zero code
2022-03-30 09:14:20 -04:00
Jeff Bradberry
1803c5bdb4 Fix up usage of django-guid
It has replaced the class-based middleware, everything is
function-based now.
2022-03-14 13:19:57 -04:00
Jeff Bradberry
b852baaa39 Fix up logger .warn() calls to use .warning() instead
This is a usage that was deprecated in Python 3.0.
2022-03-07 18:11:36 -05:00
Amol Gautam
443bdc1234 Decoupled callback functions from BaseTask Class
--- Removed all callback functions from 'jobs.py' and put them in a new file '/awx/main/tasks/callback.py'
--- Modified Unit tests unit moved
--- Moved 'update_model' from jobs.py to /awx/main/utils/update_model.py
2022-02-09 13:46:32 -05:00