Commit Graph

164 Commits

Author SHA1 Message Date
Alan Rominger
e1e2c60f2e AAP-66379 Include scaledown fix from dispatcherd (#16305)
Include scaledown fix from dispatcherd
2026-02-27 14:45:57 -05:00
Alan Rominger
5e93f60b9e AAP-41776 Enable new fancy asyncio metrics for dispatcherd (#16233)
* Enable new fancy asyncio metrics for dispatcherd

Remove old dispatcher metrics and patch in new data from local whatever

Update test fixture to new dispatcherd version

* Update dispatcherd again

* Handle node filter in URL, and catch more errors

* Add test for metric filter

* Split module for dispatcherd metrics
2026-02-04 15:28:34 -05:00
Alan Rominger
f80bbc57d8 AAP-43117 Additional dispatcher removal simplifications and waiting reaper updates (#16243)
* Additional dispatcher removal simplifications and waiting repear updates

* Fix double call and logging message

* Implement bugbot comment, should reap running on lost instances

* Add test case for new pending behavior
2026-01-26 13:55:37 -05:00
Jake Jackson
36a00ec46b AAP-58539 Move to dispatcherd (#16209)
* WIP First pass
* started removing feature flags and adjusting logic
* Add decorator
* moved to dispatcher decorator
* updated as many as I could find
* Keep callback receiver working
* remove any code that is not used by the call back receiver
* add back auto_max_workers
* added back get_auto_max_workers into common utils
* Remove control and hazmat (squash this not done)
* moved status out and deleted control as no longer needed
* removed unused imports
* adjusted test import to pull correct method
* fixed imports and addressed clusternode heartbeat test
* Update function comments
* Add back hazmat for config and remove baseworker
* added back hazmat per @alancoding feedback around config
* removed baseworker completely and refactored it into the callback
  worker
* Fix dispatcher run call and remove dispatch setting
* remove dispatcher mock publish setting
* Adjust heartbeat arg and more formatting
* fixed the call to cluster_node_heartbeat missing binder
* Fix attribute error in server logs
2026-01-23 20:49:32 +00:00
Alan Rominger
dce5ac73c5 Apply new rules from black update (#16232) 2026-01-19 12:58:07 -05:00
jessicamack
de86b93690 AAP-59874: Update to Python 3.12 (#16208)
* update to Python 3.12

* remove use of utcnow

* switch to timezone.utc

datetime.UTC is an alias of datetime.timezone.utc. if we're doing the double import for datetime it's more straightforward to just import timezone as well and get it directly

* debug python env version issue

* change python version

* pin to SHA and remove debug portion
2026-01-07 11:57:24 -05:00
Alan Rominger
054f6032fd AAP-47956 Use pg_notify for cancel and debugging, abandon socket approach (#16199)
* Use pg_notify for cancel and debugging, abandon socket approach

* Bump dispatcherd for pg_notify chunking
2025-12-10 14:38:39 -05:00
Hao Liu
b24156805a Upgrade to Django 5.2 LTS (#16185)
Upgrade to Django 5.2 LTS with compatibility fixes across fields, migrations, dispatch config, tests, and dev deps.

Dependencies:
- Upgrade django to 5.2.8 and relax requirements.in to >=5.2,<5.3.
- Bump django-debug-toolbar to >=6.0 for compatibility.

Backend:
- awx/conf/fields.py: switch URL TLD regex to use DomainNameValidator.ul in custom URLField.
- awx/main/management/commands/gather_analytics.py: use datetime.timezone.utc for naïve datetime handling.
- awx/main/dispatch/config.py: add mock_publish option; avoid DB access for test runs, set default max_workers, and support a noop broker.

Migrations (SQLite/Postgres compatibility):
- Add awx/main/migrations/_sqlite_helper.py with db-aware AlterIndexTogether/RenameIndex wrappers; consume in 0144_event_partitions.py and 0184_django_indexes.py.
- Update 0187_hop_nodes.py to use CheckConstraint(condition=...).
- Add 0205_alter_instance_peers_alter_job_hosts_and_more.py adjusting through_fields/relations on instance.peers, job.hosts, and role.ancestors.
- _dab_rbac.py: iterate roles with chunk_size=1000 for migration performance.

Tests:
Include hcp_terraform in default credential types in test_credential.py.
---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Alan Rominger <arominge@redhat.com>
2025-12-03 14:22:52 -05:00
Lila Yasin
4f41b50a09 AAP-57817 Add Redis connection retry using redis-py 7.0+ built-in (#16176)
* AAP-57817 Add Redis connection retry using redis-py 7.0+ built-in mechanism

* Refactor Redis client helpers to use settings and eliminate code duplication

* Create awx/main/utils/redis.py and move Redis client functions to avoid circular imports

* Fix subsystem_metrics to share Redis connection pool between
  client and pipeline

* Cache Redis clients in RelayConsumer and RelayWebsocketStatsManager to avoid creating new connection pools on every call

* Add cap and base config

* Add Redis retry logic with exponential backoff to handle connection failures during long-running operations

* Add REDIS_BACKOFF_CAP and REDIS_BACKOFF_BASE settings to allow
  adjustment of retry timing in worst-case scenarios without code changes

* Simplify Redis retry tests by removing unnecessary reload logic
2025-12-01 09:08:47 -05:00
Seth Foster
2fa2cd8beb Add timeout and on duplicate to system tasks (#16169)
Modify the invocation of @task_awx to accept timeout and
on_duplicate keyword arguments. These arguments are
only used in the new dispatcher implementation.

Add decorator params:
- timeout
- on_duplicate

to tasks to ensure better recovery for
stuck or long-running processes.

---------

Signed-off-by: Seth Foster <fosterbseth@gmail.com>
2025-11-12 23:18:57 -05:00
Alan Rominger
6aea699284 Fix bug where collectstatic could error due to dispatcherd config (#15999)
* Fix bug where collectstatic could error due to dispatcherd config

* Revert test because it will not work in test suite

* New publish mocking system

* Remove import of unused

* Fix default publish broker
2025-05-21 15:11:04 -04:00
Alan Rominger
94764a1f17 AAP-42649 Flag-gated use of "dispatcherd" as its own library (#15981)
Use dynamic AWX max_workers value

Make basic --status and --running commands work

Make feature flag enabled true by default for development

* [dispatcherd] Dispatcher socket-based `--status` demo working (#15908)

* Fix Task Decorator to Work With and Without Feature Flag (AAP-41775) (#15911)

* refactor(system): extract common heartbeat helpers and split cluster_node_heartbeat

Extract common heartbeat logic into helper functions:  _heartbeat_instance_management: consolidates instance management, health checks, and lost-instance detection.  _heartbeat_check_versions: compares instance versions and initiates shutdown when necessary.  _heartbeat_handle_lost_instances: reaps jobs and marks lost instances offline.

Refactor the original cluster_node_heartbeat to use these helpers and retain legacy behavior (using bind_kwargs).

Introduce adispatch_cluster_node_heartbeat for dispatcherd: uses the control API to retrieve running tasks and reaps them.

Link the two implementations by attaching adispatch_cluster_node_heartbeat as the _new_method on cluster_node_heartbeat.

* feat(publish): delegate heartbeat task submission to new dispatcherd implementation

Update apply_async to check at runtime if FEATURE_NEW_DISPATCHER is enabled.

When the task is cluster_node_heartbeat and a _new_method is attached, delegate the task submission to the new dispatcherd implementation.

Preserve the original behavior for all other tasks and fallback on error.

* refactor(system): extract task ID retrieval from dispatcherd into helper function

Improves readability of adispatch_cluster_node_heartbeat by extracting
the complex UUID parsing logic into a dedicated helper function.
Adds clearer error handling and follows established code patterns.

* fix(dispatcher): Enable task decorator to work with and without feature flag

Implemented a new approach for handling task execution with feature flags
by attaching alternative implementations to apply_async._new_method. This
allows cluster_node_heartbeat to work correctly with both the legacy and
new dispatcher systems without modifying core decorator logic.

AAP-41775

* fix(dispatcher): Improve error handling and logging in feature flag implementation

- Add error handling when attaching alternative dispatcher implementation
- Fix method self-reference in apply_async to properly use cls.apply_async
- Document limitations of this targeted approach for specific tasks
- Add logging for better debugging of dispatcher selection
- Ensure decorator timing by keeping method attachment after function definitions

This completes the robust implementation for switching between dispatcher
implementations based on feature flags.

AAP-41775

* fix(dispatcher): Implement registry pattern for dispatcher feature flag compatibility

Replaces direct method attribute assignment with a global registry for
alternative implementations. The original approach tried to attach new
methods directly to apply_async bound methods, which fails because bound
methods don't support attribute assignment in Python.

The registry pattern:
- Creates a global ALTERNATIVE_TASK_IMPLEMENTATIONS dict in publish.py
- Registers alternative implementations by task name
- Modifies apply_async to check the registry when feature flag is enabled
- Adds extensive logging throughout the process for debugging

This enables cluster_node_heartbeat to work correctly with both the legacy
and new dispatcher implementations based on the FEATURE_NEW_DISPATCHER flag.

AAP-41775

* refactor(dispatcher): Remove excessive logging from dispatcher implementation

Reduces verbose debugging logs while maintaining essential logging for critical
operations. Preserves:
- Task implementation selection based on feature flag
- Registration success/failure messages
- Critical error reporting

Removed:
- Registry content debugging messages
- Repetitive task diagnostics
- Non-essential information logging

AAP-41775

* fix(dispatcher): Fix shallow copy in dispatcher schedule conversion

This resolves "AttributeError: 'float' object has no attribute 'total_seconds'"
errors when the dispatcher is restarted.

Refs: AAP-41775

* Use IPC mechanism to get running tasks (#15926)
* Allow tasks from tasks
* Fix failure to limit to waiting jobs
* Get job record with lock
* Fix failures in dispatcherd feature branch (#15930)
* Fully handle DispatcherCancel
* Complete rest of preload import work
* Complete dispatcherd integration & job cancellation (AAP-43033) (#15941)
* feat(dispatcher): Implement job cancellation for new dispatcher

Adds feature-flag-aware job cancellation that routes cancel requests to either
the legacy dispatcher or the new dispatcherd library based on the
FEATURE_NEW_DISPATCHER flag.

- Updates cancel_dispatcher_process() to use dispatcherd's control API when enabled
- Handles both direct cancellation and task manager workflow cancellation cases
- Works with DispatcherCancel exception handling to properly handle SIGUSR1 signals

AAP-43033

* fix(dispatcher): Update run_dispatcher.py to properly handle task cancellation

Modifies the cancel command in run_dispatcher.py to properly cancel tasks
when the FEATURE_NEW_DISPATCHER flag is enabled, rather than just listing
running tasks.

The implementation translates each task UUID to the appropriate
filter format expected by the dispatcherd control API, maintaining the same
behavior as the original implementation.

Part of: AAP-43033

* refactor(system): Refactor dispatch_startup() to extract common startup logic and branch based on feature flag

This commit refactors the dispatch_startup() function to improve clarity and consistency across the legacy
and new dispatcherd flows.

No dispatcher-specific functionality is needed beyond the changes made, so this refactoring improves robustness without
altering core behavior.

* refactor(system): Refactor inform_cluster_of_shutdown() for clarity

* refactor(tasks): Replace @task with @task_awx across 22 tasks for dispatcher compatibility

- Migrated all task decorators to use @task_awx, ensuring dispatcher-aware behavior.
- Tested each task with the new dispatcherd, verifying that tasks using the registry pattern execute correctly without needing binder‐based alternative implementations.
- Removed redundant logging and outdated comments.
- Legacy tasks that do not require special parameter extraction continue to use their original logic.
- This commit reflects our complete journey of testing and verifying dispatcherd compatibility across all 22 tasks.

* refactor(publish): fix linter

* Fix bug from the branch rebase

* AAP-43763 Add tests for connection management in dispatcherd workers (#15949)

* Add test for job cancel in live tests
* Fix bug from the branch rebase
* Add test for connection recovery after connection broke
* Add test for breaking connection

* Fix dispatcherd bugs: schedule aliases, job kwargs handling, cancel handling (#15960)

* Put in job kwargs handling, not done before

* AAP-44382 [dispatcherd] Fixes for running with feature flag off (#15973)

* Use correct decorator for test of tasks

* Finalize dispatcherd feature branch (#15975)

* Work dispatcherd into dependency management system

* Use util methods from DAB

* Rename the dispatcherd feature flag, and flip default to not-enabled

* Move to new submit_task method

* Update the location of the sock file

* AAP-44381 Make dispatcherd config loading more lazy (#15979)

* Make dispatcherd config loading more lazy

* Make submission error more obvious

* Fix signal handling gap, hijack SIGUSR1 from dispatcherd (#15983)

* Fix signal handling gap, hijack SIGUSR1 from dispatcherd

* Minor adjustments to dispatcherd status command

* [dispatcherd] Get rid of alternative task registry (#15984)

Get rid of alternative task registry

* Fix deadlock error and other cleanup errors (#15987)
* Move to proper error handling location

---------

Co-authored-by: artem_tiupin <70763601+art-tapin@users.noreply.github.com>
2025-05-16 09:39:22 -04:00
Elijah DeLee
6accd1e5e6 introduce age for workers and mandatory retirement
Retire workers after a certain age, allowing them to finish their
current task if they are not idle.

This mitigates any issues like memory leaks in long running workers,
especially if systems stay busy for months at a time.

Introduce new optional setting WORKER_MAX_LIFETIME_SECONDS, defaulting to 4 hours.
2025-05-15 09:57:30 -04:00
Alan Rominger
db6e8b9bad AAP-40782 Fix too-low max_workers value, dump running at capacity (#15873)
* Dump running tasks when running out of capacity

* Use same logic for max_workers and capacity

* Address case where CPU capacity is the constraint

* Add a test for correspondence

* Fake redis to make tests work
2025-04-16 16:43:21 -04:00
Alan Rominger
5ff3d4b2fc Reduce log noise from next run being in past (#15670) 2025-04-14 09:04:16 -04:00
Alan Rominger
c3ee0c2d8a Sensible log behavior when redis is unavailable (#15466)
* Sensible log behavior when redis is unavailable

* Consistent behavior with dispatcher and callback
2025-04-10 13:45:05 -07:00
Alan Rominger
7d30dff075 Feature indirect host counting (#15802)
* AAP-37282 Add parse JQ data and test it for a `job` object in isolation (#15774)

* Add jq dependency

* Add file in progress

* Add license for jq

* Write test and get it passing

* Successfully test collection of `event_query.yml` data (#15761)

* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events

* feat: try grabbing query files from artifacts directory (#15776)

* Contract changes for the event_query collection callback plugin (#15785)

* Minor import changes to collection processing in callback plugin

* Move agreed location of event_query file

* feat: remaining schema changes for indirect host audits (#15787)

* Re-organize test file and move artifacts processing logic to callback (#15784)

* Rename the indirect host counting test file

* Combine artifacts saving logic

* Connect host audit model to jq logic via new task

* Add unit tests for indirect host counting (#15792)

* Do not get django flags from database (#15794)

* Document, implement, and test remaining indirect host audit fields (#15796)

* Document, implement, and test remaining indirect host audit fields

* Fix hashing

* AAP-39559 Wait for all event processing to finish, add fallback task (#15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task

* feat: cleanup of old indirect host audit records (#15800)

* By default, do not count indirect hosts (#15801)

* By default, do not count indirect hosts

* Fix copy paste goof

* Fix linter issue from base branch

* prevent multiple tasks from processing the same job events, prevent p… (#15805)

prevent multiple tasks from processing the same job events, prevent periodic task from spawning another task per job

* Fix typos and other bugs found by Pablo review

* fix: rely on resolved_action instead of task, adapt to proposed query… (#15815)

* fix: rely on resolved_action instead of task, adapt to proposed query structure

* tests: update indirect host tests

* update remaining queries to new format

* update live test

* Remove polling loop for job finishing event processing (#15811)

* Remove polling loop for job finishing event processing

* Make awx/main/tests/live dramatically faster (#15780)

* AAP-37282 Add parse JQ data and test it for a `job` object in isolation (#15774)

* Add jq dependency

* Add file in progress

* Add license for jq

* Write test and get it passing

* Successfully test collection of `event_query.yml` data (#15761)

* Callback plugin method from cmeyers adapted to global collection list

Get tests passing

Mild rebranding

Put behind feature flag, flip true in dev

Add noqa flag

* Add missing wait_for_events

* feat: try grabbing query files from artifacts directory (#15776)

* Contract changes for the event_query collection callback plugin (#15785)

* Minor import changes to collection processing in callback plugin

* Move agreed location of event_query file

* feat: remaining schema changes for indirect host audits (#15787)

* Re-organize test file and move artifacts processing logic to callback (#15784)

* Rename the indirect host counting test file

* Combine artifacts saving logic

* Connect host audit model to jq logic via new task

* Document, implement, and test remaining indirect host audit fields (#15796)

* Document, implement, and test remaining indirect host audit fields

* Fix hashing

* AAP-39559 Wait for all event processing to finish, add fallback task (#15798)

* Wait for all event processing to finish, add fallback task

* Add flag check to periodic task

* feat: cleanup of old indirect host audit records (#15800)

* prevent multiple tasks from processing the same job events, prevent p… (#15805)

prevent multiple tasks from processing the same job events, prevent periodic task from spawning another task per job

* Remove polling loop for job finishing event processing (#15811)

* Remove polling loop for job finishing event processing

* Make awx/main/tests/live dramatically faster (#15780)

* temp

* remove test

* reorder migrations to allow indirect instances backport

* cleanup for rebase and merge into devel

---------

Co-authored-by: Peter Braun <pbraun@redhat.com>
Co-authored-by: jessicamack <jmack@redhat.com>
Co-authored-by: Peter Braun <pbranu@redhat.com>
2025-02-24 16:39:51 +00:00
Alan Rominger
7d2b2d672c Make awx/main/tests/live dramatically faster (#15780)
* Make awx/main/tests/live dramatically faster

* Add new setting to exclude list
2025-02-08 21:07:56 -05:00
Alan Rominger
2186c24c8f General upgrade of dependencies (#15705)
* General upgrade of dependencies

* adjust licenses to match requirements

* add missing licenses

* another pass to fix licenses

* Try easy for for psycopg encoding pattern change

---------

Co-authored-by: jessicamack <jmack@redhat.com>
2025-01-07 15:03:43 -05:00
Alan Rominger
f377b5fdde Use runtime log utility moved to DAB (#15675)
* Use runtime log utility moved to DAB
2024-12-11 10:38:24 -05:00
jamesmarshall24
4e0d19914f LISTENER_DATABASES clobbers DATABASES OPTIONS (#15306)
Do not overwrite DATABASES OPTIONS with LISTENER_DATABASES
2024-06-27 13:26:30 -04:00
Hao Liu
d558204192 Make db password optional for wsrelay (#15046)
* Make db password optional for wsrelay

* Change DB setting copy to deepcopy

safer than copy()

Co-Authored-By: Jeff Bradberry <685957+jbradberry@users.noreply.github.com>

---------

Co-authored-by: Jeff Bradberry <685957+jbradberry@users.noreply.github.com>
2024-04-02 11:47:24 -04:00
Hao Liu
3fb3125bc3 Send QUIT to worker before dying (#14913)
Fix deadlock scenario where dispatcher child process stuck in reading from queue loop after dispatcher parent process decided to quit

Co-authored-by: Alan Rominger <arominge@redhat.com>
2024-02-21 16:08:43 -05:00
Alan Rominger
362e11aaf2 Respect old downtime setting name if user has already set it 2024-02-15 12:34:24 +00:00
Alan Rominger
20202054cc Use lowercase password 2024-02-13 14:36:39 +00:00
Alan Rominger
e84e2962d0 Support DB configs where PASSWORD is not used 2024-02-13 14:36:39 +00:00
Chris Meyers
8a902debd5 Per-service metrics http server
* Organize metrics into their respective service
* Server per-service metrics on a per-service http server
* Increase prometheus client usage over our custom metrics fields
2024-02-05 15:17:24 -05:00
Alan Rominger
d91da39f81 New setting for pg_notify listener DB settings, add keepalive (#14755) 2024-01-17 13:44:04 -05:00
Alan Rominger
93c329d9d5 Fix cancel bug - WorkflowManager cancel in transaction (#14608)
This fixes a bug where jobs within a workflow job were not canceled
  when the workflow job was canceled by the user

The fix is to submit the cancel request as a part of the
  transaction that WorkflowManager commits its work in
  this requires that we send the message without expecting a reply
  so this changes the control-with-reply cancel to just a control function
2023-10-30 15:30:18 -04:00
jbreitwe-rh
bb1c155bc9 Fixed typos (#14347) 2023-08-16 15:05:23 -04:00
Alan Rominger
284bd8377a Integrate scheduler into dispatcher main loop (#14067)
Dispatcher refactoring to get pg_notify publish payload
  as separate method

Refactor periodic module under dispatcher entirely
  Use real numbers for schedule reference time
  Run based on due_to_run method

Review comments about naming and code comments
2023-08-10 14:43:07 -04:00
Rick Elrod
48edb15a03 Prevent Dispatcher deadlock when Redis disappears (#14249)
This fixes https://github.com/ansible/awx/issues/14245 which has
more information about this issue.

This change addresses both:
- A clashing signal handler (registering a callback to fire when
  the task manager times out, and hitting that callback in cases
  where we didn't expect to). Make dispatcher timeout use
  SIGUSR1, not SIGTERM.
- Metrics not being reported should not make us crash, so that is
  now fixed as well.

Signed-off-by: Rick Elrod <rick@elrod.me>
Co-authored-by: Alan Rominger <arominge@redhat.com>
2023-07-18 10:43:46 -05:00
Alan Rominger
b8c48f7d50 Restore pre-upgrade pg_notify notifcation behavior (#14222) 2023-07-11 16:23:53 -04:00
John Westcott IV
cd4d83acb7 Compensating for NUL unicode characters
NUL characters are not allowed in text fields in the database

We used to strip them out of stdout but the exception changed

And we want to be sure to strip them out of JSONBlob fields
2023-06-14 17:40:15 -04:00
John Westcott IV
e47d30974c Removing psycopg2 references 2023-06-14 17:40:15 -04:00
Alan Rominger
fbaeb90268 Apply conservative database connection reduction changes (#14066)
This is expected to free up 4 additional database connections per traditional node
  compare to roughly 12 in total before this change

Out of these 3 are accomplished by using existing connection for recently added services
  then 1 is obtained by closing the connection for the idle callback receiver main process

Signed-off-by: jessicamack <jmack@redhat.com>
Co-authored-by: jessicamack <jmack@redhat.com>
2023-06-01 14:59:18 -04:00
Alan Rominger
ef99770383 Add subsystem metrics for the dispatcher (#13989)
This adds a handful of metrics to /api/v2/metrics/ recorded from the dispatcher main process

Adds logic in the dispatcher period tasks to calculate these for the last collection interval
Reports worker count, task count, scale up events, and availability

Add data to demo grafana dashboard
2023-05-17 14:29:31 -04:00
Alan Rominger
342e9197b8 Customize application_name for different connections in dispatcher service (#13074)
* Introduce new method in settings, import in-line w NOQA mark

* Further refine the app_name to use shorter service names like dispatcher

* Clean up listener logic, change some names
2023-04-13 22:36:36 -04:00
Hao Liu
c8c8ed1775 Raise ValueError when no ready and enabled task instance 2023-03-29 22:09:19 -04:00
Hao Liu
25303ee625 Only select task instance that are ready and enabled
When select a queue for task instance to run task only select task instance that are ready and enabled
2023-03-29 22:09:19 -04:00
Hao Liu
cd3f7666be add get_task_queuename
get_local_queuename will return the pod name of the instance

now that web and task are in different pods when web container queue a task it will be put into a queue without as task worker to execute the task
2023-03-29 22:09:19 -04:00
Jessica Mack
43f4872fec these methods don't need to be class methods
Signed-off-by: Jessica Mack <jmack@redhat.com>
2023-03-29 22:04:43 -04:00
Gabriel Muniz
e15f4de0dd Fix race with heartbeat and reaper logic (#13713)
* Fix race with heartbeat and reaper logic

* Fix tests to fail when over drift over heartbeat time

* replaced modified with started time for reap() code and added test

* fixed logic bug and cleaned up tests

* Added comments to tests to call out reasoning
2023-03-17 14:24:31 -04:00
Alan Rominger
6c1d4a5cfd Skip callback receiver bulk_create with 0 events 2023-02-04 12:10:39 -05:00
Alan Rominger
f5785976be Update to comply with new black rules 2023-02-01 14:59:38 -05:00
Alan Rominger
8a4059d266 Workaround for events with NUL char, touch up error loop (#13398)
* Workaround for events with NUL char, touch up error loop

This fixes an error where some events would not save
  due to having the 0x00 character which errors in postgres
  this adds a line to replace it with empty text

Hitting that kind of event put us in an infinite error loop
  so this change makes a number of changes to prevent similar loops
  the showcase example is a negative counter,
  this is not realistic in the real world but works for unit tests

These error loop fixes seek to esablish the cases where we clear the buffer
Some logic is removed from the outer loop, with the idea that
ensure_connection will better distinguish flake

* From review comments, delay NUL char sanitization to later

Use pop to make list operations more clear

* Fix incorrect use of pop
2023-01-19 13:36:23 -05:00
Jeff Bradberry
721e19e1c8 Merge pull request #13181 from jbradberry/remove-qsstats
Replace the querysets provided by django-qsstats-magic
2022-11-11 10:58:51 -05:00
Jeff Bradberry
e029cf7196 Remove the django-qsstats-magic dependency 2022-11-10 15:37:44 -05:00
Alan Rominger
1f939aa25e Merge pull request #12884 from AlanCoding/is_testing
[tech debt] Move the IS_TESTING method out of settings
2022-11-09 15:29:35 -05:00
Alan Rominger
192f45bbd0 Make canceling view non-atomic to fix 500 errors with job bursts (#13072)
* Make canceling view non-atomic to fix 500 errors with job bursts

* Update test calls for cancel method changes
2022-10-20 15:02:54 -04:00