AAP-57817 Add Redis connection retry using redis-py 7.0+ built-in (#16176)

* AAP-57817 Add Redis connection retry using redis-py 7.0+ built-in mechanism

* Refactor Redis client helpers to use settings and eliminate code duplication

* Create awx/main/utils/redis.py and move Redis client functions to avoid circular imports

* Fix subsystem_metrics to share Redis connection pool between
  client and pipeline

* Cache Redis clients in RelayConsumer and RelayWebsocketStatsManager to avoid creating new connection pools on every call

* Add cap and base config

* Add Redis retry logic with exponential backoff to handle connection failures during long-running operations

* Add REDIS_BACKOFF_CAP and REDIS_BACKOFF_BASE settings to allow
  adjustment of retry timing in worst-case scenarios without code changes

* Simplify Redis retry tests by removing unnecessary reload logic
This commit is contained in:
Lila Yasin
2025-12-01 09:08:47 -05:00
committed by GitHub
parent 0d86874d5d
commit 4f41b50a09
17 changed files with 264 additions and 24 deletions

View File

@@ -33,6 +33,7 @@ from awx.main.models.rbac import (
)
from awx.main.models.unified_jobs import UnifiedJob
from awx.main.utils.common import get_corrected_cpu, get_cpu_effective_capacity, get_corrected_memory, get_mem_effective_capacity
from awx.main.utils.redis import get_redis_client
from awx.main.models.mixins import RelatedJobsMixin, ResourceMixin
from awx.main.models.receptor_address import ReceptorAddress
@@ -397,7 +398,7 @@ class Instance(HasPolicyEditsMixin, BaseModel):
try:
# if redis is down for some reason, that means we can't persist
# playbook event data; we should consider this a zero capacity event
redis.Redis.from_url(settings.BROKER_URL).ping()
get_redis_client().ping()
except redis.ConnectionError:
errors = _('Failed to connect to Redis')