External-Mirrors/awx

mirror of https://github.com/ansible/awx.git synced 2026-01-26 16:11:30 -03:30

Author	SHA1	Message	Date
Ryan Petrello	b744c4ebb7	further optimize callback receiver buffering for certain situations see: https://github.com/ansible/awx/issues/9085	2021-01-14 17:17:12 -05:00
Chris Meyers	eb47c8dbc6	centralize reusable profiling code	2020-10-27 08:21:41 -04:00
Ryan Petrello	baad765179	refactor some callback receiver code the bigint migration removed the foreign key constraints for: - host_id - job_id (and projectupdate_id, etc...) because of this, we don't really need to check explicitly for a host_id IntegrityError anymore (because it won't occur) additionally, while it's possible to insert an event with a mismatched job_id now (for example, you can totally start a long-running job, and delete the job record in the background using the ORM or psql), doing so results in DoesNotExist errors in the code that handles the playbook_on_stats events	2020-09-25 13:12:42 -04:00
Ryan Petrello	cd0b9de7b9	remove multiprocessing.Queue usage from the callback receiver instead, just have each worker connect directly to redis this has a few benefits: - it's simpler to explain and debug - back pressure on the queue keeps messages around in redis (which is observable, and survives the restart of Python processes) - it's likely notably more performant at high loads	2020-09-24 13:53:58 -04:00
Ryan Petrello	57f8e48894	make --status more robust for dispatcher, and add support for receiver make the --status flag work by fetching a periodically recorded snapshot of internal process state; additionally, update the callback receiver to also record these statistics so we can gain more insight into any performance issues	2020-09-17 15:33:37 -04:00
Jeff Bradberry	ced8f42835	Force worker processes to have a different signal handler from the parent Situations have come up where the 5+ minute kill signal for run_task_manager is emitted to the worker process running it, but since the worker improperly inherited the AWXConsumerBase().stop() handler a deadlock ultimately was triggered on the database connection.	2020-06-04 15:41:28 -04:00
Ryan Petrello	b4b261b918	fix busted flake8	2020-05-01 13:51:37 -04:00
chris meyers	a8f52c1639	actually do exponential calc rather than 2 Log the time til reconnect attemp to log message rather than attempt number	2020-04-28 15:24:08 -04:00
chris meyers	2ecd055d1e	sleep backoff on cb receiver reconnect * Sleep before trying to reconnect Most common reason for entering this reconnect loop is when Redis service stops before the callback receiver when stopping tower services.	2020-04-28 12:47:40 -04:00
Christian Adams	a899a147e1	Fix new flake8 from pyflakes 2.2.0 release	2020-04-20 09:50:50 -04:00
Ryan Petrello	80147acc1c	work around redis connection failures in the callback receiver if redis stops/starts, sometimes the callback receiver doesn't recover without a restart; this fixes that	2020-04-09 15:38:03 -04:00
Ryan Petrello	c8044b4755	migrate event table primary keys from integer to bigint see: https://github.com/ansible/awx/issues/6010	2020-03-26 15:54:38 -04:00
Ryan Petrello	d40a5dec8f	change when we send job notifications to avoid a race condition success/failure notifications for playbooks include summary data about the hosts in based on the contents of the playbook_on_stats event the current implementation suffers from a number of race conditions that sometimes can cause that data to be missing or incomplete; this change makes it so that for playbooks we build (and send) the notification in response to the playbook_on_stats event, not the EOF event	2020-03-19 10:01:52 -04:00
chris meyers	093d204d19	fix flake8	2020-03-18 16:10:19 -04:00
chris meyers	be58906aed	remove kombu	2020-03-18 16:10:17 -04:00
chris meyers	2a2c34f567	combine all the broker replacement pieces * local redis for event processing * postgres for message broker * redis for websockets	2020-03-18 16:10:15 -04:00
chris meyers	558e92806b	POC postgres broker	2020-03-18 16:10:15 -04:00
chris meyers	355fb125cb	redis events	2020-03-18 16:10:15 -04:00
AlanCoding	e59cb07064	Add wording for control message log	2020-02-11 10:01:25 -05:00
Ryan Petrello	3c31e0ed16	some more minor callback cleanup and development tweaks	2020-01-27 17:18:09 -05:00
Ryan Petrello	78b00652bd	add the ability to enable profiling for the callback receiver workers	2020-01-27 12:03:53 -05:00
Bill Nottingham	4e46d5d7cd	Fix some lint	2020-01-20 17:15:27 -05:00
Ryan Petrello	8bd9233d2c	remove some unnecessary callback receiver debugging code	2020-01-14 14:21:53 -05:00
Ryan Petrello	306f504fb7	optimize the callback receiver to buffer writes on high throughput additionaly, optimize away several per-event host lookups and changed/failed propagation lookups we've always performed these (fairly expensive) queries on every event save - if you're processing tens of thousands of events in short bursts, this is way too slow this commit also introduces a new command for profiling the insertion rate of events, `awx-manage callback_stats` see: https://github.com/ansible/awx/issues/5514	2020-01-14 12:04:26 -05:00
Ryan Petrello	3094b67664	work around a bug in the k8s client that leaves trash in /tmp	2019-10-29 11:24:17 -04:00
Ryan Petrello	d01088d33e	Revert "add support for `awx-manage run_callback_receiver --status`"	2019-10-18 09:49:02 -04:00
Ryan Petrello	ffb1707e74	add support for `awx-manage run_callback_receiver --status`	2019-10-17 11:10:27 -04:00
Ryan Petrello	17a803f49c	remove the old callback plugin import paths and callback-specific tests	2019-04-12 16:11:23 -04:00
Ryan Petrello	32ee9838af	use the correct logger for the callback receiver the callback receiver and dispatcher share several modules, so add logic to use the correct logger	2019-03-15 08:09:47 -04:00
Ryan Petrello	daeeaf413a	clean up unnecessary usage of the six library (awx only supports py3)	2019-01-25 00:19:48 -05:00
Ryan Petrello	4707dc2a05	clean up some unnecessary dispatcher reaping code	2019-01-24 11:11:05 -05:00
Ryan Petrello	f223df303f	convert py2 -> py3	2019-01-15 14:09:01 -05:00
Ryan Petrello	5950f26c69	only allow the task dispatch worker to import and run decorated tasks this _technically_ prevents a remote code exploit where a user who has access to publish AMQP messages to the dispatch queue could craft a special message that would import and run arbitrary Python functions; that said, the types of user with this privilege level are generally _already_ the awx user (so they can already do this by hand if they want)	2018-12-12 17:46:41 -05:00
Ryan Petrello	0391dbc292	add additional DB retry logic to the callback receiver initially, I implemented this for _only_ the task worker, but it's probably needed for callback event workers, too	2018-11-29 11:57:46 -05:00
AlanCoding	482395eb6a	reduce default verbosity of devel-specific callback logging	2018-10-26 10:03:46 -04:00
Ryan Petrello	0d29bbfdc6	make the dispatcher more fault-tolerant to prolonged database outages	2018-10-18 20:00:07 -04:00
Ryan Petrello	53ae05094e	use the proper logger for the callback receiver	2018-10-17 10:56:29 -04:00
Ryan Petrello	ff1e8cc356	replace celery task decorators with a kombu-based publisher this commit implements the bulk of `awx-manage run_dispatcher`, a new command that binds to RabbitMQ via kombu and balances messages across a pool of workers that are similar to celeryd workers in spirit. Specifically, this includes: - a new decorator, `awx.main.dispatch.task`, which can be used to decorate functions or classes so that they can be designated as "Tasks" - support for fanout/broadcast tasks (at this point in time, only `conf.Setting` memcached flushes use this functionality) - support for job reaping - support for success/failure hooks for job runs (i.e., `handle_work_success` and `handle_work_error`) - support for auto scaling worker pool that scale processes up and down on demand - minimal support for RPC, such as status checks and pool recycle/reload	2018-10-11 10:53:30 -04:00

38 Commits