External-Mirrors/awx

mirror of https://github.com/ansible/awx.git synced 2026-03-06 11:11:07 -03:30

Author	SHA1	Message	Date
Ryan Petrello	ff1e8cc356	replace celery task decorators with a kombu-based publisher this commit implements the bulk of `awx-manage run_dispatcher`, a new command that binds to RabbitMQ via kombu and balances messages across a pool of workers that are similar to celeryd workers in spirit. Specifically, this includes: - a new decorator, `awx.main.dispatch.task`, which can be used to decorate functions or classes so that they can be designated as "Tasks" - support for fanout/broadcast tasks (at this point in time, only `conf.Setting` memcached flushes use this functionality) - support for job reaping - support for success/failure hooks for job runs (i.e., `handle_work_success` and `handle_work_error`) - support for auto scaling worker pool that scale processes up and down on demand - minimal support for RPC, such as status checks and pool recycle/reload	2018-10-11 10:53:30 -04:00
Ryan Petrello	67d1267d98	enforce 0 <= Instance.capacity_adjustment see: https://github.com/ansible/tower/issues/2839	2018-08-21 15:34:19 -04:00
AlanCoding	a99ebbb02f	Apply policy task more selectively	2018-08-03 06:56:15 -04:00
Ryan Petrello	3cdd0a94bd	simplify dynamic queue binding we recently made a change so that instances no longer bind to instance-group specific queues, but now instead they each bind to a direct queue for their specific hostname (https://github.com/ansible/tower/pull/1922) Because of this, we shouldn't need to reconfigure the queue binds at runtime anymore when group membership changes. Under our new model, every celeryd listens on a queue named after its hostname; when the scheduler finds a task to run, it picks an Instance in the target Instance Group and sends the task to the queue for that Instance's hostname.	2018-07-28 00:48:37 -04:00
Ryan Petrello	15aaca8f03	make InstanceGroup.policy_instance_list non-exclusive by default see: https://github.com/ansible/tower/issues/2583	2018-07-25 17:26:13 -04:00
chris meyers	aeca21ab5b	deny topology changes to iso instances via api	2018-07-06 14:50:17 -04:00
Ryan Petrello	ef6433c6f9	Revert "fix celery task reaper" This reverts commit `1359208a99`.	2018-06-18 16:18:20 -04:00
chris meyers	1359208a99	fix celery task reaper * celery workers have internal queue names that are named after the system hostname. This may differ from what tower knows the host by, Instance.hostname This adds a mapping so we can convert internal celery names to Instance names for purposes of reaping jobs.	2018-06-15 16:56:53 -04:00
Ryan Petrello	84eacfc360	fix a few isolated dev issues the main goal of this change is to make `make docker-isolated` work out of the box - specify the proper version for awx-expect --version - update some deprecated playbook bits - change the isolated container to privileged so bwrap will work - fix awx-manage test_isolated_connection - expedite the first isolated heartbeat so you don't have to wait 10m; this is accomplished by _not_ setting Instance.last_isolated_check to now() at insertion time (which causes the next check not to happen for 10 minutes) - fix a bug that caused isolated node execution to fail when bwrap was enabled see: https://github.com/ansible/tower/issues/2150 This reverts commit `9863fe71dc`.	2018-06-13 14:17:58 -04:00
chris meyers	b876c2af62	add total job count for instance + instance group	2018-06-05 12:05:22 -04:00
chris meyers	b94cf379f6	do not choose offline instances	2018-06-04 10:06:59 -04:00
chris meyers	e720fe5dd0	decide the node a job will run early * Deciding the Instance that a Job runs on at celery task run-time makes it hard to evenly distribute tasks among Instnaces. Instead, the task manager will look at the world of running jobs and choose an instance node to run on; applying a deterministic job distribution algo.	2018-06-04 10:06:59 -04:00
Ryan Petrello	4abac44411	prevent unicode in instance hostnames and instance group names see: https://github.com/ansible/tower/issues/1721	2018-05-18 16:28:43 -04:00
Matthew Jones	05419d010b	Update group cluster policies on save, not just created Currently updating policy settings doesn't trigger a re-evaluation of instance group policies, this makes sure we re-evaluate in the event that anything changes.	2018-04-24 21:40:11 -04:00
chris meyers	c3100afd0e	fixed isolated instance query * Was considering an isolated instance: any instance that has at least 1 group with no controller. This is technically correct since an iso node can not be a part of a non-iso group. * The query is now more robust and considers a node an iso node if ALL groups that a node belong to ALL have a controller. * Also added better debugging for the special tower instance group * Added a check for the existance of the special tower group so that logs are less "messy" during the install process.	2018-04-03 13:50:57 -04:00
Chris Meyers	47fa99d3ad	Merge pull request #1154 from chrismeyersfsu/enhancement-tower_in_all_groups add all instances to special tower instance group	2018-04-02 09:39:04 -04:00
chris meyers	838b723c73	add all instances to special tower instance group * All instances except isolated instances * Also, prevent any tower attributes from being modified via the API	2018-03-29 16:47:52 -04:00
chris meyers	8438331563	make jobs_running more rich in OPTIONS * Expose jobs_running as an IntegerField	2018-03-28 16:01:24 -04:00
AlanCoding	7881c921ac	block deletion of resources w unprocessed events	2018-03-16 10:14:28 -04:00
chris meyers	5d5d8152c5	prevent instance group delete if running jobs * related to https://github.com/ansible/ansible-tower/issues/7936	2018-03-15 14:25:49 -04:00
Matthew Jones	70bf78e29f	Apply capacity algorithm changes * This also adds fields to the instance view for tracking cpu and memory usage as well as information on what the capacity ranges are * Also adds a flag for enabling/disabling instances which removes them from all queues and has them stop processing new work * The capacity is now based almost exclusively on some value relative to forks * capacity_adjustment allows you to commit an instance to a certain amount of forks, cpu focused or memory focused * Each job run adds a single fork overhead (that's the reasoning behind the +1)	2018-02-01 16:57:09 -05:00
Matthew Jones	6e9930a45f	Use on_commit hook for triggering ig policy * also Apply console handlers to loggers for dev environment	2018-02-01 16:56:43 -05:00
Matthew Jones	d9e774c4b6	Updates for automatic triggering of policies * Switch policy router queue to not be "tower" so that we don't fall into a chicken/egg scenario * Show fixed policy list in serializer so a user can determine if an instance is manually managed * Change IG membership mixin to not directly handle applying topology changes. Instead it just makes sure the policy instance list is accurate * Add create/delete hooks for instances and groups to trigger policy re-evaluation * Update policy algorithm for fairer distribution * Fix an issue where CELERY_ROUTES wasn't renamed after celery/django upgrade * Update unit tests to be more explicit * Update count calculations used by algorithm to only consider non-manual instances * Adding unit tests and fixture * Don't propagate logging messages from awx.main.tasks and awx.main.scheduler * Use advisory lock to prevent policy eval conflicts * Allow updating instance groups from view	2018-02-01 16:56:16 -05:00
Matthew Jones	56abfa732e	Adding initial instance group policies and policy evaluation planner	2018-02-01 16:56:15 -05:00
Chris Meyers	c9ff3e99b8	celeryd attach to queues dynamically * Based on the tower topology (Instance and InstanceGroup relationships), have celery dyamically listen to queues on boot * Add celery task capable of "refreshing" what queues each celeryd worker listens to. This will be used to support changes in the topology. * Cleaned up some celery task definitions. * Converged wrongly targeted job launch/finish messages to 'tower' queue, rather than a 1-off queue. * Dynamically route celery tasks destined for the local node * separate beat process add support for separate beat process	2018-02-01 16:37:33 -05:00
AlanCoding	5327a4c622	Use global capacity algorithm in serializer The task manager was doing work to compute currently consumed capacity, this is moved into the manager and applied in the same form to the instance group list.	2017-08-28 12:07:47 -04:00
AlanCoding	1112557c79	set capacity to 0 if instance has not checked in lately	2017-07-27 16:20:04 -04:00
Matthew Jones	c7a85d9738	Mass rename from ansible_(awx\|tower) -> (awx\|tower)	2017-07-26 13:33:26 -04:00
AlanCoding	dd1a261bc3	setup playbook and heartbeat for isolated deployments * Allow isolated_group_ use in setup playbook * Tweaks to host/queue registration commands complementing setup * Create isolated heartbeat task and check capacity * Add content about isolated instances to acceptance docs	2017-06-19 12:13:36 -04:00
Ryan Petrello	422950f45d	Support for executing job and adhoc commands on isolated Tower nodes (#6524 )	2017-06-14 11:47:30 -04:00
Aaron Tan	604243428c	Add URL and type fields to instances/instance groups.	2017-05-30 17:00:27 -04:00
Matthew Jones	705f8af440	Update views and serializers to support instance group (ramparts) * includes top level views for instances and instance groups and extending those views to be able to view running jobs * Associative endpoints on Organizations, Inventories, and Job Templates * Related and summary field entries where appropriate * Adding job model references to executing instance group * Fix up default queue properties for clustering from the settings file * Update production and default settings for instance queues in settings	2017-05-10 12:33:03 -04:00
Matthew Jones	4ced911c00	Implementing models for instance groups, updating task manager * New InstanceGroup model and associative relationship with Instances * Associative instances between Organizations, Inventory, and Job Templates and InstanceGroups * Migrations for adding fields and tables for Instance Groups * Adding activity stream reference for instance groups * Task Manager Refactoring: Simplifying task manager relationships and move away from the interstitial hash tables Simplify dependency determination logic ** Reduce task manager runtime complexity by removing the partial references and moving the logic into the task manager directly or relying on Job model logic for determinism	2017-05-10 12:32:54 -04:00
Matthew Jones	ea8b78ca49	Protect cluster nodes after an upgrade * Modify instance model to container a version number for the node * Update that version number during the heartbeat * If during a heartbeat any of the nodes are of a newer version then shutdown the current node. The idea behind this is that if all nodes were upgraded at the same time then at the moment of the healthcheck they should all be at the newer version. Otherwise we put the system in a state where it can receive the upgrade but stay down until that happens. During setup playbook run the services will be fully restarted.	2017-04-10 15:37:33 -04:00
Aaron Tan	9e4655419e	Fix flake8 E302 errors.	2016-11-15 20:59:39 -05:00
Matthew Jones	343966f744	Implement gathering overall task capacity For use when running/planning jobs	2016-11-07 13:45:01 -05:00
Chris Meyers	25b85c4a0b	rename scheduler config singleton	2016-11-01 14:07:00 -05:00
Chris Meyers	13c89ab78c	HAify job schedules and more task_manager renaming	2016-11-01 13:50:42 -05:00
Matthew Jones	3de4aae548	Fixing up HA induced flake8 issues	2016-09-15 13:51:17 -04:00
Matthew Jones	0c1e1fa2fb	Refactor Tower HA Instance logic and models * Gut the HA middleware * Purge concept of primary and secondary. * UUID is not the primary host identifier, now it's based mostly on the username. Some work probably still left to do to make sure this is legit. Also removed unique constraint from the uuid field. This might become the cluster ident now... or it may just deprecate * No more secondary -> primary redirection * Initial revision of /api/v1/ping * Revise and gut tower-manage register_instance * Rename awx/main/socket.py to awx/main/socket_queue.py to prevent conflict with the "socket" module from python base * Revist/gut the Instance manager... not sure if this manager is really needed anymore	2016-09-08 13:37:53 -04:00
John Mitchell	32d1c0e4db	fixed copyright date	2015-06-11 16:10:23 -04:00
Matthew Jones	31d0342d41	More copyright headers for api side stuff	2015-05-29 12:10:40 -04:00
Matthew Jones	b3da3b34a3	Changing some legal headers for python source files	2015-05-29 12:10:39 -04:00
Luke Sneeringer	d6699353e5	Save hostnames, not IP addresses, for HA.	2014-12-02 10:34:25 -06:00
Luke Sneeringer	ec1c770099	Added job cancelation when switching to secondary.	2014-10-20 08:04:17 -05:00
Luke Sneeringer	e23801313e	Do a OneToOne with UnifiedJob.	2014-10-20 08:04:17 -05:00
Luke Sneeringer	f6a501bb28	Register signal against subclasses separately.	2014-10-20 08:04:17 -05:00
Luke Sneeringer	a72e72f20f	Adding a post_save receiver.	2014-10-20 08:04:17 -05:00
Luke Sneeringer	ec7aa1867f	Adding JobOrigin model and migration.	2014-10-20 08:04:16 -05:00
Luke Sneeringer	60dae748dc	Adding IP Address to the instance. Needed for redirecting.	2014-10-20 08:04:16 -05:00

1 2

53 Commits