External-Mirrors/awx

mirror of https://github.com/ansible/awx.git synced 2026-05-22 08:17:39 -02:30

Author	SHA1	Message	Date
Jeff Bradberry	f340f491dc	Control the visibility and use of hop node Instances - the list, detail, and health check API views should not include them - the Instance-InstanceGroup association views should not allow them to be changed - the ping view excludes them - list_instances management command excludes them - Instance.set_capacity_value sets hop nodes to 0 capacity - TaskManager will exclude them from the nodes available for job execution - TaskManager.reap_jobs_from_orphaned_instances will consider hop nodes to be an orphaned instance - The apply_cluster_membership_policies task will not manipulate hop nodes - get_broadcast_hosts will ignore hop nodes - active_count also will ignore hop nodes	2021-12-17 14:30:28 -05:00
Jeff Bradberry	c8f1e714e1	Capture hop nodes and the peer links between nodes	2021-12-17 14:30:18 -05:00
Alan Rominger	7b35902d33	Respect settings to keep files and work units Add new logic to cleanup orphaned work units from administrative tasks Remove noisy log which is often irrelevant about running-cleanup-on-execution-nodes we already have other logs for this	2021-11-10 08:50:10 +08:00
Alan Rominger	b70793db5c	Consolidate cleanup actions under new `ansible-runner worker cleanup` command (#11160 ) * Primary development of integrating runner cleanup command * Fixup image cleanup signals and their tests * Use alphabetical sort to solve the cluster coordination problem * Update test to new pattern * Clarity edits to interface with ansible-runner cleanup method * Another change corresponding to ansible-runner CLI updates * Fix incomplete implementation of receptor remote cleanup * Share receptor utils code between worker_info and cleanup * Complete task logging from calling runner cleanup command * Wrap up unit tests and some contract changes that fall out of those * Fix bug in CLI construction * Fix queryset filter bug	2021-10-05 16:32:03 -04:00
Alan Rominger	3fc63489f1	Filter controller_node selection to online nodes (#11120 )	2021-09-24 23:01:32 -04:00
Rebeccah	55f2125a51	if the user provides a uuid and it exists, allow that to tie to the instance, which allows the user to update the instance based on the UUID (includeding updating the hostname) should they choose to do so.	2021-09-21 16:54:11 -04:00
Alan Rominger	6a17e5b65b	Allow manually running a health check, and make other adjustments to the health check trigger (#11002 ) * Full finalize the planned work for health checks of execution nodes * Implementation of instance health_check endpoint * Also do version conditional to node_type * Do not use receptor mesh to check main cluster nodes health * Fix bugs from testing health check of cluster nodes, add doc * Add a few fields to health check serializer missed before * Light refactoring of error field processing * Fix errors clearing error, write more unit tests * Update health check info in docs * Bump migration of health check after rebase * Mark string for translation * Add related health_check link for system auditors too * Handle health_check cluster node timeout, add errors for peer judgement	2021-09-03 16:37:37 -04:00
Jim Ladd	262cd3c695	set default uuid	2021-09-03 10:05:15 -07:00
Alan Rominger	424dbe8208	Use ansible-runner imports for cpu and memory calculation (#10954 ) * Use ansible-runner imports for cpu and memory calculation * Fix bug with capacity and memory adjustment	2021-08-27 21:46:53 -04:00
Alan Rominger	daf4310176	Clean up work_type processing and fix execution vs control capacity (#10930 ) * Clean up added work_type processing for mesh_code branch * track both execution and control capacity * Remove unused execution_capacity property * Count all forms of capacity to make test pass * Force jobs to be on execution nodes, updates on control nodes * Introduce capacity_type property to abstract some details out * Update test to cover all job types at same time * Register OpenShift nodes as control types * Remove unqualified consumed_capacity from task manager and make unit tests work * Remove unqualified consumed_capacity from task manager and make unit tests work * Update unit test to execution vs control TM logic changes * Fix bug, else handling for work_type method	2021-08-26 07:24:14 -04:00
Alan Rominger	940c189c12	Corresponding AWX changes for runner --worker-info schema update (#10926 )	2021-08-24 08:41:36 -04:00
Alan Rominger	928c35ede5	Model changes for instance last_seen field to replace modified (#10870 ) * Model changes for instance last_seen field to replace modified * Break up refresh_capacity into smaller units * Rename execution node methods, fix last_seen clustering * Use update_fields to make it clear save only affects capacity * Restructing to pass unit tests * Fix bug where a PATCH did not update capacity value	2021-08-24 08:41:35 -04:00
Alan Rominger	3b1e40d227	Use the ansible-runner worker --worker-info to perform execution node capacity checks (#10825 ) * Introduce utilities for --worker-info health check integration * Handle case where ansible-runner is not installed * Add ttl parameter for health check * Reformulate return data structure and add lots of error cases * Move up the cleanup tasks, close sockets * Integrate new --worker-info into the execution node capacity check * Undo the raw value override from the PoC * Additional refinement to execution node check frequency * Put in more complete network diagram * Followup on comment to remove modified from from health check responsibilities	2021-08-24 08:41:35 -04:00
Alan Rominger	f47eb126e2	Adopt the node_type field in receptor logic (#10802 ) * Adopt the node_type field in receptor logic * Refactor Instance.objects.register so we do not reset capacity to 0	2021-08-24 08:41:34 -04:00
Alan Rominger	b53d3bc81d	Undo some things not compatible with hybrid node hack (#10763 )	2021-08-24 08:41:34 -04:00
Alan Rominger	9881bb72b8	Treat the awx_1 node as a hybrid node for now, use local work type (#10726 )	2021-08-24 08:40:21 -04:00
Alan Rominger	f597205fa7	Run capacity checks with container isolation (#10688 ) This requires swapping out the container images for the execution nodes from awx-ee to the awx image For completeness, the hop node image is switched to the raw receptor image A few outright bugs are fixed here memory calculation just was not right at all the execution_capacity calculation was reverse of intention Drop in a few TODOs about error handling from debugging	2021-08-24 08:40:19 -04:00
Alan Rominger	13300bdbd4	Update rebase to keep old control plane capacity check Also do some basic work to separate control versus execution capacity this is to assure that we don't send jobs to the control node	2021-08-24 08:40:19 -04:00
Alan Rominger	39e23db523	Make minor changes to add needed imports	2021-08-24 08:40:19 -04:00
Ryan Petrello	05cb876df5	implement an initial development environment for receptor-based clusters	2021-08-24 08:40:18 -04:00
Rebeccah	706f3f97ea	add a new field to the instance model for use with receptor changes (incoming)	2021-07-26 15:53:56 -04:00
Shane McDonald	ec8ac6f1a7	Introduce distinct controlplane instance group	2021-06-07 11:25:59 -04:00
Jeff Bradberry	1819a7963a	Make the necessary changes to the models - remove InstanceGroup.controller - remove Instance.last_isolated_check - remove .is_isolated and .is_controller methods/properties - remove .choose_online_controller_node() method - remove .supports_isolation() and replace with .can_run_containerized - simplify .can_run_containerized	2021-04-22 10:17:02 -04:00
Ryan Petrello	c2ef0a6500	move code linting to a stricter pep8-esque auto-formatting tool, black	2021-03-23 09:39:58 -04:00
Shane McDonald	1c4a376758	Explicit db field for is_container_group We now have Container Groups that dont require a credential.	2021-03-15 13:28:39 -04:00
Shane McDonald	57b317d440	Get system jobs working under new deployment model (#9221 )	2021-03-03 18:52:55 -05:00
Ryan Petrello	f850f8d3e0	introduce a new global flag for denoating K8S-based deployments - In K8S-based installs, only container groups are intended to be used for playbook execution (JTs, adhoc, inventory updates), so in this scenario, other job types have a task impact of zero. - In K8S-based installs, traditional instances have zero capacity (because they're only members of the control plane where services - http/s, local control plane execution - run) - This commit also includes some changes that allow for the task manager to launch tasks with task_impact=0 on instances that have capacity=0 (previously, an instance with zero capacity would never be selected as the "execution node" This means that when IS_K8S=True, any Job Template associated with an Instance Group will never actually go from pending -> running (because there's no capacity - all playbooks must run through Container Groups). For an improved ux, our intention is to introduce logic into the operator install process such that the default group that's created at install time is a Container Group that's configured to point at the K8S cluster where awx itself is deployed.	2021-03-03 18:52:55 -05:00
Shane McDonald	286b1d4e25	InstanceGroup#is_containerized -> InstanceGroup#is_container_group	2021-03-03 18:52:55 -05:00
Chris Meyers	2eac5a8873	reduce per-job database query count * Do not query the database for the set of Instance that belong to the group for which we are trying to fit a job on, for each job. * Instead, cache the set of instances per-instance group.	2020-10-19 10:54:56 -04:00
Bill Nottingham	05ad85e7a6	Remove the model for the now unused TowerAnalyticsState.	2020-09-09 20:18:04 -04:00
Ryan Petrello	b01d204137	if redis is unreachable, set instance capacity to zero	2020-09-03 15:11:53 -04:00
Shane McDonald	45ce6d794e	Initial migration of rabbitmq -> redis for k8s installs	2020-03-18 16:10:17 -04:00
Rebeccah	1f05372ac9	change the logic to not break existing policy_instance testing	2019-11-12 13:13:34 -05:00
Rebeccah	d0327fc044	added onto the when saved function for instance groups that sets policy variables to their default.	2019-11-12 13:13:34 -05:00
Shane McDonald	bd5003ca98	Task manager / scheduler Kubernetes integration	2019-10-04 13:21:21 -04:00
Shane McDonald	a9059edc65	Allow associating a credential with an instance group	2019-10-04 12:54:31 -04:00
Jim Ladd	cdcf2fa4c2	Increase instance version length	2019-10-03 21:24:07 -04:00
AlanCoding	fedd1cf22f	Replace JobOrigin with ActivityStream.action_node	2019-05-31 07:10:07 -04:00
softwarefactory-project-zuul[bot]	874465a2d4	Merge pull request #3865 from chrismeyersfsu/fix-enabled_still_online disabled instance does not mean offline instance Reviewed-by: https://github.com/softwarefactory-project-zuul[bot]	2019-05-23 16:55:09 +00:00
AlanCoding	f4c18843a3	Resolve default ordering warnings from tests	2019-05-20 10:58:36 -04:00
chris meyers	8aa28092ff	disabled instance does not mean offline instance * Disabling an instance is used to stop and instance from being the target of new jobs to run. * The instance should still perform it's heartbeat so that it isn't considered offline. * If the instance was allowed to go offline on an openshift cluster it would be deleted from the database.	2019-05-13 11:44:47 -04:00
Ryan Petrello	e4a50f3595	enforce a stable list order when attaching/detaching instance groups	2019-05-07 14:53:00 -04:00
softwarefactory-project-zuul[bot]	fad0274373	Merge pull request #3686 from vismay-golwala/instance_group_delete [WIP] Disallow deleting controller or isolated instance groups Reviewed-by: https://github.com/softwarefactory-project-zuul[bot]	2019-04-24 15:19:19 +00:00
AlanCoding	8c2b3e9b84	Fix Django 2.0 deprecation warnings	2019-04-22 14:17:14 -04:00
Vismay Golwala	e0c4fd4b3a	Disallow deleting controller or isolated instance groups Added two new properties to the InstanceGroup model - `is_controller` and `is_isolated`. Used these properties to hide the trash icon for instance groups that are either controller or isolated. Signed-off-by: Vismay Golwala <vgolwala@redhat.com>	2019-04-15 16:08:27 -04:00
Ryan Petrello	c586fa9821	add a minimal framework for generating analytics/metrics annotate queries & add license analytics	2019-03-27 19:53:00 -04:00
Bianca	f1e3be5ec8	Removing unicode fix-related lines in ha.py	2019-01-17 14:42:38 -05:00
Ryan Petrello	f223df303f	convert py2 -> py3	2019-01-15 14:09:01 -05:00
Ryan Petrello	ff1e8cc356	replace celery task decorators with a kombu-based publisher this commit implements the bulk of `awx-manage run_dispatcher`, a new command that binds to RabbitMQ via kombu and balances messages across a pool of workers that are similar to celeryd workers in spirit. Specifically, this includes: - a new decorator, `awx.main.dispatch.task`, which can be used to decorate functions or classes so that they can be designated as "Tasks" - support for fanout/broadcast tasks (at this point in time, only `conf.Setting` memcached flushes use this functionality) - support for job reaping - support for success/failure hooks for job runs (i.e., `handle_work_success` and `handle_work_error`) - support for auto scaling worker pool that scale processes up and down on demand - minimal support for RPC, such as status checks and pool recycle/reload	2018-10-11 10:53:30 -04:00
Ryan Petrello	67d1267d98	enforce 0 <= Instance.capacity_adjustment see: https://github.com/ansible/tower/issues/2839	2018-08-21 15:34:19 -04:00

1 2 3 4

151 Commits