Commit Graph

620 Commits

Author SHA1 Message Date
Jeff Bradberry
38ccea0f1f Fix up warnings
- the default auto-increment primary key field type is now
  configurable, and Django's check command issues a warning if you are
  just assuming the historical behavior of using AutoField.

- Django 3.2 brings in automatic AppConfig discovery, so all of our
  explicit `default_app_config = ...` assignments in __init__.py
  modules are no longer needed, and raise a RemovedInDjango41Warning.
2022-03-14 13:19:57 -04:00
Jeff Bradberry
1803c5bdb4 Fix up usage of django-guid
It has replaced the class-based middleware, everything is
function-based now.
2022-03-14 13:19:57 -04:00
Jeff Bradberry
e620bef2a5 Fix Django 3.1 deprecation removal problems
- FieldDoesNotExist now has to be imported from django.core.exceptions
- Django docs specifically say not to import
  django.conf.global_settings, which now has the side-effect of
  triggering one of the check errors
2022-03-07 18:11:36 -05:00
Jeff Bradberry
faa12880a9 Squash a few deprecation warnings
- inspect.getargspec() -> inspect.getfullargspec()
- register pytest.mark.fixture_args
- replace use of DRF's deprecated NullBooleanField
- fix some usage of naive datetimes in the tests
- fix some strings with backslashes that ought to be raw strings
2022-03-07 18:11:36 -05:00
Jeff Bradberry
df61d1a59c Upgrade to Django 3.0
- upgrades
  - Django 3.0.14
  - django-jsonfield 1.4.1 (from 1.2.0)
  - django-oauth-toolkit 1.4.1 (from 1.1.3)
    - Stopping here because later versions have changes to the
      underlying model to support OpenID Connect.  Presumably this can
      be dealt with via a migration in our project.
  - django-guid 2.2.1 (from 2.2.0)
  - django-debug-toolbar 3.2.4 (from 1.11.1)
  - python3-saml 1.13.0 (from 1.9.0)
  - xmlsec 1.3.12 (from 1.3.3)

- Remove our project's use of django.utils.six in favor of directly
  using six, in awx.sso.fields.

- Temporarily monkey patch six back in as django.utils.six, since
  django-jsonfield makes use of that import, and is no longer being
  updated.  Hopefully we can do away with this dependency with the new
  generalized JSONField brought in with Django 3.1.

- Force a json decoder to be used with all instances of JSONField
  brought in by django-jsonfield.  This deals with the 'cast to text'
  problem noted previously in our UPGRADE_BLOCKERS.

- Remove the validate_uris validator from the OAuth2Application in
  migration 0025, per the UPGRADE_BLOCKERS, and remove that note.

- Update the TEMPLATES setting to satisfy Django Debug Toolbar.  It
  requires at least one entry that has APP_DIRS=True, and as near as I
  can tell our custom OPTIONS.loaders setting was effectively doing
  the same thing as Django's own machinery if this setting is set.
2022-03-07 18:11:36 -05:00
Marcelo Moreira de Mello
5e8107621e Allow isolated paths as hostPath volume @ k8s/ocp/container groups 2022-02-28 10:22:20 -05:00
John Westcott IV
cb57752903 Changing session cookie name and added a way for clients to know what the name is #11413 (#11679)
* Changing session cookie name and added a way for clients to know what the key name is
* Adding session information to docs
* Fixing how awxkit gets the session id header
2022-02-27 07:27:25 -05:00
Shane McDonald
88f66d5c51 Enable Podman ipv6 support by default 2022-02-24 08:51:51 -05:00
Elijah DeLee
604cbc1737 Consume control capacity (#11665)
* Select control node before start task

Consume capacity on control nodes for controlling tasks and consider
remainging capacity on control nodes before selecting them.

This depends on the requirement that control and hybrid nodes should all
be in the instance group named 'controlplane'. Many tests do not satisfy that
requirement. I'll update the tests in another commit.

* update tests to use controlplane

We don't start any tasks if we don't have a controlplane instance group

Due to updates to fixtures, update tests to set node type and capacity
explicitly so they get expected result.

* Fixes for accounting of control capacity consumed

Update method is used to account for currently consumed capacity for
instance groups in the in-memory capacity tracking data structure we initialize in
after_lock_init and then update via calculate_capacity_consumed (both in
task_manager.py)

Also update fit_task_to_instance to consider control impact on instances

Trust that these functions do the right thing looking for a
node with capacity, and cut out redundant check for the whole group's
capacity per Alan's reccomendation.

* Refactor now redundant code

Deal with control type tasks before we loop over the preferred instance
groups, which cuts out the need for some redundant logic.

Also, fix a bug where I was missing assigning the execution node in one case!

* set job explanation on tasks that need capacity

move the job explanation for jobs that need capacity to a function
so we can re-use it in the three places we need it.

* project updates always run on the controlplane

Instance group ordering makes no sense on project updates because they
always need to run on the control plane.

Also, since hybrid nodes should always run the control processes for the
jobs running on them as execution nodes, account for this when looking for a
execution node.

* fix misleading message

the variables and wording were both misleading, fix to be more accurate
description in the two different cases where this log may be emitted.

* use settings correctly

use settings.DEFAULT_CONTROL_PLANE_QUEUE_NAME instead of a hardcoded
name
cache the controlplane_ig object during the after lock init to avoid
an uneccesary query
eliminate mistakenly duplicated AWX_CONTROL_PLANE_TASK_IMPACT and use
only AWX_CONTROL_NODE_TASK_IMPACT

* add test for control capacity consumption

add test to verify that when there are 2 jobs and only capacity for one
that one will move into waiting and the other stays in pending

* add test for hybrid node capacity consumption

assert that the hybrid node is used for both control and execution and
capacity is deducted correctly

* add test for task.capacity_type = control

Test that control type tasks have the right capacity consumed and
get assigned to the right instance group

Also fix lint in the tests

* jobs_running not accurate for control nodes

We can either NOT use "idle instances" for control nodes, or we need
to update the jobs_running property on the Instance model to count
jobs where the node is the controller_node.

I didn't do that because it may be an expensive query, and it would be
hard to make it match with jobs_running on the InstanceGroup which
filters on tasks assigned to the instance group.

This change chooses to stop considering "idle" control nodes an option,
since we can't acurrately identify them.

The way things are without any change, is we are continuing to over consume capacity on control nodes
because this method sees all control nodes as "idle" at the beginning
of the task manager run, and then only counts jobs started in that run
in the in-memory tracking. So jobs which last over a number of task
manager runs build up consuming capacity, which is accurately reported
via Instance.consumed_capacity

* Reduce default task impact for control nodes

This is something we can experiment with as far as what users
want at install time, but start with just 1 for now.

* update capacity docs

Describe usage of the new setting and the concept of control impact.

Co-authored-by: Alan Rominger <arominge@redhat.com>
Co-authored-by: Rebeccah <rhunter@redhat.com>
2022-02-14 10:13:22 -05:00
Alex Corey
62b0c2b647 Fix tooltip documentation 2022-02-02 16:18:41 -05:00
Marcelo Moreira de Mello
0fef88c358 Support user customization of container mount options and mount paths 2022-01-21 17:12:32 -05:00
Elijah DeLee
faba64890e Merge pull request #11559 from kdelee/pending_container_group_jobs_take2
Add resource requests to default podspec
2022-01-20 09:54:20 -05:00
John Westcott IV
e63ce9ed08 Api 4XX error msg customization #1236 (#11527)
* Adding API_400_ERROR_LOG_FORMAT setting
* Adding functional tests for API_400_ERROR_LOG_FORMAT
Co-authored-by: nixocio <nixocio@gmail.com>
2022-01-19 11:16:21 -05:00
Elijah DeLee
987924cbda Add resource requests to default podspec
Extend the timeout, assuming that we want to let the kubernetes scheduler
start containers when it wants to start them. This allows us to make
resource requests knowing that when some jobs queue up waiting for
resources, they will not get reaped in as short of a
timeout.
2022-01-18 13:34:39 -05:00
Amol Gautam
a4a3ba65d7 Refactored tasks.py to a package
--- Added 3 new sub-package : awx.main.tasks.system , awx.main.tasks.jobs , awx.main.tasks.receptor
--- Modified the functional tests and unit tests accordingly
2022-01-14 11:55:41 -05:00
Jeff Bradberry
db999b82ed Merge pull request #11431 from jbradberry/receptor-mesh-models
Modify Instance and introduce InstanceLink
2022-01-11 10:55:54 -05:00
John Westcott IV
c92468062d SAML user attribute flags issue #5303 (PR #11430)
* Adding SAML option in SAML configuration to specify system auditor and system superusers by role or attribute
* Adding keycloak container and documentation on how to start keycloak alongside AWX (including configuration of both)
2022-01-10 16:52:44 -05:00
Seth Foster
956638e564 Revert "Remove unnecessary DEBUG logger level settings (#11441)"
This reverts commit 8126f734e3.
2022-01-10 11:46:19 -05:00
Jeff Bradberry
f1c5da7026 Remove the auto-discovery feature 2022-01-10 11:37:19 -05:00
Alan Rominger
8126f734e3 Remove unnecessary DEBUG logger level settings (#11441)
* Remove unnecessary DEBUG logger level settings
2022-01-05 14:44:57 -05:00
Elijah DeLee
e10030b73d Allow setting default execution group pod spec
This will allow us to control the default container group created via settings, meaning
we could set this in the operator and the default container group would get created with it applied.

We need this for https://github.com/ansible/awx-operator/issues/242

Deepmerge the default podspec and the override

With out this, providing the `spec` for the podspec would override everything
contained, which ends up including the container used, which is not desired

Also, use the same deepmerge function def, as the code seems to be copypasted from
the utils
2021-12-10 15:02:45 -05:00
Alan Rominger
d6679a1e9b Respect dynamic log setting for console, downgrade exit log 2021-12-07 14:35:03 -05:00
Alan Rominger
35b62f8526 Revert "Set SESSION_COOKIE_NAME by default"
This reverts commit 59c6f35b0b.
2021-12-01 17:51:47 -05:00
Paul Belanger
59c6f35b0b Set SESSION_COOKIE_NAME by default
Make sure to use a different session cookie name then the default, to
avoid overlapping cookies with other django apps that might be running.

Signed-off-by: Paul Belanger <pabelanger@redhat.com>
2021-11-15 12:59:07 -05:00
Alan Rominger
77076dbd67 Reduce the number of triggers for execution node health checks 2021-11-10 08:50:11 +08:00
Alan Rominger
7b35902d33 Respect settings to keep files and work units
Add new logic to cleanup orphaned work units
  from administrative tasks

Remove noisy log which is often irrelevant
  about running-cleanup-on-execution-nodes
  we already have other logs for this
2021-11-10 08:50:10 +08:00
Alan Rominger
b70793db5c Consolidate cleanup actions under new ansible-runner worker cleanup command (#11160)
* Primary development of integrating runner cleanup command

* Fixup image cleanup signals and their tests

* Use alphabetical sort to solve the cluster coordination problem

* Update test to new pattern

* Clarity edits to interface with ansible-runner cleanup method

* Another change corresponding to ansible-runner CLI updates

* Fix incomplete implementation of receptor remote cleanup

* Share receptor utils code between worker_info and cleanup

* Complete task logging from calling runner cleanup command

* Wrap up unit tests and some contract changes that fall out of those

* Fix bug in CLI construction

* Fix queryset filter bug
2021-10-05 16:32:03 -04:00
Alan Rominger
3664cc3369 Disable autodiscovery except for docker-compose (#11142) 2021-09-27 11:36:11 -04:00
Alan Rominger
6a17e5b65b Allow manually running a health check, and make other adjustments to the health check trigger (#11002)
* Full finalize the planned work for health checks of execution nodes

* Implementation of instance health_check endpoint

* Also do version conditional to node_type

* Do not use receptor mesh to check main cluster nodes health

* Fix bugs from testing health check of cluster nodes, add doc

* Add a few fields to health check serializer missed before

* Light refactoring of error field processing

* Fix errors clearing error, write more unit tests

* Update health check info in docs

* Bump migration of health check after rebase

* Mark string for translation

* Add related health_check link for system auditors too

* Handle health_check cluster node timeout, add errors for peer judgement
2021-09-03 16:37:37 -04:00
Alan Rominger
928c35ede5 Model changes for instance last_seen field to replace modified (#10870)
* Model changes for instance last_seen field to replace modified

* Break up refresh_capacity into smaller units

* Rename execution node methods, fix last_seen clustering

* Use update_fields to make it clear save only affects capacity

* Restructing to pass unit tests

* Fix bug where a PATCH did not update capacity value
2021-08-24 08:41:35 -04:00
Elijah DeLee
86600531e2 Fix name of default awx ee
we are now tracking the latest tag
2021-08-11 22:06:56 -04:00
Elijah DeLee
9b2d2a1856 Use awx-ee:latest to get latest receptor and runner
We are updating the requirements to get the latest receptor and runner in the task container,
we should also have the latest in the EE
2021-08-11 11:36:42 -04:00
nixocio
f85b2b6352 Merge ui and ui_next in one dir
Merge ui and ui_next in one dir

See: https://github.com/ansible/awx/issues/10676

Update django .po files

Update django .po files

Run `awx-manage makemessages`.
2021-08-02 10:40:24 -04:00
Elijah DeLee
b431067aa8 also update control plane ee 2021-07-15 11:30:21 -04:00
Elijah DeLee
718b3bab4a Update defaults.py 2021-07-15 11:27:55 -04:00
Amol Gautam
32cc7e976a Refactored awx/settings/defaults.py 2021-06-22 10:49:38 -04:00
Christian M. Adams
06b04007a0 Rename managed_by_tower to managed 2021-06-22 10:49:36 -04:00
Alan Rominger
6db4732bf3 Rename the TOWER_ settings 2021-06-22 10:49:36 -04:00
Seth Foster
75a27c38c2 Add a periodic task to reap unreleased receptor work units
- Add work_unit_id field to UnifiedJob
2021-06-22 10:49:31 -04:00
Bill Nottingham
be18803250 Add support for Insights as an inventory source 2021-06-09 16:34:32 -04:00
softwarefactory-project-zuul[bot]
0a63c2d4a0 Merge pull request #10385 from mabashian/version_context_processor
Adds version context processor back in to fix api browser doc link

SUMMARY
Here's what it looks like on devel now (note the URL in the bottom left):

Here's what it looks like after the change (note the URL in the bottom left):

I dropped this in as it was before the removal of the old UI.  I believe the new UI needs access to some of these variables as well to force assets to be refetched after upgrade.
Also of note: I have no idea what I'm doing with django so please help me to become educated if I've done something silly here.
ISSUE TYPE

Bugfix Pull Request

COMPONENT NAME

API

Reviewed-by: Jake McDermott <yo@jakemcdermott.me>
2021-06-09 15:44:45 +00:00
Shane McDonald
070034047c Merge pull request #10377 from kdelee/control_plane_ee
Break out control plane EE as its own thing
2021-06-08 17:25:01 -04:00
softwarefactory-project-zuul[bot]
3340ef9c91 Merge pull request #10053 from AlanCoding/dropsies
Intentionally drop job event websocket messages in excess of 30 per second (configurable)

SUMMARY
The UI no longer follows the latest job events from websocket messages. Because of that, there's no reason to send messages for all events if the job event rate is high.
I used 30 because this is the number of events that I guesstimate will show in one page in the UI.
Needs the setting added in the UI.
This adds skip_websocket_message to event event_data. We could promote it to a top-level key for job events, if that is preferable aesthetically. Doing this allows us to test this feature without having to connect a websocket client. Ping @mabashian @chrismeyersfsu
ISSUE TYPE

Feature Pull Request

COMPONENT NAME

API
UI

ADDITIONAL INFORMATION
Scenario walkthrough:
a job is producing 1,000 events per second. User launches it, the screen fills up in, say 1/4 of a second. The scrollbar indicates content beyond the bottom of the screen. Now, for 3/4ths of a second, the scrollbar stays still. After that, it updates the scrollbar to the current line number that the job is on. The scrollbar continues to update the length of the output effectively once per second.

Reviewed-by: Alan Rominger <arominge@redhat.com>
Reviewed-by: Chris Meyers <None>
Reviewed-by: Jake McDermott <yo@jakemcdermott.me>
2021-06-08 20:10:45 +00:00
Alan Rominger
b306c6f258 Put new setting in defaults so unit tests will run 2021-06-08 13:33:52 -04:00
mabashian
de46fb409e Adds version context processor back in to fix api browser doc link 2021-06-07 19:52:01 -04:00
Elijah DeLee
2ce276455a Break out control plane EE as its own thing 2021-06-07 15:39:20 -04:00
Shane McDonald
ec8ac6f1a7 Introduce distinct controlplane instance group 2021-06-07 11:25:59 -04:00
Yanis Guenane
82c4f6bb88 Define a DEFAULT_QUEUE_NAME 2021-06-07 11:25:23 -04:00
softwarefactory-project-zuul[bot]
41e3a69001 Merge pull request #10225 from AlanCoding/deletions
Remove code and settings no longer used

Connect #8740

Reviewed-by: Jake McDermott <yo@jakemcdermott.me>
Reviewed-by: Shane McDonald <me@shanemcd.com>
2021-06-01 12:42:32 +00:00
softwarefactory-project-zuul[bot]
c29a7ccf8b Merge pull request #10102 from jbradberry/disable-local-users
Add the ability to disable local authentication

SUMMARY
When an external authentication system is enabled, users would like the ability to disable local authentication for enhanced security.
related #4553
TODO

 create a configure-Tower-in-Tower setting,  DISABLE_LOCAL_AUTH
 expose the setting in the settings UI
 be able to query out all local-only users

User.objects.filter(Q(profile__isnull=True) | Q(profile__ldap_dn=''), enterprise_auth__isnull=True, social_auth__isnull=True)
see: awx/main/utils/common.py, get_external_account


 write a thin wrapper around the Django model-based auth backend
 update the UI tests to include the new setting
 be able to trigger a side-effect when this setting changes
 revoke all OAuth2 tokens for users that do not have a remote
auth backend associated with them
 revoke sessions for local-only users

ultimately I did this by adding a new middleware that checks the value of this new setting and force-logouts any local-only user making a request after it is enabled


 settings API endpoint raises a validation error if there are no external users or auth sources configured

The remote user existence validation has been removed, since ultimately we can't know for sure if a sysadmin-level user will still have access to the UI.  This is being dealt with by using a confirmation modal, see below.


 add a modal asking the user to confirm that they want to turn this setting on

ISSUE TYPE


Feature Pull Request

COMPONENT NAME


API
UI

AWX VERSION

Reviewed-by: Jeff Bradberry <None>
Reviewed-by: Bianca Henderson <beeankha@gmail.com>
Reviewed-by: Mat Wilson <mawilson@redhat.com>
Reviewed-by: Michael Abashian <None>
Reviewed-by: Chris Meyers <None>
2021-05-27 18:37:47 +00:00