Commit Graph

543 Commits

Author SHA1 Message Date
Alan Rominger
342e9197b8 Customize application_name for different connections in dispatcher service (#13074)
* Introduce new method in settings, import in-line w NOQA mark

* Further refine the app_name to use shorter service names like dispatcher

* Clean up listener logic, change some names
2023-04-13 22:36:36 -04:00
Ladislav Smola
4e5cce8d15 Analytics export other subs attrs
We'll export also subscription_id since pool_id is not
enough in certain cases.

Then also export usage and account number
2023-04-10 21:47:32 -04:00
Martin Slemr
fff6fa7d7a Additional Licensing values 2023-04-07 08:56:03 -04:00
Hao Liu
bc55bcf3a2 Rename SUPERVISOR_CONFIG_PATH
previously this is used so that task running in the task container can reach into the web container to restart rsyslog

now that the web container and task container are split there's no longer a way to do that so i renamed this env var to reference where it will now do

which is pointing to the supervisor conf file of the current running container
2023-03-29 22:09:19 -04:00
jessicamack
13b9a6c5e3 Remove unused import
Signed-off-by: jessicamack <jmack@redhat.com>
2023-03-29 22:09:19 -04:00
jessicamack
da004da68a make reconfigure_rsyslog a task
Signed-off-by: jessicamack <jmack@redhat.com>
2023-03-29 22:09:18 -04:00
jessicamack
b5e04a4cb3 AWX code changes for rsyslog decoupling (#13222)
* add management command and logging for new daemon
* switch tasks over to calling pg_notify
* add daemon to docker-compose and supervisor
* renamed handle_setting_changes and moved notify call
* removed initial rsyslog configure from dispatcher
* add logging and clear cache before reconfigure
* add notify to delete
* moved pg_notify to own function
* update tests impacted by rsyslog change
* changed over to new pg_notify method

Signed-off-by: Jessica Mack <jmack@redhat.com>
2023-03-29 22:04:43 -04:00
Martin Slemr
8ec6e556a1 HostMetricSummaryMonthly API commented out 2023-03-23 14:13:16 -04:00
Martin Slemr
9badbf0b4e Compliance computation settings 2023-03-23 14:06:55 -04:00
Martin Slemr
05f918e666 HostMetric compliance computation 2023-03-23 14:06:55 -04:00
Christian Adams
52d46c88e4 External users should not be able to change their password (#13491)
* Azure AD users should not be able to change their password

* Multiple auth changes

Moving get_external_user function into awx.sso.common
Altering get_external_user to not look at current config, just user object values
Altering how api/conf.py detects external auth config (and making reusable function in awx.sso.common)
Altering logic in api.serializers in _update_pasword to use awx.sso.common

* Adding unit tests

---------

Co-authored-by: John Westcott IV <john.westcott.iv@redhat.com>
2023-02-28 15:44:34 -03:00
John Westcott IV
23a34c5dc9 Merge pull request #13466 from john-westcott-iv/ee_debugging
Enhancing debugging of `The project could not sync because there is no Execution Environment`
2023-02-16 08:11:30 -05:00
Alan Rominger
f5785976be Update to comply with new black rules 2023-02-01 14:59:38 -05:00
John Westcott IV
fd6605932a Adding exception if unable to find the controler plane ee 2023-01-24 13:50:07 -05:00
Seth Foster
6492c03965 Fix console color logs 2023-01-05 12:55:20 -05:00
Alan Rominger
2fdce43f9e Bulk save facts, and move to before status change (#12998)
* Facts scaling fixes for large inventory, timing issue

Move save of Ansible facts to before the job status changes
  this is considered an acceptable delay with the other
  performance fixes here

Remove completely unrelated unused facts method

Scale related changes to facts saving:
  Use .iterator() on queryset when looping
  Change save to bulk_update
  Apply bulk_update in batches of 100, to reduce memory
  Only save a single file modtime, avoiding large dict

Use decorator for long func time logging
  update decorator to fill in format statement
2022-11-15 15:18:06 -05:00
Alan Rominger
1f939aa25e Merge pull request #12884 from AlanCoding/is_testing
[tech debt] Move the IS_TESTING method out of settings
2022-11-09 15:29:35 -05:00
Seth Foster
de41601f27 make job lifecycle Cyan again 2022-10-24 13:50:42 -04:00
Rick Elrod
03eaeac459 Better handle IPv6 in util function update_scm_url (#12995)
- Firstly -- add a bunch of unit tests for `update_scm_url`, because it
  previously had none and desperately needed them.
- Secondly -- fix #12992 by adding back in IPv6 address brackets if they
  existed in the first place when the function was called.
- Thirdly -- fix a related case where we disallowed IPv6 in URLs that
  did not include the scheme.

Signed-off-by: Rick Elrod <rick@elrod.me>
2022-10-04 15:21:56 -05:00
Alan Rominger
cfce31419d Move the IS_TESTING method out of settings 2022-09-28 11:19:10 -04:00
Alan Rominger
61093b2532 Treat instance_groups prompt as template-less 2022-09-22 16:08:22 -04:00
John Westcott IV
e076f1ee2a Making labels additive and not adding a many item to config if already in parent 2022-09-22 15:39:49 -04:00
John Westcott IV
33c0fb79d6 JT param everything (#12646)
* Making almost all fields promptable on job templates and config models
* Adding EE, IG and label access checks
* Changing jobs preferred instance group function to handle the new IG cache field
* Adding new ask fields to job template modules
* Address unit/functional tests
* Adding migration file
2022-09-22 15:16:12 -04:00
Alan Rominger
c59bbdecdb Refactor canceling to work through messaging and signals, not database
If canceled attempted before, still allow attempting another cancel
in this case, attempt to send the sigterm signal again.
Keep clicking, you might help!

Replace other cancel_callbacks with sigterm watcher
  adapt special inventory mechanism for this too

Get rid of the cancel_watcher method with exception in main thread

Handle academic case of sigterm race condition

Process cancelation as control signal

Fully connect cancel method and run_dispatcher to control

Never transition workflows directly to canceled, add logs
2022-09-01 15:20:31 -04:00
Alan Rominger
a3fef27002 Add logs to debug waiting bottlenecking 2022-08-17 11:33:49 -04:00
Alan Rominger
30f556f845 Further resiliency changes focused on offline database
Make logs from database outage more manageable

Raise exception if update_model never recovers from problem
2022-08-10 16:16:57 -04:00
Seth Foster
e6f8852b05 Cache task_impact
task_impact is now a field on the database
It is calculated and set during create_unified_job

set task_impact on .save for adhoc commands
2022-08-05 14:33:47 -04:00
Seth Foster
0a47d05d26 split schedule_task_manager into 3
each call to schedule_task_manager becomes one of

ScheduleTaskManager
ScheduleDependencyManager
ScheduleWorkflowManager
2022-08-05 14:33:25 -04:00
Seth Foster
431b9370df Split TaskManager into
- DependencyManager spawns dependencies if necessary
- WorkflowManager processes running workflows to see if a new job is
  ready to spawn
- TaskManager starts tasks if unblocked and has execution capacity
2022-08-05 14:29:02 -04:00
Alan Rominger
fd671ecc9d Give specific messages if job was killed due to SIGTERM or SIGKILL (#12435)
* Reap jobs on dispatcher startup to increase clarity, replace existing reaping logic

* Exit jobs if receiving SIGTERM signal

* Fix unwanted reaping on shutdown, let subprocess close out

* Add some sanity tests for signal module

* Add a log for an unhandled dispatcher error

* Refine wording of error messages

Co-authored-by: Elijah DeLee <kdelee@redhat.com>
2022-06-30 13:20:08 -04:00
Alan Rominger
4543f6935f Only do substitutions for container path conversions with resolved paths (#12313)
* Resolve paths as much as possible before doing replacements

* Move unused method out of main code, test symlink
2022-06-09 11:36:29 -04:00
John Westcott IV
f37951249f Adding options fqcn (ansible.builtin.) to playbook identification 2022-06-06 17:32:37 -04:00
Alan Rominger
29d60844a8 Fix notification timing issue by sending in the latter of 2 events (#12110)
* Track host_status_counts and use that to process notifications

* Remove now unused setting

* Back out changes to callback class not needed after all

* Skirt the need for duck typing by leaning on the cached field

* Delete tests for deleted task

* Revert "Back out changes to callback class not needed after all"

This reverts commit 3b8ae350d218991d42bffd65ce4baac6f41926b2.

* Directly hardcode stats_event_type for callback class

* Fire notifications if stats event was never sent

* Remove test content for deleted methods

* Add placeholder for when no hosts matched

* Make field default be None, denote events processed with empty dict

* Make UI process null value for host_status_counts

* Fix tracking of EOF dispatch for system jobs

* Reorganize EVENT_MAP into class properties

* Consolidate conditional I missed from EVENT_MAP refactor

* Give up on the null condition, also applies for empty hosts

* Remove cls position argument not being used

* Move wrapup method out of class, add tests
2022-04-29 13:54:31 -04:00
Jeff Bradberry
11890f0eee Fix the job event partition alignment
it really should be always aligned to the hour, so that real job
events don't slip through the cracks.
2022-04-18 14:54:06 -04:00
Alan Rominger
73e02e745a Patches to make jobs robust to database restarts (#11905)
* Simple patches to make jobs robust to database restarts

* Add some wait time before retrying loop due to DB error

* Apply dispatcher downtime setting to job updates, fix dispatcher bug

This resolves a bug where the pg_is_down property
  never had the right value
  the loop is normally stuck in the conn.events() iterator
  so it never recognized successful database interactions
  this lead to serial database outages terminating jobs

New setting for allowable PG downtime is shared with task code
  any calls to update_model will use _max_attempts parameter
  to make it align with the patience time that the dispatcher
  respects when consuming new events

* To avoid restart loops, handle DB errors on startup with prejudice

* If reconnect consistently fails, exit with non-zero code
2022-03-30 09:14:20 -04:00
Seth Foster
24152555c5 Handle error for create_partition
Occasionally the create_partition will error with,
relation "main_projectupdateevent_20220323_19" already exists

This change wraps the db command into a try except block with its
own transaction
2022-03-28 16:37:50 -04:00
Jeff Bradberry
1803c5bdb4 Fix up usage of django-guid
It has replaced the class-based middleware, everything is
function-based now.
2022-03-14 13:19:57 -04:00
Jeff Bradberry
028f09002f Fix the cleanup_jobs management command
It previously depended on a private Django internal class that changed
with Django 3.1.

I've switched here instead to disabling the django-polymorphic
accessors to get the underlying UnifiedJob object for a Job, which due
to the way they implement those was resulting in N+1 behavior on
deletes.  This gets us back most of the way to the performance gains
we achieved with the custom collector class.  See
https://github.com/django-polymorphic/django-polymorphic/issues/198.
2022-03-07 18:11:36 -05:00
Jeff Bradberry
05142a779d Replace all usage of customized json fields with the Django builtin
The event_data field on event models, however, is getting an
overridden version that retains the underlying text data type for the
column, to avoid a heavy data migration on those tables.

Also, certain of the larger tables are getting these fields with the
NOT NULL constraint turned off, to avoid a long migration.

Remove the django.utils.six monkey patch we did at the beginning of
the upgrade.
2022-03-07 18:11:36 -05:00
Jeff Bradberry
b852baaa39 Fix up logger .warn() calls to use .warning() instead
This is a usage that was deprecated in Python 3.0.
2022-03-07 18:11:36 -05:00
Jeff Bradberry
a3a216f91f Fix up new Django 3.0 deprecations
Mostly text based: force/smart_text, ugettext_*
2022-03-07 18:11:36 -05:00
Alan Rominger
eb52095670 Fix bug where translated strings will cause log error to error (#11813)
* Fix bug where translated strings will cause log error to error

* Use force_str for ensuring string
2022-02-28 08:38:01 -05:00
Elijah DeLee
799968460d Fixup conversion of memory and cpu settings to support k8s resource request format (#11725)
fix memory and cpu settings to suport k8s resource request format

* fix conversion of memory setting to bytes

This setting has not been getting set by default, and needed some fixing
up to be compatible with setting the memory in the same way as we set it
in the operator, as well as with other changes from last year which
assume that ansible runner is returning memory in bytes.

This way we can start setting this setting in the operator, and get a
more accurate reflection of how much memory is available to the control
pod in k8s.

On platforms where services are all sharing memory, we deduct a
penalty from the memory available. On k8s we don't need to do this
because the web, redis, and task containers each have memory
allocated to them.

* Support CPU setting expressed in units used by k8s

This setting has not been getting set by default, and needed some fixing
up to be compatible with setting the CPU resource request/limits in the
same way as we set it in the resource requests/limits.

This way we can start setting this setting in the
operator, and get a more accurate reflection of how much cpu is
available to the control pod in k8s.

Because cpu on k8s can be partial cores, migrate cpu field to decimal.

k8s does not allow granularity of less than 100m (equivalent to 0.1 cores), so only
store up to 1 decimal place.

fix analytics to deal with decimal cpu

need to use DjangoJSONEncoder when Decimal fields in data passed to
json.dumps
2022-02-15 14:08:24 -05:00
Amol Gautam
443bdc1234 Decoupled callback functions from BaseTask Class
--- Removed all callback functions from 'jobs.py' and put them in a new file '/awx/main/tasks/callback.py'
--- Modified Unit tests unit moved
--- Moved 'update_model' from jobs.py to /awx/main/utils/update_model.py
2022-02-09 13:46:32 -05:00
Elijah DeLee
987924cbda Add resource requests to default podspec
Extend the timeout, assuming that we want to let the kubernetes scheduler
start containers when it wants to start them. This allows us to make
resource requests knowing that when some jobs queue up waiting for
resources, they will not get reaped in as short of a
timeout.
2022-01-18 13:34:39 -05:00
Alan Rominger
9664aed1f2 Remove unused ansible version method 2022-01-14 14:55:35 -05:00
Alan Rominger
82671680e3 Respect linter rule F811 for trivial re-definition 2022-01-14 13:23:04 -05:00
Amol Gautam
bff49f2a5f Merge pull request #11528 from amolgautam25/tasks-refactor-1
Refactored 'tasks.py' file  into a package
2022-01-14 12:16:32 -05:00
Amol Gautam
a4a3ba65d7 Refactored tasks.py to a package
--- Added 3 new sub-package : awx.main.tasks.system , awx.main.tasks.jobs , awx.main.tasks.receptor
--- Modified the functional tests and unit tests accordingly
2022-01-14 11:55:41 -05:00
Marcelo Moreira de Mello
1517f2d910 Don't expose serviceAccount token on default pod spec 2022-01-12 23:47:24 -05:00