Compare commits

..

137 Commits

Author SHA1 Message Date
jainnikhil30
4cd90163fc make the default JOB_EVENT_BUFFER_SECONDS 1 seconds (#14335) 2023-08-12 07:49:34 +05:30
Alan Rominger
8dc6ceffee Fix programming error in facts retry merge (#14336) 2023-08-11 13:54:18 -04:00
Alan Rominger
2c7184f9d2 Add a retry to update host facts on deadlocks (#14325) 2023-08-11 11:13:56 -04:00
Martin Slemr
5cf93febaa HostMetricSummaryMonthly: Analytics export 2023-08-11 09:38:23 -04:00
Alan Rominger
284bd8377a Integrate scheduler into dispatcher main loop (#14067)
Dispatcher refactoring to get pg_notify publish payload
  as separate method

Refactor periodic module under dispatcher entirely
  Use real numbers for schedule reference time
  Run based on due_to_run method

Review comments about naming and code comments
2023-08-10 14:43:07 -04:00
Jeff Bradberry
14992cee17 Add in an async task to migrate the data over 2023-08-10 13:48:58 -04:00
Jeff Bradberry
6db663eacb Modify main/0185 to set aside the json fields that might be a problem
Rename them, then create a new clean field of the new jsonb type.
We'll use a task to do the data conversion.
2023-08-10 13:48:58 -04:00
Ivanilson Junior
87bb70bcc0 Remove extra quote from Skipped task status string (#14318)
Signed-off-by: Ivanilson Junior <ivanilsonaraujojr@gmail.com>
Co-authored-by: kialam <digitalanime@gmail.com>
2023-08-09 15:58:46 -07:00
Pablo Hess
c2d02841e8 Allow importing licenses with a missing "usage" attribute (#14326) 2023-08-09 16:41:14 -04:00
onefourfive
e5a6007bf1 fix broken link to upgrade docs. related #11313 (#14296)
Signed-off-by: onefourfive <>
Co-authored-by: onefourfive <unknown>
2023-08-09 15:06:44 -04:00
Alan Rominger
6f9ea1892b AAP-14538 Only process ansible_facts for successful jobs (#14313) 2023-08-04 17:10:14 -04:00
Sean Sullivan
abc56305cc Add Request time out option for collection (#14157)
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-08-03 15:06:04 -03:00
kialam
9bb6786a58 Wait for new label IDs before setting label prompt values. (#14283) 2023-08-03 09:46:46 -04:00
Michael Abashian
aec9a9ca56 Fix rbac around credential access add button (#14290) 2023-08-03 09:18:21 -04:00
John Westcott IV
7e4cf859f5 Added PR check to ensure JIRA links are present (#13839) 2023-08-02 15:28:13 -04:00
mcen1
90c3d8a275 Update example service-account.yml for container group in documentation (#13479)
Co-authored-by: Hao Liu <44379968+TheRealHaoLiu@users.noreply.github.com>
Co-authored-by: Nana <35573203+masbahnana@users.noreply.github.com>
2023-08-02 15:27:18 -04:00
lucas-benedito
6d1c8de4ed Fix trial status and host limit with sub (#14237)
Co-authored-by: Lucas Benedito <lbenedit@redhat.com>
2023-08-02 10:27:20 -04:00
Seth Foster
601b62deef bump python-daemon package (#14301) 2023-08-01 01:39:17 +00:00
Seth Foster
131dd088cd fix linting (#14302) 2023-07-31 20:37:37 -04:00
Rick Elrod
445d892050 Drop unused django-taggit dependency (#14241)
This drops the django-taggit dependency and drops the relevant fields
from old migrations.

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-07-31 10:05:27 -05:00
Michael Abashian
35a576f2dd Adds autoComplete attribute to forms that were missing it (#14080) 2023-07-28 09:49:36 -04:00
John Westcott IV
7838641215 Fixed dependencies tag in PR labeler (#14286) 2023-07-28 08:30:30 -04:00
Alan Rominger
ab5cc2e69c Simplifications for DependencyManager (#13533) 2023-07-27 15:42:29 -04:00
John Westcott IV
5a63533967 Added support to collection for named urls (#14205) 2023-07-27 10:22:41 -03:00
Christian Adams
b549ae1efa Only show the product version header when the requester is authenticated (#14135) 2023-07-26 18:38:05 -04:00
Alex Corey
bd0089fd35 fixes docs link for controller versions >= 4.3 (#14287) 2023-07-26 21:54:39 +00:00
Christian Adams
40d18e95c2 Explicitly turn off autocomplete for API login form (#14232) 2023-07-26 15:33:26 -04:00
Andrew Klychkov
191a0f7f2a docs/execution_environments.md: add a link to EE getting started guide (#14263) 2023-07-26 15:05:36 -04:00
eric-zadara
852bb0717c Return back chdir to project sync to support project-local roles/collections
Signed-off-by: eric-zadara <eric@zadarastorage.com>
2023-07-25 09:58:43 -05:00
Alan Rominger
98bfe3f43f Add missing trigger for failed-to-start nodes (#13802) 2023-07-24 12:17:46 -04:00
John Westcott IV
53a7b7818e Updating release process doc for operator hub instructions (#13564) 2023-07-24 15:29:26 +01:00
Gabriel Muniz
e7c7454a3a Remove host update code which can be non performant (#14233) 2023-07-24 09:56:40 -04:00
Homero Pawlowski
63e82aa4a3 Fix collection module docs for names, IDs, and named URLs (#14269) 2023-07-24 08:57:46 -04:00
ZitaNemeckova
fc1b74aa68 Remove extra data for AoC (#14254) 2023-07-19 11:16:53 -04:00
Alan Rominger
ea455df9f4 Only push the production images for main repo (#14261) 2023-07-19 09:51:33 -04:00
Satoe Imaishi
8e2a5ed8ae Require pyyaml >= 6.0.1 (#14262) 2023-07-18 16:25:14 -05:00
Rick Elrod
1d7e54bd39 Wrap Django RedisCache to mute exceptions (#14243)
We introduce a thin wrapper over Django's RedisCache so that the functionality of DJANGO_REDIS_IGNORE_EXCEPTIONS is retained while still being able to drop the django-redis dependency.

Credit to django-redis's implementation for the idea of using a decorator for this and abstracting out the exception handling logic.

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-07-18 15:31:09 -05:00
Cristiano Nicolai
83df056f71 Small doc fixes for workflow and task manager (#14242) 2023-07-18 19:23:48 +00:00
Rick Elrod
48edb15a03 Prevent Dispatcher deadlock when Redis disappears (#14249)
This fixes https://github.com/ansible/awx/issues/14245 which has
more information about this issue.

This change addresses both:
- A clashing signal handler (registering a callback to fire when
  the task manager times out, and hitting that callback in cases
  where we didn't expect to). Make dispatcher timeout use
  SIGUSR1, not SIGTERM.
- Metrics not being reported should not make us crash, so that is
  now fixed as well.

Signed-off-by: Rick Elrod <rick@elrod.me>
Co-authored-by: Alan Rominger <arominge@redhat.com>
2023-07-18 10:43:46 -05:00
John Westcott IV
8ddc19a927 Changing how associations work in awx collection (#13626)
Co-authored-by: Alan Rominger <arominge@redhat.com>
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-07-17 14:16:55 -03:00
Sean Sullivan
b021ad7b28 Allow job_template collection module to set verbosity to 5 (#14244) 2023-07-17 09:48:14 -05:00
Rick Elrod
b8ba2feecd Tell Makefile and pre-commit.sh that they are bash
On some systems, /bin/sh is a bash symlink and running it will launch
bash in sh compatibility mode. However, bash-specific syntax will still
work in this mode (for example using == or pipefail).

However, on systems where /bin/sh is a symlink to another shell (think:
Debian-based) they might not have those bashisms.

Set the shell in the Makefile, so that it uses bash (since it is already
depending on bash, even though it is calling it as /bin/sh by default),
and add a shebang to pre-commit.sh for the same reason.

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-07-14 12:06:55 -05:00
Rick Elrod
8cfb704f86 Migrate from django-redis to Django's built-in Redis caching support (#14210)
Signed-off-by: Rick Elrod <rick@elrod.me>
2023-07-13 12:16:16 -05:00
John Westcott IV
efcac860de Upgrade django to 4.2.3 (#14228) 2023-07-13 08:52:50 -04:00
Martin Slemr
6c5590e0e6 HostMetricSummaryMonthly command + views + scheduled task (#13999)
Co-authored-by: Alan Rominger <arominge@redhat.com>
2023-07-12 16:40:09 -04:00
Erez Tamam
0edcd688a2 add organization column notification template list (#13998) 2023-07-12 15:11:47 -04:00
Alan Rominger
b8c48f7d50 Restore pre-upgrade pg_notify notifcation behavior (#14222) 2023-07-11 16:23:53 -04:00
John Westcott IV
07e30a3d5f Refined release documentation (#14221) 2023-07-10 19:45:34 +00:00
John Westcott IV
cb5a8aa194 Fix black pre-commit hook (#14212) 2023-07-06 16:36:50 -04:00
Seth Foster
8b49f910c7 Add settings.RECEPTOR_LOG_LEVEL, update work signing key path (#14098) 2023-07-06 11:39:30 -04:00
kialam
a4f808df34 Schedules form - pass time prop as string. (#14206) 2023-07-06 07:57:55 -07:00
Alan Rominger
82abd18927 Fix DELETE 500 KeyError due to eventless model events (#14172) 2023-07-05 15:37:52 -04:00
John Westcott IV
5e9d514e5e Added CSRF Origin in settings (#14062) 2023-07-05 15:18:23 -04:00
Rick Elrod
4a34ee1f1e Add optional pgbouncer to dev environment (#14083)
Signed-off-by: Rick Elrod <rick@elrod.me>
2023-07-05 13:41:47 -05:00
John Westcott IV
3624fe2cac Add combined roles/collection requirements on project sync (#14081) 2023-07-05 13:25:44 -03:00
Cesar Francisco San Nicolas Martinez
0f96d9aca2 Rename/relocate receptor crt in install bundle (#14201) 2023-07-05 14:50:55 +02:00
Shane McDonald
989b80e771 Fix selinux errors with Redis mount in dev env 2023-07-03 09:57:01 -04:00
John Westcott IV
cc64be937d Fix spelling errors in readme of awx_collection/tools
Signed-off-by: John Westcott <john.westcott.iv@redhat.com>
2023-06-30 15:41:47 -04:00
John Westcott IV
94183d602c Enhancing vault integration
Added persistent storage

Auto-create vault and awx via playbooks

Create a new pattern for custom containers where we can do initialization

Auto-install roles needed for plumbing via the Makefile
2023-06-30 10:05:15 -04:00
Vidya Nambiar
ac4ef141bf Fix filter experience when assigning access to teams (#14175) 2023-06-29 15:15:32 -04:00
jainnikhil30
86f6b54eec add the bulk api swagger topic for API reference docs (#14181) 2023-06-28 21:55:38 +05:30
Michael Abashian
bd8108b27c Fixed bug where a weekly rrule string without a BYDAY would result in the UI throwing a TypeError (#14182) 2023-06-28 11:10:49 -04:00
Alan Rominger
aed96fb365 Use the proper queryset to filter project update events (#14166) 2023-06-26 21:41:08 -04:00
Alan Rominger
fe2da52eec Upgrade Github actions issue labeler to fix 404 errors (#14163) 2023-06-26 17:14:53 -04:00
Alan Rominger
974465e46a Add hashivault option as docker-compose optional container (#14161)
Co-authored-by: Sarabraj Singh <singh.sarabraj@gmail.com>
2023-06-26 15:48:58 -04:00
Alan Rominger
c736986023 Try to fix CI by adding dropped coreapi lib (#14165) 2023-06-26 15:11:12 -04:00
Akira Yokochi
6b381aa79e Add example for ad_hoc_command module (#14106) 2023-06-23 11:59:16 -04:00
Alan Rominger
755e55ec70 Remove reference to unmaintained runner image (#14143) 2023-06-23 10:15:11 -04:00
Rick Elrod
255c2e4172 [wsrelay] Give connection tasks time to clean up
When we close/cancel a connection to a web node, give the task time to
clean up after itself and cleanly exit. Otherwise, the Python GC might
clean up the task too early and this leads to ugly log messages like
this: "Task was destroyed but it is pending!"

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-06-23 00:56:24 -05:00
Alan Rominger
aa8437fd77 Tooling for running collection tests locally ad hoc (#14160) 2023-06-22 13:32:09 -04:00
Akira Yokochi
66f14bfe8f Using execution_environment option in ad_hoc_command module (#14105) 2023-06-22 13:10:01 -04:00
Gabriel Muniz
721a2002dc Add --interval to launch monitor command (#14068)
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-06-22 11:07:26 -03:00
Seth Foster
af39b2cd3f Rename work signing private key filename (#14156) 2023-06-21 19:50:04 -04:00
Lorenzo Tanganelli
cdd48dd7cd Add instance_groups on resource_list_param_keys in awx_collection (#14146) 2023-06-21 19:29:14 +00:00
Sean Sullivan
d3de884baf In collection, give changed status in workflow_job_template when destroying nodes (#13928) 2023-06-21 15:17:53 -04:00
Benjamin Dudas
fa8968b95b Fix for Save on the Jobs settings page not responding (#14103)
Co-authored-by: Michael Abashian <mabashia@redhat.com>
2023-06-21 15:14:31 -04:00
Jesse Wattenbarger
897a19e127 Add None check back to get_post_fields (#14155) 2023-06-21 12:37:59 -04:00
Artsiom Musin
4bae961b5f Improve performance for awx cli export (#13182)
Co-authored-by: Jesse Wattenbarger <jwattenb@redhat.com>
2023-06-21 10:49:22 -04:00
Seth Foster
900c4fd8f1 Rename work signing private key filename (#14151) 2023-06-21 09:52:58 -04:00
Akira Yokochi
4d5bbd7065 Fixed typo in integration test for group module (#14140) 2023-06-21 09:28:01 -04:00
Gabriel Muniz
fb8fadc7f9 Add new ANSIBLE_COLLECTIONS_PATH in preparation for deprecation of plural version (#14079) 2023-06-20 10:32:18 -03:00
John Westcott IV
ba99ddfd82 Fix PR and issue labeler job permissions (#14134) 2023-06-15 18:56:40 +00:00
Gabriel Muniz
9676a95e05 Add AWS Secretsmanager plugin (#13778)
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-06-15 10:12:02 -04:00
Gabriel Muniz
36d6ed9cac Removed automatic failure of job template launch when last project update is failed and update on launch is enabled (#13796)
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-06-15 10:11:13 -04:00
Gabriel Muniz
875f1a82e4 Add dynamically configurable debug settings (#14008)
Co-authored-by: Michael Abashian <mabashia@redhat.com>
2023-06-15 09:31:54 -04:00
Rick Elrod
db71b63829 Address comments from @jjwatt
Signed-off-by: Rick Elrod <rick@elrod.me>
2023-06-14 17:40:15 -04:00
John Westcott IV
cd4d83acb7 Compensating for NUL unicode characters
NUL characters are not allowed in text fields in the database

We used to strip them out of stdout but the exception changed

And we want to be sure to strip them out of JSONBlob fields
2023-06-14 17:40:15 -04:00
John Westcott IV
7e25a694f3 Making all non-complicated JSONBlobs JSONFields 2023-06-14 17:40:15 -04:00
John Westcott IV
baca43ee62 Performing test maintainance 2023-06-14 17:40:15 -04:00
John Westcott IV
3b69552260 Forcing our JSONField to use text instead of Jsonb data 2023-06-14 17:40:15 -04:00
Rick Elrod
f9bd780d62 [wsrelay] Port back to psycopg3
Signed-off-by: Rick Elrod <rick@elrod.me>
2023-06-14 17:40:15 -04:00
John Westcott IV
a665d96026 Replacing psycopg2.copy_expert with psycopg3.copy 2023-06-14 17:40:15 -04:00
John Westcott IV
e47d30974c Removing psycopg2 references 2023-06-14 17:40:15 -04:00
John Westcott IV
2b8ed66f3e Updating old migration for psycopg3 2023-06-14 17:40:15 -04:00
John Westcott IV
dfe8b3b16b Removes psycopg2 in favor of psycopg3 2023-06-14 17:40:15 -04:00
Artsiom Musin
c738d0788e Check for a list of all option instead of string (#14046) 2023-06-14 15:41:06 -04:00
Jesse Wattenbarger
0c2d589109 Lazy init VERSION vars in Makefile (#14093) 2023-06-14 15:00:38 -04:00
Sean Sullivan
a47bbb5479 bugfix collection role module target_teams and instance_groups options (#14119) 2023-06-14 17:53:24 +00:00
Shane McDonald
4b4b73c02a Fix ARM builds (#14125) 2023-06-14 16:40:59 +00:00
John Westcott IV
d1d08fe499 Changed pin of rsyslog version (#14117) 2023-06-13 16:33:25 -04:00
Hao Liu
7e7a9f541c Remove install bundle download restriction (#14092) 2023-06-12 16:08:44 -04:00
kialam
98d67e2133 Update Patternfly and related deps. (#14086) 2023-06-12 12:35:26 -07:00
Alan Rominger
7a36041bf2 Remove whitespace artifacts from black with f-strings (#14112) 2023-06-12 11:52:22 -04:00
Hao Liu
b96564da55 Rename/relocate receptor cert and keys (#14091) 2023-06-09 12:57:04 -04:00
Seth Foster
044d6bf97c Fix task_system logs twice (#14096) 2023-06-07 16:50:56 -04:00
delinea-sagar
d357c1162f Awx.credential plugin.tss (#13985) 2023-06-07 19:36:15 +00:00
Darshan
3c22fc9242 Fix : awx.awx.group preserve hosts fails when there are no hosts (#13913)
Co-authored-by: Sean Sullivan <ssulliva@redhat.com>
2023-06-07 15:24:59 -04:00
Seth Foster
8c86092bf5 Remove random UUIDs from swagger json (#14089) 2023-06-06 10:44:15 -04:00
Cesar Francisco San Nicolas Martinez
081206965c Generate random UUID by default for added remote nodes (#14074) 2023-06-06 12:36:28 +02:00
Rick Elrod
036f85cd80 Two silly internal cleanups
- Nix an unused function from run_dispatcher. This stopped being used
  in 558e92806b but was never removed.

- Fix a typo in run_ws_heartbeat: hearbeat -> heartbeat that has existed
  since the beginning of this daemon.

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-06-05 14:46:25 -05:00
Gabriel Muniz
6976ac9273 Add management command to precreate partitioned tables (#14076) 2023-06-05 18:20:53 +00:00
rakesh561
9009a21a32 Update Mesh.js to allow for running AWX at non-root path (URL prefixing) (#14020)
Co-authored-by: Michael Abashian <mabashia@redhat.com>
2023-06-05 11:46:12 -04:00
Shane McDonald
aafd4df288 Fix /api/swagger endpoint (available only in development mode) (#13197)
Co-authored-by: John Westcott IV <john.westcott.iv@redhat.com>
2023-06-02 12:58:21 -04:00
John Westcott IV
844666df4c Send real client remote address in TACACS+ authentication packet (#14077)
Co-authored-by: ekougs <ekougs@gmail.com>
2023-06-02 10:03:56 -04:00
Rick Elrod
0ae720244c [rsyslog] Enable disk-assisted queuing on output (#14005)
Right now we only enable queuing on the rsyslog main_queue. This adds a
parameter to also enable it on the omhttp output action. As omhttp can
take time to process messages (e.g. blocking on the result of its HTTP
requests), this change allows for queuing messages up and hopefully
preventing some messages from getting lost when the log server is slow
to respond.

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-06-01 22:37:45 -05:00
Alex Corey
b70fa88b78 Adds RTL tests to new component, and to Instances List (#12927) 2023-06-01 19:19:24 +00:00
Alan Rominger
fbaeb90268 Apply conservative database connection reduction changes (#14066)
This is expected to free up 4 additional database connections per traditional node
  compare to roughly 12 in total before this change

Out of these 3 are accomplished by using existing connection for recently added services
  then 1 is obtained by closing the connection for the idle callback receiver main process

Signed-off-by: jessicamack <jmack@redhat.com>
Co-authored-by: jessicamack <jmack@redhat.com>
2023-06-01 14:59:18 -04:00
Michael Abashian
2a549c0b23 Removes dependabot for opening ui dependency pr's (#14075) 2023-06-01 14:30:02 -04:00
Alan Rominger
2c320cb16d Manually run subquery for parent event updates (#14044)
Fixes a long query when processing playbook_on_stats events
2023-06-01 07:55:56 -04:00
lucas-benedito
434595481c AAP-8038 - enable/disable services on reboot (#13415)
Co-authored-by: Lucas Benedito <lbenedit@redhat.com>
2023-05-31 19:24:30 +00:00
sll552
444d05447e Fix ovirt source (#12882) 2023-05-31 15:22:58 -04:00
Michael Abashian
fbe202bdbf Adds missing rel="noopener noreferrer" to each link element with target="_blank" (#13959) 2023-05-31 13:49:39 -04:00
Michael Abashian
d89cad0d9e Adds managed_by_policy checkbox to instances form. Adds warnings when associating or disassociating instances from instance groups. (#13994) 2023-05-31 12:31:55 -04:00
Marliana Lara
bdfd6f47ff Use PATCH request when updating wf nodes (#14063) 2023-05-31 12:30:58 -04:00
Gabriel Muniz
ae7be2eea1 Add instance_group to bulk api (#13982)
Co-authored-by: Elijah DeLee <kdelee@redhat.com>
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-05-26 09:09:44 -03:00
Baptiste Agasse
8957a84738 Related #13336 - DNS resolution is preventing awx_collection to work with http[s]_proxy (#13524)
Co-authored-by: Seth Foster <fosterseth@users.noreply.github.com>
2023-05-24 20:00:07 +00:00
Rick Elrod
bac124004f Rename heartbeet daemon to ws_heartbeat (#14041)
Signed-off-by: Rick Elrod <rick@elrod.me>
2023-05-24 13:27:55 -05:00
Joel Tenta
f46c7452d1 Spelling and codespelling corrections from community PR
- Made the choice not to pull in the CI tools due to the possibility of it blocking PRs.

Co-Authored By: Lila Yasin <89486372+djyasin@users.noreply.github.com>
2023-05-24 10:06:42 -04:00
John Westcott IV
098861d906 Updated sqlparse library (#13962)
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-05-24 08:09:29 -03:00
John Westcott IV
daf39dc77e Adding capability of pretty error pages (#13852)
Co-authored-by: Jessica Steurer <70719005+jay-steurer@users.noreply.github.com>
2023-05-23 14:05:38 -03:00
Hao Liu
00d8291d40 Change logging setting for task analytic scheduler (#14031) 2023-05-23 13:01:12 -04:00
Rick Elrod
88d1a484fa [dev docs] Re-document websockets infrastructure (#13992)
Re-add documentation for how AWX websockets and channels work, in the post-web/task split world.

Signed-off-by: Rick Elrod <rick@elrod.me>
2023-05-22 16:41:23 -05:00
Michael Abashian
5afdfb1135 Escape parenthesis in labeler for tech preview ui label 2023-05-18 15:00:19 -04:00
Michael Abashian
2f15cc5170 Updates issue_labeler.yml to handle tech preview ui auto-labeling 2023-05-18 14:46:36 -04:00
Michael Abashian
f15d40286c Adds a component label for the tech preview ui in bug_report.yml 2023-05-18 14:45:27 -04:00
Alan Rominger
f58c44590d Remove unused settings and associated code (#13898) 2023-05-18 10:05:59 -04:00
Alan Rominger
ef99770383 Add subsystem metrics for the dispatcher (#13989)
This adds a handful of metrics to /api/v2/metrics/ recorded from the dispatcher main process

Adds logic in the dispatcher period tasks to calculate these for the last collection interval
Reports worker count, task count, scale up events, and availability

Add data to demo grafana dashboard
2023-05-17 14:29:31 -04:00
305 changed files with 14996 additions and 8895 deletions

View File

@@ -44,6 +44,7 @@ body:
label: Select the relevant components
options:
- label: UI
- label: UI (tech preview)
- label: API
- label: Docs
- label: Collection

View File

@@ -1,19 +0,0 @@
version: 2
updates:
- package-ecosystem: "npm"
directory: "/awx/ui"
schedule:
interval: "monthly"
open-pull-requests-limit: 5
allow:
- dependency-type: "production"
reviewers:
- "AlexSCorey"
- "keithjgrant"
- "kialam"
- "mabashian"
- "marshmalien"
labels:
- "component:ui"
- "dependencies"
target-branch: "devel"

View File

@@ -6,6 +6,8 @@ needs_triage:
- "Feature Summary"
"component:ui":
- "\\[X\\] UI"
"component:ui_next":
- "\\[X\\] UI \\(tech preview\\)"
"component:api":
- "\\[X\\] API"
"component:docs":

View File

@@ -15,5 +15,5 @@
"dependencies":
- any: ["awx/ui/package.json"]
- any: ["awx/requirements/*.txt"]
- any: ["awx/requirements/requirements.in"]
- any: ["requirements/*.txt"]
- any: ["requirements/requirements.in"]

View File

@@ -48,8 +48,11 @@ jobs:
DEV_DOCKER_TAG_BASE=ghcr.io/${OWNER_LC} COMPOSE_TAG=${GITHUB_REF##*/} make awx-kube-dev-build
DEV_DOCKER_TAG_BASE=ghcr.io/${OWNER_LC} COMPOSE_TAG=${GITHUB_REF##*/} make awx-kube-build
- name: Push image
- name: Push development images
run: |
docker push ghcr.io/${OWNER_LC}/awx_devel:${GITHUB_REF##*/}
docker push ghcr.io/${OWNER_LC}/awx_kube_devel:${GITHUB_REF##*/}
docker push ghcr.io/${OWNER_LC}/awx:${GITHUB_REF##*/}
- name: Push AWX k8s image, only for upstream and feature branches
run: docker push ghcr.io/${OWNER_LC}/awx:${GITHUB_REF##*/}
if: endsWith(github.repository, '/awx')

View File

@@ -6,9 +6,9 @@ on:
- opened
- reopened
permissions:
contents: read # to fetch code
issues: write # to label issues
permissions:
contents: write # to fetch code
issues: write # to label issues
jobs:
triage:
@@ -17,7 +17,7 @@ jobs:
steps:
- name: Label Issue
uses: github/issue-labeler@v2.4.1
uses: github/issue-labeler@v3.1
with:
repo-token: "${{ secrets.GITHUB_TOKEN }}"
not-before: 2021-12-07T07:00:00Z

View File

@@ -8,7 +8,7 @@ on:
- synchronize
permissions:
contents: read # to determine modified files (actions/labeler)
contents: write # to determine modified files (actions/labeler)
pull-requests: write # to add labels to PRs (actions/labeler)
jobs:

View File

@@ -0,0 +1,35 @@
---
name: Check body for reference to jira
on:
pull_request:
branches:
- release_**
jobs:
pr-check:
if: github.repository_owner == 'ansible' && github.repository != 'awx'
name: Scan PR description for JIRA links
runs-on: ubuntu-latest
permissions:
packages: write
contents: read
steps:
- name: Check for JIRA lines
env:
PR_BODY: ${{ github.event.pull_request.body }}
run: |
echo "$PR_BODY" | grep "JIRA: None" > no_jira
echo "$PR_BODY" | grep "JIRA: https://.*[0-9]+"> jira
exit 0
# We exit 0 and set the shell to prevent the returns from the greps from failing this step
# See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#exit-codes-and-error-action-preference
shell: bash {0}
- name: Check for exactly one item
run: |
if [ $(cat no_jira jira | wc -l) != 1 ] ; then
echo "The PR body must contain exactly one of [ 'JIRA: None' or 'JIRA: <one or more links>' ]"
echo "We counted $(cat no_jira jira | wc -l)"
exit 255;
else
exit 0;
fi

View File

@@ -4,6 +4,6 @@
Early versions of AWX did not support seamless upgrades between major versions and required the use of a backup and restore tool to perform upgrades.
Users who wish to upgrade modern AWX installations should follow the instructions at:
As of version 18.0, `awx-operator` is the preferred install/upgrade method. Users who wish to upgrade modern AWX installations should follow the instructions at:
https://github.com/ansible/awx/blob/devel/INSTALL.md#upgrading-from-previous-versions
https://github.com/ansible/awx-operator/blob/devel/docs/upgrade/upgrading.md

View File

@@ -31,7 +31,7 @@ If your issue isn't considered high priority, then please be patient as it may t
`state:needs_info` The issue needs more information. This could be more debug output, more specifics out the system such as version information. Any detail that is currently preventing this issue from moving forward. This should be considered a blocked state.
`state:needs_review` The issue/pull request needs to be reviewed by other maintainers and contributors. This is usually used when there is a question out to another maintainer or when a person is less familar with an area of the code base the issue is for.
`state:needs_review` The issue/pull request needs to be reviewed by other maintainers and contributors. This is usually used when there is a question out to another maintainer or when a person is less familiar with an area of the code base the issue is for.
`state:needs_revision` More commonly used on pull requests, this state represents that there are changes that are being waited on.

View File

@@ -1,6 +1,7 @@
-include awx/ui_next/Makefile
PYTHON := $(notdir $(shell for i in python3.9 python3; do command -v $$i; done|sed 1q))
SHELL := bash
DOCKER_COMPOSE ?= docker-compose
OFFICIAL ?= no
NODE ?= node
@@ -8,7 +9,7 @@ NPM_BIN ?= npm
CHROMIUM_BIN=/tmp/chrome-linux/chrome
GIT_BRANCH ?= $(shell git rev-parse --abbrev-ref HEAD)
MANAGEMENT_COMMAND ?= awx-manage
VERSION := $(shell $(PYTHON) tools/scripts/scm_version.py)
VERSION ?= $(shell $(PYTHON) tools/scripts/scm_version.py)
# ansible-test requires semver compatable version, so we allow overrides to hack it
COLLECTION_VERSION ?= $(shell $(PYTHON) tools/scripts/scm_version.py | cut -d . -f 1-3)
@@ -27,6 +28,8 @@ COLLECTION_TEMPLATE_VERSION ?= false
# NOTE: This defaults the container image version to the branch that's active
COMPOSE_TAG ?= $(GIT_BRANCH)
MAIN_NODE_TYPE ?= hybrid
# If set to true docker-compose will also start a pgbouncer instance and use it
PGBOUNCER ?= false
# If set to true docker-compose will also start a keycloak instance
KEYCLOAK ?= false
# If set to true docker-compose will also start an ldap instance
@@ -37,6 +40,8 @@ SPLUNK ?= false
PROMETHEUS ?= false
# If set to true docker-compose will also start a grafana instance
GRAFANA ?= false
# If set to true docker-compose will also start a hashicorp vault instance
VAULT ?= false
# If set to true docker-compose will also start a tacacs+ instance
TACACS ?= false
@@ -52,7 +57,7 @@ RECEPTOR_IMAGE ?= quay.io/ansible/receptor:devel
# Python packages to install only from source (not from binary wheels)
# Comma separated list
SRC_ONLY_PKGS ?= cffi,pycparser,psycopg2,twilio
SRC_ONLY_PKGS ?= cffi,pycparser,psycopg,twilio
# These should be upgraded in the AWX and Ansible venv before attempting
# to install the actual requirements
VENV_BOOTSTRAP ?= pip==21.2.4 setuptools==65.6.3 setuptools_scm[toml]==7.0.5 wheel==0.38.4
@@ -267,11 +272,11 @@ run-wsrelay:
$(PYTHON) manage.py run_wsrelay
## Start the heartbeat process in background in development environment.
run-heartbeet:
run-ws-heartbeat:
@if [ "$(VENV_BASE)" ]; then \
. $(VENV_BASE)/awx/bin/activate; \
fi; \
$(PYTHON) manage.py run_heartbeet
$(PYTHON) manage.py run_ws_heartbeat
reports:
mkdir -p $@
@@ -520,15 +525,20 @@ docker-compose-sources: .git/hooks/pre-commit
-e control_plane_node_count=$(CONTROL_PLANE_NODE_COUNT) \
-e execution_node_count=$(EXECUTION_NODE_COUNT) \
-e minikube_container_group=$(MINIKUBE_CONTAINER_GROUP) \
-e enable_pgbouncer=$(PGBOUNCER) \
-e enable_keycloak=$(KEYCLOAK) \
-e enable_ldap=$(LDAP) \
-e enable_splunk=$(SPLUNK) \
-e enable_prometheus=$(PROMETHEUS) \
-e enable_grafana=$(GRAFANA) \
-e enable_vault=$(VAULT) \
-e enable_tacacs=$(TACACS) \
$(EXTRA_SOURCES_ANSIBLE_OPTS)
docker-compose: awx/projects docker-compose-sources
ansible-galaxy install --ignore-certs -r tools/docker-compose/ansible/requirements.yml;
ansible-playbook -i tools/docker-compose/inventory tools/docker-compose/ansible/initialize_containers.yml \
-e enable_vault=$(VAULT);
$(DOCKER_COMPOSE) -f tools/docker-compose/_sources/docker-compose.yml $(COMPOSE_OPTS) up $(COMPOSE_UP_OPTS) --remove-orphans
docker-compose-credential-plugins: awx/projects docker-compose-sources
@@ -580,7 +590,7 @@ docker-clean:
-$(foreach image_id,$(shell docker images --filter=reference='*/*/*awx_devel*' --filter=reference='*/*awx_devel*' --filter=reference='*awx_devel*' -aq),docker rmi --force $(image_id);)
docker-clean-volumes: docker-compose-clean docker-compose-container-group-clean
docker volume rm -f tools_awx_db tools_grafana_storage tools_prometheus_storage $(docker volume ls --filter name=tools_redis_socket_ -q)
docker volume rm -f tools_awx_db tools_vault_1 tools_grafana_storage tools_prometheus_storage $(docker volume ls --filter name=tools_redis_socket_ -q)
docker-refresh: docker-clean docker-compose

View File

@@ -232,7 +232,8 @@ class APIView(views.APIView):
response = super(APIView, self).finalize_response(request, response, *args, **kwargs)
time_started = getattr(self, 'time_started', None)
response['X-API-Product-Version'] = get_awx_version()
if request.user.is_authenticated:
response['X-API-Product-Version'] = get_awx_version()
response['X-API-Product-Name'] = server_product_name()
response['X-API-Node'] = settings.CLUSTER_HOST_ID

View File

@@ -1629,8 +1629,8 @@ class ProjectUpdateDetailSerializer(ProjectUpdateSerializer):
fields = ('*', 'host_status_counts', 'playbook_counts')
def get_playbook_counts(self, obj):
task_count = obj.project_update_events.filter(event='playbook_on_task_start').count()
play_count = obj.project_update_events.filter(event='playbook_on_play_start').count()
task_count = obj.get_event_queryset().filter(event='playbook_on_task_start').count()
play_count = obj.get_event_queryset().filter(event='playbook_on_play_start').count()
data = {'play_count': play_count, 'task_count': task_count}
@@ -4686,12 +4686,11 @@ class BulkJobNodeSerializer(WorkflowJobNodeSerializer):
# many-to-many fields
credentials = serializers.ListField(child=serializers.IntegerField(min_value=1), required=False)
labels = serializers.ListField(child=serializers.IntegerField(min_value=1), required=False)
# TODO: Use instance group role added via PR 13584(once merged), for now everything related to instance group is commented
# instance_groups = serializers.ListField(child=serializers.IntegerField(min_value=1), required=False)
instance_groups = serializers.ListField(child=serializers.IntegerField(min_value=1), required=False)
class Meta:
model = WorkflowJobNode
fields = ('*', 'credentials', 'labels') # m2m fields are not canonical for WJ nodes, TODO: add instance_groups once supported
fields = ('*', 'credentials', 'labels', 'instance_groups') # m2m fields are not canonical for WJ nodes
def validate(self, attrs):
return super(LaunchConfigurationBaseSerializer, self).validate(attrs)
@@ -4751,21 +4750,21 @@ class BulkJobLaunchSerializer(serializers.Serializer):
requested_use_execution_environments = {job['execution_environment'] for job in attrs['jobs'] if 'execution_environment' in job}
requested_use_credentials = set()
requested_use_labels = set()
# requested_use_instance_groups = set()
requested_use_instance_groups = set()
for job in attrs['jobs']:
for cred in job.get('credentials', []):
requested_use_credentials.add(cred)
for label in job.get('labels', []):
requested_use_labels.add(label)
# for instance_group in job.get('instance_groups', []):
# requested_use_instance_groups.add(instance_group)
for instance_group in job.get('instance_groups', []):
requested_use_instance_groups.add(instance_group)
key_to_obj_map = {
"unified_job_template": {obj.id: obj for obj in UnifiedJobTemplate.objects.filter(id__in=requested_ujts)},
"inventory": {obj.id: obj for obj in Inventory.objects.filter(id__in=requested_use_inventories)},
"credentials": {obj.id: obj for obj in Credential.objects.filter(id__in=requested_use_credentials)},
"labels": {obj.id: obj for obj in Label.objects.filter(id__in=requested_use_labels)},
# "instance_groups": {obj.id: obj for obj in InstanceGroup.objects.filter(id__in=requested_use_instance_groups)},
"instance_groups": {obj.id: obj for obj in InstanceGroup.objects.filter(id__in=requested_use_instance_groups)},
"execution_environment": {obj.id: obj for obj in ExecutionEnvironment.objects.filter(id__in=requested_use_execution_environments)},
}
@@ -4792,7 +4791,7 @@ class BulkJobLaunchSerializer(serializers.Serializer):
self.check_list_permission(Credential, requested_use_credentials, 'use_role')
self.check_list_permission(Label, requested_use_labels)
# self.check_list_permission(InstanceGroup, requested_use_instance_groups) # TODO: change to use_role for conflict
self.check_list_permission(InstanceGroup, requested_use_instance_groups) # TODO: change to use_role for conflict
self.check_list_permission(ExecutionEnvironment, requested_use_execution_environments) # TODO: change if roles introduced
jobs_object = self.get_objectified_jobs(attrs, key_to_obj_map)
@@ -4839,7 +4838,7 @@ class BulkJobLaunchSerializer(serializers.Serializer):
node_m2m_object_types_to_through_model = {
'credentials': WorkflowJobNode.credentials.through,
'labels': WorkflowJobNode.labels.through,
# 'instance_groups': WorkflowJobNode.instance_groups.through,
'instance_groups': WorkflowJobNode.instance_groups.through,
}
node_deferred_attr_names = (
'limit',
@@ -4892,9 +4891,9 @@ class BulkJobLaunchSerializer(serializers.Serializer):
if field_name in node_m2m_objects[node_identifier] and field_name == 'labels':
for label in node_m2m_objects[node_identifier][field_name]:
through_model_objects.append(through_model(label=label, workflowjobnode=node_m2m_objects[node_identifier]['node']))
# if obj_type in node_m2m_objects[node_identifier] and obj_type == 'instance_groups':
# for instance_group in node_m2m_objects[node_identifier][obj_type]:
# through_model_objects.append(through_model(instancegroup=instance_group, workflowjobnode=node_m2m_objects[node_identifier]['node']))
if field_name in node_m2m_objects[node_identifier] and field_name == 'instance_groups':
for instance_group in node_m2m_objects[node_identifier][field_name]:
through_model_objects.append(through_model(instancegroup=instance_group, workflowjobnode=node_m2m_objects[node_identifier]['node']))
if through_model_objects:
through_model.objects.bulk_create(through_model_objects)
@@ -5436,7 +5435,7 @@ class InstanceSerializer(BaseSerializer):
res = super(InstanceSerializer, self).get_related(obj)
res['jobs'] = self.reverse('api:instance_unified_jobs_list', kwargs={'pk': obj.pk})
res['instance_groups'] = self.reverse('api:instance_instance_groups_list', kwargs={'pk': obj.pk})
if settings.IS_K8S and obj.node_type in (Instance.Types.EXECUTION,):
if obj.node_type in [Instance.Types.EXECUTION, Instance.Types.HOP]:
res['install_bundle'] = self.reverse('api:instance_install_bundle', kwargs={'pk': obj.pk})
res['peers'] = self.reverse('api:instance_peers_list', kwargs={"pk": obj.pk})
if self.context['request'].user.is_superuser or self.context['request'].user.is_system_auditor:

View File

@@ -1,16 +1,10 @@
import json
import warnings
from coreapi.document import Object, Link
from rest_framework import exceptions
from rest_framework.permissions import AllowAny
from rest_framework.renderers import CoreJSONRenderer
from rest_framework.response import Response
from rest_framework.schemas import SchemaGenerator, AutoSchema as DRFAuthSchema
from rest_framework.views import APIView
from rest_framework_swagger import renderers
from drf_yasg.views import get_schema_view
from drf_yasg import openapi
class SuperUserSchemaGenerator(SchemaGenerator):
@@ -55,43 +49,15 @@ class AutoSchema(DRFAuthSchema):
return description
class SwaggerSchemaView(APIView):
_ignore_model_permissions = True
exclude_from_schema = True
permission_classes = [AllowAny]
renderer_classes = [CoreJSONRenderer, renderers.OpenAPIRenderer, renderers.SwaggerUIRenderer]
def get(self, request):
generator = SuperUserSchemaGenerator(title='Ansible Automation Platform controller API', patterns=None, urlconf=None)
schema = generator.get_schema(request=request)
# python core-api doesn't support the deprecation yet, so track it
# ourselves and return it in a response header
_deprecated = []
# By default, DRF OpenAPI serialization places all endpoints in
# a single node based on their root path (/api). Instead, we want to
# group them by topic/tag so that they're categorized in the rendered
# output
document = schema._data.pop('api')
for path, node in document.items():
if isinstance(node, Object):
for action in node.values():
topic = getattr(action, 'topic', None)
if topic:
schema._data.setdefault(topic, Object())
schema._data[topic]._data[path] = node
if isinstance(action, Object):
for link in action.links.values():
if link.deprecated:
_deprecated.append(link.url)
elif isinstance(node, Link):
topic = getattr(node, 'topic', None)
if topic:
schema._data.setdefault(topic, Object())
schema._data[topic]._data[path] = node
if not schema:
raise exceptions.ValidationError('The schema generator did not return a schema Document')
return Response(schema, headers={'X-Deprecated-Paths': json.dumps(_deprecated)})
schema_view = get_schema_view(
openapi.Info(
title="Snippets API",
default_version='v1',
description="Test description",
terms_of_service="https://www.google.com/policies/terms/",
contact=openapi.Contact(email="contact@snippets.local"),
license=openapi.License(name="BSD License"),
),
public=True,
permission_classes=[AllowAny],
)

View File

@@ -9,10 +9,10 @@ receptor_work_commands:
params: worker
allowruntimeparams: true
verifysignature: true
custom_worksign_public_keyfile: receptor/work-public-key.pem
custom_worksign_public_keyfile: receptor/work_public_key.pem
custom_tls_certfile: receptor/tls/receptor.crt
custom_tls_keyfile: receptor/tls/receptor.key
custom_ca_certfile: receptor/tls/ca/receptor-ca.crt
custom_ca_certfile: receptor/tls/ca/mesh-CA.crt
receptor_protocol: 'tcp'
receptor_listener: true
receptor_port: {{ instance.listener_port }}

View File

@@ -30,7 +30,7 @@ from awx.api.views import (
OAuth2TokenList,
ApplicationOAuth2TokenList,
OAuth2ApplicationDetail,
# HostMetricSummaryMonthlyList, # It will be enabled in future version of the AWX
HostMetricSummaryMonthlyList,
)
from awx.api.views.bulk import (
@@ -123,8 +123,7 @@ v2_urls = [
re_path(r'^constructed_inventories/', include(constructed_inventory_urls)),
re_path(r'^hosts/', include(host_urls)),
re_path(r'^host_metrics/', include(host_metric_urls)),
# It will be enabled in future version of the AWX
# re_path(r'^host_metric_summary_monthly/$', HostMetricSummaryMonthlyList.as_view(), name='host_metric_summary_monthly_list'),
re_path(r'^host_metric_summary_monthly/$', HostMetricSummaryMonthlyList.as_view(), name='host_metric_summary_monthly_list'),
re_path(r'^groups/', include(group_urls)),
re_path(r'^inventory_sources/', include(inventory_source_urls)),
re_path(r'^inventory_updates/', include(inventory_update_urls)),
@@ -167,10 +166,13 @@ urlpatterns = [
]
if MODE == 'development':
# Only include these if we are in the development environment
from awx.api.swagger import SwaggerSchemaView
urlpatterns += [re_path(r'^swagger/$', SwaggerSchemaView.as_view(), name='swagger_view')]
from awx.api.swagger import schema_view
from awx.api.urls.debug import urls as debug_urls
urlpatterns += [re_path(r'^debug/', include(debug_urls))]
urlpatterns += [
re_path(r'^swagger(?P<format>\.json|\.yaml)/$', schema_view.without_ui(cache_timeout=0), name='schema-json'),
re_path(r'^swagger/$', schema_view.with_ui('swagger', cache_timeout=0), name='schema-swagger-ui'),
re_path(r'^redoc/$', schema_view.with_ui('redoc', cache_timeout=0), name='schema-redoc'),
]

View File

@@ -1564,16 +1564,15 @@ class HostMetricDetail(RetrieveDestroyAPIView):
return Response(status=status.HTTP_204_NO_CONTENT)
# It will be enabled in future version of the AWX
# class HostMetricSummaryMonthlyList(ListAPIView):
# name = _("Host Metrics Summary Monthly")
# model = models.HostMetricSummaryMonthly
# serializer_class = serializers.HostMetricSummaryMonthlySerializer
# permission_classes = (IsSystemAdminOrAuditor,)
# search_fields = ('date',)
#
# def get_queryset(self):
# return self.model.objects.all()
class HostMetricSummaryMonthlyList(ListAPIView):
name = _("Host Metrics Summary Monthly")
model = models.HostMetricSummaryMonthly
serializer_class = serializers.HostMetricSummaryMonthlySerializer
permission_classes = (IsSystemAdminOrAuditor,)
search_fields = ('date',)
def get_queryset(self):
return self.model.objects.all()
class HostList(HostRelatedSearchMixin, ListCreateAPIView):

View File

@@ -1,5 +1,7 @@
from collections import OrderedDict
from django.utils.translation import gettext_lazy as _
from rest_framework.permissions import IsAuthenticated
from rest_framework.renderers import JSONRenderer
from rest_framework.reverse import reverse
@@ -18,6 +20,9 @@ from awx.api import (
class BulkView(APIView):
name = _('Bulk')
swagger_topic = 'Bulk'
permission_classes = [IsAuthenticated]
renderer_classes = [
renderers.BrowsableAPIRenderer,

View File

@@ -57,13 +57,11 @@ class InstanceInstallBundle(GenericAPIView):
with io.BytesIO() as f:
with tarfile.open(fileobj=f, mode='w:gz') as tar:
# copy /etc/receptor/tls/ca/receptor-ca.crt to receptor/tls/ca in the tar file
tar.add(
os.path.realpath('/etc/receptor/tls/ca/receptor-ca.crt'), arcname=f"{instance_obj.hostname}_install_bundle/receptor/tls/ca/receptor-ca.crt"
)
# copy /etc/receptor/tls/ca/mesh-CA.crt to receptor/tls/ca in the tar file
tar.add(os.path.realpath('/etc/receptor/tls/ca/mesh-CA.crt'), arcname=f"{instance_obj.hostname}_install_bundle/receptor/tls/ca/mesh-CA.crt")
# copy /etc/receptor/signing/work-public-key.pem to receptor/work-public-key.pem
tar.add('/etc/receptor/signing/work-public-key.pem', arcname=f"{instance_obj.hostname}_install_bundle/receptor/work-public-key.pem")
# copy /etc/receptor/work_public_key.pem to receptor/work_public_key.pem
tar.add('/etc/receptor/work_public_key.pem', arcname=f"{instance_obj.hostname}_install_bundle/receptor/work_public_key.pem")
# generate and write the receptor key to receptor/tls/receptor.key in the tar file
key, cert = generate_receptor_tls(instance_obj)
@@ -161,14 +159,14 @@ def generate_receptor_tls(instance_obj):
.sign(key, hashes.SHA256())
)
# sign csr with the receptor ca key from /etc/receptor/ca/receptor-ca.key
with open('/etc/receptor/tls/ca/receptor-ca.key', 'rb') as f:
# sign csr with the receptor ca key from /etc/receptor/ca/mesh-CA.key
with open('/etc/receptor/tls/ca/mesh-CA.key', 'rb') as f:
ca_key = serialization.load_pem_private_key(
f.read(),
password=None,
)
with open('/etc/receptor/tls/ca/receptor-ca.crt', 'rb') as f:
with open('/etc/receptor/tls/ca/mesh-CA.crt', 'rb') as f:
ca_cert = x509.load_pem_x509_certificate(f.read())
cert = (

View File

@@ -20,6 +20,7 @@ from rest_framework import status
import requests
from awx import MODE
from awx.api.generics import APIView
from awx.conf.registry import settings_registry
from awx.main.analytics import all_collectors
@@ -54,6 +55,8 @@ class ApiRootView(APIView):
data['custom_logo'] = settings.CUSTOM_LOGO
data['custom_login_info'] = settings.CUSTOM_LOGIN_INFO
data['login_redirect_override'] = settings.LOGIN_REDIRECT_OVERRIDE
if MODE == 'development':
data['swagger'] = drf_reverse('api:schema-swagger-ui')
return Response(data)
@@ -104,8 +107,7 @@ class ApiVersionRootView(APIView):
data['groups'] = reverse('api:group_list', request=request)
data['hosts'] = reverse('api:host_list', request=request)
data['host_metrics'] = reverse('api:host_metric_list', request=request)
# It will be enabled in future version of the AWX
# data['host_metric_summary_monthly'] = reverse('api:host_metric_summary_monthly_list', request=request)
data['host_metric_summary_monthly'] = reverse('api:host_metric_summary_monthly_list', request=request)
data['job_templates'] = reverse('api:job_template_list', request=request)
data['jobs'] = reverse('api:job_list', request=request)
data['ad_hoc_commands'] = reverse('api:ad_hoc_command_list', request=request)

View File

@@ -14,7 +14,7 @@ class ConfConfig(AppConfig):
def ready(self):
self.module.autodiscover()
if not set(sys.argv) & {'migrate', 'check_migrations'}:
if not set(sys.argv) & {'migrate', 'check_migrations', 'showmigrations'}:
from .settings import SettingsWrapper
SettingsWrapper.initialize()

View File

@@ -0,0 +1,17 @@
# Generated by Django 4.2 on 2023-06-09 19:51
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('conf', '0009_rename_proot_settings'),
]
operations = [
migrations.AlterField(
model_name='setting',
name='value',
field=models.JSONField(null=True),
),
]

View File

@@ -8,7 +8,6 @@ import json
from django.db import models
# AWX
from awx.main.fields import JSONBlob
from awx.main.models.base import CreatedModifiedModel, prevent_search
from awx.main.utils import encrypt_field
from awx.conf import settings_registry
@@ -18,7 +17,7 @@ __all__ = ['Setting']
class Setting(CreatedModifiedModel):
key = models.CharField(max_length=255)
value = JSONBlob(null=True)
value = models.JSONField(null=True)
user = prevent_search(models.ForeignKey('auth.User', related_name='settings', default=None, null=True, editable=False, on_delete=models.CASCADE))
def __str__(self):

View File

@@ -366,9 +366,9 @@ class BaseAccess(object):
report_violation = lambda message: None
else:
report_violation = lambda message: logger.warning(message)
if validation_info.get('trial', False) is True or validation_info['instance_count'] == 10: # basic 10 license
if validation_info.get('trial', False) is True:
def report_violation(message):
def report_violation(message): # noqa
raise PermissionDenied(message)
if check_expiration and validation_info.get('time_remaining', None) is None:

View File

@@ -399,7 +399,10 @@ def _copy_table(table, query, path):
file_path = os.path.join(path, table + '_table.csv')
file = FileSplitter(filespec=file_path)
with connection.cursor() as cursor:
cursor.copy_expert(query, file)
with cursor.copy(query) as copy:
while data := copy.read():
byte_data = bytes(data)
file.write(byte_data.decode())
return file.file_list()
@@ -610,3 +613,20 @@ def host_metric_table(since, full_path, until, **kwargs):
since.isoformat(), until.isoformat(), since.isoformat(), until.isoformat()
)
return _copy_table(table='host_metric', query=host_metric_query, path=full_path)
@register('host_metric_summary_monthly_table', '1.0', format='csv', description=_('HostMetricSummaryMonthly export, full sync'), expensive=trivial_slicing)
def host_metric_summary_monthly_table(since, full_path, **kwargs):
query = '''
COPY (SELECT main_hostmetricsummarymonthly.id,
main_hostmetricsummarymonthly.date,
main_hostmetricsummarymonthly.license_capacity,
main_hostmetricsummarymonthly.license_consumed,
main_hostmetricsummarymonthly.hosts_added,
main_hostmetricsummarymonthly.hosts_deleted,
main_hostmetricsummarymonthly.indirectly_managed_hosts
FROM main_hostmetricsummarymonthly
ORDER BY main_hostmetricsummarymonthly.id ASC) TO STDOUT WITH CSV HEADER
'''
return _copy_table(table='host_metric_summary_monthly', query=query, path=full_path)

View File

@@ -209,6 +209,11 @@ class Metrics:
SetFloatM('workflow_manager_recorded_timestamp', 'Unix timestamp when metrics were last recorded'),
SetFloatM('workflow_manager_spawn_workflow_graph_jobs_seconds', 'Time spent spawning workflow tasks'),
SetFloatM('workflow_manager_get_tasks_seconds', 'Time spent loading workflow tasks from db'),
# dispatcher subsystem metrics
SetIntM('dispatcher_pool_scale_up_events', 'Number of times local dispatcher scaled up a worker since startup'),
SetIntM('dispatcher_pool_active_task_count', 'Number of active tasks in the worker pool when last task was submitted'),
SetIntM('dispatcher_pool_max_worker_count', 'Highest number of workers in worker pool in last collection interval, about 20s'),
SetFloatM('dispatcher_availability', 'Fraction of time (in last collection interval) dispatcher was able to receive messages'),
]
# turn metric list into dictionary with the metric name as a key
self.METRICS = {}

87
awx/main/cache.py Normal file
View File

@@ -0,0 +1,87 @@
import functools
from django.conf import settings
from django.core.cache.backends.base import DEFAULT_TIMEOUT
from django.core.cache.backends.redis import RedisCache
from redis.exceptions import ConnectionError, ResponseError, TimeoutError
import socket
# This list comes from what django-redis ignores and the behavior we are trying
# to retain while dropping the dependency on django-redis.
IGNORED_EXCEPTIONS = (TimeoutError, ResponseError, ConnectionError, socket.timeout)
CONNECTION_INTERRUPTED_SENTINEL = object()
def optionally_ignore_exceptions(func=None, return_value=None):
if func is None:
return functools.partial(optionally_ignore_exceptions, return_value=return_value)
@functools.wraps(func)
def wrapper(*args, **kwargs):
try:
return func(*args, **kwargs)
except IGNORED_EXCEPTIONS as e:
if settings.DJANGO_REDIS_IGNORE_EXCEPTIONS:
return return_value
raise e.__cause__ or e
return wrapper
class AWXRedisCache(RedisCache):
"""
We just want to wrap the upstream RedisCache class so that we can ignore
the exceptions that it raises when the cache is unavailable.
"""
@optionally_ignore_exceptions
def add(self, key, value, timeout=DEFAULT_TIMEOUT, version=None):
return super().add(key, value, timeout, version)
@optionally_ignore_exceptions(return_value=CONNECTION_INTERRUPTED_SENTINEL)
def _get(self, key, default=None, version=None):
return super().get(key, default, version)
def get(self, key, default=None, version=None):
value = self._get(key, default, version)
if value is CONNECTION_INTERRUPTED_SENTINEL:
return default
return value
@optionally_ignore_exceptions
def set(self, key, value, timeout=DEFAULT_TIMEOUT, version=None):
return super().set(key, value, timeout, version)
@optionally_ignore_exceptions
def touch(self, key, timeout=DEFAULT_TIMEOUT, version=None):
return super().touch(key, timeout, version)
@optionally_ignore_exceptions
def delete(self, key, version=None):
return super().delete(key, version)
@optionally_ignore_exceptions
def get_many(self, keys, version=None):
return super().get_many(keys, version)
@optionally_ignore_exceptions
def has_key(self, key, version=None):
return super().has_key(key, version)
@optionally_ignore_exceptions
def incr(self, key, delta=1, version=None):
return super().incr(key, delta, version)
@optionally_ignore_exceptions
def set_many(self, data, timeout=DEFAULT_TIMEOUT, version=None):
return super().set_many(data, timeout, version)
@optionally_ignore_exceptions
def delete_many(self, keys, version=None):
return super().delete_many(keys, version)
@optionally_ignore_exceptions
def clear(self):
return super().clear()

View File

@@ -94,6 +94,20 @@ register(
category_slug='system',
)
register(
'CSRF_TRUSTED_ORIGINS',
default=[],
field_class=fields.StringListField,
label=_('CSRF Trusted Origins List'),
help_text=_(
"If the service is behind a reverse proxy/load balancer, use this setting "
"to configure the schema://addresses from which the service should trust "
"Origin header values. "
),
category=_('System'),
category_slug='system',
)
register(
'LICENSE',
field_class=fields.DictField,
@@ -684,11 +698,28 @@ register(
field_class=fields.IntegerField,
default=1,
min_value=1,
label=_('Maximum disk persistance for external log aggregation (in GB)'),
label=_('Maximum disk persistence for external log aggregation (in GB)'),
help_text=_(
'Amount of data to store (in gigabytes) during an outage of '
'the external log aggregator (defaults to 1). '
'Equivalent to the rsyslogd queue.maxdiskspace setting.'
'Equivalent to the rsyslogd queue.maxdiskspace setting for main_queue. '
'Notably, this is used for the rsyslogd main queue (for input messages).'
),
category=_('Logging'),
category_slug='logging',
)
register(
'LOG_AGGREGATOR_ACTION_MAX_DISK_USAGE_GB',
field_class=fields.IntegerField,
default=1,
min_value=1,
label=_('Maximum disk persistence for rsyslogd action queuing (in GB)'),
help_text=_(
'Amount of data to store (in gigabytes) if an rsyslog action takes time '
'to process an incoming message (defaults to 1). '
'Equivalent to the rsyslogd queue.maxdiskspace setting on the action (e.g. omhttp). '
'Like LOG_AGGREGATOR_MAX_DISK_USAGE_GB, it stores files in the directory specified '
'by LOG_AGGREGATOR_MAX_DISK_USAGE_PATH.'
),
category=_('Logging'),
category_slug='logging',
@@ -831,6 +862,55 @@ register(
category_slug='system',
)
register(
'HOST_METRIC_SUMMARY_TASK_LAST_TS',
field_class=fields.DateTimeField,
label=_('Last computing date of HostMetricSummaryMonthly'),
allow_null=True,
category=_('System'),
category_slug='system',
)
register(
'AWX_CLEANUP_PATHS',
field_class=fields.BooleanField,
label=_('Enable or Disable tmp dir cleanup'),
default=True,
help_text=_('Enable or Disable TMP Dir cleanup'),
category=('Debug'),
category_slug='debug',
)
register(
'AWX_REQUEST_PROFILE',
field_class=fields.BooleanField,
label=_('Debug Web Requests'),
default=False,
help_text=_('Debug web request python timing'),
category=('Debug'),
category_slug='debug',
)
register(
'DEFAULT_CONTAINER_RUN_OPTIONS',
field_class=fields.StringListField,
label=_('Container Run Options'),
default=['--network', 'slirp4netns:enable_ipv6=true'],
help_text=_("List of options to pass to podman run example: ['--network', 'slirp4netns:enable_ipv6=true', '--log-level', 'debug']"),
category=('Jobs'),
category_slug='jobs',
)
register(
'RECEPTOR_RELEASE_WORK',
field_class=fields.BooleanField,
label=_('Release Receptor Work'),
default=True,
help_text=_('Release receptor work'),
category=('Debug'),
category_slug='debug',
)
def logging_validate(serializer, attrs):
if not serializer.instance or not hasattr(serializer.instance, 'LOG_AGGREGATOR_HOST') or not hasattr(serializer.instance, 'LOG_AGGREGATOR_TYPE'):

View File

@@ -0,0 +1,65 @@
import boto3
from botocore.exceptions import ClientError
from .plugin import CredentialPlugin
from django.utils.translation import gettext_lazy as _
secrets_manager_inputs = {
'fields': [
{
'id': 'aws_access_key',
'label': _('AWS Access Key'),
'type': 'string',
},
{
'id': 'aws_secret_key',
'label': _('AWS Secret Key'),
'type': 'string',
'secret': True,
},
],
'metadata': [
{
'id': 'region_name',
'label': _('AWS Secrets Manager Region'),
'type': 'string',
'help_text': _('Region which the secrets manager is located'),
},
{
'id': 'secret_name',
'label': _('AWS Secret Name'),
'type': 'string',
},
],
'required': ['aws_access_key', 'aws_secret_key', 'region_name', 'secret_name'],
}
def aws_secretsmanager_backend(**kwargs):
secret_name = kwargs['secret_name']
region_name = kwargs['region_name']
aws_secret_access_key = kwargs['aws_secret_key']
aws_access_key_id = kwargs['aws_access_key']
session = boto3.session.Session()
client = session.client(
service_name='secretsmanager', region_name=region_name, aws_secret_access_key=aws_secret_access_key, aws_access_key_id=aws_access_key_id
)
try:
get_secret_value_response = client.get_secret_value(SecretId=secret_name)
except ClientError as e:
raise e
# Secrets Manager decrypts the secret value using the associated KMS CMK
# Depending on whether the secret was a string or binary, only one of these fields will be populated
if 'SecretString' in get_secret_value_response:
secret = get_secret_value_response['SecretString']
else:
secret = get_secret_value_response['SecretBinary']
return secret
aws_secretmanager_plugin = CredentialPlugin('AWS Secrets Manager lookup', inputs=secrets_manager_inputs, backend=aws_secretsmanager_backend)

View File

@@ -265,6 +265,8 @@ def kv_backend(**kwargs):
if secret_key:
try:
if (secret_key != 'data') and (secret_key not in json['data']) and ('data' in json['data']):
return json['data']['data'][secret_key]
return json['data'][secret_key]
except KeyError:
raise RuntimeError('{} is not present at {}'.format(secret_key, secret_path))

View File

@@ -50,7 +50,7 @@ tss_inputs = {
def tss_backend(**kwargs):
if 'domain' in kwargs:
if kwargs.get("domain"):
authorizer = DomainPasswordGrantAuthorizer(kwargs['server_url'], kwargs['username'], kwargs['password'], kwargs['domain'])
else:
authorizer = PasswordGrantAuthorizer(kwargs['server_url'], kwargs['username'], kwargs['password'])

View File

@@ -1,5 +1,5 @@
import os
import psycopg2
import psycopg
import select
from contextlib import contextmanager
@@ -40,8 +40,12 @@ def get_task_queuename():
class PubSub(object):
def __init__(self, conn):
def __init__(self, conn, select_timeout=None):
self.conn = conn
if select_timeout is None:
self.select_timeout = 5
else:
self.select_timeout = select_timeout
def listen(self, channel):
with self.conn.cursor() as cur:
@@ -55,25 +59,42 @@ class PubSub(object):
with self.conn.cursor() as cur:
cur.execute('SELECT pg_notify(%s, %s);', (channel, payload))
def events(self, select_timeout=5, yield_timeouts=False):
@staticmethod
def current_notifies(conn):
"""
Altered version of .notifies method from psycopg library
This removes the outer while True loop so that we only process
queued notifications
"""
with conn.lock:
try:
ns = conn.wait(psycopg.generators.notifies(conn.pgconn))
except psycopg.errors._NO_TRACEBACK as ex:
raise ex.with_traceback(None)
enc = psycopg._encodings.pgconn_encoding(conn.pgconn)
for pgn in ns:
n = psycopg.connection.Notify(pgn.relname.decode(enc), pgn.extra.decode(enc), pgn.be_pid)
yield n
def events(self, yield_timeouts=False):
if not self.conn.autocommit:
raise RuntimeError('Listening for events can only be done in autocommit mode')
while True:
if select.select([self.conn], [], [], select_timeout) == NOT_READY:
if select.select([self.conn], [], [], self.select_timeout) == NOT_READY:
if yield_timeouts:
yield None
else:
self.conn.poll()
while self.conn.notifies:
yield self.conn.notifies.pop(0)
notification_generator = self.current_notifies(self.conn)
for notification in notification_generator:
yield notification
def close(self):
self.conn.close()
@contextmanager
def pg_bus_conn(new_connection=False):
def pg_bus_conn(new_connection=False, select_timeout=None):
'''
Any listeners probably want to establish a new database connection,
separate from the Django connection used for queries, because that will prevent
@@ -89,9 +110,8 @@ def pg_bus_conn(new_connection=False):
conf['OPTIONS'] = conf.get('OPTIONS', {}).copy()
# Modify the application name to distinguish from other connections the process might use
conf['OPTIONS']['application_name'] = get_application_name(settings.CLUSTER_HOST_ID, function='listener')
conn = psycopg2.connect(dbname=conf['NAME'], host=conf['HOST'], user=conf['USER'], password=conf['PASSWORD'], port=conf['PORT'], **conf['OPTIONS'])
# Django connection.cursor().connection doesn't have autocommit=True on by default
conn.set_session(autocommit=True)
connection_data = f"dbname={conf['NAME']} host={conf['HOST']} user={conf['USER']} password={conf['PASSWORD']} port={conf['PORT']}"
conn = psycopg.connect(connection_data, autocommit=True, **conf['OPTIONS'])
else:
if pg_connection.connection is None:
pg_connection.connect()
@@ -99,7 +119,7 @@ def pg_bus_conn(new_connection=False):
raise RuntimeError('Unexpectedly could not connect to postgres for pg_notify actions')
conn = pg_connection.connection
pubsub = PubSub(conn)
pubsub = PubSub(conn, select_timeout=select_timeout)
yield pubsub
if new_connection:
conn.close()

View File

@@ -40,6 +40,9 @@ class Control(object):
def cancel(self, task_ids, *args, **kwargs):
return self.control_with_reply('cancel', *args, extra_data={'task_ids': task_ids}, **kwargs)
def schedule(self, *args, **kwargs):
return self.control_with_reply('schedule', *args, **kwargs)
@classmethod
def generate_reply_queue_name(cls):
return f"reply_to_{str(uuid.uuid4()).replace('-','_')}"
@@ -52,14 +55,14 @@ class Control(object):
if not connection.get_autocommit():
raise RuntimeError('Control-with-reply messages can only be done in autocommit mode')
with pg_bus_conn() as conn:
with pg_bus_conn(select_timeout=timeout) as conn:
conn.listen(reply_queue)
send_data = {'control': command, 'reply_to': reply_queue}
if extra_data:
send_data.update(extra_data)
conn.notify(self.queuename, json.dumps(send_data))
for reply in conn.events(select_timeout=timeout, yield_timeouts=True):
for reply in conn.events(yield_timeouts=True):
if reply is None:
logger.error(f'{self.service} did not reply within {timeout}s')
raise RuntimeError(f"{self.service} did not reply within {timeout}s")

View File

@@ -1,57 +1,142 @@
import logging
import os
import time
from multiprocessing import Process
import yaml
from datetime import datetime
from django.conf import settings
from django.db import connections
from schedule import Scheduler
from django_guid import set_guid
from django_guid.utils import generate_guid
from awx.main.dispatch.worker import TaskWorker
from awx.main.utils.db import set_connection_name
logger = logging.getLogger('awx.main.dispatch.periodic')
class Scheduler(Scheduler):
def run_continuously(self):
idle_seconds = max(1, min(self.jobs).period.total_seconds() / 2)
class ScheduledTask:
"""
Class representing schedules, very loosely modeled after python schedule library Job
the idea of this class is to:
- only deal in relative times (time since the scheduler global start)
- only deal in integer math for target runtimes, but float for current relative time
def run():
ppid = os.getppid()
logger.warning('periodic beat started')
Missed schedule policy:
Invariant target times are maintained, meaning that if interval=10s offset=0
and it runs at t=7s, then it calls for next run in 3s.
However, if a complete interval has passed, that is counted as a missed run,
and missed runs are abandoned (no catch-up runs).
"""
set_connection_name('periodic') # set application_name to distinguish from other dispatcher processes
def __init__(self, name: str, data: dict):
# parameters need for schedule computation
self.interval = int(data['schedule'].total_seconds())
self.offset = 0 # offset relative to start time this schedule begins
self.index = 0 # number of periods of the schedule that has passed
while True:
if os.getppid() != ppid:
# if the parent PID changes, this process has been orphaned
# via e.g., segfault or sigkill, we should exit too
pid = os.getpid()
logger.warning(f'periodic beat exiting gracefully pid:{pid}')
raise SystemExit()
try:
for conn in connections.all():
# If the database connection has a hiccup, re-establish a new
# connection
conn.close_if_unusable_or_obsolete()
set_guid(generate_guid())
self.run_pending()
except Exception:
logger.exception('encountered an error while scheduling periodic tasks')
time.sleep(idle_seconds)
# parameters that do not affect scheduling logic
self.last_run = None # time of last run, only used for debug
self.completed_runs = 0 # number of times schedule is known to run
self.name = name
self.data = data # used by caller to know what to run
process = Process(target=run)
process.daemon = True
process.start()
@property
def next_run(self):
"Time until the next run with t=0 being the global_start of the scheduler class"
return (self.index + 1) * self.interval + self.offset
def due_to_run(self, relative_time):
return bool(self.next_run <= relative_time)
def expected_runs(self, relative_time):
return int((relative_time - self.offset) / self.interval)
def mark_run(self, relative_time):
self.last_run = relative_time
self.completed_runs += 1
new_index = self.expected_runs(relative_time)
if new_index > self.index + 1:
logger.warning(f'Missed {new_index - self.index - 1} schedules of {self.name}')
self.index = new_index
def missed_runs(self, relative_time):
"Number of times job was supposed to ran but failed to, only used for debug"
missed_ct = self.expected_runs(relative_time) - self.completed_runs
# if this is currently due to run do not count that as a missed run
if missed_ct and self.due_to_run(relative_time):
missed_ct -= 1
return missed_ct
def run_continuously():
scheduler = Scheduler()
for task in settings.CELERYBEAT_SCHEDULE.values():
apply_async = TaskWorker.resolve_callable(task['task']).apply_async
total_seconds = task['schedule'].total_seconds()
scheduler.every(total_seconds).seconds.do(apply_async)
scheduler.run_continuously()
class Scheduler:
def __init__(self, schedule):
"""
Expects schedule in the form of a dictionary like
{
'job1': {'schedule': timedelta(seconds=50), 'other': 'stuff'}
}
Only the schedule nearest-second value is used for scheduling,
the rest of the data is for use by the caller to know what to run.
"""
self.jobs = [ScheduledTask(name, data) for name, data in schedule.items()]
min_interval = min(job.interval for job in self.jobs)
num_jobs = len(self.jobs)
# this is intentionally oppioniated against spammy schedules
# a core goal is to spread out the scheduled tasks (for worker management)
# and high-frequency schedules just do not work with that
if num_jobs > min_interval:
raise RuntimeError(f'Number of schedules ({num_jobs}) is more than the shortest schedule interval ({min_interval} seconds).')
# even space out jobs over the base interval
for i, job in enumerate(self.jobs):
job.offset = (i * min_interval) // num_jobs
# internally times are all referenced relative to startup time, add grace period
self.global_start = time.time() + 2.0
def get_and_mark_pending(self):
relative_time = time.time() - self.global_start
to_run = []
for job in self.jobs:
if job.due_to_run(relative_time):
to_run.append(job)
logger.debug(f'scheduler found {job.name} to run, {relative_time - job.next_run} seconds after target')
job.mark_run(relative_time)
return to_run
def time_until_next_run(self):
relative_time = time.time() - self.global_start
next_job = min(self.jobs, key=lambda j: j.next_run)
delta = next_job.next_run - relative_time
if delta <= 0.1:
# careful not to give 0 or negative values to the select timeout, which has unclear interpretation
logger.warning(f'Scheduler next run of {next_job.name} is {-delta} seconds in the past')
return 0.1
elif delta > 20.0:
logger.warning(f'Scheduler next run unexpectedly over 20 seconds in future: {delta}')
return 20.0
logger.debug(f'Scheduler next run is {next_job.name} in {delta} seconds')
return delta
def debug(self, *args, **kwargs):
data = dict()
data['title'] = 'Scheduler status'
now = datetime.fromtimestamp(time.time()).strftime('%Y-%m-%d %H:%M:%S UTC')
start_time = datetime.fromtimestamp(self.global_start).strftime('%Y-%m-%d %H:%M:%S UTC')
relative_time = time.time() - self.global_start
data['started_time'] = start_time
data['current_time'] = now
data['current_time_relative'] = round(relative_time, 3)
data['total_schedules'] = len(self.jobs)
data['schedule_list'] = dict(
[
(
job.name,
dict(
last_run_seconds_ago=round(relative_time - job.last_run, 3) if job.last_run else None,
next_run_in_seconds=round(job.next_run - relative_time, 3),
offset_in_seconds=job.offset,
completed_runs=job.completed_runs,
missed_runs=job.missed_runs(relative_time),
),
)
for job in sorted(self.jobs, key=lambda job: job.interval)
]
)
return yaml.safe_dump(data, default_flow_style=False, sort_keys=False)

View File

@@ -339,6 +339,17 @@ class AutoscalePool(WorkerPool):
# but if the task takes longer than the time defined here, we will force it to stop here
self.task_manager_timeout = settings.TASK_MANAGER_TIMEOUT + settings.TASK_MANAGER_TIMEOUT_GRACE_PERIOD
# initialize some things for subsystem metrics periodic gathering
# the AutoscalePool class does not save these to redis directly, but reports via produce_subsystem_metrics
self.scale_up_ct = 0
self.worker_count_max = 0
def produce_subsystem_metrics(self, metrics_object):
metrics_object.set('dispatcher_pool_scale_up_events', self.scale_up_ct)
metrics_object.set('dispatcher_pool_active_task_count', sum(len(w.managed_tasks) for w in self.workers))
metrics_object.set('dispatcher_pool_max_worker_count', self.worker_count_max)
self.worker_count_max = len(self.workers)
@property
def should_grow(self):
if len(self.workers) < self.min_workers:
@@ -406,16 +417,16 @@ class AutoscalePool(WorkerPool):
# the task manager to never do more work
current_task = w.current_task
if current_task and isinstance(current_task, dict):
endings = ['tasks.task_manager', 'tasks.dependency_manager', 'tasks.workflow_manager']
endings = ('tasks.task_manager', 'tasks.dependency_manager', 'tasks.workflow_manager')
current_task_name = current_task.get('task', '')
if any(current_task_name.endswith(e) for e in endings):
if current_task_name.endswith(endings):
if 'started' not in current_task:
w.managed_tasks[current_task['uuid']]['started'] = time.time()
age = time.time() - current_task['started']
w.managed_tasks[current_task['uuid']]['age'] = age
if age > self.task_manager_timeout:
logger.error(f'{current_task_name} has held the advisory lock for {age}, sending SIGTERM to {w.pid}')
os.kill(w.pid, signal.SIGTERM)
logger.error(f'{current_task_name} has held the advisory lock for {age}, sending SIGUSR1 to {w.pid}')
os.kill(w.pid, signal.SIGUSR1)
for m in orphaned:
# if all the workers are dead, spawn at least one
@@ -443,7 +454,12 @@ class AutoscalePool(WorkerPool):
idx = random.choice(range(len(self.workers)))
return idx, self.workers[idx]
else:
return super(AutoscalePool, self).up()
self.scale_up_ct += 1
ret = super(AutoscalePool, self).up()
new_worker_ct = len(self.workers)
if new_worker_ct > self.worker_count_max:
self.worker_count_max = new_worker_ct
return ret
def write(self, preferred_queue, body):
if 'guid' in body:

View File

@@ -73,15 +73,15 @@ class task:
return cls.apply_async(args, kwargs)
@classmethod
def apply_async(cls, args=None, kwargs=None, queue=None, uuid=None, **kw):
def get_async_body(cls, args=None, kwargs=None, uuid=None, **kw):
"""
Get the python dict to become JSON data in the pg_notify message
This same message gets passed over the dispatcher IPC queue to workers
If a task is submitted to a multiprocessing pool, skipping pg_notify, this might be used directly
"""
task_id = uuid or str(uuid4())
args = args or []
kwargs = kwargs or {}
queue = queue or getattr(cls.queue, 'im_func', cls.queue)
if not queue:
msg = f'{cls.name}: Queue value required and may not be None'
logger.error(msg)
raise ValueError(msg)
obj = {'uuid': task_id, 'args': args, 'kwargs': kwargs, 'task': cls.name, 'time_pub': time.time()}
guid = get_guid()
if guid:
@@ -89,6 +89,16 @@ class task:
if bind_kwargs:
obj['bind_kwargs'] = bind_kwargs
obj.update(**kw)
return obj
@classmethod
def apply_async(cls, args=None, kwargs=None, queue=None, uuid=None, **kw):
queue = queue or getattr(cls.queue, 'im_func', cls.queue)
if not queue:
msg = f'{cls.name}: Queue value required and may not be None'
logger.error(msg)
raise ValueError(msg)
obj = cls.get_async_body(args=args, kwargs=kwargs, uuid=uuid, **kw)
if callable(queue):
queue = queue()
if not is_testing():
@@ -116,4 +126,5 @@ class task:
setattr(fn, 'name', cls.name)
setattr(fn, 'apply_async', cls.apply_async)
setattr(fn, 'delay', cls.delay)
setattr(fn, 'get_async_body', cls.get_async_body)
return fn

View File

@@ -7,18 +7,21 @@ import signal
import sys
import redis
import json
import psycopg2
import psycopg
import time
from uuid import UUID
from queue import Empty as QueueEmpty
from datetime import timedelta
from django import db
from django.conf import settings
from awx.main.dispatch.pool import WorkerPool
from awx.main.dispatch.periodic import Scheduler
from awx.main.dispatch import pg_bus_conn
from awx.main.utils.common import log_excess_runtime
from awx.main.utils.db import set_connection_name
import awx.main.analytics.subsystem_metrics as s_metrics
if 'run_callback_receiver' in sys.argv:
logger = logging.getLogger('awx.main.commands.run_callback_receiver')
@@ -63,10 +66,12 @@ class AWXConsumerBase(object):
def control(self, body):
logger.warning(f'Received control signal:\n{body}')
control = body.get('control')
if control in ('status', 'running', 'cancel'):
if control in ('status', 'schedule', 'running', 'cancel'):
reply_queue = body['reply_to']
if control == 'status':
msg = '\n'.join([self.listening_on, self.pool.debug()])
if control == 'schedule':
msg = self.scheduler.debug()
elif control == 'running':
msg = []
for worker in self.pool.workers:
@@ -92,16 +97,11 @@ class AWXConsumerBase(object):
else:
logger.error('unrecognized control message: {}'.format(control))
def process_task(self, body):
def dispatch_task(self, body):
"""This will place the given body into a worker queue to run method decorated as a task"""
if isinstance(body, dict):
body['time_ack'] = time.time()
if 'control' in body:
try:
return self.control(body)
except Exception:
logger.exception(f"Exception handling control message: {body}")
return
if len(self.pool):
if "uuid" in body and body['uuid']:
try:
@@ -115,15 +115,24 @@ class AWXConsumerBase(object):
self.pool.write(queue, body)
self.total_messages += 1
def process_task(self, body):
"""Routes the task details in body as either a control task or a task-task"""
if 'control' in body:
try:
return self.control(body)
except Exception:
logger.exception(f"Exception handling control message: {body}")
return
self.dispatch_task(body)
@log_excess_runtime(logger)
def record_statistics(self):
if time.time() - self.last_stats > 1: # buffer stat recording to once per second
try:
self.redis.set(f'awx_{self.name}_statistics', self.pool.debug())
self.last_stats = time.time()
except Exception:
logger.exception(f"encountered an error communicating with redis to store {self.name} statistics")
self.last_stats = time.time()
self.last_stats = time.time()
def run(self, *args, **kwargs):
signal.signal(signal.SIGINT, self.stop)
@@ -142,29 +151,72 @@ class AWXConsumerRedis(AWXConsumerBase):
def run(self, *args, **kwargs):
super(AWXConsumerRedis, self).run(*args, **kwargs)
self.worker.on_start()
logger.info(f'Callback receiver started with pid={os.getpid()}')
db.connection.close() # logs use database, so close connection
while True:
logger.debug(f'{os.getpid()} is alive')
time.sleep(60)
class AWXConsumerPG(AWXConsumerBase):
def __init__(self, *args, **kwargs):
def __init__(self, *args, schedule=None, **kwargs):
super().__init__(*args, **kwargs)
self.pg_max_wait = settings.DISPATCHER_DB_DOWNTOWN_TOLLERANCE
# if no successful loops have ran since startup, then we should fail right away
self.pg_is_down = True # set so that we fail if we get database errors on startup
self.pg_down_time = time.time() - self.pg_max_wait # allow no grace period
self.last_cleanup = time.time()
init_time = time.time()
self.pg_down_time = init_time - self.pg_max_wait # allow no grace period
self.last_cleanup = init_time
self.subsystem_metrics = s_metrics.Metrics(auto_pipe_execute=False)
self.last_metrics_gather = init_time
self.listen_cumulative_time = 0.0
if schedule:
schedule = schedule.copy()
else:
schedule = {}
# add control tasks to be ran at regular schedules
# NOTE: if we run out of database connections, it is important to still run cleanup
# so that we scale down workers and free up connections
schedule['pool_cleanup'] = {'control': self.pool.cleanup, 'schedule': timedelta(seconds=60)}
# record subsystem metrics for the dispatcher
schedule['metrics_gather'] = {'control': self.record_metrics, 'schedule': timedelta(seconds=20)}
self.scheduler = Scheduler(schedule)
def record_metrics(self):
current_time = time.time()
self.pool.produce_subsystem_metrics(self.subsystem_metrics)
self.subsystem_metrics.set('dispatcher_availability', self.listen_cumulative_time / (current_time - self.last_metrics_gather))
self.subsystem_metrics.pipe_execute()
self.listen_cumulative_time = 0.0
self.last_metrics_gather = current_time
def run_periodic_tasks(self):
self.record_statistics() # maintains time buffer in method
"""
Run general periodic logic, and return maximum time in seconds before
the next requested run
This may be called more often than that when events are consumed
so this should be very efficient in that
"""
try:
self.record_statistics() # maintains time buffer in method
except Exception as exc:
logger.warning(f'Failed to save dispatcher statistics {exc}')
if time.time() - self.last_cleanup > 60: # same as cluster_node_heartbeat
# NOTE: if we run out of database connections, it is important to still run cleanup
# so that we scale down workers and free up connections
self.pool.cleanup()
self.last_cleanup = time.time()
for job in self.scheduler.get_and_mark_pending():
if 'control' in job.data:
try:
job.data['control']()
except Exception:
logger.exception(f'Error running control task {job.data}')
elif 'task' in job.data:
body = self.worker.resolve_callable(job.data['task']).get_async_body()
# bypasses pg_notify for scheduled tasks
self.dispatch_task(body)
self.pg_is_down = False
self.listen_start = time.time()
return self.scheduler.time_until_next_run()
def run(self, *args, **kwargs):
super(AWXConsumerPG, self).run(*args, **kwargs)
@@ -180,17 +232,21 @@ class AWXConsumerPG(AWXConsumerBase):
if init is False:
self.worker.on_start()
init = True
# run_periodic_tasks run scheduled actions and gives time until next scheduled action
# this is saved to the conn (PubSub) object in order to modify read timeout in-loop
conn.select_timeout = self.run_periodic_tasks()
# this is the main operational loop for awx-manage run_dispatcher
for e in conn.events(yield_timeouts=True):
self.listen_cumulative_time += time.time() - self.listen_start # for metrics
if e is not None:
self.process_task(json.loads(e.payload))
self.run_periodic_tasks()
self.pg_is_down = False
conn.select_timeout = self.run_periodic_tasks()
if self.should_stop:
return
except psycopg2.InterfaceError:
except psycopg.InterfaceError:
logger.warning("Stale Postgres message bus connection, reconnecting")
continue
except (db.DatabaseError, psycopg2.OperationalError):
except (db.DatabaseError, psycopg.OperationalError):
# If we have attained stady state operation, tolerate short-term database hickups
if not self.pg_is_down:
logger.exception(f"Error consuming new events from postgres, will retry for {self.pg_max_wait} s")
@@ -232,8 +288,8 @@ class BaseWorker(object):
break
except QueueEmpty:
continue
except Exception as e:
logger.error("Exception on worker {}, restarting: ".format(idx) + str(e))
except Exception:
logger.exception("Exception on worker {}, reconnecting: ".format(idx))
continue
try:
for conn in db.connections.all():

View File

@@ -191,7 +191,9 @@ class CallbackBrokerWorker(BaseWorker):
e._retry_count = retry_count
# special sanitization logic for postgres treatment of NUL 0x00 char
if (retry_count == 1) and isinstance(exc_indv, ValueError) and ("\x00" in e.stdout):
# This used to check the class of the exception but on the postgres3 upgrade it could appear
# as either DataError or ValueError, so now lets just try if its there.
if (retry_count == 1) and ("\x00" in e.stdout):
e.stdout = e.stdout.replace("\x00", "")
if retry_count >= self.INDIVIDUAL_EVENT_RETRIES:

View File

@@ -67,10 +67,60 @@ def __enum_validate__(validator, enums, instance, schema):
Draft4Validator.VALIDATORS['enum'] = __enum_validate__
import logging
logger = logging.getLogger('awx.main.fields')
class JSONBlob(JSONField):
# Cringe... a JSONField that is back ended with a TextField.
# This field was a legacy custom field type that tl;dr; was a TextField
# Over the years, with Django upgrades, we were able to go to a JSONField instead of the custom field
# However, we didn't want to have large customers with millions of events to update from text to json during an upgrade
# So we keep this field type as backended with TextField.
def get_internal_type(self):
return "TextField"
# postgres uses a Jsonb field as the default backend
# with psycopg2 it was using a psycopg2._json.Json class internally
# with psycopg3 it uses a psycopg.types.json.Jsonb class internally
# The binary class was not compatible with a text field, so we are going to override these next two methods and ensure we are using a string
def from_db_value(self, value, expression, connection):
if value is None:
return value
if isinstance(value, str):
try:
return json.loads(value)
except Exception as e:
logger.error(f"Failed to load JSONField {self.name}: {e}")
return value
def get_db_prep_value(self, value, connection, prepared=False):
if not prepared:
value = self.get_prep_value(value)
try:
# Null characters are not allowed in text fields and JSONBlobs are JSON data but saved as text
# So we want to make sure we strip out any null characters also note, these "should" be escaped by the dumps process:
# >>> my_obj = { 'test': '\x00' }
# >>> import json
# >>> json.dumps(my_obj)
# '{"test": "\\u0000"}'
# But just to be safe, lets remove them if they are there. \x00 and \u0000 are the same:
# >>> string = "\x00"
# >>> "\u0000" in string
# True
dumped_value = json.dumps(value)
if "\x00" in dumped_value:
dumped_value = dumped_value.replace("\x00", '')
return dumped_value
except Exception as e:
logger.error(f"Failed to dump JSONField {self.name}: {e} value: {value}")
return value
# Based on AutoOneToOneField from django-annoying:
# https://bitbucket.org/offline/django-annoying/src/a0de8b294db3/annoying/fields.py

View File

@@ -17,6 +17,6 @@ class Command(BaseCommand):
months_ago = options.get('months-ago') or None
if not months_ago:
months_ago = getattr(settings, 'CLEANUP_HOST_METRICS_THRESHOLD', 12)
months_ago = getattr(settings, 'CLEANUP_HOST_METRICS_SOFT_THRESHOLD', 12)
HostMetric.cleanup_task(months_ago)

View File

@@ -17,10 +17,7 @@ from django.utils.timezone import now
# AWX
from awx.main.models import Job, AdHocCommand, ProjectUpdate, InventoryUpdate, SystemJob, WorkflowJob, Notification
def unified_job_class_to_event_table_name(job_class):
return f'main_{job_class().event_class.__name__.lower()}'
from awx.main.utils import unified_job_class_to_event_table_name
def partition_table_name(job_class, dt):

View File

@@ -0,0 +1,9 @@
from django.core.management.base import BaseCommand
from awx.main.tasks.host_metrics import HostMetricSummaryMonthlyTask
class Command(BaseCommand):
help = 'Computing of HostMetricSummaryMonthly'
def handle(self, *args, **options):
HostMetricSummaryMonthlyTask().execute()

View File

@@ -0,0 +1,27 @@
from django.utils.timezone import now
from django.core.management.base import BaseCommand, CommandParser
from datetime import timedelta
from awx.main.utils.common import create_partition, unified_job_class_to_event_table_name
from awx.main.models import Job, SystemJob, ProjectUpdate, InventoryUpdate, AdHocCommand
class Command(BaseCommand):
"""Command used to precreate database partitions to avoid pg_dump locks"""
def add_arguments(self, parser: CommandParser) -> None:
parser.add_argument('--count', dest='count', action='store', help='The amount of hours of partitions to create', type=int, default=1)
def _create_partitioned_tables(self, count):
tables = list()
for model in (Job, SystemJob, ProjectUpdate, InventoryUpdate, AdHocCommand):
tables.append(unified_job_class_to_event_table_name(model))
start = now()
while count > 0:
for table in tables:
create_partition(table, start)
print(f'Created partitions for {table} {start}')
start = start + timedelta(hours=1)
count -= 1
def handle(self, **options):
self._create_partitioned_tables(count=options.get('count'))

View File

@@ -35,7 +35,7 @@ class Command(BaseCommand):
from awx.main.management.commands.register_queue import RegisterQueue
(changed, instance) = Instance.objects.register(ip_address=os.environ.get('MY_POD_IP'), node_type='control', uuid=settings.SYSTEM_UUID)
(changed, instance) = Instance.objects.register(ip_address=os.environ.get('MY_POD_IP'), node_type='control', node_uuid=settings.SYSTEM_UUID)
RegisterQueue(settings.DEFAULT_CONTROL_PLANE_QUEUE_NAME, 100, 0, [], is_container_group=False).register()
RegisterQueue(
settings.DEFAULT_EXECUTION_QUEUE_NAME,
@@ -48,7 +48,7 @@ class Command(BaseCommand):
max_concurrent_jobs=settings.DEFAULT_EXECUTION_QUEUE_MAX_CONCURRENT_JOBS,
).register()
else:
(changed, instance) = Instance.objects.register(hostname=hostname, node_type=node_type, uuid=uuid)
(changed, instance) = Instance.objects.register(hostname=hostname, node_type=node_type, node_uuid=uuid)
if changed:
print("Successfully registered instance {}".format(hostname))
else:

View File

@@ -2,6 +2,7 @@ import logging
import json
from django.core.management.base import BaseCommand
from awx.main.dispatch import pg_bus_conn
from awx.main.dispatch.worker.task import TaskWorker
@@ -18,7 +19,7 @@ class Command(BaseCommand):
def handle(self, *arg, **options):
try:
with pg_bus_conn(new_connection=True) as conn:
with pg_bus_conn() as conn:
conn.listen("tower_settings_change")
for e in conn.events(yield_timeouts=True):
if e is not None:

View File

@@ -4,28 +4,22 @@ import logging
import yaml
from django.conf import settings
from django.core.cache import cache as django_cache
from django.core.management.base import BaseCommand
from django.db import connection as django_connection
from awx.main.dispatch import get_task_queuename
from awx.main.dispatch.control import Control
from awx.main.dispatch.pool import AutoscalePool
from awx.main.dispatch.worker import AWXConsumerPG, TaskWorker
from awx.main.dispatch import periodic
logger = logging.getLogger('awx.main.dispatch')
def construct_bcast_queue_name(common_name):
return common_name + '_' + settings.CLUSTER_HOST_ID
class Command(BaseCommand):
help = 'Launch the task dispatcher'
def add_arguments(self, parser):
parser.add_argument('--status', dest='status', action='store_true', help='print the internal state of any running dispatchers')
parser.add_argument('--schedule', dest='schedule', action='store_true', help='print the current status of schedules being ran by dispatcher')
parser.add_argument('--running', dest='running', action='store_true', help='print the UUIDs of any tasked managed by this dispatcher')
parser.add_argument(
'--reload',
@@ -47,6 +41,9 @@ class Command(BaseCommand):
if options.get('status'):
print(Control('dispatcher').status())
return
if options.get('schedule'):
print(Control('dispatcher').schedule())
return
if options.get('running'):
print(Control('dispatcher').running())
return
@@ -63,21 +60,11 @@ class Command(BaseCommand):
print(Control('dispatcher').cancel(cancel_data))
return
# It's important to close these because we're _about_ to fork, and we
# don't want the forked processes to inherit the open sockets
# for the DB and cache connections (that way lies race conditions)
django_connection.close()
django_cache.close()
# spawn a daemon thread to periodically enqueues scheduled tasks
# (like the node heartbeat)
periodic.run_continuously()
consumer = None
try:
queues = ['tower_broadcast_all', 'tower_settings_change', get_task_queuename()]
consumer = AWXConsumerPG('dispatcher', TaskWorker(), queues, AutoscalePool(min_workers=4))
consumer = AWXConsumerPG('dispatcher', TaskWorker(), queues, AutoscalePool(min_workers=4), schedule=settings.CELERYBEAT_SCHEDULE)
consumer.run()
except KeyboardInterrupt:
logger.debug('Terminating Task Dispatcher')

View File

@@ -1,74 +0,0 @@
import json
import logging
import os
import time
import signal
import sys
from django.core.management.base import BaseCommand
from django.conf import settings
from awx.main.dispatch import pg_bus_conn
logger = logging.getLogger('awx.main.commands.run_heartbeet')
class Command(BaseCommand):
help = 'Launch the web server beacon (heartbeet)'
def print_banner(self):
heartbeet = r"""
********** **********
************* *************
*****************************
***********HEART***********
*************************
*******************
*************** _._
*********** /`._ `'. __
******* \ .\| \ _'` `)
*** (``_) \| ).'` /`- /
* `\ `;\_ `\\//`-'` /
\ `'.'.| / __/`
`'--v_|/`'`
__||-._
/'` `-`` `'\\
/ .'` )
\ BEET ' )
\. /
'. /'`
`) |
//
'(.
`\`.
``"""
print(heartbeet)
def construct_payload(self, action='online'):
payload = {
'hostname': settings.CLUSTER_HOST_ID,
'ip': os.environ.get('MY_POD_IP'),
'action': action,
}
return json.dumps(payload)
def notify_listener_and_exit(self, *args):
with pg_bus_conn(new_connection=False) as conn:
conn.notify('web_heartbeet', self.construct_payload(action='offline'))
sys.exit(0)
def do_hearbeat_loop(self):
with pg_bus_conn(new_connection=True) as conn:
while True:
logger.debug('Sending heartbeat')
conn.notify('web_heartbeet', self.construct_payload())
time.sleep(settings.BROADCAST_WEBSOCKET_BEACON_FROM_WEB_RATE_SECONDS)
def handle(self, *arg, **options):
self.print_banner()
signal.signal(signal.SIGTERM, self.notify_listener_and_exit)
signal.signal(signal.SIGINT, self.notify_listener_and_exit)
# Note: We don't really try any reconnect logic to pg_notify here,
# just let supervisor restart if we fail.
self.do_hearbeat_loop()

View File

@@ -22,7 +22,7 @@ class Command(BaseCommand):
def handle(self, *arg, **options):
try:
with pg_bus_conn(new_connection=True) as conn:
with pg_bus_conn() as conn:
conn.listen("rsyslog_configurer")
# reconfigure rsyslog on start up
reconfigure_rsyslog()

View File

@@ -0,0 +1,45 @@
import json
import logging
import os
import time
import signal
import sys
from django.core.management.base import BaseCommand
from django.conf import settings
from awx.main.dispatch import pg_bus_conn
logger = logging.getLogger('awx.main.commands.run_ws_heartbeat')
class Command(BaseCommand):
help = 'Launch the web server beacon (ws_heartbeat)'
def construct_payload(self, action='online'):
payload = {
'hostname': settings.CLUSTER_HOST_ID,
'ip': os.environ.get('MY_POD_IP'),
'action': action,
}
return json.dumps(payload)
def notify_listener_and_exit(self, *args):
with pg_bus_conn(new_connection=False) as conn:
conn.notify('web_ws_heartbeat', self.construct_payload(action='offline'))
sys.exit(0)
def do_heartbeat_loop(self):
while True:
with pg_bus_conn() as conn:
logger.debug('Sending heartbeat')
conn.notify('web_ws_heartbeat', self.construct_payload())
time.sleep(settings.BROADCAST_WEBSOCKET_BEACON_FROM_WEB_RATE_SECONDS)
def handle(self, *arg, **options):
signal.signal(signal.SIGTERM, self.notify_listener_and_exit)
signal.signal(signal.SIGINT, self.notify_listener_and_exit)
# Note: We don't really try any reconnect logic to pg_notify here,
# just let supervisor restart if we fail.
self.do_heartbeat_loop()

View File

@@ -2,6 +2,7 @@
# All Rights Reserved.
import logging
import uuid
from django.db import models
from django.conf import settings
from django.db.models.functions import Lower
@@ -114,7 +115,7 @@ class InstanceManager(models.Manager):
return node[0]
raise RuntimeError("No instance found with the current cluster host id")
def register(self, uuid=None, hostname=None, ip_address=None, node_type='hybrid', defaults=None):
def register(self, node_uuid=None, hostname=None, ip_address=None, node_type='hybrid', defaults=None):
if not hostname:
hostname = settings.CLUSTER_HOST_ID
@@ -131,8 +132,8 @@ class InstanceManager(models.Manager):
logger.warning("IP address {0} conflict detected, ip address unset for host {1}.".format(ip_address, other_hostname))
# Return existing instance that matches hostname or UUID (default to UUID)
if uuid is not None and uuid != UUID_DEFAULT and self.filter(uuid=uuid).exists():
instance = self.filter(uuid=uuid)
if node_uuid is not None and node_uuid != UUID_DEFAULT and self.filter(uuid=node_uuid).exists():
instance = self.filter(uuid=node_uuid)
else:
# if instance was not retrieved by uuid and hostname was, use the hostname
instance = self.filter(hostname=hostname)
@@ -170,9 +171,7 @@ class InstanceManager(models.Manager):
}
if defaults is not None:
create_defaults.update(defaults)
uuid_option = {}
if uuid is not None:
uuid_option = {'uuid': uuid}
uuid_option = {'uuid': node_uuid if node_uuid is not None else uuid.uuid4()}
if node_type == 'execution' and 'version' not in create_defaults:
create_defaults['version'] = RECEPTOR_PENDING
instance = self.create(hostname=hostname, ip_address=ip_address, node_type=node_type, **create_defaults, **uuid_option)

View File

@@ -9,13 +9,11 @@ from django.db import migrations, models
import django.utils.timezone
import django.db.models.deletion
from django.conf import settings
import taggit.managers
import awx.main.fields
class Migration(migrations.Migration):
dependencies = [
('taggit', '0002_auto_20150616_2121'),
('contenttypes', '0002_remove_content_type_name'),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
]
@@ -184,12 +182,6 @@ class Migration(migrations.Migration):
null=True,
),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
],
options={
'ordering': ('kind', 'name'),
@@ -529,12 +521,6 @@ class Migration(migrations.Migration):
null=True,
),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
('users', models.ManyToManyField(related_name='organizations', to=settings.AUTH_USER_MODEL, blank=True)),
],
options={
@@ -589,12 +575,6 @@ class Migration(migrations.Migration):
null=True,
),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
],
),
migrations.CreateModel(
@@ -644,12 +624,6 @@ class Migration(migrations.Migration):
null=True,
),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
],
options={
'ordering': ['-next_run'],
@@ -687,12 +661,6 @@ class Migration(migrations.Migration):
),
),
('organization', models.ForeignKey(related_name='teams', on_delete=django.db.models.deletion.SET_NULL, to='main.Organization', null=True)),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
('users', models.ManyToManyField(related_name='teams', to=settings.AUTH_USER_MODEL, blank=True)),
],
options={
@@ -1267,13 +1235,6 @@ class Migration(migrations.Migration):
null=True,
),
),
migrations.AddField(
model_name='unifiedjobtemplate',
name='tags',
field=taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
migrations.AddField(
model_name='unifiedjob',
name='created_by',
@@ -1319,13 +1280,6 @@ class Migration(migrations.Migration):
name='schedule',
field=models.ForeignKey(on_delete=django.db.models.deletion.SET_NULL, default=None, editable=False, to='main.Schedule', null=True),
),
migrations.AddField(
model_name='unifiedjob',
name='tags',
field=taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
migrations.AddField(
model_name='unifiedjob',
name='unified_job_template',
@@ -1370,13 +1324,6 @@ class Migration(migrations.Migration):
help_text='Organization containing this inventory.',
),
),
migrations.AddField(
model_name='inventory',
name='tags',
field=taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
migrations.AddField(
model_name='host',
name='inventory',
@@ -1407,13 +1354,6 @@ class Migration(migrations.Migration):
null=True,
),
),
migrations.AddField(
model_name='host',
name='tags',
field=taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
migrations.AddField(
model_name='group',
name='hosts',
@@ -1441,13 +1381,6 @@ class Migration(migrations.Migration):
name='parents',
field=models.ManyToManyField(related_name='children', to='main.Group', blank=True),
),
migrations.AddField(
model_name='group',
name='tags',
field=taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
migrations.AddField(
model_name='custominventoryscript',
name='organization',
@@ -1459,13 +1392,6 @@ class Migration(migrations.Migration):
null=True,
),
),
migrations.AddField(
model_name='custominventoryscript',
name='tags',
field=taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
migrations.AddField(
model_name='credential',
name='team',

View File

@@ -12,8 +12,6 @@ import django.db.models.deletion
from django.conf import settings
from django.utils.timezone import now
import taggit.managers
def create_system_job_templates(apps, schema_editor):
"""
@@ -125,7 +123,6 @@ class Migration(migrations.Migration):
]
dependencies = [
('taggit', '0002_auto_20150616_2121'),
('contenttypes', '0002_remove_content_type_name'),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
('main', '0001_initial'),
@@ -256,12 +253,6 @@ class Migration(migrations.Migration):
'organization',
models.ForeignKey(related_name='notification_templates', on_delete=django.db.models.deletion.SET_NULL, to='main.Organization', null=True),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
],
),
migrations.AddField(
@@ -721,12 +712,6 @@ class Migration(migrations.Migration):
help_text='Organization this label belongs to.',
),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
],
options={
'ordering': ('organization', 'name'),

View File

@@ -2,13 +2,9 @@
# Python
from __future__ import unicode_literals
# Psycopg2
from psycopg2.extensions import AsIs
# Django
from django.db import connection, migrations, models, OperationalError, ProgrammingError
from django.conf import settings
import taggit.managers
# AWX
import awx.main.fields
@@ -136,8 +132,8 @@ class Migration(migrations.Migration):
),
),
migrations.RunSQL(
[("CREATE INDEX host_ansible_facts_default_gin ON %s USING gin" "(ansible_facts jsonb_path_ops);", [AsIs(Host._meta.db_table)])],
[('DROP INDEX host_ansible_facts_default_gin;', None)],
sql="CREATE INDEX host_ansible_facts_default_gin ON {} USING gin(ansible_facts jsonb_path_ops);".format(Host._meta.db_table),
reverse_sql='DROP INDEX host_ansible_facts_default_gin;',
),
# SCM file-based inventories
migrations.AddField(
@@ -320,10 +316,6 @@ class Migration(migrations.Migration):
model_name='permission',
name='project',
),
migrations.RemoveField(
model_name='permission',
name='tags',
),
migrations.RemoveField(
model_name='permission',
name='team',
@@ -513,12 +505,6 @@ class Migration(migrations.Migration):
null=True,
),
),
(
'tags',
taggit.managers.TaggableManager(
to='taggit.Tag', through='taggit.TaggedItem', blank=True, help_text='A comma-separated list of tags.', verbose_name='Tags'
),
),
],
options={
'ordering': ('kind', 'name'),

View File

@@ -4,7 +4,6 @@ from __future__ import unicode_literals
from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion
import taggit.managers
# AWX
import awx.main.fields
@@ -20,7 +19,6 @@ def setup_tower_managed_defaults(apps, schema_editor):
class Migration(migrations.Migration):
dependencies = [
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
('taggit', '0002_auto_20150616_2121'),
('main', '0066_v350_inventorysource_custom_virtualenv'),
]
@@ -60,12 +58,6 @@ class Migration(migrations.Migration):
'source_credential',
models.ForeignKey(null=True, on_delete=django.db.models.deletion.CASCADE, related_name='target_input_sources', to='main.Credential'),
),
(
'tags',
taggit.managers.TaggableManager(
blank=True, help_text='A comma-separated list of tags.', through='taggit.TaggedItem', to='taggit.Tag', verbose_name='Tags'
),
),
(
'target_credential',
models.ForeignKey(null=True, on_delete=django.db.models.deletion.CASCADE, related_name='input_sources', to='main.Credential'),

View File

@@ -4,12 +4,10 @@ from django.conf import settings
from django.db import migrations, models
import django.db.models.deletion
import django.db.models.expressions
import taggit.managers
class Migration(migrations.Migration):
dependencies = [
('taggit', '0003_taggeditem_add_unique_index'),
migrations.swappable_dependency(settings.AUTH_USER_MODEL),
('main', '0123_drop_hg_support'),
]
@@ -69,12 +67,6 @@ class Migration(migrations.Migration):
to='main.Organization',
),
),
(
'tags',
taggit.managers.TaggableManager(
blank=True, help_text='A comma-separated list of tags.', through='taggit.TaggedItem', to='taggit.Tag', verbose_name='Tags'
),
),
],
options={
'ordering': (django.db.models.expressions.OrderBy(django.db.models.expressions.F('organization_id'), nulls_first=True), 'image'),

View File

@@ -30,7 +30,7 @@ def migrate_event_data(apps, schema_editor):
# otherwise, the schema changes we would make on the old jobevents table
# (namely, dropping the primary key constraint) would cause the migration
# to suffer a serious performance degradation
cursor.execute(f'CREATE TABLE tmp_{tblname} ' f'(LIKE _unpartitioned_{tblname} INCLUDING ALL)')
cursor.execute(f'CREATE TABLE tmp_{tblname} (LIKE _unpartitioned_{tblname} INCLUDING ALL)')
# drop primary key constraint; in a partioned table
# constraints must include the partition key itself
@@ -48,7 +48,7 @@ def migrate_event_data(apps, schema_editor):
cursor.execute(f'DROP TABLE tmp_{tblname}')
# recreate primary key constraint
cursor.execute(f'ALTER TABLE ONLY {tblname} ' f'ADD CONSTRAINT {tblname}_pkey_new PRIMARY KEY (id, job_created);')
cursor.execute(f'ALTER TABLE ONLY {tblname} ADD CONSTRAINT {tblname}_pkey_new PRIMARY KEY (id, job_created);')
with connection.cursor() as cursor:
"""

View File

@@ -0,0 +1,277 @@
# Generated by Django 4.2.3 on 2023-08-02 13:18
import awx.main.models.notifications
from django.db import migrations, models
class Migration(migrations.Migration):
dependencies = [
('main', '0184_django_indexes'),
('conf', '0010_change_to_JSONField'),
]
operations = [
migrations.AlterField(
model_name='instancegroup',
name='policy_instance_list',
field=models.JSONField(
blank=True, default=list, help_text='List of exact-match Instances that will always be automatically assigned to this group'
),
),
migrations.AlterField(
model_name='jobtemplate',
name='survey_spec',
field=models.JSONField(blank=True, default=dict),
),
migrations.AlterField(
model_name='notificationtemplate',
name='messages',
field=models.JSONField(
blank=True,
default=awx.main.models.notifications.NotificationTemplate.default_messages,
help_text='Optional custom messages for notification template.',
null=True,
),
),
migrations.AlterField(
model_name='notificationtemplate',
name='notification_configuration',
field=models.JSONField(default=dict),
),
migrations.AlterField(
model_name='project',
name='inventory_files',
field=models.JSONField(
blank=True,
default=list,
editable=False,
help_text='Suggested list of content that could be Ansible inventory in the project',
verbose_name='Inventory Files',
),
),
migrations.AlterField(
model_name='project',
name='playbook_files',
field=models.JSONField(blank=True, default=list, editable=False, help_text='List of playbooks found in the project', verbose_name='Playbook Files'),
),
migrations.AlterField(
model_name='schedule',
name='char_prompts',
field=models.JSONField(blank=True, default=dict),
),
migrations.AlterField(
model_name='schedule',
name='survey_passwords',
field=models.JSONField(blank=True, default=dict, editable=False),
),
migrations.AlterField(
model_name='workflowjobtemplate',
name='char_prompts',
field=models.JSONField(blank=True, default=dict),
),
migrations.AlterField(
model_name='workflowjobtemplate',
name='survey_spec',
field=models.JSONField(blank=True, default=dict),
),
migrations.AlterField(
model_name='workflowjobtemplatenode',
name='char_prompts',
field=models.JSONField(blank=True, default=dict),
),
migrations.AlterField(
model_name='workflowjobtemplatenode',
name='survey_passwords',
field=models.JSONField(blank=True, default=dict, editable=False),
),
# These are potentially a problem. Move the existing fields
# aside while pretending like they've been deleted, then add
# in fresh empty fields. Make the old fields nullable where
# needed while we are at it, so that new rows don't hit
# IntegrityError. We'll do the data migration out-of-band
# using a task.
migrations.RunSQL( # Already nullable
"ALTER TABLE main_activitystream RENAME deleted_actor TO deleted_actor_old;",
state_operations=[
migrations.RemoveField(
model_name='activitystream',
name='deleted_actor',
),
],
),
migrations.AddField(
model_name='activitystream',
name='deleted_actor',
field=models.JSONField(null=True),
),
migrations.RunSQL(
"""
ALTER TABLE main_activitystream RENAME setting TO setting_old;
ALTER TABLE main_activitystream ALTER COLUMN setting_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='activitystream',
name='setting',
),
],
),
migrations.AddField(
model_name='activitystream',
name='setting',
field=models.JSONField(blank=True, default=dict),
),
migrations.RunSQL(
"""
ALTER TABLE main_job RENAME survey_passwords TO survey_passwords_old;
ALTER TABLE main_job ALTER COLUMN survey_passwords_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='job',
name='survey_passwords',
),
],
),
migrations.AddField(
model_name='job',
name='survey_passwords',
field=models.JSONField(blank=True, default=dict, editable=False),
),
migrations.RunSQL(
"""
ALTER TABLE main_joblaunchconfig RENAME char_prompts TO char_prompts_old;
ALTER TABLE main_joblaunchconfig ALTER COLUMN char_prompts_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='joblaunchconfig',
name='char_prompts',
),
],
),
migrations.AddField(
model_name='joblaunchconfig',
name='char_prompts',
field=models.JSONField(blank=True, default=dict),
),
migrations.RunSQL(
"""
ALTER TABLE main_joblaunchconfig RENAME survey_passwords TO survey_passwords_old;
ALTER TABLE main_joblaunchconfig ALTER COLUMN survey_passwords_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='joblaunchconfig',
name='survey_passwords',
),
],
),
migrations.AddField(
model_name='joblaunchconfig',
name='survey_passwords',
field=models.JSONField(blank=True, default=dict, editable=False),
),
migrations.RunSQL(
"""
ALTER TABLE main_notification RENAME body TO body_old;
ALTER TABLE main_notification ALTER COLUMN body_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='notification',
name='body',
),
],
),
migrations.AddField(
model_name='notification',
name='body',
field=models.JSONField(blank=True, default=dict),
),
migrations.RunSQL(
"""
ALTER TABLE main_unifiedjob RENAME job_env TO job_env_old;
ALTER TABLE main_unifiedjob ALTER COLUMN job_env_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='unifiedjob',
name='job_env',
),
],
),
migrations.AddField(
model_name='unifiedjob',
name='job_env',
field=models.JSONField(blank=True, default=dict, editable=False),
),
migrations.RunSQL(
"""
ALTER TABLE main_workflowjob RENAME char_prompts TO char_prompts_old;
ALTER TABLE main_workflowjob ALTER COLUMN char_prompts_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='workflowjob',
name='char_prompts',
),
],
),
migrations.AddField(
model_name='workflowjob',
name='char_prompts',
field=models.JSONField(blank=True, default=dict),
),
migrations.RunSQL(
"""
ALTER TABLE main_workflowjob RENAME survey_passwords TO survey_passwords_old;
ALTER TABLE main_workflowjob ALTER COLUMN survey_passwords_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='workflowjob',
name='survey_passwords',
),
],
),
migrations.AddField(
model_name='workflowjob',
name='survey_passwords',
field=models.JSONField(blank=True, default=dict, editable=False),
),
migrations.RunSQL(
"""
ALTER TABLE main_workflowjobnode RENAME char_prompts TO char_prompts_old;
ALTER TABLE main_workflowjobnode ALTER COLUMN char_prompts_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='workflowjobnode',
name='char_prompts',
),
],
),
migrations.AddField(
model_name='workflowjobnode',
name='char_prompts',
field=models.JSONField(blank=True, default=dict),
),
migrations.RunSQL(
"""
ALTER TABLE main_workflowjobnode RENAME survey_passwords TO survey_passwords_old;
ALTER TABLE main_workflowjobnode ALTER COLUMN survey_passwords_old DROP NOT NULL;
""",
state_operations=[
migrations.RemoveField(
model_name='workflowjobnode',
name='survey_passwords',
),
],
),
migrations.AddField(
model_name='workflowjobnode',
name='survey_passwords',
field=models.JSONField(blank=True, default=dict, editable=False),
),
]

View File

@@ -0,0 +1,27 @@
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
from django.db import migrations
def delete_taggit_contenttypes(apps, schema_editor):
ContentType = apps.get_model('contenttypes', 'ContentType')
ContentType.objects.filter(app_label='taggit').delete()
def delete_taggit_migration_records(apps, schema_editor):
recorder = migrations.recorder.MigrationRecorder(connection=schema_editor.connection)
recorder.migration_qs.filter(app='taggit').delete()
class Migration(migrations.Migration):
dependencies = [
('main', '0185_move_JSONBlob_to_JSONField'),
]
operations = [
migrations.RunSQL("DROP TABLE IF EXISTS taggit_tag CASCADE;"),
migrations.RunSQL("DROP TABLE IF EXISTS taggit_taggeditem CASCADE;"),
migrations.RunPython(delete_taggit_contenttypes),
migrations.RunPython(delete_taggit_migration_records),
]

View File

@@ -3,6 +3,7 @@
# Django
from django.conf import settings # noqa
from django.db import connection
from django.db.models.signals import pre_delete # noqa
# AWX
@@ -99,6 +100,58 @@ User.add_to_class('can_access_with_errors', check_user_access_with_errors)
User.add_to_class('accessible_objects', user_accessible_objects)
def convert_jsonfields():
if connection.vendor != 'postgresql':
return
# fmt: off
fields = [
('main_activitystream', 'id', (
'deleted_actor',
'setting',
)),
('main_job', 'unifiedjob_ptr_id', (
'survey_passwords',
)),
('main_joblaunchconfig', 'id', (
'char_prompts',
'survey_passwords',
)),
('main_notification', 'id', (
'body',
)),
('main_unifiedjob', 'id', (
'job_env',
)),
('main_workflowjob', 'unifiedjob_ptr_id', (
'char_prompts',
'survey_passwords',
)),
('main_workflowjobnode', 'id', (
'char_prompts',
'survey_passwords',
)),
]
# fmt: on
with connection.cursor() as cursor:
for table, pkfield, columns in fields:
# Do the renamed old columns still exist? If so, run the task.
old_columns = ','.join(f"'{column}_old'" for column in columns)
cursor.execute(
f"""
select count(1) from information_schema.columns
where
table_name = %s and column_name in ({old_columns});
""",
(table,),
)
if cursor.fetchone()[0]:
from awx.main.tasks.system import migrate_jsonfield
migrate_jsonfield.apply_async([table, pkfield, columns])
def cleanup_created_modified_by(sender, **kwargs):
# work around a bug in django-polymorphic that doesn't properly
# handle cascades for reverse foreign keys on the polymorphic base model

View File

@@ -3,7 +3,6 @@
# AWX
from awx.api.versioning import reverse
from awx.main.fields import JSONBlob
from awx.main.models.base import accepts_json
# Django
@@ -36,7 +35,7 @@ class ActivityStream(models.Model):
operation = models.CharField(max_length=13, choices=OPERATION_CHOICES)
timestamp = models.DateTimeField(auto_now_add=True)
changes = accepts_json(models.TextField(blank=True))
deleted_actor = JSONBlob(null=True)
deleted_actor = models.JSONField(null=True)
action_node = models.CharField(
blank=True,
default='',
@@ -84,7 +83,7 @@ class ActivityStream(models.Model):
o_auth2_application = models.ManyToManyField("OAuth2Application", blank=True)
o_auth2_access_token = models.ManyToManyField("OAuth2AccessToken", blank=True)
setting = JSONBlob(default=dict, blank=True)
setting = models.JSONField(default=dict, blank=True)
def __str__(self):
operation = self.operation if 'operation' in self.__dict__ else '_delayed_'

View File

@@ -7,9 +7,6 @@ from django.core.exceptions import ValidationError, ObjectDoesNotExist
from django.utils.translation import gettext_lazy as _
from django.utils.timezone import now
# Django-Taggit
from taggit.managers import TaggableManager
# Django-CRUM
from crum import get_current_user
@@ -301,8 +298,6 @@ class PrimordialModel(HasEditsMixin, CreatedModifiedModel):
on_delete=models.SET_NULL,
)
tags = TaggableManager(blank=True)
def __init__(self, *args, **kwargs):
r = super(PrimordialModel, self).__init__(*args, **kwargs)
if self.pk:

View File

@@ -4,6 +4,7 @@ import datetime
from datetime import timezone
import logging
from collections import defaultdict
import time
from django.conf import settings
from django.core.exceptions import ObjectDoesNotExist
@@ -383,8 +384,17 @@ class BasePlaybookEvent(CreatedModifiedModel):
.distinct()
) # noqa
job.get_event_queryset().filter(uuid__in=changed).update(changed=True)
job.get_event_queryset().filter(uuid__in=failed).update(failed=True)
# NOTE: we take a set of changed and failed parent uuids because the subquery
# complicates the plan with large event tables causing very long query execution time
changed_start = time.time()
changed_res = job.get_event_queryset().filter(uuid__in=set(changed)).update(changed=True)
failed_start = time.time()
failed_res = job.get_event_queryset().filter(uuid__in=set(failed)).update(failed=True)
logger.debug(
f'Event propagation for job {job.id}: '
f'marked {changed_res} as changed in {failed_start - changed_start:.4f}s, '
f'{failed_res} as failed in {time.time() - failed_start:.4f}s'
)
for field in ('playbook', 'play', 'task', 'role'):
value = force_str(event_data.get(field, '')).strip()

View File

@@ -20,7 +20,7 @@ from solo.models import SingletonModel
# AWX
from awx import __version__ as awx_application_version
from awx.api.versioning import reverse
from awx.main.fields import JSONBlob, ImplicitRoleField
from awx.main.fields import ImplicitRoleField
from awx.main.managers import InstanceManager, UUID_DEFAULT
from awx.main.constants import JOB_FOLDER_PREFIX
from awx.main.models.base import BaseModel, HasEditsMixin, prevent_search
@@ -406,7 +406,7 @@ class InstanceGroup(HasPolicyEditsMixin, BaseModel, RelatedJobsMixin, ResourceMi
max_forks = models.IntegerField(default=0, help_text=_("Max forks to execute on this group. Zero means no limit."))
policy_instance_percentage = models.IntegerField(default=0, help_text=_("Percentage of Instances to automatically assign to this group"))
policy_instance_minimum = models.IntegerField(default=0, help_text=_("Static minimum number of Instances to automatically assign to this group"))
policy_instance_list = JSONBlob(
policy_instance_list = models.JSONField(
default=list, blank=True, help_text=_("List of exact-match Instances that will always be automatically assigned to this group")
)

View File

@@ -899,18 +899,18 @@ class HostMetric(models.Model):
last_automation_before = now() - dateutil.relativedelta.relativedelta(months=months_ago)
logger.info(f'Cleanup [HostMetric]: soft-deleting records last automated before {last_automation_before}')
logger.info(f'cleanup_host_metrics: soft-deleting records last automated before {last_automation_before}')
HostMetric.active_objects.filter(last_automation__lt=last_automation_before).update(
deleted=True, deleted_counter=models.F('deleted_counter') + 1, last_deleted=now()
)
settings.CLEANUP_HOST_METRICS_LAST_TS = now()
except (TypeError, ValueError):
logger.error(f"Cleanup [HostMetric]: months_ago({months_ago}) has to be a positive integer value")
logger.error(f"cleanup_host_metrics: months_ago({months_ago}) has to be a positive integer value")
class HostMetricSummaryMonthly(models.Model):
"""
HostMetric summaries computed by scheduled task <TODO> monthly
HostMetric summaries computed by scheduled task 'awx.main.tasks.system.host_metric_summary_monthly' monthly
"""
date = models.DateField(unique=True)
@@ -1623,6 +1623,7 @@ class rhv(PluginFileInjector):
collection = 'ovirt'
downstream_namespace = 'redhat'
downstream_collection = 'rhv'
use_fqcn = True
class satellite6(PluginFileInjector):

View File

@@ -883,7 +883,7 @@ class LaunchTimeConfigBase(BaseModel):
)
# All standard fields are stored in this dictionary field
# This is a solution to the nullable CharField problem, specific to prompting
char_prompts = JSONBlob(default=dict, blank=True)
char_prompts = models.JSONField(default=dict, blank=True)
# Define fields that are not really fields, but alias to char_prompts lookups
limit = NullablePromptPseudoField('limit')
@@ -960,7 +960,7 @@ class LaunchTimeConfig(LaunchTimeConfigBase):
# Special case prompting fields, even more special than the other ones
extra_data = JSONBlob(default=dict, blank=True)
survey_passwords = prevent_search(
JSONBlob(
models.JSONField(
default=dict,
editable=False,
blank=True,

View File

@@ -24,7 +24,7 @@ from awx.main.utils import parse_yaml_or_json, get_custom_venv_choices, get_lice
from awx.main.utils.execution_environments import get_default_execution_environment
from awx.main.utils.encryption import decrypt_value, get_encryption_key, is_encrypted
from awx.main.utils.polymorphic import build_polymorphic_ctypes_map
from awx.main.fields import AskForField, JSONBlob
from awx.main.fields import AskForField
from awx.main.constants import ACTIVE_STATES
@@ -103,7 +103,7 @@ class SurveyJobTemplateMixin(models.Model):
survey_enabled = models.BooleanField(
default=False,
)
survey_spec = prevent_search(JSONBlob(default=dict, blank=True))
survey_spec = prevent_search(models.JSONField(default=dict, blank=True))
ask_inventory_on_launch = AskForField(
blank=True,
@@ -392,7 +392,7 @@ class SurveyJobMixin(models.Model):
abstract = True
survey_passwords = prevent_search(
JSONBlob(
models.JSONField(
default=dict,
editable=False,
blank=True,

View File

@@ -17,7 +17,6 @@ from jinja2.exceptions import TemplateSyntaxError, UndefinedError, SecurityError
# AWX
from awx.api.versioning import reverse
from awx.main.fields import JSONBlob
from awx.main.models.base import CommonModelNameNotUnique, CreatedModifiedModel, prevent_search
from awx.main.utils import encrypt_field, decrypt_field, set_environ
from awx.main.notifications.email_backend import CustomEmailBackend
@@ -69,12 +68,12 @@ class NotificationTemplate(CommonModelNameNotUnique):
choices=NOTIFICATION_TYPE_CHOICES,
)
notification_configuration = prevent_search(JSONBlob(default=dict))
notification_configuration = prevent_search(models.JSONField(default=dict))
def default_messages():
return {'started': None, 'success': None, 'error': None, 'workflow_approval': None}
messages = JSONBlob(null=True, blank=True, default=default_messages, help_text=_('Optional custom messages for notification template.'))
messages = models.JSONField(null=True, blank=True, default=default_messages, help_text=_('Optional custom messages for notification template.'))
def has_message(self, condition):
potential_template = self.messages.get(condition, {})
@@ -236,7 +235,7 @@ class Notification(CreatedModifiedModel):
default='',
editable=False,
)
body = JSONBlob(default=dict, blank=True)
body = models.JSONField(default=dict, blank=True)
def get_absolute_url(self, request=None):
return reverse('api:notification_detail', kwargs={'pk': self.pk}, request=request)

View File

@@ -33,7 +33,7 @@ from awx.main.models.mixins import ResourceMixin, TaskManagerProjectUpdateMixin,
from awx.main.utils import update_scm_url, polymorphic
from awx.main.utils.ansible import skip_directory, could_be_inventory, could_be_playbook
from awx.main.utils.execution_environments import get_control_plane_execution_environment
from awx.main.fields import ImplicitRoleField, JSONBlob
from awx.main.fields import ImplicitRoleField
from awx.main.models.rbac import (
ROLE_SINGLETON_SYSTEM_ADMINISTRATOR,
ROLE_SINGLETON_SYSTEM_AUDITOR,
@@ -303,7 +303,7 @@ class Project(UnifiedJobTemplate, ProjectOptions, ResourceMixin, CustomVirtualEn
help_text=_('The last revision fetched by a project update'),
)
playbook_files = JSONBlob(
playbook_files = models.JSONField(
default=list,
blank=True,
editable=False,
@@ -311,7 +311,7 @@ class Project(UnifiedJobTemplate, ProjectOptions, ResourceMixin, CustomVirtualEn
help_text=_('List of playbooks found in the project'),
)
inventory_files = JSONBlob(
inventory_files = models.JSONField(
default=list,
blank=True,
editable=False,
@@ -479,7 +479,7 @@ class Project(UnifiedJobTemplate, ProjectOptions, ResourceMixin, CustomVirtualEn
RunProjectUpdate/RunInventoryUpdate.
"""
if self.status not in ('error', 'failed'):
if self.status not in ('error', 'failed') or self.scm_update_on_launch:
return None
latest_update = self.project_updates.last()

View File

@@ -55,7 +55,7 @@ from awx.main.utils import polymorphic
from awx.main.constants import ACTIVE_STATES, CAN_CANCEL, JOB_VARIABLE_PREFIXES
from awx.main.redact import UriCleaner, REPLACE_STR
from awx.main.consumers import emit_channel_notification
from awx.main.fields import AskForField, OrderedManyToManyField, JSONBlob
from awx.main.fields import AskForField, OrderedManyToManyField
__all__ = ['UnifiedJobTemplate', 'UnifiedJob', 'StdoutMaxBytesExceeded']
@@ -668,7 +668,7 @@ class UnifiedJob(
editable=False,
)
job_env = prevent_search(
JSONBlob(
models.JSONField(
default=dict,
blank=True,
editable=False,
@@ -1137,11 +1137,6 @@ class UnifiedJob(
if total > max_supported:
raise StdoutMaxBytesExceeded(total, max_supported)
# psycopg2's copy_expert writes bytes, but callers of this
# function assume a str-based fd will be returned; decode
# .write() calls on the fly to maintain this interface
_write = fd.write
fd.write = lambda s: _write(smart_str(s))
tbl = self._meta.db_table + 'event'
created_by_cond = ''
if self.has_unpartitioned_events:
@@ -1150,7 +1145,12 @@ class UnifiedJob(
created_by_cond = f"job_created='{self.created.isoformat()}' AND "
sql = f"copy (select stdout from {tbl} where {created_by_cond}{self.event_parent_key}={self.id} and stdout != '' order by start_line) to stdout" # nosql
cursor.copy_expert(sql, fd)
# psycopg3's copy writes bytes, but callers of this
# function assume a str-based fd will be returned; decode
# .write() calls on the fly to maintain this interface
with cursor.copy(sql) as copy:
while data := copy.read():
fd.write(smart_str(bytes(data)))
if hasattr(fd, 'name'):
# If we're dealing with a physical file, use `sed` to clean

View File

@@ -661,7 +661,11 @@ class WorkflowJob(UnifiedJob, WorkflowJobOptions, SurveyJobMixin, JobNotificatio
@property
def event_processing_finished(self):
return True
return True # workflow jobs do not have events
@property
def has_unpartitioned_events(self):
return False # workflow jobs do not have events
def _get_parent_field_name(self):
if self.job_template_id:
@@ -914,7 +918,11 @@ class WorkflowApproval(UnifiedJob, JobNotificationMixin):
@property
def event_processing_finished(self):
return True
return True # approval jobs do not have events
@property
def has_unpartitioned_events(self):
return False # approval jobs do not have events
def send_approval_notification(self, approval_status):
from awx.main.tasks.system import send_notifications # avoid circular import

View File

@@ -3,8 +3,6 @@
from django.db.models.signals import pre_save, post_save, pre_delete, m2m_changed
from taggit.managers import TaggableManager
class ActivityStreamRegistrar(object):
def __init__(self):
@@ -21,8 +19,6 @@ class ActivityStreamRegistrar(object):
pre_delete.connect(activity_stream_delete, sender=model, dispatch_uid=str(self.__class__) + str(model) + "_delete")
for m2mfield in model._meta.many_to_many:
if isinstance(m2mfield, TaggableManager):
continue # Special case for taggit app
try:
m2m_attr = getattr(model, m2mfield.name)
m2m_changed.connect(

View File

@@ -25,7 +25,6 @@ from awx.main.models import (
InventoryUpdate,
Job,
Project,
ProjectUpdate,
UnifiedJob,
WorkflowApproval,
WorkflowJob,
@@ -102,27 +101,33 @@ class TaskBase:
def record_aggregate_metrics(self, *args):
if not is_testing():
# increment task_manager_schedule_calls regardless if the other
# metrics are recorded
s_metrics.Metrics(auto_pipe_execute=True).inc(f"{self.prefix}__schedule_calls", 1)
# Only record metrics if the last time recording was more
# than SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL ago.
# Prevents a short-duration task manager that runs directly after a
# long task manager to override useful metrics.
current_time = time.time()
time_last_recorded = current_time - self.subsystem_metrics.decode(f"{self.prefix}_recorded_timestamp")
if time_last_recorded > settings.SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL:
logger.debug(f"recording {self.prefix} metrics, last recorded {time_last_recorded} seconds ago")
self.subsystem_metrics.set(f"{self.prefix}_recorded_timestamp", current_time)
self.subsystem_metrics.pipe_execute()
else:
logger.debug(f"skipping recording {self.prefix} metrics, last recorded {time_last_recorded} seconds ago")
try:
# increment task_manager_schedule_calls regardless if the other
# metrics are recorded
s_metrics.Metrics(auto_pipe_execute=True).inc(f"{self.prefix}__schedule_calls", 1)
# Only record metrics if the last time recording was more
# than SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL ago.
# Prevents a short-duration task manager that runs directly after a
# long task manager to override useful metrics.
current_time = time.time()
time_last_recorded = current_time - self.subsystem_metrics.decode(f"{self.prefix}_recorded_timestamp")
if time_last_recorded > settings.SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL:
logger.debug(f"recording {self.prefix} metrics, last recorded {time_last_recorded} seconds ago")
self.subsystem_metrics.set(f"{self.prefix}_recorded_timestamp", current_time)
self.subsystem_metrics.pipe_execute()
else:
logger.debug(f"skipping recording {self.prefix} metrics, last recorded {time_last_recorded} seconds ago")
except Exception:
logger.exception(f"Error saving metrics for {self.prefix}")
def record_aggregate_metrics_and_exit(self, *args):
self.record_aggregate_metrics()
sys.exit(1)
def schedule(self):
# Always be able to restore the original signal handler if we finish
original_sigusr1 = signal.getsignal(signal.SIGUSR1)
# Lock
with task_manager_bulk_reschedule():
with advisory_lock(f"{self.prefix}_lock", wait=False) as acquired:
@@ -131,9 +136,14 @@ class TaskBase:
logger.debug(f"Not running {self.prefix} scheduler, another task holds lock")
return
logger.debug(f"Starting {self.prefix} Scheduler")
# if sigterm due to timeout, still record metrics
signal.signal(signal.SIGTERM, self.record_aggregate_metrics_and_exit)
self._schedule()
# if sigusr1 due to timeout, still record metrics
signal.signal(signal.SIGUSR1, self.record_aggregate_metrics_and_exit)
try:
self._schedule()
finally:
# Reset the signal handler back to the default just in case anything
# else uses the same signal for other purposes
signal.signal(signal.SIGUSR1, original_sigusr1)
commit_start = time.time()
if self.prefix == "task_manager":
@@ -154,7 +164,6 @@ class WorkflowManager(TaskBase):
logger.warning("Workflow manager has reached time out while processing running workflows, exiting loop early")
ScheduleWorkflowManager().schedule()
# Do not process any more workflow jobs. Stop here.
# Maybe we should schedule another WorkflowManager run
break
dag = WorkflowDAG(workflow_job)
status_changed = False
@@ -169,8 +178,8 @@ class WorkflowManager(TaskBase):
workflow_job.save(update_fields=['status', 'start_args'])
status_changed = True
else:
workflow_nodes = dag.mark_dnr_nodes()
WorkflowJobNode.objects.bulk_update(workflow_nodes, ['do_not_run'])
dnr_nodes = dag.mark_dnr_nodes()
WorkflowJobNode.objects.bulk_update(dnr_nodes, ['do_not_run'])
# If workflow is now done, we do special things to mark it as done.
is_done = dag.is_workflow_done()
if is_done:
@@ -250,6 +259,7 @@ class WorkflowManager(TaskBase):
job.status = 'failed'
job.save(update_fields=['status', 'job_explanation'])
job.websocket_emit_status('failed')
ScheduleWorkflowManager().schedule()
# TODO: should we emit a status on the socket here similar to tasks.py awx_periodic_scheduler() ?
# emit_websocket_notification('/socket.io/jobs', '', dict(id=))
@@ -270,184 +280,115 @@ class WorkflowManager(TaskBase):
class DependencyManager(TaskBase):
def __init__(self):
super().__init__(prefix="dependency_manager")
self.all_projects = {}
self.all_inventory_sources = {}
def create_project_update(self, task, project_id=None):
if project_id is None:
project_id = task.project_id
project_task = Project.objects.get(id=project_id).create_project_update(_eager_fields=dict(launch_type='dependency'))
# Project created 1 seconds behind
project_task.created = task.created - timedelta(seconds=1)
project_task.status = 'pending'
project_task.save()
logger.debug('Spawned {} as dependency of {}'.format(project_task.log_format, task.log_format))
return project_task
def create_inventory_update(self, task, inventory_source_task):
inventory_task = InventorySource.objects.get(id=inventory_source_task.id).create_inventory_update(_eager_fields=dict(launch_type='dependency'))
inventory_task.created = task.created - timedelta(seconds=2)
inventory_task.status = 'pending'
inventory_task.save()
logger.debug('Spawned {} as dependency of {}'.format(inventory_task.log_format, task.log_format))
return inventory_task
def add_dependencies(self, task, dependencies):
with disable_activity_stream():
task.dependent_jobs.add(*dependencies)
def get_inventory_source_tasks(self):
def cache_projects_and_sources(self, task_list):
project_ids = set()
inventory_ids = set()
for task in self.all_tasks:
for task in task_list:
if isinstance(task, Job):
inventory_ids.add(task.inventory_id)
self.all_inventory_sources = [invsrc for invsrc in InventorySource.objects.filter(inventory_id__in=inventory_ids, update_on_launch=True)]
if task.project_id:
project_ids.add(task.project_id)
if task.inventory_id:
inventory_ids.add(task.inventory_id)
elif isinstance(task, InventoryUpdate):
if task.inventory_source and task.inventory_source.source_project_id:
project_ids.add(task.inventory_source.source_project_id)
def get_latest_inventory_update(self, inventory_source):
latest_inventory_update = InventoryUpdate.objects.filter(inventory_source=inventory_source).order_by("-created")
if not latest_inventory_update.exists():
return None
return latest_inventory_update.first()
for proj in Project.objects.filter(id__in=project_ids, scm_update_on_launch=True):
self.all_projects[proj.id] = proj
def should_update_inventory_source(self, job, latest_inventory_update):
now = tz_now()
for invsrc in InventorySource.objects.filter(inventory_id__in=inventory_ids, update_on_launch=True):
self.all_inventory_sources.setdefault(invsrc.inventory_id, [])
self.all_inventory_sources[invsrc.inventory_id].append(invsrc)
if latest_inventory_update is None:
@staticmethod
def should_update_again(update, cache_timeout):
'''
If it has never updated, we need to update
If there is already an update in progress then we do not need to a new create one
If the last update failed, we always need to try and update again
If current time is more than cache_timeout after last update, then we need a new one
'''
if (update is None) or (update.status in ['failed', 'canceled', 'error']):
return True
'''
If there's already a inventory update utilizing this job that's about to run
then we don't need to create one
'''
if latest_inventory_update.status in ['waiting', 'pending', 'running']:
if update.status in ['waiting', 'pending', 'running']:
return False
timeout_seconds = timedelta(seconds=latest_inventory_update.inventory_source.update_cache_timeout)
if (latest_inventory_update.finished + timeout_seconds) < now:
return True
if latest_inventory_update.inventory_source.update_on_launch is True and latest_inventory_update.status in ['failed', 'canceled', 'error']:
return True
return False
return bool(((update.finished + timedelta(seconds=cache_timeout))) < tz_now())
def get_latest_project_update(self, project_id):
latest_project_update = ProjectUpdate.objects.filter(project=project_id, job_type='check').order_by("-created")
if not latest_project_update.exists():
return None
return latest_project_update.first()
def should_update_related_project(self, job, latest_project_update):
now = tz_now()
if latest_project_update is None:
return True
if latest_project_update.status in ['failed', 'canceled']:
return True
'''
If there's already a project update utilizing this job that's about to run
then we don't need to create one
'''
if latest_project_update.status in ['waiting', 'pending', 'running']:
return False
'''
If the latest project update has a created time == job_created_time-1
then consider the project update found. This is so we don't enter an infinite loop
of updating the project when cache timeout is 0.
'''
if (
latest_project_update.project.scm_update_cache_timeout == 0
and latest_project_update.launch_type == 'dependency'
and latest_project_update.created == job.created - timedelta(seconds=1)
):
return False
'''
Normal Cache Timeout Logic
'''
timeout_seconds = timedelta(seconds=latest_project_update.project.scm_update_cache_timeout)
if (latest_project_update.finished + timeout_seconds) < now:
return True
return False
def get_or_create_project_update(self, project_id):
project = self.all_projects.get(project_id, None)
if project is not None:
latest_project_update = project.project_updates.filter(job_type='check').order_by("-created").first()
if self.should_update_again(latest_project_update, project.scm_update_cache_timeout):
project_task = project.create_project_update(_eager_fields=dict(launch_type='dependency'))
project_task.signal_start()
return [project_task]
else:
return [latest_project_update]
return []
def gen_dep_for_job(self, task):
created_dependencies = []
dependencies = []
# TODO: Can remove task.project None check after scan-job-default-playbook is removed
if task.project is not None and task.project.scm_update_on_launch is True:
latest_project_update = self.get_latest_project_update(task.project_id)
if self.should_update_related_project(task, latest_project_update):
latest_project_update = self.create_project_update(task)
created_dependencies.append(latest_project_update)
dependencies.append(latest_project_update)
dependencies = self.get_or_create_project_update(task.project_id)
# Inventory created 2 seconds behind job
try:
start_args = json.loads(decrypt_field(task, field_name="start_args"))
except ValueError:
start_args = dict()
# generator for inventory sources related to this task
task_inv_sources = (invsrc for invsrc in self.all_inventory_sources if invsrc.inventory_id == task.inventory_id)
for inventory_source in task_inv_sources:
# generator for update-on-launch inventory sources related to this task
for inventory_source in self.all_inventory_sources.get(task.inventory_id, []):
if "inventory_sources_already_updated" in start_args and inventory_source.id in start_args['inventory_sources_already_updated']:
continue
if not inventory_source.update_on_launch:
continue
latest_inventory_update = self.get_latest_inventory_update(inventory_source)
if self.should_update_inventory_source(task, latest_inventory_update):
inventory_task = self.create_inventory_update(task, inventory_source)
created_dependencies.append(inventory_task)
latest_inventory_update = inventory_source.inventory_updates.order_by("-created").first()
if self.should_update_again(latest_inventory_update, inventory_source.update_cache_timeout):
inventory_task = inventory_source.create_inventory_update(_eager_fields=dict(launch_type='dependency'))
inventory_task.signal_start()
dependencies.append(inventory_task)
else:
dependencies.append(latest_inventory_update)
if dependencies:
self.add_dependencies(task, dependencies)
return created_dependencies
return dependencies
def gen_dep_for_inventory_update(self, inventory_task):
created_dependencies = []
if inventory_task.source == "scm":
invsrc = inventory_task.inventory_source
if not invsrc.source_project.scm_update_on_launch:
return created_dependencies
latest_src_project_update = self.get_latest_project_update(invsrc.source_project_id)
if self.should_update_related_project(inventory_task, latest_src_project_update):
latest_src_project_update = self.create_project_update(inventory_task, project_id=invsrc.source_project_id)
created_dependencies.append(latest_src_project_update)
self.add_dependencies(inventory_task, [latest_src_project_update])
latest_src_project_update.scm_inventory_updates.add(inventory_task)
return created_dependencies
if invsrc:
return self.get_or_create_project_update(invsrc.source_project_id)
return []
@timeit
def generate_dependencies(self, undeped_tasks):
created_dependencies = []
dependencies = []
self.cache_projects_and_sources(undeped_tasks)
for task in undeped_tasks:
task.log_lifecycle("acknowledged")
if type(task) is Job:
created_dependencies += self.gen_dep_for_job(task)
job_deps = self.gen_dep_for_job(task)
elif type(task) is InventoryUpdate:
created_dependencies += self.gen_dep_for_inventory_update(task)
job_deps = self.gen_dep_for_inventory_update(task)
else:
continue
if job_deps:
dependencies += job_deps
with disable_activity_stream():
task.dependent_jobs.add(*dependencies)
logger.debug(f'Linked {[dep.log_format for dep in dependencies]} as dependencies of {task.log_format}')
UnifiedJob.objects.filter(pk__in=[task.pk for task in undeped_tasks]).update(dependencies_processed=True)
return created_dependencies
def process_tasks(self):
deps = self.generate_dependencies(self.all_tasks)
self.generate_dependencies(deps)
self.subsystem_metrics.inc(f"{self.prefix}_pending_processed", len(self.all_tasks) + len(deps))
return dependencies
@timeit
def _schedule(self):
self.get_tasks(dict(status__in=["pending"], dependencies_processed=False))
if len(self.all_tasks) > 0:
self.get_inventory_source_tasks()
self.process_tasks()
deps = self.generate_dependencies(self.all_tasks)
undeped_deps = [dep for dep in deps if dep.dependencies_processed is False]
self.generate_dependencies(undeped_deps)
self.subsystem_metrics.inc(f"{self.prefix}_pending_processed", len(self.all_tasks) + len(undeped_deps))
ScheduleTaskManager().schedule()

View File

@@ -1 +1 @@
from . import jobs, receptor, system # noqa
from . import host_metrics, jobs, receptor, system # noqa

View File

@@ -31,6 +31,7 @@ class RunnerCallback:
self.model = model
self.update_attempts = int(settings.DISPATCHER_DB_DOWNTOWN_TOLLERANCE / 5)
self.wrapup_event_dispatched = False
self.artifacts_processed = False
self.extra_update_fields = {}
def update_model(self, pk, _attempt=0, **updates):
@@ -211,6 +212,9 @@ class RunnerCallback:
if result_traceback:
self.delay_update(result_traceback=result_traceback)
def artifacts_handler(self, artifact_dir):
self.artifacts_processed = True
class RunnerCallbackForProjectUpdate(RunnerCallback):
def __init__(self, *args, **kwargs):

View File

@@ -9,6 +9,7 @@ from django.conf import settings
from django.db.models.query import QuerySet
from django.utils.encoding import smart_str
from django.utils.timezone import now
from django.db import OperationalError
# AWX
from awx.main.utils.common import log_excess_runtime
@@ -57,6 +58,28 @@ def start_fact_cache(hosts, destination, log_data, timeout=None, inventory_id=No
return None
def raw_update_hosts(host_list):
Host.objects.bulk_update(host_list, ['ansible_facts', 'ansible_facts_modified'])
def update_hosts(host_list, max_tries=5):
if not host_list:
return
for i in range(max_tries):
try:
raw_update_hosts(host_list)
except OperationalError as exc:
# Deadlocks can happen if this runs at the same time as another large query
# inventory updates and updating last_job_host_summary are candidates for conflict
# but these would resolve easily on a retry
if i + 1 < max_tries:
logger.info(f'OperationalError (suspected deadlock) saving host facts retry {i}, message: {exc}')
continue
else:
raise
break
@log_excess_runtime(
logger,
debug_cutoff=0.01,
@@ -111,7 +134,6 @@ def finish_fact_cache(hosts, destination, facts_write_time, log_data, job_id=Non
system_tracking_logger.info('Facts cleared for inventory {} host {}'.format(smart_str(host.inventory.name), smart_str(host.name)))
log_data['cleared_ct'] += 1
if len(hosts_to_update) > 100:
Host.objects.bulk_update(hosts_to_update, ['ansible_facts', 'ansible_facts_modified'])
update_hosts(hosts_to_update)
hosts_to_update = []
if hosts_to_update:
Host.objects.bulk_update(hosts_to_update, ['ansible_facts', 'ansible_facts_modified'])
update_hosts(hosts_to_update)

View File

@@ -0,0 +1,205 @@
import datetime
from dateutil.relativedelta import relativedelta
import logging
from django.conf import settings
from django.db.models import Count
from django.db.models.functions import TruncMonth
from django.utils.timezone import now
from rest_framework.fields import DateTimeField
from awx.main.dispatch import get_task_queuename
from awx.main.dispatch.publish import task
from awx.main.models.inventory import HostMetric, HostMetricSummaryMonthly
from awx.conf.license import get_license
logger = logging.getLogger('awx.main.tasks.host_metric_summary_monthly')
@task(queue=get_task_queuename)
def host_metric_summary_monthly():
"""Run cleanup host metrics summary monthly task each week"""
if _is_run_threshold_reached(
getattr(settings, 'HOST_METRIC_SUMMARY_TASK_LAST_TS', None), getattr(settings, 'HOST_METRIC_SUMMARY_TASK_INTERVAL', 7) * 86400
):
logger.info(f"Executing host_metric_summary_monthly, last ran at {getattr(settings, 'HOST_METRIC_SUMMARY_TASK_LAST_TS', '---')}")
HostMetricSummaryMonthlyTask().execute()
logger.info("Finished host_metric_summary_monthly")
def _is_run_threshold_reached(setting, threshold_seconds):
last_time = DateTimeField().to_internal_value(setting) if setting else DateTimeField().to_internal_value('1970-01-01')
return (now() - last_time).total_seconds() > threshold_seconds
class HostMetricSummaryMonthlyTask:
"""
This task computes last [threshold] months of HostMetricSummaryMonthly table
[threshold] is setting CLEANUP_HOST_METRICS_HARD_THRESHOLD
Each record in the table represents changes in HostMetric table in one month
It always overrides all the months newer than <threshold>, never updates older months
Algorithm:
- hosts_added are HostMetric records with first_automation in given month
- hosts_deleted are HostMetric records with deleted=True and last_deleted in given month
- - HostMetrics soft-deleted before <threshold> also increases hosts_deleted in their last_deleted month
- license_consumed is license_consumed(previous month) + hosts_added - hosts_deleted
- - license_consumed for HostMetricSummaryMonthly.date < [threshold] is computed also from
all HostMetrics.first_automation < [threshold]
- license_capacity is set only for current month, and it's never updated (value taken from current subscription)
"""
def __init__(self):
self.host_metrics = {}
self.processed_month = self._get_first_month()
self.existing_summaries = None
self.existing_summaries_idx = 0
self.existing_summaries_cnt = 0
self.records_to_create = []
self.records_to_update = []
def execute(self):
self._load_existing_summaries()
self._load_hosts_added()
self._load_hosts_deleted()
# Get first month after last hard delete
month = self._get_first_month()
license_consumed = self._get_license_consumed_before(month)
# Fill record for each month
while month <= datetime.date.today().replace(day=1):
summary = self._find_or_create_summary(month)
# Update summary and update license_consumed by hosts added/removed this month
self._update_summary(summary, month, license_consumed)
license_consumed = summary.license_consumed
month = month + relativedelta(months=1)
# Create/Update stats
HostMetricSummaryMonthly.objects.bulk_create(self.records_to_create, batch_size=1000)
HostMetricSummaryMonthly.objects.bulk_update(self.records_to_update, ['license_consumed', 'hosts_added', 'hosts_deleted'], batch_size=1000)
# Set timestamp of last run
settings.HOST_METRIC_SUMMARY_TASK_LAST_TS = now()
def _get_license_consumed_before(self, month):
license_consumed = 0
for metric_month, metric in self.host_metrics.items():
if metric_month < month:
hosts_added = metric.get('hosts_added', 0)
hosts_deleted = metric.get('hosts_deleted', 0)
license_consumed = license_consumed + hosts_added - hosts_deleted
else:
break
return license_consumed
def _load_existing_summaries(self):
"""Find all summaries newer than host metrics delete threshold"""
self.existing_summaries = HostMetricSummaryMonthly.objects.filter(date__gte=self._get_first_month()).order_by('date')
self.existing_summaries_idx = 0
self.existing_summaries_cnt = len(self.existing_summaries)
def _load_hosts_added(self):
"""Aggregates hosts added each month, by the 'first_automation' timestamp"""
#
# -- SQL translation (for better code readability)
# SELECT date_trunc('month', first_automation) as month,
# count(first_automation) AS hosts_added
# FROM main_hostmetric
# GROUP BY month
# ORDER by month;
result = (
HostMetric.objects.annotate(month=TruncMonth('first_automation'))
.values('month')
.annotate(hosts_added=Count('first_automation'))
.values('month', 'hosts_added')
.order_by('month')
)
for host_metric in list(result):
month = host_metric['month']
if month:
beginning_of_month = datetime.date(month.year, month.month, 1)
if self.host_metrics.get(beginning_of_month) is None:
self.host_metrics[beginning_of_month] = {}
self.host_metrics[beginning_of_month]['hosts_added'] = host_metric['hosts_added']
def _load_hosts_deleted(self):
"""
Aggregates hosts deleted each month, by the 'last_deleted' timestamp.
Host metrics have to be deleted NOW to be counted as deleted before
(by intention - statistics can change retrospectively by re-automation of previously deleted host)
"""
#
# -- SQL translation (for better code readability)
# SELECT date_trunc('month', last_deleted) as month,
# count(last_deleted) AS hosts_deleted
# FROM main_hostmetric
# WHERE deleted = True
# GROUP BY 1 # equal to "GROUP BY month"
# ORDER by month;
result = (
HostMetric.objects.annotate(month=TruncMonth('last_deleted'))
.values('month')
.annotate(hosts_deleted=Count('last_deleted'))
.values('month', 'hosts_deleted')
.filter(deleted=True)
.order_by('month')
)
for host_metric in list(result):
month = host_metric['month']
if month:
beginning_of_month = datetime.date(month.year, month.month, 1)
if self.host_metrics.get(beginning_of_month) is None:
self.host_metrics[beginning_of_month] = {}
self.host_metrics[beginning_of_month]['hosts_deleted'] = host_metric['hosts_deleted']
def _find_or_create_summary(self, month):
summary = self._find_summary(month)
if not summary:
summary = HostMetricSummaryMonthly(date=month)
self.records_to_create.append(summary)
else:
self.records_to_update.append(summary)
return summary
def _find_summary(self, month):
"""
Existing summaries are ordered by month ASC.
This method is called with month in ascending order too => only 1 traversing is enough
"""
summary = None
while not summary and self.existing_summaries_idx < self.existing_summaries_cnt:
tmp = self.existing_summaries[self.existing_summaries_idx]
if tmp.date < month:
self.existing_summaries_idx += 1
elif tmp.date == month:
summary = tmp
elif tmp.date > month:
break
return summary
def _update_summary(self, summary, month, license_consumed):
"""Updates the metric with hosts added and deleted and set license info for current month"""
# Get month counts from host metrics, zero if not found
hosts_added, hosts_deleted = 0, 0
if metric := self.host_metrics.get(month, None):
hosts_added = metric.get('hosts_added', 0)
hosts_deleted = metric.get('hosts_deleted', 0)
summary.license_consumed = license_consumed + hosts_added - hosts_deleted
summary.hosts_added = hosts_added
summary.hosts_deleted = hosts_deleted
# Set subscription count for current month
if month == datetime.date.today().replace(day=1):
license_info = get_license()
summary.license_capacity = license_info.get('instance_count', 0)
return summary
@staticmethod
def _get_first_month():
"""Returns first month after host metrics hard delete threshold"""
threshold = getattr(settings, 'CLEANUP_HOST_METRICS_HARD_THRESHOLD', 36)
return datetime.date.today().replace(day=1) - relativedelta(months=int(threshold) - 1)

View File

@@ -290,13 +290,6 @@ class BaseTask(object):
content = safe_dump(vars, safe_dict)
return self.write_private_data_file(private_data_dir, 'extravars', content, sub_dir='env')
def add_awx_venv(self, env):
env['VIRTUAL_ENV'] = settings.AWX_VENV_PATH
if 'PATH' in env:
env['PATH'] = os.path.join(settings.AWX_VENV_PATH, "bin") + ":" + env['PATH']
else:
env['PATH'] = os.path.join(settings.AWX_VENV_PATH, "bin")
def build_env(self, instance, private_data_dir, private_data_files=None):
"""
Build environment dictionary for ansible-playbook.
@@ -926,6 +919,7 @@ class RunJob(SourceControlMixin, BaseTask):
path_vars = (
('ANSIBLE_COLLECTIONS_PATHS', 'collections_paths', 'requirements_collections', '~/.ansible/collections:/usr/share/ansible/collections'),
('ANSIBLE_ROLES_PATH', 'roles_path', 'requirements_roles', '~/.ansible/roles:/usr/share/ansible/roles:/etc/ansible/roles'),
('ANSIBLE_COLLECTIONS_PATH', 'collections_path', 'requirements_collections', '~/.ansible/collections:/usr/share/ansible/collections'),
)
config_values = read_ansible_config(os.path.join(private_data_dir, 'project'), list(map(lambda x: x[1], path_vars)))
@@ -1100,7 +1094,7 @@ class RunJob(SourceControlMixin, BaseTask):
# actual `run()` call; this _usually_ means something failed in
# the pre_run_hook method
return
if self.should_use_fact_cache():
if self.should_use_fact_cache() and self.runner_callback.artifacts_processed:
job.log_lifecycle("finish_job_fact_cache")
finish_fact_cache(
job.get_hosts_for_fact_cache(),
@@ -1268,7 +1262,7 @@ class RunProjectUpdate(BaseTask):
galaxy_creds_are_defined = project_update.project.organization and project_update.project.organization.galaxy_credentials.exists()
if not galaxy_creds_are_defined and (settings.AWX_ROLES_ENABLED or settings.AWX_COLLECTIONS_ENABLED):
logger.warning('Galaxy role/collection syncing is enabled, but no ' f'credentials are configured for {project_update.project.organization}.')
logger.warning('Galaxy role/collection syncing is enabled, but no credentials are configured for {project_update.project.organization}.')
extra_vars.update(
{

View File

@@ -464,6 +464,7 @@ class AWXReceptorJob:
event_handler=self.task.runner_callback.event_handler,
finished_callback=self.task.runner_callback.finished_callback,
status_handler=self.task.runner_callback.status_handler,
artifacts_handler=self.task.runner_callback.artifacts_handler,
**self.runner_params,
)
@@ -639,11 +640,11 @@ class AWXReceptorJob:
#
RECEPTOR_CONFIG_STARTER = (
{'local-only': None},
{'log-level': 'info'},
{'log-level': settings.RECEPTOR_LOG_LEVEL},
{'node': {'firewallrules': [{'action': 'reject', 'tonode': settings.CLUSTER_HOST_ID, 'toservice': 'control'}]}},
{'control-service': {'service': 'control', 'filename': '/var/run/receptor/receptor.sock', 'permissions': '0660'}},
{'work-command': {'worktype': 'local', 'command': 'ansible-runner', 'params': 'worker', 'allowruntimeparams': True}},
{'work-signing': {'privatekey': '/etc/receptor/signing/work-private-key.pem', 'tokenexpiration': '1m'}},
{'work-signing': {'privatekey': '/etc/receptor/work_private_key.pem', 'tokenexpiration': '1m'}},
{
'work-kubernetes': {
'worktype': 'kubernetes-runtime-auth',
@@ -665,7 +666,7 @@ RECEPTOR_CONFIG_STARTER = (
{
'tls-client': {
'name': 'tlsclient',
'rootcas': '/etc/receptor/tls/ca/receptor-ca.crt',
'rootcas': '/etc/receptor/tls/ca/mesh-CA.crt',
'cert': '/etc/receptor/tls/receptor.crt',
'key': '/etc/receptor/tls/receptor.key',
'mintls13': False,

View File

@@ -2,6 +2,7 @@
from collections import namedtuple
import functools
import importlib
import itertools
import json
import logging
import os
@@ -14,7 +15,7 @@ from datetime import datetime
# Django
from django.conf import settings
from django.db import transaction, DatabaseError, IntegrityError
from django.db import connection, transaction, DatabaseError, IntegrityError
from django.db.models.fields.related import ForeignKey
from django.utils.timezone import now, timedelta
from django.utils.encoding import smart_str
@@ -48,6 +49,7 @@ from awx.main.models import (
SmartInventoryMembership,
Job,
HostMetric,
convert_jsonfields,
)
from awx.main.constants import ACTIVE_STATES
from awx.main.dispatch.publish import task
@@ -86,6 +88,11 @@ def dispatch_startup():
if settings.IS_K8S:
write_receptor_config()
try:
convert_jsonfields()
except Exception:
logger.exception("Failed json field conversion, skipping.")
startup_logger.debug("Syncing Schedules")
for sch in Schedule.objects.all():
try:
@@ -129,6 +136,52 @@ def inform_cluster_of_shutdown():
logger.exception('Encountered problem with normal shutdown signal.')
@task(queue=get_task_queuename)
def migrate_jsonfield(table, pkfield, columns):
batchsize = 10000
with advisory_lock(f'json_migration_{table}', wait=False) as acquired:
if not acquired:
return
from django.db.migrations.executor import MigrationExecutor
# If Django is currently running migrations, wait until it is done.
while True:
executor = MigrationExecutor(connection)
if not executor.migration_plan(executor.loader.graph.leaf_nodes()):
break
time.sleep(120)
logger.warning(f"Migrating json fields for {table}: {', '.join(columns)}")
with connection.cursor() as cursor:
for i in itertools.count(0, batchsize):
# Are there even any rows in the table beyond this point?
cursor.execute(f"select count(1) from {table} where {pkfield} >= %s limit 1;", (i,))
if not cursor.fetchone()[0]:
break
column_expr = ', '.join(f"{colname} = {colname}_old::jsonb" for colname in columns)
# If any of the old columns have non-null values, the data needs to be cast and copied over.
empty_expr = ' or '.join(f"{colname}_old is not null" for colname in columns)
cursor.execute( # Only clobber the new fields if there is non-null data in the old ones.
f"""
update {table}
set {column_expr}
where {pkfield} >= %s and {pkfield} < %s
and {empty_expr};
""",
(i, i + batchsize),
)
rows = cursor.rowcount
logger.debug(f"Batch {i} to {i + batchsize} copied on {table}, {rows} rows affected.")
column_expr = ', '.join(f"DROP COLUMN {column}_old" for column in columns)
cursor.execute(f"ALTER TABLE {table} {column_expr};")
logger.warning(f"Migration of {table} to jsonb is finished.")
@task(queue=get_task_queuename)
def apply_cluster_membership_policies():
from awx.main.signals import disable_activity_stream
@@ -316,13 +369,8 @@ def send_notifications(notification_list, job_id=None):
@task(queue=get_task_queuename)
def gather_analytics():
from awx.conf.models import Setting
from rest_framework.fields import DateTimeField
last_gather = Setting.objects.filter(key='AUTOMATION_ANALYTICS_LAST_GATHER').first()
last_time = DateTimeField().to_internal_value(last_gather.value) if last_gather and last_gather.value else None
gather_time = now()
if not last_time or ((gather_time - last_time).total_seconds() > settings.AUTOMATION_ANALYTICS_GATHER_INTERVAL):
if is_run_threshold_reached(Setting.objects.filter(key='AUTOMATION_ANALYTICS_LAST_GATHER').first(), settings.AUTOMATION_ANALYTICS_GATHER_INTERVAL):
analytics.gather()
@@ -381,16 +429,25 @@ def cleanup_images_and_files():
@task(queue=get_task_queuename)
def cleanup_host_metrics():
"""Run cleanup host metrics ~each month"""
# TODO: move whole method to host_metrics in follow-up PR
from awx.conf.models import Setting
if is_run_threshold_reached(
Setting.objects.filter(key='CLEANUP_HOST_METRICS_LAST_TS').first(), getattr(settings, 'CLEANUP_HOST_METRICS_INTERVAL', 30) * 86400
):
months_ago = getattr(settings, 'CLEANUP_HOST_METRICS_SOFT_THRESHOLD', 12)
logger.info("Executing cleanup_host_metrics")
HostMetric.cleanup_task(months_ago)
logger.info("Finished cleanup_host_metrics")
def is_run_threshold_reached(setting, threshold_seconds):
from rest_framework.fields import DateTimeField
last_cleanup = Setting.objects.filter(key='CLEANUP_HOST_METRICS_LAST_TS').first()
last_time = DateTimeField().to_internal_value(last_cleanup.value) if last_cleanup and last_cleanup.value else None
last_time = DateTimeField().to_internal_value(setting.value) if setting and setting.value else DateTimeField().to_internal_value('1970-01-01')
cleanup_interval_secs = getattr(settings, 'CLEANUP_HOST_METRICS_INTERVAL', 30) * 86400
if not last_time or ((now() - last_time).total_seconds() > cleanup_interval_secs):
months_ago = getattr(settings, 'CLEANUP_HOST_METRICS_THRESHOLD', 12)
HostMetric.cleanup_task(months_ago)
return (now() - last_time).total_seconds() > threshold_seconds
@task(queue=get_task_queuename)
@@ -541,7 +598,7 @@ def cluster_node_heartbeat(dispatch_time=None, worker_tasks=None):
logger.warning(f'Heartbeat skew - interval={(nowtime - last_last_seen).total_seconds():.4f}, expected={settings.CLUSTER_NODE_HEARTBEAT_PERIOD}')
else:
if settings.AWX_AUTO_DEPROVISION_INSTANCES:
(changed, this_inst) = Instance.objects.register(ip_address=os.environ.get('MY_POD_IP'), node_type='control', uuid=settings.SYSTEM_UUID)
(changed, this_inst) = Instance.objects.register(ip_address=os.environ.get('MY_POD_IP'), node_type='control', node_uuid=settings.SYSTEM_UUID)
if changed:
logger.warning(f'Recreated instance record {this_inst.hostname} after unexpected removal')
this_inst.local_health_check()
@@ -839,10 +896,7 @@ def delete_inventory(inventory_id, user_id, retries=5):
user = None
with ignore_inventory_computed_fields(), ignore_inventory_group_removal(), impersonate(user):
try:
i = Inventory.objects.get(id=inventory_id)
for host in i.hosts.iterator():
host.job_events_as_primary_host.update(host=None)
i.delete()
Inventory.objects.get(id=inventory_id).delete()
emit_channel_notification('inventories-status_changed', {'group_name': 'inventories', 'inventory_id': inventory_id, 'status': 'deleted'})
logger.debug('Deleted inventory {} as user {}.'.format(inventory_id, user_id))
except Inventory.DoesNotExist:

View File

@@ -7,7 +7,7 @@ from django.core.serializers.json import DjangoJSONEncoder
from django.utils.functional import Promise
from django.utils.encoding import force_str
from openapi_codec.encode import generate_swagger_object
from drf_yasg.codecs import OpenAPICodecJson
import pytest
from awx.api.versioning import drf_reverse
@@ -43,12 +43,12 @@ class TestSwaggerGeneration:
@pytest.fixture(autouse=True, scope='function')
def _prepare(self, get, admin):
if not self.__class__.JSON:
url = drf_reverse('api:swagger_view') + '?format=openapi'
url = drf_reverse('api:schema-swagger-ui') + '?format=openapi'
response = get(url, user=admin)
data = generate_swagger_object(response.data)
codec = OpenAPICodecJson([])
data = codec.generate_swagger_object(response.data)
if response.has_header('X-Deprecated-Paths'):
data['deprecated_paths'] = json.loads(response['X-Deprecated-Paths'])
data.update(response.accepted_renderer.get_customizations() or {})
data['host'] = None
data['schemes'] = ['https']
@@ -60,12 +60,21 @@ class TestSwaggerGeneration:
# change {version} in paths to the actual default API version (e.g., v2)
revised_paths[path.replace('{version}', settings.REST_FRAMEWORK['DEFAULT_VERSION'])] = node
for method in node:
# Ignore any parameters methods, these cause issues because it can come as an array instead of a dict
# Which causes issues in the last for loop in here
if method == 'parameters':
continue
if path in deprecated_paths:
node[method]['deprecated'] = True
if 'description' in node[method]:
# Pop off the first line and use that as the summary
lines = node[method]['description'].splitlines()
node[method]['summary'] = lines.pop(0).strip('#:')
# If there was a description then set the summary as the description, otherwise make something up
if lines:
node[method]['summary'] = lines.pop(0).strip('#:')
else:
node[method]['summary'] = f'No Description for {method} on {path}'
node[method]['description'] = '\n'.join(lines)
# remove the required `version` parameter
@@ -90,13 +99,13 @@ class TestSwaggerGeneration:
# The number of API endpoints changes over time, but let's just check
# for a reasonable number here; if this test starts failing, raise/lower the bounds
paths = JSON['paths']
assert 250 < len(paths) < 350
assert list(paths['/api/'].keys()) == ['get']
assert list(paths['/api/v2/'].keys()) == ['get']
assert list(sorted(paths['/api/v2/credentials/'].keys())) == ['get', 'post']
assert list(sorted(paths['/api/v2/credentials/{id}/'].keys())) == ['delete', 'get', 'patch', 'put']
assert list(paths['/api/v2/settings/'].keys()) == ['get']
assert list(paths['/api/v2/settings/{category_slug}/'].keys()) == ['get', 'put', 'patch', 'delete']
assert 250 < len(paths) < 375
assert set(list(paths['/api/'].keys())) == set(['get', 'parameters'])
assert set(list(paths['/api/v2/'].keys())) == set(['get', 'parameters'])
assert set(list(sorted(paths['/api/v2/credentials/'].keys()))) == set(['get', 'post', 'parameters'])
assert set(list(sorted(paths['/api/v2/credentials/{id}/'].keys()))) == set(['delete', 'get', 'patch', 'put', 'parameters'])
assert set(list(paths['/api/v2/settings/'].keys())) == set(['get', 'parameters'])
assert set(list(paths['/api/v2/settings/{category_slug}/'].keys())) == set(['get', 'put', 'patch', 'delete', 'parameters'])
@pytest.mark.parametrize(
'path',
@@ -162,4 +171,8 @@ class TestSwaggerGeneration:
data = re.sub(r'[0-9]{4}-[0-9]{2}-[0-9]{2}(T|\s)[0-9]{2}:[0-9]{2}:[0-9]{2}.[0-9]+(Z|\+[0-9]{2}:[0-9]{2})?', r'2018-02-01T08:00:00.000000Z', data)
data = re.sub(r'''(\s+"client_id": ")([a-zA-Z0-9]{40})("\,\s*)''', r'\1xxxx\3', data)
data = re.sub(r'"action_node": "[^"]+"', '"action_node": "awx"', data)
# replace uuids to prevent needless diffs
pattern = r'[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}'
data = re.sub(pattern, r'00000000-0000-0000-0000-000000000000', data)
f.write(data)

View File

@@ -1,6 +1,9 @@
import json
from django.contrib.auth.models import User
from django.core.exceptions import ValidationError
from unittest import mock
from awx.main.models import (
Organization,
@@ -20,6 +23,7 @@ from awx.main.models import (
WorkflowJobNode,
WorkflowJobTemplateNode,
)
from awx.main.models.inventory import HostMetric, HostMetricSummaryMonthly
# mk methods should create only a single object of a single type.
# they should also have the option of being persisted or not.
@@ -248,3 +252,42 @@ def mk_workflow_job_node(unified_job_template=None, success_nodes=None, failure_
if persisted:
workflow_node.save()
return workflow_node
def mk_host_metric(hostname, first_automation, last_automation=None, last_deleted=None, deleted=False, persisted=True):
ok, idx = False, 1
while not ok:
try:
with mock.patch("django.utils.timezone.now") as mock_now:
mock_now.return_value = first_automation
metric = HostMetric(
hostname=hostname or f"host-{first_automation}-{idx}",
first_automation=first_automation,
last_automation=last_automation or first_automation,
last_deleted=last_deleted,
deleted=deleted,
)
metric.validate_unique()
if persisted:
metric.save()
ok = True
except ValidationError as e:
# Repeat create for auto-generated hostname
if not hostname and e.message_dict.get('hostname', None):
idx += 1
else:
raise e
def mk_host_metric_summary(date, license_consumed=0, license_capacity=0, hosts_added=0, hosts_deleted=0, indirectly_managed_hosts=0, persisted=True):
summary = HostMetricSummaryMonthly(
date=date,
license_consumed=license_consumed,
license_capacity=license_capacity,
hosts_added=hosts_added,
hosts_deleted=hosts_deleted,
indirectly_managed_hosts=indirectly_managed_hosts,
)
if persisted:
summary.save()
return summary

View File

@@ -2,8 +2,8 @@ import pytest
import tempfile
import os
import re
import shutil
import csv
from io import StringIO
from django.utils.timezone import now
from datetime import timedelta
@@ -20,15 +20,16 @@ from awx.main.models import (
)
@pytest.fixture
def sqlite_copy_expert(request):
# copy_expert is postgres-specific, and SQLite doesn't support it; mock its
# behavior to test that it writes a file that contains stdout from events
path = tempfile.mkdtemp(prefix="copied_tables")
class MockCopy:
headers = None
results = None
sent_data = False
def write_stdout(self, sql, fd):
def __init__(self, sql, parent_connection):
# Would be cool if we instead properly disected the SQL query and verified
# it that way. But instead, we just take the naive approach here.
self.results = None
self.headers = None
sql = sql.strip()
assert sql.startswith("COPY (")
assert sql.endswith(") TO STDOUT WITH CSV HEADER")
@@ -51,29 +52,49 @@ def sqlite_copy_expert(request):
elif not line.endswith(","):
sql_new[-1] = sql_new[-1].rstrip(",")
sql = "\n".join(sql_new)
parent_connection.execute(sql)
self.results = parent_connection.fetchall()
self.headers = [i[0] for i in parent_connection.description]
self.execute(sql)
results = self.fetchall()
headers = [i[0] for i in self.description]
def read(self):
if not self.sent_data:
mem_file = StringIO()
csv_handle = csv.writer(
mem_file,
delimiter=",",
quoting=csv.QUOTE_ALL,
escapechar="\\",
lineterminator="\n",
)
if self.headers:
csv_handle.writerow(self.headers)
if self.results:
csv_handle.writerows(self.results)
self.sent_data = True
return memoryview((mem_file.getvalue()).encode())
return None
csv_handle = csv.writer(
fd,
delimiter=",",
quoting=csv.QUOTE_ALL,
escapechar="\\",
lineterminator="\n",
)
csv_handle.writerow(headers)
csv_handle.writerows(results)
def __enter__(self):
return self
setattr(SQLiteCursorWrapper, "copy_expert", write_stdout)
request.addfinalizer(lambda: shutil.rmtree(path))
request.addfinalizer(lambda: delattr(SQLiteCursorWrapper, "copy_expert"))
return path
def __exit__(self, exc_type, exc_val, exc_tb):
pass
@pytest.fixture
def sqlite_copy(request, mocker):
# copy is postgres-specific, and SQLite doesn't support it; mock its
# behavior to test that it writes a file that contains stdout from events
def write_stdout(self, sql):
mock_copy = MockCopy(sql, self)
return mock_copy
mocker.patch.object(SQLiteCursorWrapper, 'copy', write_stdout, create=True)
@pytest.mark.django_db
def test_copy_tables_unified_job_query(sqlite_copy_expert, project, inventory, job_template):
def test_copy_tables_unified_job_query(sqlite_copy, project, inventory, job_template):
"""
Ensure that various unified job types are in the output of the query.
"""
@@ -127,7 +148,7 @@ def workflow_job(states=["new", "new", "new", "new", "new"]):
@pytest.mark.django_db
def test_copy_tables_workflow_job_node_query(sqlite_copy_expert, workflow_job):
def test_copy_tables_workflow_job_node_query(sqlite_copy, workflow_job):
time_start = now() - timedelta(hours=9)
with tempfile.TemporaryDirectory() as tmpdir:

View File

@@ -224,7 +224,7 @@ class TestControllerNode:
return AdHocCommand.objects.create(inventory=inventory)
@pytest.mark.django_db
def test_field_controller_node_exists(self, sqlite_copy_expert, admin_user, job, project_update, inventory_update, adhoc, get, system_job_factory):
def test_field_controller_node_exists(self, sqlite_copy, admin_user, job, project_update, inventory_update, adhoc, get, system_job_factory):
system_job = system_job_factory()
r = get(reverse('api:unified_job_list') + '?id={}'.format(job.id), admin_user, expect=200)

View File

@@ -57,7 +57,7 @@ def _mk_inventory_update(created=None):
[_mk_inventory_update, InventoryUpdateEvent, 'inventory_update', 'api:inventory_update_stdout'],
],
)
def test_text_stdout(sqlite_copy_expert, Parent, Child, relation, view, get, admin):
def test_text_stdout(sqlite_copy, Parent, Child, relation, view, get, admin):
job = Parent()
job.save()
for i in range(3):
@@ -79,7 +79,7 @@ def test_text_stdout(sqlite_copy_expert, Parent, Child, relation, view, get, adm
],
)
@pytest.mark.parametrize('download', [True, False])
def test_ansi_stdout_filtering(sqlite_copy_expert, Parent, Child, relation, view, download, get, admin):
def test_ansi_stdout_filtering(sqlite_copy, Parent, Child, relation, view, download, get, admin):
job = Parent()
job.save()
for i in range(3):
@@ -111,7 +111,7 @@ def test_ansi_stdout_filtering(sqlite_copy_expert, Parent, Child, relation, view
[_mk_inventory_update, InventoryUpdateEvent, 'inventory_update', 'api:inventory_update_stdout'],
],
)
def test_colorized_html_stdout(sqlite_copy_expert, Parent, Child, relation, view, get, admin):
def test_colorized_html_stdout(sqlite_copy, Parent, Child, relation, view, get, admin):
job = Parent()
job.save()
for i in range(3):
@@ -134,7 +134,7 @@ def test_colorized_html_stdout(sqlite_copy_expert, Parent, Child, relation, view
[_mk_inventory_update, InventoryUpdateEvent, 'inventory_update', 'api:inventory_update_stdout'],
],
)
def test_stdout_line_range(sqlite_copy_expert, Parent, Child, relation, view, get, admin):
def test_stdout_line_range(sqlite_copy, Parent, Child, relation, view, get, admin):
job = Parent()
job.save()
for i in range(20):
@@ -146,7 +146,7 @@ def test_stdout_line_range(sqlite_copy_expert, Parent, Child, relation, view, ge
@pytest.mark.django_db
def test_text_stdout_from_system_job_events(sqlite_copy_expert, get, admin):
def test_text_stdout_from_system_job_events(sqlite_copy, get, admin):
created = tz_now()
job = SystemJob(created=created)
job.save()
@@ -158,7 +158,7 @@ def test_text_stdout_from_system_job_events(sqlite_copy_expert, get, admin):
@pytest.mark.django_db
def test_text_stdout_with_max_stdout(sqlite_copy_expert, get, admin):
def test_text_stdout_with_max_stdout(sqlite_copy, get, admin):
created = tz_now()
job = SystemJob(created=created)
job.save()
@@ -185,7 +185,7 @@ def test_text_stdout_with_max_stdout(sqlite_copy_expert, get, admin):
)
@pytest.mark.parametrize('fmt', ['txt', 'ansi'])
@mock.patch('awx.main.redact.UriCleaner.SENSITIVE_URI_PATTERN', mock.Mock(**{'search.return_value': None})) # really slow for large strings
def test_max_bytes_display(sqlite_copy_expert, Parent, Child, relation, view, fmt, get, admin):
def test_max_bytes_display(sqlite_copy, Parent, Child, relation, view, fmt, get, admin):
created = tz_now()
job = Parent(created=created)
job.save()
@@ -255,7 +255,7 @@ def test_legacy_result_stdout_with_max_bytes(Cls, view, fmt, get, admin):
],
)
@pytest.mark.parametrize('fmt', ['txt', 'ansi', 'txt_download', 'ansi_download'])
def test_text_with_unicode_stdout(sqlite_copy_expert, Parent, Child, relation, view, get, admin, fmt):
def test_text_with_unicode_stdout(sqlite_copy, Parent, Child, relation, view, get, admin, fmt):
job = Parent()
job.save()
for i in range(3):
@@ -267,7 +267,7 @@ def test_text_with_unicode_stdout(sqlite_copy_expert, Parent, Child, relation, v
@pytest.mark.django_db
def test_unicode_with_base64_ansi(sqlite_copy_expert, get, admin):
def test_unicode_with_base64_ansi(sqlite_copy, get, admin):
created = tz_now()
job = Job(created=created)
job.save()

View File

@@ -0,0 +1,382 @@
import pytest
import datetime
from dateutil.relativedelta import relativedelta
from django.conf import settings
from django.utils import timezone
from awx.main.management.commands.host_metric_summary_monthly import Command
from awx.main.models.inventory import HostMetric, HostMetricSummaryMonthly
from awx.main.tests.factories.fixtures import mk_host_metric, mk_host_metric_summary
@pytest.fixture
def threshold():
return int(getattr(settings, 'CLEANUP_HOST_METRICS_HARD_THRESHOLD', 36))
@pytest.mark.django_db
@pytest.mark.parametrize("metrics_cnt", [0, 1, 2, 3])
@pytest.mark.parametrize("mode", ["old_data", "actual_data", "all_data"])
def test_summaries_counts(threshold, metrics_cnt, mode):
assert HostMetricSummaryMonthly.objects.count() == 0
for idx in range(metrics_cnt):
if mode == "old_data" or mode == "all_data":
mk_host_metric(None, months_ago(threshold + idx, "dt"))
elif mode == "actual_data" or mode == "all_data":
mk_host_metric(None, (months_ago(threshold - idx, "dt")))
Command().handle()
# Number of records is equal to host metrics' hard cleanup months
assert HostMetricSummaryMonthly.objects.count() == threshold
# Records start with date in the month following to the threshold month
date = months_ago(threshold - 1)
for metric in list(HostMetricSummaryMonthly.objects.order_by('date').all()):
assert metric.date == date
date += relativedelta(months=1)
# Older record are untouched
mk_host_metric_summary(date=months_ago(threshold + 10))
Command().handle()
assert HostMetricSummaryMonthly.objects.count() == threshold + 1
@pytest.mark.django_db
@pytest.mark.parametrize("mode", ["old_data", "actual_data", "all_data"])
def test_summary_values(threshold, mode):
tester = {"old_data": MetricsTesterOldData(threshold), "actual_data": MetricsTesterActualData(threshold), "all_data": MetricsTesterCombinedData(threshold)}[
mode
]
for iteration in ["create_metrics", "add_old_summaries", "change_metrics", "delete_metrics", "add_metrics"]:
getattr(tester, iteration)() # call method by string
# Operation is idempotent, repeat twice
for _ in range(2):
Command().handle()
# call assert method by string
getattr(tester, f"assert_{iteration}")()
class MetricsTester:
def __init__(self, threshold, ignore_asserts=False):
self.threshold = threshold
self.expected_summaries = {}
self.ignore_asserts = ignore_asserts
def add_old_summaries(self):
"""These records don't correspond with Host metrics"""
mk_host_metric_summary(self.below(4), license_consumed=100, hosts_added=10, hosts_deleted=5)
mk_host_metric_summary(self.below(3), license_consumed=105, hosts_added=20, hosts_deleted=10)
mk_host_metric_summary(self.below(2), license_consumed=115, hosts_added=60, hosts_deleted=75)
def assert_add_old_summaries(self):
"""Old summary records should be untouched"""
self.expected_summaries[self.below(4)] = {"date": self.below(4), "license_consumed": 100, "hosts_added": 10, "hosts_deleted": 5}
self.expected_summaries[self.below(3)] = {"date": self.below(3), "license_consumed": 105, "hosts_added": 20, "hosts_deleted": 10}
self.expected_summaries[self.below(2)] = {"date": self.below(2), "license_consumed": 115, "hosts_added": 60, "hosts_deleted": 75}
self.assert_host_metric_summaries()
def assert_host_metric_summaries(self):
"""Ignore asserts when old/actual test object is used only as a helper for Combined test"""
if self.ignore_asserts:
return True
for summary in list(HostMetricSummaryMonthly.objects.order_by('date').all()):
assert self.expected_summaries.get(summary.date, None) is not None
assert self.expected_summaries[summary.date] == {
"date": summary.date,
"license_consumed": summary.license_consumed,
"hosts_added": summary.hosts_added,
"hosts_deleted": summary.hosts_deleted,
}
def below(self, months, fmt="date"):
"""months below threshold, returns first date of that month"""
date = months_ago(self.threshold + months)
if fmt == "dt":
return timezone.make_aware(datetime.datetime.combine(date, datetime.datetime.min.time()))
else:
return date
def above(self, months, fmt="date"):
"""months above threshold, returns first date of that month"""
date = months_ago(self.threshold - months)
if fmt == "dt":
return timezone.make_aware(datetime.datetime.combine(date, datetime.datetime.min.time()))
else:
return date
class MetricsTesterOldData(MetricsTester):
def create_metrics(self):
"""Creates 7 host metrics older than delete threshold"""
mk_host_metric("host_1", first_automation=self.below(3, "dt"))
mk_host_metric("host_2", first_automation=self.below(2, "dt"))
mk_host_metric("host_3", first_automation=self.below(2, "dt"), last_deleted=self.above(2, "dt"), deleted=False)
mk_host_metric("host_4", first_automation=self.below(2, "dt"), last_deleted=self.above(2, "dt"), deleted=True)
mk_host_metric("host_5", first_automation=self.below(2, "dt"), last_deleted=self.below(2, "dt"), deleted=True)
mk_host_metric("host_6", first_automation=self.below(1, "dt"), last_deleted=self.below(1, "dt"), deleted=False)
mk_host_metric("host_7", first_automation=self.below(1, "dt"))
def assert_create_metrics(self):
"""
Month 1 is computed from older host metrics,
Month 2 has deletion (host_4)
Other months are unchanged (same as month 2)
"""
self.expected_summaries = {
self.above(1): {"date": self.above(1), "license_consumed": 6, "hosts_added": 0, "hosts_deleted": 0},
self.above(2): {"date": self.above(2), "license_consumed": 5, "hosts_added": 0, "hosts_deleted": 1},
}
# no change in months 3+
idx = 3
month = self.above(idx)
while month <= beginning_of_the_month():
self.expected_summaries[self.above(idx)] = {"date": self.above(idx), "license_consumed": 5, "hosts_added": 0, "hosts_deleted": 0}
month += relativedelta(months=1)
idx += 1
self.assert_host_metric_summaries()
def add_old_summaries(self):
super().add_old_summaries()
def assert_add_old_summaries(self):
super().assert_add_old_summaries()
@staticmethod
def change_metrics():
"""Hosts 1,2 soft deleted, host_4 automated again (undeleted)"""
HostMetric.objects.filter(hostname='host_1').update(last_deleted=beginning_of_the_month("dt"), deleted=True)
HostMetric.objects.filter(hostname='host_2').update(last_deleted=timezone.now(), deleted=True)
HostMetric.objects.filter(hostname='host_4').update(deleted=False)
def assert_change_metrics(self):
"""
Summaries since month 2 were changed (host_4 restored == automated again)
Current month has 2 deletions (host_1, host_2)
"""
self.expected_summaries[self.above(2)] |= {'hosts_deleted': 0}
for idx in range(2, self.threshold):
self.expected_summaries[self.above(idx)] |= {'license_consumed': 6}
self.expected_summaries[beginning_of_the_month()] |= {'license_consumed': 4, 'hosts_deleted': 2}
self.assert_host_metric_summaries()
@staticmethod
def delete_metrics():
"""Deletes metric deleted before the threshold"""
HostMetric.objects.filter(hostname='host_5').delete()
def assert_delete_metrics(self):
"""No change"""
self.assert_host_metric_summaries()
@staticmethod
def add_metrics():
"""Adds new metrics"""
mk_host_metric("host_24", first_automation=beginning_of_the_month("dt"))
mk_host_metric("host_25", first_automation=beginning_of_the_month("dt")) # timezone.now())
def assert_add_metrics(self):
"""Summary in current month is updated"""
self.expected_summaries[beginning_of_the_month()]['license_consumed'] = 6
self.expected_summaries[beginning_of_the_month()]['hosts_added'] = 2
self.assert_host_metric_summaries()
class MetricsTesterActualData(MetricsTester):
def create_metrics(self):
"""Creates 16 host metrics newer than delete threshold"""
mk_host_metric("host_8", first_automation=self.above(1, "dt"))
mk_host_metric("host_9", first_automation=self.above(1, "dt"), last_deleted=self.above(1, "dt"))
mk_host_metric("host_10", first_automation=self.above(1, "dt"), last_deleted=self.above(1, "dt"), deleted=True)
mk_host_metric("host_11", first_automation=self.above(1, "dt"), last_deleted=self.above(2, "dt"))
mk_host_metric("host_12", first_automation=self.above(1, "dt"), last_deleted=self.above(2, "dt"), deleted=True)
mk_host_metric("host_13", first_automation=self.above(2, "dt"))
mk_host_metric("host_14", first_automation=self.above(2, "dt"), last_deleted=self.above(2, "dt"))
mk_host_metric("host_15", first_automation=self.above(2, "dt"), last_deleted=self.above(2, "dt"), deleted=True)
mk_host_metric("host_16", first_automation=self.above(2, "dt"), last_deleted=self.above(3, "dt"))
mk_host_metric("host_17", first_automation=self.above(2, "dt"), last_deleted=self.above(3, "dt"), deleted=True)
mk_host_metric("host_18", first_automation=self.above(4, "dt"))
# next one shouldn't happen in real (deleted=True, last_deleted = NULL)
mk_host_metric("host_19", first_automation=self.above(4, "dt"), deleted=True)
mk_host_metric("host_20", first_automation=self.above(4, "dt"), last_deleted=self.above(4, "dt"))
mk_host_metric("host_21", first_automation=self.above(4, "dt"), last_deleted=self.above(4, "dt"), deleted=True)
mk_host_metric("host_22", first_automation=self.above(4, "dt"), last_deleted=self.above(5, "dt"))
mk_host_metric("host_23", first_automation=self.above(4, "dt"), last_deleted=self.above(5, "dt"), deleted=True)
def assert_create_metrics(self):
self.expected_summaries = {
self.above(1): {"date": self.above(1), "license_consumed": 4, "hosts_added": 5, "hosts_deleted": 1},
self.above(2): {"date": self.above(2), "license_consumed": 7, "hosts_added": 5, "hosts_deleted": 2},
self.above(3): {"date": self.above(3), "license_consumed": 6, "hosts_added": 0, "hosts_deleted": 1},
self.above(4): {"date": self.above(4), "license_consumed": 11, "hosts_added": 6, "hosts_deleted": 1},
self.above(5): {"date": self.above(5), "license_consumed": 10, "hosts_added": 0, "hosts_deleted": 1},
}
# no change in months 6+
idx = 6
month = self.above(idx)
while month <= beginning_of_the_month():
self.expected_summaries[self.above(idx)] = {"date": self.above(idx), "license_consumed": 10, "hosts_added": 0, "hosts_deleted": 0}
month += relativedelta(months=1)
idx += 1
self.assert_host_metric_summaries()
def add_old_summaries(self):
super().add_old_summaries()
def assert_add_old_summaries(self):
super().assert_add_old_summaries()
@staticmethod
def change_metrics():
"""
- Hosts 12, 19, 21 were automated again (undeleted)
- Host 16 was soft deleted
- Host 17 was undeleted and soft deleted again
"""
HostMetric.objects.filter(hostname='host_12').update(deleted=False)
HostMetric.objects.filter(hostname='host_16').update(last_deleted=timezone.now(), deleted=True)
HostMetric.objects.filter(hostname='host_17').update(last_deleted=beginning_of_the_month("dt"), deleted=True)
HostMetric.objects.filter(hostname='host_19').update(deleted=False)
HostMetric.objects.filter(hostname='host_21').update(deleted=False)
def assert_change_metrics(self):
"""
Summaries since month 2 were changed
Current month has 2 deletions (host_16, host_17)
"""
self.expected_summaries[self.above(2)] |= {'license_consumed': 8, 'hosts_deleted': 1}
self.expected_summaries[self.above(3)] |= {'license_consumed': 8, 'hosts_deleted': 0}
self.expected_summaries[self.above(4)] |= {'license_consumed': 14, 'hosts_deleted': 0}
# month 5 had hosts_deleted 1 => license_consumed == 14 - 1
for idx in range(5, self.threshold):
self.expected_summaries[self.above(idx)] |= {'license_consumed': 13}
self.expected_summaries[beginning_of_the_month()] |= {'license_consumed': 11, 'hosts_deleted': 2}
self.assert_host_metric_summaries()
def delete_metrics(self):
"""Hard cleanup can't delete metrics newer than threshold. No change"""
pass
def assert_delete_metrics(self):
"""No change"""
self.assert_host_metric_summaries()
@staticmethod
def add_metrics():
"""Adds new metrics"""
mk_host_metric("host_26", first_automation=beginning_of_the_month("dt"))
mk_host_metric("host_27", first_automation=timezone.now())
def assert_add_metrics(self):
"""
Two metrics were deleted in current month by change_metrics()
Two metrics are added now
=> license_consumed is equal to the previous month (13 - 2 + 2)
"""
self.expected_summaries[beginning_of_the_month()] |= {'license_consumed': 13, 'hosts_added': 2}
self.assert_host_metric_summaries()
class MetricsTesterCombinedData(MetricsTester):
def __init__(self, threshold):
super().__init__(threshold)
self.old_data = MetricsTesterOldData(threshold, ignore_asserts=True)
self.actual_data = MetricsTesterActualData(threshold, ignore_asserts=True)
def assert_host_metric_summaries(self):
self._combine_expected_summaries()
super().assert_host_metric_summaries()
def create_metrics(self):
self.old_data.create_metrics()
self.actual_data.create_metrics()
def assert_create_metrics(self):
self.old_data.assert_create_metrics()
self.actual_data.assert_create_metrics()
self.assert_host_metric_summaries()
def add_old_summaries(self):
super().add_old_summaries()
def assert_add_old_summaries(self):
self.old_data.assert_add_old_summaries()
self.actual_data.assert_add_old_summaries()
self.assert_host_metric_summaries()
def change_metrics(self):
self.old_data.change_metrics()
self.actual_data.change_metrics()
def assert_change_metrics(self):
self.old_data.assert_change_metrics()
self.actual_data.assert_change_metrics()
self.assert_host_metric_summaries()
def delete_metrics(self):
self.old_data.delete_metrics()
self.actual_data.delete_metrics()
def assert_delete_metrics(self):
self.old_data.assert_delete_metrics()
self.actual_data.assert_delete_metrics()
self.assert_host_metric_summaries()
def add_metrics(self):
self.old_data.add_metrics()
self.actual_data.add_metrics()
def assert_add_metrics(self):
self.old_data.assert_add_metrics()
self.actual_data.assert_add_metrics()
self.assert_host_metric_summaries()
def _combine_expected_summaries(self):
"""
Expected summaries are sum of expected values for tests with old and actual data
Except data older than hard delete threshold (these summaries are untouched by task => the same in all tests)
"""
for date, summary in self.old_data.expected_summaries.items():
if date <= months_ago(self.threshold):
license_consumed = summary['license_consumed']
hosts_added = summary['hosts_added']
hosts_deleted = summary['hosts_deleted']
else:
license_consumed = summary['license_consumed'] + self.actual_data.expected_summaries[date]['license_consumed']
hosts_added = summary['hosts_added'] + self.actual_data.expected_summaries[date]['hosts_added']
hosts_deleted = summary['hosts_deleted'] + self.actual_data.expected_summaries[date]['hosts_deleted']
self.expected_summaries[date] = {'date': date, 'license_consumed': license_consumed, 'hosts_added': hosts_added, 'hosts_deleted': hosts_deleted}
def months_ago(num, fmt="date"):
if num is None:
return None
return beginning_of_the_month(fmt) - relativedelta(months=num)
def beginning_of_the_month(fmt="date"):
date = datetime.date.today().replace(day=1)
if fmt == "dt":
return timezone.make_aware(datetime.datetime.combine(date, datetime.datetime.min.time()))
else:
return date

View File

@@ -1,8 +1,6 @@
# Python
import pytest
from unittest import mock
import tempfile
import shutil
import urllib.parse
from unittest.mock import PropertyMock
@@ -789,25 +787,43 @@ def oauth_application(admin):
return Application.objects.create(name='test app', user=admin, client_type='confidential', authorization_grant_type='password')
@pytest.fixture
def sqlite_copy_expert(request):
# copy_expert is postgres-specific, and SQLite doesn't support it; mock its
# behavior to test that it writes a file that contains stdout from events
path = tempfile.mkdtemp(prefix='job-event-stdout')
class MockCopy:
events = []
index = -1
def write_stdout(self, sql, fd):
# simulate postgres copy_expert support with ORM code
def __init__(self, sql):
self.events = []
parts = sql.split(' ')
tablename = parts[parts.index('from') + 1]
for cls in (JobEvent, AdHocCommandEvent, ProjectUpdateEvent, InventoryUpdateEvent, SystemJobEvent):
if cls._meta.db_table == tablename:
for event in cls.objects.order_by('start_line').all():
fd.write(event.stdout)
self.events.append(event.stdout)
setattr(SQLiteCursorWrapper, 'copy_expert', write_stdout)
request.addfinalizer(lambda: shutil.rmtree(path))
request.addfinalizer(lambda: delattr(SQLiteCursorWrapper, 'copy_expert'))
return path
def read(self):
self.index = self.index + 1
if self.index < len(self.events):
return memoryview(self.events[self.index].encode())
return None
def __enter__(self):
return self
def __exit__(self, exc_type, exc_val, exc_tb):
pass
@pytest.fixture
def sqlite_copy(request, mocker):
# copy is postgres-specific, and SQLite doesn't support it; mock its
# behavior to test that it writes a file that contains stdout from events
def write_stdout(self, sql):
mock_copy = MockCopy(sql)
return mock_copy
mocker.patch.object(SQLiteCursorWrapper, 'copy', write_stdout, create=True)
@pytest.fixture

View File

@@ -98,7 +98,7 @@ class TestJobNotificationMixin(object):
@pytest.mark.django_db
@pytest.mark.parametrize('JobClass', [AdHocCommand, InventoryUpdate, Job, ProjectUpdate, SystemJob, WorkflowJob])
def test_context(self, JobClass, sqlite_copy_expert, project, inventory_source):
def test_context(self, JobClass, sqlite_copy, project, inventory_source):
"""The Jinja context defines all of the fields that can be used by a template. Ensure that the context generated
for each job type has the expected structure."""
kwargs = {}

View File

@@ -331,15 +331,13 @@ def test_single_job_dependencies_project_launch(controlplane_instance_group, job
p.save(skip_update=True)
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
dm = DependencyManager()
with mock.patch.object(DependencyManager, "create_project_update", wraps=dm.create_project_update) as mock_pu:
dm.schedule()
mock_pu.assert_called_once_with(j)
pu = [x for x in p.project_updates.all()]
assert len(pu) == 1
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(pu[0], controlplane_instance_group, instance)
pu[0].status = "successful"
pu[0].save()
dm.schedule()
pu = [x for x in p.project_updates.all()]
assert len(pu) == 1
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(pu[0], controlplane_instance_group, instance)
pu[0].status = "successful"
pu[0].save()
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(j, controlplane_instance_group, instance)
@@ -359,15 +357,14 @@ def test_single_job_dependencies_inventory_update_launch(controlplane_instance_g
i.inventory_sources.add(ii)
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
dm = DependencyManager()
with mock.patch.object(DependencyManager, "create_inventory_update", wraps=dm.create_inventory_update) as mock_iu:
dm.schedule()
mock_iu.assert_called_once_with(j, ii)
iu = [x for x in ii.inventory_updates.all()]
assert len(iu) == 1
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(iu[0], controlplane_instance_group, instance)
iu[0].status = "successful"
iu[0].save()
dm.schedule()
assert ii.inventory_updates.count() == 1
iu = [x for x in ii.inventory_updates.all()]
assert len(iu) == 1
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(iu[0], controlplane_instance_group, instance)
iu[0].status = "successful"
iu[0].save()
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(j, controlplane_instance_group, instance)
@@ -382,11 +379,11 @@ def test_inventory_update_launches_project_update(controlplane_instance_group, s
iu = ii.create_inventory_update()
iu.status = "pending"
iu.save()
assert project.project_updates.count() == 0
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
dm = DependencyManager()
with mock.patch.object(DependencyManager, "create_project_update", wraps=dm.create_project_update) as mock_pu:
dm.schedule()
mock_pu.assert_called_with(iu, project_id=project.id)
dm.schedule()
assert project.project_updates.count() == 1
@pytest.mark.django_db
@@ -407,9 +404,8 @@ def test_job_dependency_with_already_updated(controlplane_instance_group, job_te
j.save()
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
dm = DependencyManager()
with mock.patch.object(DependencyManager, "create_inventory_update", wraps=dm.create_inventory_update) as mock_iu:
dm.schedule()
mock_iu.assert_not_called()
dm.schedule()
assert ii.inventory_updates.count() == 0
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
TaskManager().schedule()
TaskManager.start_task.assert_called_once_with(j, controlplane_instance_group, instance)
@@ -442,7 +438,9 @@ def test_shared_dependencies_launch(controlplane_instance_group, job_template_fa
TaskManager().schedule()
pu = p.project_updates.first()
iu = ii.inventory_updates.first()
TaskManager.start_task.assert_has_calls([mock.call(iu, controlplane_instance_group, instance), mock.call(pu, controlplane_instance_group, instance)])
TaskManager.start_task.assert_has_calls(
[mock.call(iu, controlplane_instance_group, instance), mock.call(pu, controlplane_instance_group, instance)], any_order=True
)
pu.status = "successful"
pu.finished = pu.created + timedelta(seconds=1)
pu.save()
@@ -451,7 +449,9 @@ def test_shared_dependencies_launch(controlplane_instance_group, job_template_fa
iu.save()
with mock.patch("awx.main.scheduler.TaskManager.start_task"):
TaskManager().schedule()
TaskManager.start_task.assert_has_calls([mock.call(j1, controlplane_instance_group, instance), mock.call(j2, controlplane_instance_group, instance)])
TaskManager.start_task.assert_has_calls(
[mock.call(j1, controlplane_instance_group, instance), mock.call(j2, controlplane_instance_group, instance)], any_order=True
)
pu = [x for x in p.project_updates.all()]
iu = [x for x in ii.inventory_updates.all()]
assert len(pu) == 1

View File

@@ -78,6 +78,7 @@ def test_default_cred_types():
[
'aim',
'aws',
'aws_secretsmanager_credential',
'azure_kv',
'azure_rm',
'centrify_vault_kv',

View File

@@ -3,6 +3,7 @@ import multiprocessing
import random
import signal
import time
import yaml
from unittest import mock
from django.utils.timezone import now as tz_now
@@ -13,6 +14,7 @@ from awx.main.dispatch import reaper
from awx.main.dispatch.pool import StatefulPoolWorker, WorkerPool, AutoscalePool
from awx.main.dispatch.publish import task
from awx.main.dispatch.worker import BaseWorker, TaskWorker
from awx.main.dispatch.periodic import Scheduler
'''
@@ -439,3 +441,76 @@ class TestJobReaper(object):
assert job.started > ref_time
assert job.status == 'running'
assert job.job_explanation == ''
@pytest.mark.django_db
class TestScheduler:
def test_too_many_schedules_freak_out(self):
with pytest.raises(RuntimeError):
Scheduler({'job1': {'schedule': datetime.timedelta(seconds=1)}, 'job2': {'schedule': datetime.timedelta(seconds=1)}})
def test_spread_out(self):
scheduler = Scheduler(
{
'job1': {'schedule': datetime.timedelta(seconds=16)},
'job2': {'schedule': datetime.timedelta(seconds=16)},
'job3': {'schedule': datetime.timedelta(seconds=16)},
'job4': {'schedule': datetime.timedelta(seconds=16)},
}
)
assert [job.offset for job in scheduler.jobs] == [0, 4, 8, 12]
def test_missed_schedule(self, mocker):
scheduler = Scheduler({'job1': {'schedule': datetime.timedelta(seconds=10)}})
assert scheduler.jobs[0].missed_runs(time.time() - scheduler.global_start) == 0
mocker.patch('awx.main.dispatch.periodic.time.time', return_value=scheduler.global_start + 50)
scheduler.get_and_mark_pending()
assert scheduler.jobs[0].missed_runs(50) > 1
def test_advance_schedule(self, mocker):
scheduler = Scheduler(
{
'job1': {'schedule': datetime.timedelta(seconds=30)},
'joba': {'schedule': datetime.timedelta(seconds=20)},
'jobb': {'schedule': datetime.timedelta(seconds=20)},
}
)
for job in scheduler.jobs:
# HACK: the offsets automatically added make this a hard test to write... so remove offsets
job.offset = 0.0
mocker.patch('awx.main.dispatch.periodic.time.time', return_value=scheduler.global_start + 29)
to_run = scheduler.get_and_mark_pending()
assert set(job.name for job in to_run) == set(['joba', 'jobb'])
mocker.patch('awx.main.dispatch.periodic.time.time', return_value=scheduler.global_start + 39)
to_run = scheduler.get_and_mark_pending()
assert len(to_run) == 1
assert to_run[0].name == 'job1'
@staticmethod
def get_job(scheduler, name):
for job in scheduler.jobs:
if job.name == name:
return job
def test_scheduler_debug(self, mocker):
scheduler = Scheduler(
{
'joba': {'schedule': datetime.timedelta(seconds=20)},
'jobb': {'schedule': datetime.timedelta(seconds=50)},
'jobc': {'schedule': datetime.timedelta(seconds=500)},
'jobd': {'schedule': datetime.timedelta(seconds=20)},
}
)
rel_time = 119.9 # slightly under the 6th 20-second bin, to avoid offset problems
current_time = scheduler.global_start + rel_time
mocker.patch('awx.main.dispatch.periodic.time.time', return_value=current_time - 1.0e-8)
self.get_job(scheduler, 'jobb').mark_run(rel_time)
self.get_job(scheduler, 'jobd').mark_run(rel_time - 20.0)
output = scheduler.debug()
data = yaml.safe_load(output)
assert data['schedule_list']['jobc']['last_run_seconds_ago'] is None
assert data['schedule_list']['joba']['missed_runs'] == 4
assert data['schedule_list']['jobd']['missed_runs'] == 3
assert data['schedule_list']['jobd']['completed_runs'] == 1
assert data['schedule_list']['jobb']['next_run_in_seconds'] > 25.0

View File

@@ -6,6 +6,7 @@ import json
from awx.main.models import (
Job,
Instance,
Host,
JobHostSummary,
InventoryUpdate,
InventorySource,
@@ -18,6 +19,9 @@ from awx.main.models import (
ExecutionEnvironment,
)
from awx.main.tasks.system import cluster_node_heartbeat
from awx.main.tasks.facts import update_hosts
from django.db import OperationalError
from django.test.utils import override_settings
@@ -112,6 +116,51 @@ def test_job_notification_host_data(inventory, machine_credential, project, job_
}
@pytest.mark.django_db
class TestAnsibleFactsSave:
current_call = 0
def test_update_hosts_deleted_host(self, inventory):
hosts = [Host.objects.create(inventory=inventory, name=f'foo{i}') for i in range(3)]
for host in hosts:
host.ansible_facts = {'foo': 'bar'}
last_pk = hosts[-1].pk
assert inventory.hosts.count() == 3
Host.objects.get(pk=last_pk).delete()
assert inventory.hosts.count() == 2
update_hosts(hosts)
assert inventory.hosts.count() == 2
for host in inventory.hosts.all():
host.refresh_from_db()
assert host.ansible_facts == {'foo': 'bar'}
def test_update_hosts_forever_deadlock(self, inventory, mocker):
hosts = [Host.objects.create(inventory=inventory, name=f'foo{i}') for i in range(3)]
for host in hosts:
host.ansible_facts = {'foo': 'bar'}
db_mock = mocker.patch('awx.main.tasks.facts.Host.objects.bulk_update')
db_mock.side_effect = OperationalError('deadlock detected')
with pytest.raises(OperationalError):
update_hosts(hosts)
def fake_bulk_update(self, host_list):
if self.current_call > 2:
return Host.objects.bulk_update(host_list, ['ansible_facts', 'ansible_facts_modified'])
self.current_call += 1
raise OperationalError('deadlock detected')
def test_update_hosts_resolved_deadlock(self, inventory, mocker):
hosts = [Host.objects.create(inventory=inventory, name=f'foo{i}') for i in range(3)]
for host in hosts:
host.ansible_facts = {'foo': 'bar'}
self.current_call = 0
mocker.patch('awx.main.tasks.facts.raw_update_hosts', new=self.fake_bulk_update)
update_hosts(hosts)
for host in inventory.hosts.all():
host.refresh_from_db()
assert host.ansible_facts == {'foo': 'bar'}
@pytest.mark.django_db
class TestLaunchConfig:
def test_null_creation_from_prompts(self):

View File

@@ -16,8 +16,7 @@ from django.test.utils import override_settings
@pytest.mark.django_db
def test_get_notification_template_list(get, user, notification_template):
url = reverse('api:notification_template_list')
response = get(url, user('admin', True))
assert response.status_code == 200
response = get(url, user('admin', True), expect=200)
assert len(response.data['results']) == 1
@@ -35,8 +34,8 @@ def test_basic_parameterization(get, post, user, organization):
notification_configuration=dict(url="http://localhost", disable_ssl_verification=False, headers={"Test": "Header"}),
),
u,
expect=201,
)
assert response.status_code == 201
url = reverse('api:notification_template_detail', kwargs={'pk': response.data['id']})
response = get(url, u)
assert 'related' in response.data
@@ -69,8 +68,8 @@ def test_encrypted_subfields(get, post, user, organization):
notification_configuration=dict(account_sid="dummy", account_token="shouldhide", from_number="+19999999999", to_numbers=["9998887777"]),
),
u,
expect=201,
)
assert response.status_code == 201
notification_template_actual = NotificationTemplate.objects.get(id=response.data['id'])
url = reverse('api:notification_template_detail', kwargs={'pk': response.data['id']})
response = get(url, u)
@@ -96,8 +95,8 @@ def test_inherited_notification_templates(get, post, user, organization, project
notification_configuration=dict(url="http://localhost", disable_ssl_verification=False, headers={"Test": "Header"}),
),
u,
expect=201,
)
assert response.status_code == 201
notification_templates.append(response.data['id'])
i = Inventory.objects.create(name='test', organization=organization)
i.save()
@@ -122,8 +121,7 @@ def test_disallow_delete_when_notifications_pending(delete, user, notification_t
u = user('superuser', True)
url = reverse('api:notification_template_detail', kwargs={'pk': notification_template.id})
Notification.objects.create(notification_template=notification_template, status='pending')
response = delete(url, user=u)
assert response.status_code == 405
delete(url, user=u, expect=405)
@pytest.mark.django_db
@@ -133,9 +131,8 @@ def test_notification_template_list_includes_notification_errors(get, user, noti
Notification.objects.create(notification_template=notification_template, status='successful')
url = reverse('api:notification_template_list')
u = user('superuser', True)
response = get(url, user=u)
response = get(url, user=u, expect=200)
assert response.status_code == 200
notifications = response.data['results'][0]['summary_fields']['recent_notifications']
assert len(notifications) == 3
statuses = [n['status'] for n in notifications]
@@ -163,8 +160,8 @@ def test_custom_environment_injection(post, user, organization):
notification_configuration=dict(url="https://example.org", disable_ssl_verification=False, http_method="POST", headers={"Test": "Header"}),
),
u,
expect=201,
)
assert response.status_code == 201
template = NotificationTemplate.objects.get(pk=response.data['id'])
with pytest.raises(ConnectionError), override_settings(AWX_TASK_ENV={'HTTPS_PROXY': '192.168.50.100:1234'}), mock.patch.object(
HTTPAdapter, 'send'

View File

@@ -4,7 +4,7 @@ from unittest import mock # noqa
import pytest
from awx.api.versioning import reverse
from awx.main.models import Project
from awx.main.models import Project, JobTemplate
from django.core.exceptions import ValidationError
@@ -451,3 +451,19 @@ def test_project_list_ordering_with_duplicate_names(get, order_by, organization_
results = get(reverse('api:project_list'), objects.superusers.admin, QUERY_STRING='order_by=%s' % order_by).data['results']
project_ids[x] = [proj['id'] for proj in results]
assert project_ids[0] == project_ids[1] == project_ids[2] == [1, 2, 3, 4, 5]
@pytest.mark.django_db
def test_project_failed_update(post, project, admin, inventory):
"""Test to ensure failed projects with update on launch will create launch rather than error"""
jt = JobTemplate.objects.create(project=project, inventory=inventory)
# set project to update on launch and set status to failed
project.update_fields(scm_update_on_launch=True)
project.update()
project.project_updates.last().update_fields(status='failed')
response = post(reverse('api:job_template_launch', kwargs={'pk': jt.pk}), user=admin, expect=201)
assert response.status_code == 201
# set project to not update on launch and validate still 400's
project.update_fields(scm_update_on_launch=False)
response = post(reverse('api:job_template_launch', kwargs={'pk': jt.pk}), user=admin, expect=400)
assert response.status_code == 400

View File

@@ -47,7 +47,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="logs-01.loggly.com" serverport="80" usehttps="off" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="inputs/1fd38090-2af1-4e1e-8d80-492899da0f71/tag/http/")', # noqa
'action(type="omhttp" server="logs-01.loggly.com" serverport="80" usehttps="off" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="inputs/1fd38090-2af1-4e1e-8d80-492899da0f71/tag/http/")', # noqa
]
),
),
@@ -89,7 +89,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="yoursplunk" serverport="443" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
'action(type="omhttp" server="yoursplunk" serverport="443" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
]
),
),
@@ -103,7 +103,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="yoursplunk" serverport="80" usehttps="off" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
'action(type="omhttp" server="yoursplunk" serverport="80" usehttps="off" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
]
),
),
@@ -117,7 +117,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="yoursplunk" serverport="8088" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
'action(type="omhttp" server="yoursplunk" serverport="8088" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
]
),
),
@@ -131,7 +131,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="yoursplunk" serverport="8088" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
'action(type="omhttp" server="yoursplunk" serverport="8088" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
]
),
),
@@ -145,7 +145,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="yoursplunk.org" serverport="8088" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
'action(type="omhttp" server="yoursplunk.org" serverport="8088" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
]
),
),
@@ -159,7 +159,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="yoursplunk.org" serverport="8088" usehttps="off" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
'action(type="omhttp" server="yoursplunk.org" serverport="8088" usehttps="off" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="services/collector/event")', # noqa
]
),
),
@@ -173,7 +173,7 @@ data_loggly = {
'\n'.join(
[
'template(name="awx" type="string" string="%rawmsg-after-pri%")\nmodule(load="omhttp")',
'action(type="omhttp" server="endpoint5.collection.us2.sumologic.com" serverport="443" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" errorfile="/var/log/tower/rsyslog.err" restpath="receiver/v1/http/ZaVnC4dhaV0qoiETY0MrM3wwLoDgO1jFgjOxE6-39qokkj3LGtOroZ8wNaN2M6DtgYrJZsmSi4-36_Up5TbbN_8hosYonLKHSSOSKY845LuLZBCBwStrHQ==")', # noqa
'action(type="omhttp" server="endpoint5.collection.us2.sumologic.com" serverport="443" usehttps="on" allowunsignedcerts="off" skipverifyhost="off" action.resumeRetryCount="-1" template="awx" action.resumeInterval="5" queue.spoolDirectory="/var/lib/awx" queue.filename="awx-external-logger-action-queue" queue.maxdiskspace="1g" queue.type="LinkedList" queue.saveOnShutdown="on" errorfile="/var/log/tower/rsyslog.err" restpath="receiver/v1/http/ZaVnC4dhaV0qoiETY0MrM3wwLoDgO1jFgjOxE6-39qokkj3LGtOroZ8wNaN2M6DtgYrJZsmSi4-36_Up5TbbN_8hosYonLKHSSOSKY845LuLZBCBwStrHQ==")', # noqa
]
),
),

View File

@@ -90,6 +90,7 @@ __all__ = [
'get_event_partition_epoch',
'cleanup_new_process',
'log_excess_runtime',
'unified_job_class_to_event_table_name',
]
@@ -1219,3 +1220,7 @@ def log_excess_runtime(func_logger, cutoff=5.0, debug_cutoff=5.0, msg=None, add_
return _new_func
return log_excess_runtime_decorator
def unified_job_class_to_event_table_name(job_class):
return f'main_{job_class().event_class.__name__.lower()}'

View File

@@ -17,7 +17,8 @@ def construct_rsyslog_conf_template(settings=settings):
port = getattr(settings, 'LOG_AGGREGATOR_PORT', '')
protocol = getattr(settings, 'LOG_AGGREGATOR_PROTOCOL', '')
timeout = getattr(settings, 'LOG_AGGREGATOR_TCP_TIMEOUT', 5)
max_disk_space = getattr(settings, 'LOG_AGGREGATOR_MAX_DISK_USAGE_GB', 1)
max_disk_space_main_queue = getattr(settings, 'LOG_AGGREGATOR_MAX_DISK_USAGE_GB', 1)
max_disk_space_action_queue = getattr(settings, 'LOG_AGGREGATOR_ACTION_MAX_DISK_USAGE_GB', 1)
spool_directory = getattr(settings, 'LOG_AGGREGATOR_MAX_DISK_USAGE_PATH', '/var/lib/awx').rstrip('/')
error_log_file = getattr(settings, 'LOG_AGGREGATOR_RSYSLOGD_ERROR_LOG_FILE', '')
@@ -32,7 +33,7 @@ def construct_rsyslog_conf_template(settings=settings):
'$WorkDirectory /var/lib/awx/rsyslog',
f'$MaxMessageSize {max_bytes}',
'$IncludeConfig /var/lib/awx/rsyslog/conf.d/*.conf',
f'main_queue(queue.spoolDirectory="{spool_directory}" queue.maxdiskspace="{max_disk_space}g" queue.type="Disk" queue.filename="awx-external-logger-backlog")', # noqa
f'main_queue(queue.spoolDirectory="{spool_directory}" queue.maxdiskspace="{max_disk_space_main_queue}g" queue.type="Disk" queue.filename="awx-external-logger-backlog")', # noqa
'module(load="imuxsock" SysSock.Use="off")',
'input(type="imuxsock" Socket="' + settings.LOGGING['handlers']['external_logger']['address'] + '" unlink="on" RateLimit.Burst="0")',
'template(name="awx" type="string" string="%rawmsg-after-pri%")',
@@ -78,6 +79,11 @@ def construct_rsyslog_conf_template(settings=settings):
'action.resumeRetryCount="-1"',
'template="awx"',
f'action.resumeInterval="{timeout}"',
f'queue.spoolDirectory="{spool_directory}"',
'queue.filename="awx-external-logger-action-queue"',
f'queue.maxdiskspace="{max_disk_space_action_queue}g"',
'queue.type="LinkedList"',
'queue.saveOnShutdown="on"',
]
if error_log_file:
params.append(f'errorfile="{error_log_file}"')

View File

@@ -97,8 +97,6 @@ class SpecialInventoryHandler(logging.Handler):
self.event_handler(dispatch_data)
ColorHandler = logging.StreamHandler
if settings.COLOR_LOGS is True:
try:
from logutils.colorize import ColorizingStreamHandler
@@ -133,3 +131,5 @@ if settings.COLOR_LOGS is True:
except ImportError:
# logutils is only used for colored logs in the dev environment
pass
else:
ColorHandler = logging.StreamHandler

View File

@@ -175,7 +175,12 @@ class Licenser(object):
license.setdefault('pool_id', sub['pool']['id'])
license.setdefault('product_name', sub['pool']['productName'])
license.setdefault('valid_key', True)
license.setdefault('license_type', 'enterprise')
if sub['pool']['productId'].startswith('S'):
license.setdefault('trial', True)
license.setdefault('license_type', 'trial')
else:
license.setdefault('trial', False)
license.setdefault('license_type', 'enterprise')
license.setdefault('satellite', False)
# Use the nearest end date
endDate = parse_date(sub['endDate'])
@@ -287,7 +292,7 @@ class Licenser(object):
license['productId'] = sub['product_id']
license['quantity'] = int(sub['quantity'])
license['support_level'] = sub['support_level']
license['usage'] = sub['usage']
license['usage'] = sub.get('usage')
license['subscription_name'] = sub['name']
license['subscriptionId'] = sub['subscription_id']
license['accountNumber'] = sub['account_number']

View File

@@ -12,7 +12,7 @@ from channels.layers import get_channel_layer
from django.conf import settings
from django.apps import apps
import asyncpg
import psycopg
from awx.main.analytics.broadcast_websocket import (
RelayWebsocketStats,
@@ -209,53 +209,49 @@ class WebSocketRelayManager(object):
# hostname -> ip
self.known_hosts: Dict[str, str] = dict()
async def on_heartbeet(self, conn, pid, channel, payload):
try:
if not payload or channel != "web_heartbeet":
return
async def on_ws_heartbeat(self, conn):
await conn.execute("LISTEN web_ws_heartbeat")
async for notif in conn.notifies():
if notif is None:
continue
try:
payload = json.loads(payload)
except json.JSONDecodeError:
logmsg = "Failed to decode message from pg_notify channel `web_heartbeet`"
if logger.isEnabledFor(logging.DEBUG):
logmsg = "{} {}".format(logmsg, payload)
logger.warning(logmsg)
return
# Skip if the message comes from the same host we are running on
# In this case, we'll be sharing a redis, no need to relay.
if payload.get("hostname") == self.local_hostname:
return
if payload.get("action") == "online":
hostname = payload.get("hostname")
ip = payload.get("ip")
if ip is None:
# If we don't get an IP, just try the hostname, maybe it resolves
ip = hostname
if ip is None:
logger.warning(f"Received invalid online heartbeet, missing hostname and ip: {payload}")
if not notif.payload or notif.channel != "web_ws_heartbeat":
return
self.known_hosts[hostname] = ip
logger.debug(f"Web host {hostname} ({ip}) online heartbeat received.")
elif payload.get("action") == "offline":
hostname = payload.get("hostname")
ip = payload.get("ip")
if ip is None:
# If we don't get an IP, just try the hostname, maybe it resolves
ip = hostname
if ip is None:
logger.warning(f"Received invalid offline heartbeet, missing hostname and ip: {payload}")
return
self.cleanup_offline_host(ip)
logger.debug(f"Web host {hostname} ({ip}) offline heartbeat received.")
except Exception as e:
# This catch-all is the same as the one above. asyncio will eat the exception
# but we want to know about it.
logger.exception(f"on_heartbeet exception: {e}")
def cleanup_offline_host(self, hostname):
try:
payload = json.loads(notif.payload)
except json.JSONDecodeError:
logmsg = "Failed to decode message from pg_notify channel `web_ws_heartbeat`"
if logger.isEnabledFor(logging.DEBUG):
logmsg = "{} {}".format(logmsg, payload)
logger.warning(logmsg)
return
# Skip if the message comes from the same host we are running on
# In this case, we'll be sharing a redis, no need to relay.
if payload.get("hostname") == self.local_hostname:
return
action = payload.get("action")
if action in ("online", "offline"):
hostname = payload.get("hostname")
ip = payload.get("ip") or hostname # try back to hostname if ip isn't supplied
if ip is None:
logger.warning(f"Received invalid {action} ws_heartbeat, missing hostname and ip: {payload}")
return
logger.debug(f"Web host {hostname} ({ip}) {action} heartbeat received.")
if action == "online":
self.known_hosts[hostname] = ip
elif action == "offline":
await self.cleanup_offline_host(hostname)
except Exception as e:
# This catch-all is the same as the one above. asyncio will eat the exception
# but we want to know about it.
logger.exception(f"on_ws_heartbeat exception: {e}")
async def cleanup_offline_host(self, hostname):
"""
Given a hostname, try to cancel its task/connection and remove it from
the list of hosts we know about.
@@ -264,6 +260,19 @@ class WebSocketRelayManager(object):
"""
if hostname in self.relay_connections:
self.relay_connections[hostname].cancel()
# Wait for the task to actually run its cancel/completion logic
# otherwise it might get GC'd too early when we del it below.
# Being GC'd too early could generate a scary message in logs:
# "Task was destroyed but it is pending!"
try:
await asyncio.wait_for(self.relay_connections[hostname].async_task, timeout=10)
except asyncio.TimeoutError:
logger.warning(f"Tried to cancel relay connection for {hostname} but it timed out during cleanup.")
except asyncio.CancelledError:
# Handle the case where the task was already cancelled by the time we got here.
pass
del self.relay_connections[hostname]
if hostname in self.known_hosts:
@@ -282,16 +291,16 @@ class WebSocketRelayManager(object):
# Set up a pg_notify consumer for allowing web nodes to "provision" and "deprovision" themselves gracefully.
database_conf = settings.DATABASES['default']
async_conn = await asyncpg.connect(
database=database_conf['NAME'],
async_conn = await psycopg.AsyncConnection.connect(
dbname=database_conf['NAME'],
host=database_conf['HOST'],
user=database_conf['USER'],
password=database_conf['PASSWORD'],
port=database_conf['PORT'],
# We cannot include these because asyncpg doesn't allow all the options that psycopg does.
# **database_conf.get("OPTIONS", {}),
**database_conf.get("OPTIONS", {}),
)
await async_conn.add_listener("web_heartbeet", self.on_heartbeet)
await async_conn.set_autocommit(True)
event_loop.create_task(self.on_ws_heartbeat(async_conn))
# Establishes a websocket connection to /websocket/relay on all API servers
while True:
@@ -318,13 +327,11 @@ class WebSocketRelayManager(object):
if deleted_remote_hosts:
logger.info(f"Removing {deleted_remote_hosts} from websocket broadcast list")
await asyncio.gather(self.cleanup_offline_host(h) for h in deleted_remote_hosts)
if new_remote_hosts:
logger.info(f"Adding {new_remote_hosts} to websocket broadcast list")
for h in deleted_remote_hosts:
self.cleanup_offline_host(h)
for h in new_remote_hosts:
stats = self.stats_mgr.new_remote_host_stats(h)
relay_connection = WebsocketRelayConnection(name=self.local_hostname, stats=stats, remote_host=self.known_hosts[h])

View File

@@ -189,11 +189,12 @@
connection: local
name: Install content with ansible-galaxy command if necessary
vars:
galaxy_task_env: # configure in settings
additional_collections_env:
# These environment variables are used for installing collections, in addition to galaxy_task_env
# setting the collections paths silences warnings
galaxy_task_env: # configured in settings
# additional_galaxy_env contains environment variables are used for installing roles and collections and will take precedence over items in galaxy_task_env
additional_galaxy_env:
# These paths control where ansible-galaxy installs collections and roles on top the filesystem
ANSIBLE_COLLECTIONS_PATHS: "{{ projects_root }}/.__awx_cache/{{ local_path }}/stage/requirements_collections"
ANSIBLE_ROLES_PATH: "{{ projects_root }}/.__awx_cache/{{ local_path }}/stage/requirements_roles"
# Put the local tmp directory in same volume as collection destination
# otherwise, files cannot be moved accross volumes and will cause error
ANSIBLE_LOCAL_TEMP: "{{ projects_root }}/.__awx_cache/{{ local_path }}/stage/tmp"
@@ -212,40 +213,53 @@
- name: End play due to disabled content sync
ansible.builtin.meta: end_play
- name: Fetch galaxy roles from requirements.(yml/yaml)
ansible.builtin.command: >
ansible-galaxy role install -r {{ item }}
--roles-path {{ projects_root }}/.__awx_cache/{{ local_path }}/stage/requirements_roles
{{ ' -' + 'v' * ansible_verbosity if ansible_verbosity else '' }}
args:
chdir: "{{ project_path | quote }}"
register: galaxy_result
with_fileglob:
- "{{ project_path | quote }}/roles/requirements.yaml"
- "{{ project_path | quote }}/roles/requirements.yml"
changed_when: "'was installed successfully' in galaxy_result.stdout"
environment: "{{ galaxy_task_env }}"
when: roles_enabled | bool
tags:
- install_roles
- block:
- name: Fetch galaxy roles from roles/requirements.(yml/yaml)
ansible.builtin.command:
cmd: "ansible-galaxy role install -r {{ item }} {{ verbosity }}"
register: galaxy_result
with_fileglob:
- "{{ project_path | quote }}/roles/requirements.yaml"
- "{{ project_path | quote }}/roles/requirements.yml"
changed_when: "'was installed successfully' in galaxy_result.stdout"
when: roles_enabled | bool
tags:
- install_roles
- name: Fetch galaxy collections from collections/requirements.(yml/yaml)
ansible.builtin.command: >
ansible-galaxy collection install -r {{ item }}
--collections-path {{ projects_root }}/.__awx_cache/{{ local_path }}/stage/requirements_collections
{{ ' -' + 'v' * ansible_verbosity if ansible_verbosity else '' }}
args:
chdir: "{{ project_path | quote }}"
register: galaxy_collection_result
with_fileglob:
- "{{ project_path | quote }}/collections/requirements.yaml"
- "{{ project_path | quote }}/collections/requirements.yml"
- "{{ project_path | quote }}/requirements.yaml"
- "{{ project_path | quote }}/requirements.yml"
changed_when: "'Installing ' in galaxy_collection_result.stdout"
environment: "{{ additional_collections_env | combine(galaxy_task_env) }}"
when:
- "ansible_version.full is version_compare('2.9', '>=')"
- collections_enabled | bool
tags:
- install_collections
- name: Fetch galaxy collections from collections/requirements.(yml/yaml)
ansible.builtin.command:
cmd: "ansible-galaxy collection install -r {{ item }} {{ verbosity }}"
register: galaxy_collection_result
with_fileglob:
- "{{ project_path | quote }}/collections/requirements.yaml"
- "{{ project_path | quote }}/collections/requirements.yml"
changed_when: "'Nothing to do.' not in galaxy_collection_result.stdout"
when:
- "ansible_version.full is version_compare('2.9', '>=')"
- collections_enabled | bool
tags:
- install_collections
- name: Fetch galaxy roles and collections from requirements.(yml/yaml)
ansible.builtin.command:
cmd: "ansible-galaxy install -r {{ item }} {{ verbosity }}"
register: galaxy_combined_result
with_fileglob:
- "{{ project_path | quote }}/requirements.yaml"
- "{{ project_path | quote }}/requirements.yml"
changed_when: "'Nothing to do.' not in galaxy_combined_result.stdout"
when:
- "ansible_version.full is version_compare('2.10', '>=')"
- collections_enabled | bool
- roles_enabled | bool
tags:
- install_collections
- install_roles
module_defaults:
ansible.builtin.command:
chdir: "{{ project_path | quote }}"
# We combine our additional_galaxy_env into galaxy_task_env so that our values are preferred over anything a user would set
environment: "{{ galaxy_task_env | combine(additional_galaxy_env) }}"
vars:
verbosity: "{{ (ansible_verbosity) | ternary('-'+'v'*ansible_verbosity, '') }}"

Some files were not shown because too many files have changed in this diff Show More