Commit Graph

102 Commits

Author SHA1 Message Date
Shane McDonald
ec8ac6f1a7 Introduce distinct controlplane instance group 2021-06-07 11:25:59 -04:00
Jim Ladd
84af610a1f remove rebase cruft 2021-06-04 09:17:09 -07:00
Jim Ladd
7b188aafea lint 2021-06-04 09:17:09 -07:00
Ryan Petrello
c7ab3ea86e move the partition data migration to be a post-upgrade async process
this copies the approach we took with the bigint migration
2021-06-04 09:17:07 -07:00
Jim Ladd
67046513ae Push changes before rebasing 2021-06-04 09:17:07 -07:00
Jim Ladd
0eb1984b22 Only create partitions for regular jobs 2021-06-04 09:17:06 -07:00
Jim Ladd
c87d7b0d79 fix import 2021-06-04 09:17:06 -07:00
Jim Ladd
612e91263c auto-create partition 2021-06-04 09:17:06 -07:00
Jeff Bradberry
1819a7963a Make the necessary changes to the models
- remove InstanceGroup.controller
- remove Instance.last_isolated_check
- remove .is_isolated and .is_controller methods/properties
- remove .choose_online_controller_node() method
- remove .supports_isolation() and replace with .can_run_containerized
- simplify .can_run_containerized
2021-04-22 10:17:02 -04:00
Ryan Petrello
c2ef0a6500 move code linting to a stricter pep8-esque auto-formatting tool, black 2021-03-23 09:39:58 -04:00
Ryan Petrello
f850f8d3e0 introduce a new global flag for denoating K8S-based deployments
- In K8S-based installs, only container groups are intended to be used
  for playbook execution (JTs, adhoc, inventory updates), so in this
  scenario, other job types have a task impact of zero.
- In K8S-based installs, traditional instances have *zero* capacity
  (because they're only members of the control plane where services
  - http/s, local control plane execution - run)
- This commit also includes some changes that allow for the task manager
  to launch tasks with task_impact=0 on instances that have capacity=0
  (previously, an instance with zero capacity would never be selected
  as the "execution node"

This means that when IS_K8S=True, any Job Template associated with an
Instance Group will never actually go from pending -> running (because
there's no capacity - all playbooks must run through Container Groups).
For an improved ux, our intention is to introduce logic into the
operator install process such that the *default* group that's created at
install time is a *Container Group* that's configured to point at the
K8S cluster where awx itself is deployed.
2021-03-03 18:52:55 -05:00
Shane McDonald
373bb443aa UnifiedJob#is_containerized -> UnifiedJob#is_container_group_task 2021-03-03 18:52:55 -05:00
Shane McDonald
286b1d4e25 InstanceGroup#is_containerized -> InstanceGroup#is_container_group 2021-03-03 18:52:55 -05:00
Seth Foster
41d0a2f7b9 Add job lifecycle logging
Various	points (e.g. created, running, processing events), are
structured into	json format and	output to /var/log/tower/job_lifecycle.log

As part	of this	work, the DependencyGraph is reworked to return
which job object is doing the blocking, rather than a boolean.
2021-02-04 12:25:51 -05:00
Chris Meyers
79d7c6d9b3 make optimization code work with container groups
* Task manager fit_ optimization code caused problems with container
group code.
* Note that we don't actually get the benefit of the optimization for
container groups. We just make it so that the code doesn't blow up. It
will take another pass to apply optimizations to the container group
task manager path.
2020-10-23 10:17:30 -04:00
Chris Meyers
11cc6362b5 reduce per-job database query count
* Do not query the database for the set of Instance that belong to the
group for which we are trying to fit a job on, for each job.
* Instead, cache the set of instances per-instance group.
2020-10-19 11:01:11 -04:00
Seth Foster
e09274e533 PR #8074 - limit how many jobs the task manager can start on a given run 2020-09-08 12:16:06 -04:00
Ryan Petrello
de59d1d3f6 improve job reaping for jobs that were started on a missing Instance
see: https://github.com/ansible/awx/issues/7848
2020-08-21 16:32:17 -04:00
Jeff Bradberry
ced8f42835 Force worker processes to have a different signal handler from the parent
Situations have come up where the 5+ minute kill signal for
run_task_manager is emitted to the worker process running it, but
since the worker improperly inherited the AWXConsumerBase().stop()
handler a deadlock ultimately was triggered on the database
connection.
2020-06-04 15:41:28 -04:00
Christian Adams
19ccb5e213 Mark job_explanation strings after they are read from the db
- For strings that need to be translated, but are saved in the db:
   * They must be marked for translation using gettext_noop() to be
   translated.
   * And must also be marked for translation with _() when read from db
   and shown to the user.
   * [Ref]: https://docs.djangoproject.com/en/3.0/topics/i18n/translation/#marking-strings-as-no-op
2020-05-15 22:50:50 -04:00
Christian Adams
a899a147e1 Fix new flake8 from pyflakes 2.2.0 release 2020-04-20 09:50:50 -04:00
Seth Foster
9b4b2167b3 TaskManager process dependencies only once
This adds a boolean "dependencies_processed" field to UnifiedJob
model. The default value is False. Once the task manager generates
dependencies for this task, it will not generate them again on
subsequent runs.

The changes also remove .process_dependencies(), as this method repeats
the same code as .process_pending_tasks(), and is not needed. Once
dependencies are generated, they are handled at .process_pending_tasks().

Adds a unit test that should catch regressions for this fix.
2020-02-06 11:47:33 -05:00
Ryan Petrello
513f54a422 fix a few bugs related to container group execution
see: https://github.com/ansible/awx/issues/5326
2019-11-14 13:23:38 -05:00
Ryan Petrello
7f1096f711 reap k8s-based jobs when the dispatcher restarts 2019-10-29 11:24:11 -04:00
Ryan Petrello
1cf02e1e17 properly set execution_node for project and inv updates run "in k8s"
see: https://github.com/ansible/awx/issues/4907
2019-10-17 15:15:24 -04:00
Shane McDonald
bd5003ca98 Task manager / scheduler Kubernetes integration 2019-10-04 13:21:21 -04:00
beeankha
57fd6b7280 Set default messages for approval notifications 2019-09-27 15:48:00 -04:00
beeankha
13450fdbf9 Set up approval notifications to send 2019-09-27 15:48:00 -04:00
beeankha
6be2d84adb Add endpoints for approval node notifications
...and also add a migration file.
2019-09-27 15:48:00 -04:00
beeankha
2fc7e93c6a Emit websocket for approval node timeout
...and update timeout_message to be more translation-friendly.
2019-08-29 14:30:33 -04:00
beeankha
ea509f518e Addressing comments, updating tests, etc. 2019-08-27 15:38:15 -04:00
beeankha
f6f6e5883a Update websockets for pending approvals, change timeout expiration to 2019-08-27 15:36:27 -04:00
beeankha
d9f3fed06f Update UJ/UJT endpoints, update approval RBAC, update approval timeout 2019-08-27 15:36:25 -04:00
beeankha
544a5063f3 Update timeout implementation, placeholder code for possible websocket support 2019-08-27 15:36:24 -04:00
beeankha
8c17990750 Activity stream and timeout
Update activity stream to show approval node info, add meaningful log
message for expired approval nodes in the Task Manager timeout
function.
2019-08-27 15:36:24 -04:00
Ryan Petrello
0522d45ab0 fixed a few issues related to approval role RBAC for normal users 2019-08-27 15:36:23 -04:00
beeankha
28289e85c1 Add timeout for workflow approval nodes 2019-08-27 15:36:22 -04:00
beeankha
5f82754a3f Clean up RBAC code 2019-08-27 15:36:22 -04:00
beeankha
296b4e830b Add more RBAC for approval nodes 2019-08-27 15:36:21 -04:00
beeankha
64c94d478d Add more RBAC, filter out AJT/AJs from unified jobs lists
Comment out placeholder in serializer
2019-08-27 15:36:17 -04:00
beeankha
453e142635 Fix UJT-related error, add notification placeholders 2019-08-27 15:35:43 -04:00
beeankha
9cfed6f2a8 Add check for no-op case back, remove redundant on_commit code 2019-06-17 10:47:58 -04:00
beeankha
95896b1acd Edit wfj running notification trigger 2019-06-17 10:47:58 -04:00
beeankha
68fe23d8b7 Update Organization Notification Template subclass, move success/fail wfj notification trigger 2019-06-17 10:47:58 -04:00
beeankha
dd372548a9 Update swagger test 2019-06-17 10:47:57 -04:00
beeankha
8d6e1f0927 Trigger running notifications in WFJs and edit unit test 2019-06-17 10:47:57 -04:00
beeankha
8ec97235e3 Add feature for notifications to trigger on job start 2019-06-17 10:47:57 -04:00
Ryan Petrello
b1d75327e3 add the ability to toggle DEBUG logging on dynamically 2019-05-16 07:58:31 -04:00
Ryan Petrello
4f83d44142 mark a workflow convergence error message for i18n 2019-02-14 15:55:20 -05:00
Ryan Petrello
2927803a82 fix overindent lint failures 2019-01-30 12:12:39 -05:00