Merge pull request #12728 from john-westcott-iv/ig_fallback

Adding prevent_instance_group_fallback
fix name to be consistent (#12975 )
2026-02-05 19:44:43 -03:30 · 2022-10-03 10:47:51 -04:00 · 2022-09-29 16:52:12 -04:00 · 2022-09-29 14:20:49 -04:00 · 2022-09-29 14:19:37 -04:00 · 2022-09-29 14:19:37 -04:00
585 changed files with 45466 additions and 26345 deletions
--- a/.dockerignore
+++ b/.dockerignore
@@ -1,3 +1,2 @@
-awx/ui/node_modules
 Dockerfile
 .git
--- a/.github/ISSUE_TEMPLATE.md
+++ b/.github/ISSUE_TEMPLATE.md
@@ -25,7 +25,7 @@ Instead use the bug or feature request.
 <!--- Pick one below and delete the rest: -->
 - Breaking Change
 - New or Enhanced Feature
- - Bug or Docs Fix
+ - Bug, Docs Fix or other nominal change


 ##### COMPONENT NAME
--- a/.github/ISSUE_TEMPLATE/feature_request.yml
+++ b/.github/ISSUE_TEMPLATE/feature_request.yml
@@ -20,6 +20,19 @@ body:
        - label: I understand that AWX is open source software provided for free and that I might not receive a timely response.
          required: true

+  - type: dropdown
+    id: feature-type
+    attributes:
+      label: Feature type
+      description: >-
+        What kind of feature is this?
+      multiple: false
+      options:
+        - "New Feature"
+        - "Enhancement to Existing Feature"
+    validations:
+      required: true
+
  - type: textarea
    id: summary
    attributes:
@@ -40,3 +53,36 @@ body:
        - label: CLI
        - label: Other

+  - type: textarea
+    id: steps-to-reproduce
+    attributes:
+      label: Steps to reproduce
+      description: >-
+        Describe the necessary steps to understand the scenario of the requested enhancement. 
+        Include all the steps that will help the developer and QE team understand what you are requesting.
+    validations:
+      required: true
+
+  - type: textarea
+    id: current-results
+    attributes:
+      label: Current results
+      description: What is currently happening on the scenario?
+    validations:
+      required: true
+
+  - type: textarea
+    id: sugested-results
+    attributes:
+      label: Sugested feature result
+      description: What is the result this new feature will bring?
+    validations:
+      required: true
+
+  - type: textarea
+    id: additional-information
+    attributes:
+      label: Additional information
+      description: Please provide any other information you think is relevant that could help us understand your feature request.
+    validations:
+      required: false
--- a/.github/PULL_REQUEST_TEMPLATE.md
+++ b/.github/PULL_REQUEST_TEMPLATE.md
@@ -11,7 +11,7 @@ the change does.
 <!--- Pick one below and delete the rest: -->
 - Breaking Change 
 - New or Enhanced Feature
- - Bug or Docs Fix
+ - Bug, Docs Fix or other nominal change

 ##### COMPONENT NAME
 <!--- Name of the module/plugin/module/task -->
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -13,7 +13,6 @@ updates:
      - "kialam"
      - "mabashian"
      - "marshmalien"
-      - "nixocio"
    labels:
      - "component:ui"
      - "dependencies"
--- a/.github/triage_replies.md
+++ b/.github/triage_replies.md
@@ -1,5 +1,5 @@
 ## General
- For the roundup of all the different mailing lists available from AWX, Ansible, and beyond visit: https://docs.ansible.com/ansible/latest/community/communication.html 
+- For the roundup of all the different mailing lists available from AWX, Ansible, and beyond visit: https://docs.ansible.com/ansible/latest/community/communication.html
 - Hello, we think your question is answered in our FAQ. Does this: https://www.ansible.com/products/awx-project/faq cover your question?
 - You can find the latest documentation here: https://docs.ansible.com/automation-controller/latest/html/userguide/index.html

@@ -29,12 +29,24 @@ In the future, sometimes starting a discussion on the development list prior to
 Thank you once again for this and your interest in AWX!


-### No Progress
+### No Progress Issue
+- Hi! \
+\
+Thank you very much for for this issue. It means a lot to us that you have taken time to contribute by opening this report. \
+\
+On this issue, there were comments added but it has been some time since then without response. At this time we are closing this issue. If you get time to address the comments we can reopen the issue if you can contact us by using any of the communication methods listed in the page below: \
+\
+https://github.com/ansible/awx/#get-involved \
+\
+Thank you once again for this and your interest in AWX!
+
+
+### No Progress PR
 - Hi! \
 \
 Thank you very much for your submission to AWX. It means a lot to us that you have taken time to contribute. \
 \
-On this PR, changes were requested but it has been some time since then. We think this PR has merit but without the requested changes we are unable to merge it. At this time we are closing you PR. If you get time to address the changes you are welcome to open another PR or we can reopen this PR upon request if you contact us by using any of the communication methods listed in the page below: \
+On this PR, changes were requested but it has been some time since then. We think this PR has merit but without the requested changes we are unable to merge it. At this time we are closing your PR. If you get time to address the changes you are welcome to open another PR or we can reopen this PR upon request if you contact us by using any of the communication methods listed in the page below: \
 \
 https://github.com/ansible/awx/#get-involved \
 \
@@ -46,11 +58,15 @@ Thank you once again for this and your interest in AWX!
 ## Common

 ### Give us more info
- Hello, we'd love to help, but we need a little more information about the problem you're having. Screenshots, log outputs, or any reproducers would be very helpful. 
+- Hello, we'd love to help, but we need a little more information about the problem you're having. Screenshots, log outputs, or any reproducers would be very helpful.

 ### Code of Conduct
- Hello. Please keep in mind that Ansible adheres to a Code of Conduct in its community spaces. The spirit of the code of conduct is to be kind, and this is your friendly reminder to be so. Please see the full code of conduct here if you have questions: https://docs.ansible.com/ansible/latest/community/code_of_conduct.html 
+- Hello. Please keep in mind that Ansible adheres to a Code of Conduct in its community spaces. The spirit of the code of conduct is to be kind, and this is your friendly reminder to be so. Please see the full code of conduct here if you have questions: https://docs.ansible.com/ansible/latest/community/code_of_conduct.html

+### EE Contents / Community General
+- Hello. The awx-ee contains the collections and dependencies needed for supported AWX features to function. Anything beyond that (like the community.general package) will require you to build your own EE. For information on how to do that, see https://ansible-builder.readthedocs.io/en/stable/ \
+\
+The Ansible Community is looking at building an EE that corresponds to all of the collections inside the ansible package. That may help you if and when it happens; see https://github.com/ansible-community/community-topics/issues/31 for details.



@@ -63,29 +79,34 @@ Thank you once again for this and your interest in AWX!
 - Hello, we think your idea is good! Please consider contributing a PR for this following our contributing guidelines: https://github.com/ansible/awx/blob/devel/CONTRIBUTING.md

 ### Receptor
- You can find the receptor docs here: https://receptor.readthedocs.io/en/latest/ 
+- You can find the receptor docs here: https://receptor.readthedocs.io/en/latest/
 - Hello, your issue seems related to receptor. Could you please open an issue in the receptor repository? https://github.com/ansible/receptor. Thanks!

 ### Ansible Engine not AWX
- Hello, your question seems to be about Ansible development, not about AWX. Try asking on the Ansible-devel specific mailing list: https://groups.google.com/g/ansible-devel 
+- Hello, your question seems to be about Ansible development, not about AWX. Try asking on the Ansible-devel specific mailing list: https://groups.google.com/g/ansible-devel
 - Hello, your question seems to be about using Ansible, not about AWX. https://groups.google.com/g/ansible-project is the best place to visit for user questions about Ansible. Thanks!

 ### Ansible Galaxy not AWX
 - Hey there. That sounds like an FAQ question. Did this: https://www.ansible.com/products/awx-project/faq cover your question?

 ### Contributing Guidelines
- AWX: https://github.com/ansible/awx/blob/devel/CONTRIBUTING.md 
+- AWX: https://github.com/ansible/awx/blob/devel/CONTRIBUTING.md
 - AWX-Operator: https://github.com/ansible/awx-operator/blob/devel/CONTRIBUTING.md

+### Oracle AWX
+We'd be happy to help if you can reproduce this with AWX since we do not have Oracle's Linux Automation Manager. If you need help with this specific version of Oracles Linux Automation Manager you will need to contact your Oracle for support. 
+
 ### AWX Release
+Subject: Announcing AWX Xa.Ya.za and AWX-Operator Xb.Yb.zb
+
 - Hi all, \
 \
-We're happy to announce that the next release of AWX, version <X> is now available! \
-In addition AWX Operator version <Y> has also been release! \
+We're happy to announce that the next release of AWX, version <b>`Xa.Ya.za`</b> is now available! \
+In addition AWX Operator version <b>`Xb.Yb.zb`</b> has also been released! \
 \
 Please see the releases pages for more details: \
-	AWX: https://github.com/ansible/awx/releases/tag/<X> \
-	Operator: https://github.com/ansible/awx-operator/releases/tag/<Y> \
+	AWX: https://github.com/ansible/awx/releases/tag/Xa.Ya.za \
+	Operator: https://github.com/ansible/awx-operator/releases/tag/Xb.Yb.zb \
 \
 The AWX team.

--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -111,6 +111,15 @@ jobs:
          repository: ansible/awx-operator
          path: awx-operator

+      - name: Get python version from Makefile
+        working-directory: awx
+        run: echo py_version=`make PYTHON_VERSION` >> $GITHUB_ENV
+
+      - name: Install python ${{ env.py_version }}
+        uses: actions/setup-python@v2
+        with:
+          python-version: ${{ env.py_version }}
+
      - name: Install playbook dependencies
        run: |
          python3 -m pip install docker
--- a/.github/workflows/label_issue.yml
+++ b/.github/workflows/label_issue.yml
@@ -19,3 +19,34 @@ jobs:
          not-before: 2021-12-07T07:00:00Z
          configuration-path: .github/issue_labeler.yml
          enable-versioned-regex: 0
+
+  community:
+    runs-on: ubuntu-latest
+    name: Label Issue - Community
+    steps:
+      - uses: actions/checkout@v2
+      - uses: actions/setup-python@v4
+      - name: Install python requests
+        run: pip install requests
+      - name: Check if user is a member of Ansible org
+        uses: jannekem/run-python-script-action@v1
+        id: check_user
+        with:
+          script: |
+            import requests
+            headers = {'Accept': 'application/vnd.github+json', 'Authorization': 'token ${{ secrets.GITHUB_TOKEN }}'}
+            response = requests.get('${{ fromJson(toJson(github.event.issue.user.url)) }}/orgs?per_page=100', headers=headers)
+            is_member = False
+            for org in response.json():
+              if org['login'] == 'ansible':
+                is_member = True
+            if is_member:
+                print("User is member")
+            else:
+                print("User is community")
+      - name: Add community label if not a member
+        if: contains(steps.check_user.outputs.stdout, 'community')
+        uses: andymckay/labeler@e6c4322d0397f3240f0e7e30a33b5c5df2d39e90
+        with:
+          add-labels: "community"
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
--- a/.github/workflows/label_pr.yml
+++ b/.github/workflows/label_pr.yml
@@ -18,3 +18,34 @@ jobs:
        with:
          repo-token: "${{ secrets.GITHUB_TOKEN }}"
          configuration-path: .github/pr_labeler.yml
+
+  community:
+    runs-on: ubuntu-latest
+    name: Label PR - Community
+    steps:
+      - uses: actions/checkout@v2
+      - uses: actions/setup-python@v4
+      - name: Install python requests
+        run: pip install requests
+      - name: Check if user is a member of Ansible org
+        uses: jannekem/run-python-script-action@v1
+        id: check_user
+        with:
+          script: |
+            import requests
+            headers = {'Accept': 'application/vnd.github+json', 'Authorization': 'token ${{ secrets.GITHUB_TOKEN }}'}
+            response = requests.get('${{ fromJson(toJson(github.event.pull_request.user.url)) }}/orgs?per_page=100', headers=headers)
+            is_member = False
+            for org in response.json():
+              if org['login'] == 'ansible':
+                is_member = True
+            if is_member:
+                print("User is member")
+            else:
+                print("User is community")
+      - name: Add community label if not a member
+        if: contains(steps.check_user.outputs.stdout, 'community')
+        uses: andymckay/labeler@e6c4322d0397f3240f0e7e30a33b5c5df2d39e90
+        with:
+          add-labels: "community"
+          repo-token: ${{ secrets.GITHUB_TOKEN }}
--- a/.github/workflows/pr_body_check.yml
+++ b/.github/workflows/pr_body_check.yml
@@ -0,0 +1,45 @@
+---
+name: PR Check
+env:
+  BRANCH: ${{ github.base_ref || 'devel' }}
+on:
+  pull_request:
+    types: [opened, edited, reopened, synchronize]
+jobs:
+  pr-check:
+    name: Scan PR description for semantic versioning keywords
+    runs-on: ubuntu-latest
+    permissions:
+      packages: write
+      contents: read
+    steps:
+      - name: Write PR body to a file
+        run: |
+          cat >> pr.body << __SOME_RANDOM_PR_EOF__
+          ${{ github.event.pull_request.body }}
+          __SOME_RANDOM_PR_EOF__
+
+      - name: Display the received body for troubleshooting
+        run: cat pr.body
+
+      # We want to write these out individually just incase the options were joined on a single line
+      - name: Check for each of the lines
+        run: |
+          grep "Bug, Docs Fix or other nominal change" pr.body > Z
+          grep "New or Enhanced Feature" pr.body > Y
+          grep "Breaking Change" pr.body > X
+          exit 0
+        # We exit 0 and set the shell to prevent the returns from the greps from failing this step
+        # See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#exit-codes-and-error-action-preference
+        shell: bash {0}
+
+      - name: Check for exactly one item
+        run: |
+          if [ $(cat X Y Z | wc -l) != 1 ] ; then
+            echo "The PR body must contain exactly one of [ 'Bug, Docs Fix or other nominal change', 'New or Enhanced Feature', 'Breaking Change' ]"
+            echo "We counted $(cat X Y Z | wc -l)"
+            echo "See the default PR body for examples"
+            exit 255;
+          else
+            exit 0;
+          fi
--- a/.github/workflows/promote.yml
+++ b/.github/workflows/promote.yml
@@ -21,7 +21,7 @@ jobs:

      - name: Install dependencies
        run: |
-          python${{ env.py_version }} -m pip install wheel twine
+          python${{ env.py_version }} -m pip install wheel twine setuptools-scm

      - name: Set official collection namespace
        run: echo collection_namespace=awx >> $GITHUB_ENV
@@ -70,4 +70,4 @@ jobs:
          docker tag ghcr.io/${{ github.repository }}:${{ github.event.release.tag_name }} quay.io/${{ github.repository }}:latest
          docker push quay.io/${{ github.repository }}:${{ github.event.release.tag_name }}
          docker push quay.io/${{ github.repository }}:latest
-          
+
--- a/.github/workflows/update_dependabot_prs.yml
+++ b/.github/workflows/update_dependabot_prs.yml
@@ -0,0 +1,29 @@
+---
+name: Dependency Pr Update
+on:
+  pull_request:
+    types: [labeled, opened, reopened]
+
+jobs:
+  pr-check:
+    name: Update Dependabot Prs
+    if:  contains(github.event.pull_request.labels.*.name, 'dependencies') && contains(github.event.pull_request.labels.*.name, 'component:ui')
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout branch
+        uses: actions/checkout@v3
+
+      - name: Update PR Body
+        env:
+            GITHUB_TOKEN: ${{secrets.GITHUB_TOKEN}}
+            OWNER: ${{ github.repository_owner }}
+            REPO: ${{ github.event.repository.name }}
+            PR: ${{github.event.pull_request.number}}
+            PR_BODY: ${{github.event.pull_request.body}}
+        run: |
+          gh pr checkout ${{ env.PR }}
+          echo "${{ env.PR_BODY }}" > my_pr_body.txt
+          echo "" >> my_pr_body.txt
+          echo "Bug, Docs Fix or other nominal change" >> my_pr_body.txt
+          gh pr edit ${{env.PR}} --body-file my_pr_body.txt
--- a/.gitignore
+++ b/.gitignore
@@ -153,9 +153,6 @@ use_dev_supervisor.txt
 /sanity/
 /awx_collection_build/

-# Setup for metrics gathering
-tools/prometheus/prometheus.yml
-
 .idea/*
 *.unison.tmp
 *.#
--- a/.yamllint
+++ b/.yamllint
@@ -8,6 +8,8 @@ ignore: |
  awx/ui/test/e2e/tests/smoke-vars.yml
  awx/ui/node_modules
  tools/docker-compose/_sources
+  # django template files
+  awx/api/templates/instance_install_bundle/**

 extends: default

--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -19,16 +19,17 @@ Have questions about this document or anything not covered here? Come chat with
  - [Purging containers and images](#purging-containers-and-images)
  - [Pre commit hooks](#pre-commit-hooks)
 - [What should I work on?](#what-should-i-work-on)
+  - [Translations](#translations)
 - [Submitting Pull Requests](#submitting-pull-requests)
- [PR Checks run by Zuul](#pr-checks-run-by-zuul)
 - [Reporting Issues](#reporting-issues)
+- [Getting Help](#getting-help)

 ## Things to know prior to submitting code

 - All code submissions are done through pull requests against the `devel` branch.
 - You must use `git commit --signoff` for any commit to be merged, and agree that usage of --signoff constitutes agreement with the terms of [DCO 1.1](./DCO_1_1.md).
 - Take care to make sure no merge commits are in the submission, and use `git rebase` vs `git merge` for this reason.
-  - If collaborating with someone else on the same branch, consider using `--force-with-lease` instead of `--force`. This will prevent you from accidentally overwriting commits pushed by someone else. For more information, see https://git-scm.com/docs/git-push#git-push---force-with-leaseltrefnamegt
+  - If collaborating with someone else on the same branch, consider using `--force-with-lease` instead of `--force`. This will prevent you from accidentally overwriting commits pushed by someone else. For more information, see [git push docs](https://git-scm.com/docs/git-push#git-push---force-with-leaseltrefnamegt).
 - If submitting a large code change, it's a good idea to join the `#ansible-awx` channel on irc.libera.chat, and talk about what you would like to do or add first. This not only helps everyone know what's going on, it also helps save time and effort, if the community decides some changes are needed.
 - We ask all of our community members and contributors to adhere to the [Ansible code of conduct](http://docs.ansible.com/ansible/latest/community/code_of_conduct.html). If you have questions, or need assistance, please reach out to our community team at [codeofconduct@ansible.com](mailto:codeofconduct@ansible.com)

@@ -42,8 +43,7 @@ The AWX development environment workflow and toolchain uses Docker and the docke

 Prior to starting the development services, you'll need `docker` and `docker-compose`. On Linux, you can generally find these in your distro's packaging, but you may find that Docker themselves maintain a separate repo that tracks more closely to the latest releases.

-For macOS and Windows, we recommend [Docker for Mac](https://www.docker.com/docker-mac) and [Docker for Windows](https://www.docker.com/docker-windows)
-respectively.
+For macOS and Windows, we recommend [Docker for Mac](https://www.docker.com/docker-mac) and [Docker for Windows](https://www.docker.com/docker-windows) respectively.

 For Linux platforms, refer to the following from Docker:

@@ -79,17 +79,13 @@ See the [README.md](./tools/docker-compose/README.md) for docs on how to build t

 ### Building API Documentation

-AWX includes support for building [Swagger/OpenAPI
-documentation](https://swagger.io). To build the documentation locally, run:
+AWX includes support for building [Swagger/OpenAPI documentation](https://swagger.io). To build the documentation locally, run:

 ```bash
 (container)/awx_devel$ make swagger
 ```

-This will write a file named `swagger.json` that contains the API specification
-in OpenAPI format. A variety of online tools are available for translating
-this data into more consumable formats (such as HTML). http://editor.swagger.io
-is an example of one such service.
+This will write a file named `swagger.json` that contains the API specification in OpenAPI format. A variety of online tools are available for translating this data into more consumable formats (such as HTML). http://editor.swagger.io is an example of one such service.

 ### Accessing the AWX web interface

@@ -115,20 +111,30 @@ While you can use environment variables to skip the pre-commit hooks GitHub will

 ## What should I work on?

+We have a ["good first issue" label](https://github.com/ansible/awx/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) we put on some issues that might be a good starting point for new contributors.
+
+Fixing bugs and updating the documentation are always appreciated, so reviewing the backlog of issues is always a good place to start.
+
 For feature work, take a look at the current [Enhancements](https://github.com/ansible/awx/issues?q=is%3Aissue+is%3Aopen+label%3Atype%3Aenhancement).

 If it has someone assigned to it then that person is the person responsible for working the enhancement. If you feel like you could contribute then reach out to that person.

-Fixing bugs, adding translations, and updating the documentation are always appreciated, so reviewing the backlog of issues is always a good place to start. For extra information on debugging tools, see [Debugging](./docs/debugging/).
+**NOTES**
+
+> Issue assignment will only be done for maintainers of the project. If you decide to work on an issue, please feel free to add a comment in the issue to let others know that you are working on it; but know that we will accept the first pull request from whomever is able to fix an issue. Once your PR is accepted we can add you as an assignee to an issue upon request. 

-**NOTE**

 > If you work in a part of the codebase that is going through active development, your changes may be rejected, or you may be asked to `rebase`. A good idea before starting work is to have a discussion with us in the `#ansible-awx` channel on irc.libera.chat, or on the [mailing list](https://groups.google.com/forum/#!forum/awx-project).

-**NOTE**
-
 > If you're planning to develop features or fixes for the UI, please review the [UI Developer doc](./awx/ui/README.md).

+### Translations
+
+At this time we do not accept PRs for adding additional language translations as we have an automated process for generating our translations. This is because translations require constant care as new strings are added and changed in the code base. Because of this the .po files are overwritten during every translation release cycle. We also can't support a lot of translations on AWX as its an open source project and each language adds time and cost to maintain. If you would like to see AWX translated into a new language please create an issue and ask others you know to upvote the issue. Our translation team will review the needs of the community and see what they can do around supporting additional language.
+
+If you find an issue with an existing translation, please see the [Reporting Issues](#reporting-issues) section to open an issue and our translation team will work with you on a resolution. 
+
+
 ## Submitting Pull Requests

 Fixes and Features for AWX will go through the Github pull request process. Submit your pull request (PR) against the `devel` branch.
@@ -152,28 +158,14 @@ We like to keep our commit history clean, and will require resubmission of pull

 Sometimes it might take us a while to fully review your PR. We try to keep the `devel` branch in good working order, and so we review requests carefully. Please be patient.

-All submitted PRs will have the linter and unit tests run against them via Zuul, and the status reported in the PR.
-
-## PR Checks run by Zuul
-
-Zuul jobs for awx are defined in the [zuul-jobs](https://github.com/ansible/zuul-jobs) repo.
-
-Zuul runs the following checks that must pass:
-
-1. `tox-awx-api-lint`
-2. `tox-awx-ui-lint`
-3. `tox-awx-api`
-4. `tox-awx-ui`
-5. `tox-awx-swagger`
-
-Zuul runs the following checks that are non-voting (can not pass but serve to inform PR reviewers):
-
-1. `tox-awx-detect-schema-change`
-   This check generates the schema and diffs it against a reference copy of the `devel` version of the schema.
-   Reviewers should inspect the `job-output.txt.gz` related to the check if their is a failure (grep for `diff -u -b` to find beginning of diff).
-   If the schema change is expected and makes sense in relation to the changes made by the PR, then you are good to go!
-   If not, the schema changes should be fixed, but this decision must be enforced by reviewers.
+When your PR is initially submitted the checks will not be run until a maintainer allows them to be. Once a maintainer has done a quick review of your work the PR will have the linter and unit tests run against them via GitHub Actions, and the status reported in the PR.

 ## Reporting Issues
-
+ 
 We welcome your feedback, and encourage you to file an issue when you run into a problem. But before opening a new issues, we ask that you please view our [Issues guide](./ISSUES.md).
+
+## Getting Help
+
+If you require additional assistance, please reach out to us at `#ansible-awx` on irc.libera.chat, or submit your question to the [mailing list](https://groups.google.com/forum/#!forum/awx-project).
+
+For extra information on debugging tools, see [Debugging](./docs/debugging/).
--- a/MANIFEST.in
+++ b/MANIFEST.in
@@ -3,7 +3,7 @@ recursive-include awx *.po
 recursive-include awx *.mo
 recursive-include awx/static *
 recursive-include awx/templates *.html
-recursive-include awx/api/templates *.md *.html
+recursive-include awx/api/templates *.md *.html *.yml
 recursive-include awx/ui/build *.html
 recursive-include awx/ui/build *
 recursive-include awx/playbooks *.yml
--- a/117
+++ b/117
@@ -5,8 +5,8 @@ NPM_BIN ?= npm
 CHROMIUM_BIN=/tmp/chrome-linux/chrome
 GIT_BRANCH ?= $(shell git rev-parse --abbrev-ref HEAD)
 MANAGEMENT_COMMAND ?= awx-manage
-VERSION := $(shell $(PYTHON) setup.py --version)
-COLLECTION_VERSION := $(shell $(PYTHON) setup.py --version | cut -d . -f 1-3)
+VERSION := $(shell $(PYTHON) tools/scripts/scm_version.py)
+COLLECTION_VERSION := $(shell $(PYTHON) tools/scripts/scm_version.py | cut -d . -f 1-3)

 # NOTE: This defaults the container image version to the branch that's active
 COMPOSE_TAG ?= $(GIT_BRANCH)
@@ -49,7 +49,7 @@ I18N_FLAG_FILE = .i18n_built
 .PHONY: awx-link clean clean-tmp clean-venv requirements requirements_dev \
 	develop refresh adduser migrate dbchange \
 	receiver test test_unit test_coverage coverage_html \
-	dev_build release_build sdist \
+	sdist \
 	ui-release ui-devel \
 	VERSION PYTHON_VERSION docker-compose-sources \
 	.git/hooks/pre-commit
@@ -72,7 +72,7 @@ clean-languages:
 	rm -f $(I18N_FLAG_FILE)
 	find ./awx/locale/ -type f -regex ".*\.mo$" -delete

-# Remove temporary build files, compiled Python files.
+## Remove temporary build files, compiled Python files.
 clean: clean-ui clean-api clean-awxkit clean-dist
 	rm -rf awx/public
 	rm -rf awx/lib/site-packages
@@ -94,7 +94,7 @@ clean-api:
 clean-awxkit:
 	rm -rf awxkit/*.egg-info awxkit/.tox awxkit/build/*

-# convenience target to assert environment variables are defined
+## convenience target to assert environment variables are defined
 guard-%:
 	@if [ "$${$*}" = "" ]; then \
 	    echo "The required environment variable '$*' is not set"; \
@@ -117,7 +117,7 @@ virtualenv_awx:
 		fi; \
 	fi

-# Install third-party requirements needed for AWX's environment.
+## Install third-party requirements needed for AWX's environment. 
 # this does not use system site packages intentionally
 requirements_awx: virtualenv_awx
 	if [[ "$(PIP_OPTIONS)" == *"--no-index"* ]]; then \
@@ -136,7 +136,7 @@ requirements_dev: requirements_awx requirements_awx_dev

 requirements_test: requirements

-# "Install" awx package in development mode.
+## "Install" awx package in development mode.
 develop:
 	@if [ "$(VIRTUAL_ENV)" ]; then \
 	    pip uninstall -y awx; \
@@ -153,21 +153,21 @@ version_file:
 	fi; \
 	$(PYTHON) -c "import awx; print(awx.__version__)" > /var/lib/awx/.awx_version; \

-# Refresh development environment after pulling new code.
+## Refresh development environment after pulling new code.
 refresh: clean requirements_dev version_file develop migrate

-# Create Django superuser.
+## Create Django superuser.
 adduser:
 	$(MANAGEMENT_COMMAND) createsuperuser

-# Create database tables and apply any new migrations.
+## Create database tables and apply any new migrations.
 migrate:
 	if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
 	fi; \
 	$(MANAGEMENT_COMMAND) migrate --noinput

-# Run after making changes to the models to create a new migration.
+## Run after making changes to the models to create a new migration.
 dbchange:
 	$(MANAGEMENT_COMMAND) makemigrations

@@ -218,7 +218,7 @@ wsbroadcast:
 	fi; \
 	$(PYTHON) manage.py run_wsbroadcast

-# Run to start the background task dispatcher for development.
+## Run to start the background task dispatcher for development.
 dispatcher:
 	@if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
@@ -226,7 +226,7 @@ dispatcher:
 	$(PYTHON) manage.py run_dispatcher


-# Run to start the zeromq callback receiver
+## Run to start the zeromq callback receiver
 receiver:
 	@if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
@@ -273,12 +273,12 @@ api-lint:
 	yamllint -s .

 awx-link:
-	[ -d "/awx_devel/awx.egg-info" ] || $(PYTHON) /awx_devel/setup.py egg_info_dev
+	[ -d "/awx_devel/awx.egg-info" ] || $(PYTHON) /awx_devel/tools/scripts/egg_info_dev
 	cp -f /tmp/awx.egg-link /var/lib/awx/venv/awx/lib/$(PYTHON)/site-packages/awx.egg-link

 TEST_DIRS ?= awx/main/tests/unit awx/main/tests/functional awx/conf/tests awx/sso/tests
 PYTEST_ARGS ?= -n auto
-# Run all API unit tests.
+## Run all API unit tests.
 test:
 	if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
@@ -341,23 +341,24 @@ test_unit:
 	fi; \
 	py.test awx/main/tests/unit awx/conf/tests/unit awx/sso/tests/unit

-# Run all API unit tests with coverage enabled.
+## Run all API unit tests with coverage enabled.
 test_coverage:
 	@if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
 	fi; \
 	py.test --create-db --cov=awx --cov-report=xml --junitxml=./reports/junit.xml $(TEST_DIRS)

-# Output test coverage as HTML (into htmlcov directory).
+## Output test coverage as HTML (into htmlcov directory).
 coverage_html:
 	coverage html

-# Run API unit tests across multiple Python/Django versions with Tox.
+## Run API unit tests across multiple Python/Django versions with Tox.
 test_tox:
 	tox -v

-# Make fake data
+
 DATA_GEN_PRESET = ""
+## Make fake data
 bulk_data:
 	@if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
@@ -378,9 +379,10 @@ clean-ui:
 	rm -rf $(UI_BUILD_FLAG_FILE)

 awx/ui/node_modules:
-	NODE_OPTIONS=--max-old-space-size=6144 $(NPM_BIN) --prefix awx/ui --loglevel warn ci
+	NODE_OPTIONS=--max-old-space-size=6144 $(NPM_BIN) --prefix awx/ui --loglevel warn --force ci

-$(UI_BUILD_FLAG_FILE): awx/ui/node_modules
+$(UI_BUILD_FLAG_FILE):
+	$(MAKE) awx/ui/node_modules
 	$(PYTHON) tools/scripts/compilemessages.py
 	$(NPM_BIN) --prefix awx/ui --loglevel warn run compile-strings
 	$(NPM_BIN) --prefix awx/ui --loglevel warn run build
@@ -424,21 +426,13 @@ ui-test-general:
 	$(NPM_BIN) run --prefix awx/ui pretest
 	$(NPM_BIN) run --prefix awx/ui/ test-general --runInBand

-# Build a pip-installable package into dist/ with a timestamped version number.
-dev_build:
-	$(PYTHON) setup.py dev_build
-
-# Build a pip-installable package into dist/ with the release version number.
-release_build:
-	$(PYTHON) setup.py release_build
-
 HEADLESS ?= no
 ifeq ($(HEADLESS), yes)
 dist/$(SDIST_TAR_FILE):
 else
 dist/$(SDIST_TAR_FILE): $(UI_BUILD_FLAG_FILE)
 endif
-	$(PYTHON) setup.py $(SDIST_COMMAND)
+	$(PYTHON) -m build -s
 	ln -sf $(SDIST_TAR_FILE) dist/awx.tar.gz

 sdist: dist/$(SDIST_TAR_FILE)
@@ -459,6 +453,11 @@ COMPOSE_OPTS ?=
 CONTROL_PLANE_NODE_COUNT ?= 1
 EXECUTION_NODE_COUNT ?= 2
 MINIKUBE_CONTAINER_GROUP ?= false
+EXTRA_SOURCES_ANSIBLE_OPTS ?=
+
+ifneq ($(ADMIN_PASSWORD),)
+	EXTRA_SOURCES_ANSIBLE_OPTS := -e admin_password=$(ADMIN_PASSWORD) $(EXTRA_SOURCES_ANSIBLE_OPTS)
+endif

 docker-compose-sources: .git/hooks/pre-commit
 	@if [ $(MINIKUBE_CONTAINER_GROUP) = true ]; then\
@@ -476,7 +475,8 @@ docker-compose-sources: .git/hooks/pre-commit
 	    -e enable_ldap=$(LDAP) \
 	    -e enable_splunk=$(SPLUNK) \
 	    -e enable_prometheus=$(PROMETHEUS) \
-	    -e enable_grafana=$(GRAFANA)
+	    -e enable_grafana=$(GRAFANA) $(EXTRA_SOURCES_ANSIBLE_OPTS)
+


 docker-compose: awx/projects docker-compose-sources
@@ -510,7 +510,7 @@ docker-compose-container-group-clean:
 	fi
 	rm -rf tools/docker-compose-minikube/_sources/

-# Base development image build
+## Base development image build
 docker-compose-build:
 	ansible-playbook tools/ansible/dockerfile.yml -e build_dev=True -e receptor_image=$(RECEPTOR_IMAGE)
 	DOCKER_BUILDKIT=1 docker build -t $(DEVEL_IMAGE_NAME) \
@@ -528,7 +528,7 @@ docker-clean-volumes: docker-compose-clean docker-compose-container-group-clean

 docker-refresh: docker-clean docker-compose

-# Docker Development Environment with Elastic Stack Connected
+## Docker Development Environment with Elastic Stack Connected
 docker-compose-elk: awx/projects docker-compose-sources
 	docker-compose -f tools/docker-compose/_sources/docker-compose.yml -f tools/elastic/docker-compose.logstash-link.yml -f tools/elastic/docker-compose.elastic-override.yml up --no-recreate

@@ -565,26 +565,34 @@ Dockerfile.kube-dev: tools/ansible/roles/dockerfile/templates/Dockerfile.j2
 	    -e template_dest=_build_kube_dev \
 	    -e receptor_image=$(RECEPTOR_IMAGE)

+## Build awx_kube_devel image for development on local Kubernetes environment.
 awx-kube-dev-build: Dockerfile.kube-dev
 	DOCKER_BUILDKIT=1 docker build -f Dockerfile.kube-dev \
 	    --build-arg BUILDKIT_INLINE_CACHE=1 \
 	    --cache-from=$(DEV_DOCKER_TAG_BASE)/awx_kube_devel:$(COMPOSE_TAG) \
 	    -t $(DEV_DOCKER_TAG_BASE)/awx_kube_devel:$(COMPOSE_TAG) .

+## Build awx image for deployment on Kubernetes environment.
+awx-kube-build: Dockerfile
+	DOCKER_BUILDKIT=1 docker build -f Dockerfile \
+		--build-arg VERSION=$(VERSION) \
+		--build-arg SETUPTOOLS_SCM_PRETEND_VERSION=$(VERSION) \
+		--build-arg HEADLESS=$(HEADLESS) \
+		-t $(DEV_DOCKER_TAG_BASE)/awx:$(COMPOSE_TAG) .

 # Translation TASKS
 # --------------------------------------

-# generate UI .pot file, an empty template of strings yet to be translated
+## generate UI .pot file, an empty template of strings yet to be translated
 pot: $(UI_BUILD_FLAG_FILE)
 	$(NPM_BIN) --prefix awx/ui --loglevel warn run extract-template --clean

-# generate UI .po files for each locale (will update translated strings for `en`)
+## generate UI .po files for each locale (will update translated strings for `en`)
 po: $(UI_BUILD_FLAG_FILE)
 	$(NPM_BIN) --prefix awx/ui --loglevel warn run extract-strings -- --clean

-# generate API django .pot .po
-LANG = "en-us"
+LANG = "en_us"
+## generate API django .pot .po
 messages:
 	@if [ "$(VENV_BASE)" ]; then \
 		. $(VENV_BASE)/awx/bin/activate; \
@@ -593,3 +601,38 @@ messages:

 print-%:
 	@echo $($*)
+
+# HELP related targets
+# --------------------------------------
+
+HELP_FILTER=.PHONY
+
+## Display help targets
+help:
+	@printf "Available targets:\n"
+	@make -s help/generate | grep -vE "\w($(HELP_FILTER))"
+
+## Display help for all targets
+help/all:
+	@printf "Available targets:\n"
+	@make -s help/generate
+
+## Generate help output from MAKEFILE_LIST
+help/generate:
+	@awk '/^[-a-zA-Z_0-9%:\\\.\/]+:/ { \
+		helpMessage = match(lastLine, /^## (.*)/); \
+		if (helpMessage) { \
+			helpCommand = $$1; \
+			helpMessage = substr(lastLine, RSTART + 3, RLENGTH); \
+			gsub("\\\\", "", helpCommand); \
+			gsub(":+$$", "", helpCommand); \
+			printf "  \x1b[32;01m%-35s\x1b[0m %s\n", helpCommand, helpMessage; \
+		} else { \
+			helpCommand = $$1; \
+			gsub("\\\\", "", helpCommand); \
+			gsub(":+$$", "", helpCommand); \
+			printf "  \x1b[32;01m%-35s\x1b[0m %s\n", helpCommand, "No help available"; \
+		} \
+	} \
+	{ lastLine = $$0 }' $(MAKEFILE_LIST) | sort -u
+	@printf "\n"
--- a/awx/init.py
+++ b/awx/init.py
@@ -6,9 +6,40 @@ import os
 import sys
 import warnings

-from pkg_resources import get_distribution

-__version__ = get_distribution('awx').version
+def get_version():
+    version_from_file = get_version_from_file()
+    if version_from_file:
+        return version_from_file
+    else:
+        from setuptools_scm import get_version
+
+        version = get_version(root='..', relative_to=__file__)
+        return version
+
+
+def get_version_from_file():
+    vf = version_file()
+    if vf:
+        with open(vf, 'r') as file:
+            return file.read().strip()
+
+
+def version_file():
+    current_dir = os.path.dirname(os.path.abspath(__file__))
+    version_file = os.path.join(current_dir, '..', 'VERSION')
+
+    if os.path.exists(version_file):
+        return version_file
+
+
+try:
+    import pkg_resources
+
+    __version__ = pkg_resources.get_distribution('awx').version
+except pkg_resources.DistributionNotFound:
+    __version__ = get_version()
+
 __all__ = ['__version__']


@@ -21,7 +52,6 @@ try:
 except ImportError:  # pragma: no cover
    MODE = 'production'

-
 import hashlib

 try:
@@ -160,7 +190,7 @@ def manage():
        sys.stdout.write('%s\n' % __version__)
    # If running as a user without permission to read settings, display an
    # error message.  Allow --help to still work.
-    elif settings.SECRET_KEY == 'permission-denied':
+    elif not os.getenv('SKIP_SECRET_KEY_CHECK', False) and settings.SECRET_KEY == 'permission-denied':
        if len(sys.argv) == 1 or len(sys.argv) >= 2 and sys.argv[1] in ('-h', '--help', 'help'):
            execute_from_command_line(sys.argv)
            sys.stdout.write('\n')
--- a/awx/api/filters.py
+++ b/awx/api/filters.py
@@ -157,7 +157,7 @@ class FieldLookupBackend(BaseFilterBackend):

    # A list of fields that we know can be filtered on without the possiblity
    # of introducing duplicates
-    NO_DUPLICATES_ALLOW_LIST = (CharField, IntegerField, BooleanField)
+    NO_DUPLICATES_ALLOW_LIST = (CharField, IntegerField, BooleanField, TextField)

    def get_fields_from_lookup(self, model, lookup):

@@ -232,6 +232,9 @@ class FieldLookupBackend(BaseFilterBackend):
                re.compile(value)
            except re.error as e:
                raise ValueError(e.args[0])
+        elif new_lookup.endswith('__iexact'):
+            if not isinstance(field, (CharField, TextField)):
+                raise ValueError(f'{field.name} is not a text field and cannot be filtered by case-insensitive search')
        elif new_lookup.endswith('__search'):
            related_model = getattr(field, 'related_model', None)
            if not related_model:
@@ -258,8 +261,8 @@ class FieldLookupBackend(BaseFilterBackend):
            search_filters = {}
            needs_distinct = False
            # Can only have two values: 'AND', 'OR'
-            # If 'AND' is used, an iterm must satisfy all condition to show up in the results.
-            # If 'OR' is used, an item just need to satisfy one condition to appear in results.
+            # If 'AND' is used, an item must satisfy all conditions to show up in the results.
+            # If 'OR' is used, an item just needs to satisfy one condition to appear in results.
            search_filter_relation = 'OR'
            for key, values in request.query_params.lists():
                if key in self.RESERVED_NAMES:
--- a/awx/api/generics.py
+++ b/awx/api/generics.py
@@ -63,7 +63,6 @@ __all__ = [
    'SubDetailAPIView',
    'ResourceAccessList',
    'ParentMixin',
-    'DeleteLastUnattachLabelMixin',
    'SubListAttachDetachAPIView',
    'CopyAPIView',
    'BaseUsersList',
@@ -98,7 +97,6 @@ class LoggedLoginView(auth_views.LoginView):
            current_user = UserSerializer(self.request.user)
            current_user = smart_str(JSONRenderer().render(current_user.data))
            current_user = urllib.parse.quote('%s' % current_user, '')
-            ret.set_cookie('current_user', current_user, secure=settings.SESSION_COOKIE_SECURE or None)
            ret.setdefault('X-API-Session-Cookie-Name', getattr(settings, 'SESSION_COOKIE_NAME', 'awx_sessionid'))

            return ret
@@ -775,28 +773,6 @@ class SubListAttachDetachAPIView(SubListCreateAttachDetachAPIView):
        return {'id': None}


-class DeleteLastUnattachLabelMixin(object):
-    """
-    Models for which you want the last instance to be deleted from the database
-    when the last disassociate is called should inherit from this class. Further,
-    the model should implement is_detached()
-    """
-
-    def unattach(self, request, *args, **kwargs):
-        (sub_id, res) = super(DeleteLastUnattachLabelMixin, self).unattach_validate(request)
-        if res:
-            return res
-
-        res = super(DeleteLastUnattachLabelMixin, self).unattach_by_id(request, sub_id)
-
-        obj = self.model.objects.get(id=sub_id)
-
-        if obj.is_detached():
-            obj.delete()
-
-        return res
-
-
 class SubDetailAPIView(ParentMixin, generics.RetrieveAPIView, GenericAPIView):
    pass

--- a/awx/api/serializers.py
+++ b/awx/api/serializers.py
@@ -29,7 +29,6 @@ from django.utils.translation import gettext_lazy as _
 from django.utils.encoding import force_str
 from django.utils.text import capfirst
 from django.utils.timezone import now
-from django.utils.functional import cached_property

 # Django REST Framework
 from rest_framework.exceptions import ValidationError, PermissionDenied
@@ -155,6 +154,7 @@ SUMMARIZABLE_FK_FIELDS = {
    'source_project': DEFAULT_SUMMARY_FIELDS + ('status', 'scm_type'),
    'project_update': DEFAULT_SUMMARY_FIELDS + ('status', 'failed'),
    'credential': DEFAULT_SUMMARY_FIELDS + ('kind', 'cloud', 'kubernetes', 'credential_type_id'),
+    'signature_validation_credential': DEFAULT_SUMMARY_FIELDS + ('kind', 'credential_type_id'),
    'job': DEFAULT_SUMMARY_FIELDS + ('status', 'failed', 'elapsed', 'type', 'canceled_on'),
    'job_template': DEFAULT_SUMMARY_FIELDS,
    'workflow_job_template': DEFAULT_SUMMARY_FIELDS,
@@ -1471,6 +1471,7 @@ class ProjectSerializer(UnifiedJobTemplateSerializer, ProjectOptionsSerializer):
            'allow_override',
            'custom_virtualenv',
            'default_environment',
+            'signature_validation_credential',
        ) + (
            'last_update_failed',
            'last_updated',
@@ -1679,6 +1680,7 @@ class InventorySerializer(LabelsListMixin, BaseSerializerWithVariables):
            'total_inventory_sources',
            'inventory_sources_with_failures',
            'pending_deletion',
+            'prevent_instance_group_fallback',
        )

    def get_related(self, obj):
@@ -2073,7 +2075,7 @@ class InventorySourceSerializer(UnifiedJobTemplateSerializer, InventorySourceOpt

    class Meta:
        model = InventorySource
-        fields = ('*', 'name', 'inventory', 'update_on_launch', 'update_cache_timeout', 'source_project', 'update_on_project_update') + (
+        fields = ('*', 'name', 'inventory', 'update_on_launch', 'update_cache_timeout', 'source_project') + (
            'last_update_failed',
            'last_updated',
        )  # Backwards compatibility.
@@ -2136,11 +2138,6 @@ class InventorySourceSerializer(UnifiedJobTemplateSerializer, InventorySourceOpt
            raise serializers.ValidationError(_("Cannot use manual project for SCM-based inventory."))
        return value

-    def validate_update_on_project_update(self, value):
-        if value and self.instance and self.instance.schedules.exists():
-            raise serializers.ValidationError(_("Setting not compatible with existing schedules."))
-        return value
-
    def validate_inventory(self, value):
        if value and value.kind == 'smart':
            raise serializers.ValidationError({"detail": _("Cannot create Inventory Source for Smart Inventory")})
@@ -2191,7 +2188,7 @@ class InventorySourceSerializer(UnifiedJobTemplateSerializer, InventorySourceOpt
            if ('source' in attrs or 'source_project' in attrs) and get_field_from_model_or_attrs('source_project') is None:
                raise serializers.ValidationError({"source_project": _("Project required for scm type sources.")})
        else:
-            redundant_scm_fields = list(filter(lambda x: attrs.get(x, None), ['source_project', 'source_path', 'update_on_project_update']))
+            redundant_scm_fields = list(filter(lambda x: attrs.get(x, None), ['source_project', 'source_path']))
            if redundant_scm_fields:
                raise serializers.ValidationError({"detail": _("Cannot set %s if not SCM type." % ' '.join(redundant_scm_fields))})

@@ -2236,6 +2233,7 @@ class InventoryUpdateSerializer(UnifiedJobSerializer, InventorySourceOptionsSeri
            'source_project_update',
            'custom_virtualenv',
            'instance_group',
+            'scm_revision',
        )

    def get_related(self, obj):
@@ -2926,6 +2924,12 @@ class JobTemplateSerializer(JobTemplateMixin, UnifiedJobTemplateSerializer, JobO
            'ask_verbosity_on_launch',
            'ask_inventory_on_launch',
            'ask_credential_on_launch',
+            'ask_execution_environment_on_launch',
+            'ask_labels_on_launch',
+            'ask_forks_on_launch',
+            'ask_job_slice_count_on_launch',
+            'ask_timeout_on_launch',
+            'ask_instance_groups_on_launch',
            'survey_enabled',
            'become_enabled',
            'diff_mode',
@@ -2934,6 +2938,7 @@ class JobTemplateSerializer(JobTemplateMixin, UnifiedJobTemplateSerializer, JobO
            'job_slice_count',
            'webhook_service',
            'webhook_credential',
+            'prevent_instance_group_fallback',
        )
        read_only_fields = ('*', 'custom_virtualenv')

@@ -3188,7 +3193,7 @@ class JobRelaunchSerializer(BaseSerializer):
        return attrs


-class JobCreateScheduleSerializer(BaseSerializer):
+class JobCreateScheduleSerializer(LabelsListMixin, BaseSerializer):

    can_schedule = serializers.SerializerMethodField()
    prompts = serializers.SerializerMethodField()
@@ -3214,11 +3219,14 @@ class JobCreateScheduleSerializer(BaseSerializer):
        try:
            config = obj.launch_config
            ret = config.prompts_dict(display=True)
-            if 'inventory' in ret:
-                ret['inventory'] = self._summarize('inventory', ret['inventory'])
-            if 'credentials' in ret:
-                all_creds = [self._summarize('credential', cred) for cred in ret['credentials']]
-                ret['credentials'] = all_creds
+            for field_name in ('inventory', 'execution_environment'):
+                if field_name in ret:
+                    ret[field_name] = self._summarize(field_name, ret[field_name])
+            for field_name, singular in (('credentials', 'credential'), ('instance_groups', 'instance_group')):
+                if field_name in ret:
+                    ret[field_name] = [self._summarize(singular, obj) for obj in ret[field_name]]
+            if 'labels' in ret:
+                ret['labels'] = self._summary_field_labels(config)
            return ret
        except JobLaunchConfig.DoesNotExist:
            return {'all': _('Unknown, job may have been ran before launch configurations were saved.')}
@@ -3391,6 +3399,9 @@ class WorkflowJobTemplateSerializer(JobTemplateMixin, LabelsListMixin, UnifiedJo
    limit = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
    scm_branch = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)

+    skip_tags = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
+    job_tags = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
+
    class Meta:
        model = WorkflowJobTemplate
        fields = (
@@ -3409,6 +3420,11 @@ class WorkflowJobTemplateSerializer(JobTemplateMixin, LabelsListMixin, UnifiedJo
            'webhook_service',
            'webhook_credential',
            '-execution_environment',
+            'ask_labels_on_launch',
+            'ask_skip_tags_on_launch',
+            'ask_tags_on_launch',
+            'skip_tags',
+            'job_tags',
        )

    def get_related(self, obj):
@@ -3452,7 +3468,7 @@ class WorkflowJobTemplateSerializer(JobTemplateMixin, LabelsListMixin, UnifiedJo

        # process char_prompts, these are not direct fields on the model
        mock_obj = self.Meta.model()
-        for field_name in ('scm_branch', 'limit'):
+        for field_name in ('scm_branch', 'limit', 'skip_tags', 'job_tags'):
            if field_name in attrs:
                setattr(mock_obj, field_name, attrs[field_name])
                attrs.pop(field_name)
@@ -3478,6 +3494,9 @@ class WorkflowJobSerializer(LabelsListMixin, UnifiedJobSerializer):
    limit = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
    scm_branch = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)

+    skip_tags = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
+    job_tags = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
+
    class Meta:
        model = WorkflowJob
        fields = (
@@ -3497,6 +3516,8 @@ class WorkflowJobSerializer(LabelsListMixin, UnifiedJobSerializer):
            'webhook_service',
            'webhook_credential',
            'webhook_guid',
+            'skip_tags',
+            'job_tags',
        )

    def get_related(self, obj):
@@ -3613,6 +3634,9 @@ class LaunchConfigurationBaseSerializer(BaseSerializer):
    skip_tags = serializers.CharField(allow_blank=True, allow_null=True, required=False, default=None)
    diff_mode = serializers.BooleanField(required=False, allow_null=True, default=None)
    verbosity = serializers.ChoiceField(allow_null=True, required=False, default=None, choices=VERBOSITY_CHOICES)
+    forks = serializers.IntegerField(required=False, allow_null=True, min_value=0, default=None)
+    job_slice_count = serializers.IntegerField(required=False, allow_null=True, min_value=0, default=None)
+    timeout = serializers.IntegerField(required=False, allow_null=True, default=None)
    exclude_errors = ()

    class Meta:
@@ -3628,13 +3652,21 @@ class LaunchConfigurationBaseSerializer(BaseSerializer):
            'skip_tags',
            'diff_mode',
            'verbosity',
+            'execution_environment',
+            'forks',
+            'job_slice_count',
+            'timeout',
        )

    def get_related(self, obj):
        res = super(LaunchConfigurationBaseSerializer, self).get_related(obj)
        if obj.inventory_id:
            res['inventory'] = self.reverse('api:inventory_detail', kwargs={'pk': obj.inventory_id})
+        if obj.execution_environment_id:
+            res['execution_environment'] = self.reverse('api:execution_environment_detail', kwargs={'pk': obj.execution_environment_id})
+        res['labels'] = self.reverse('api:{}_labels_list'.format(get_type_for_model(self.Meta.model)), kwargs={'pk': obj.pk})
        res['credentials'] = self.reverse('api:{}_credentials_list'.format(get_type_for_model(self.Meta.model)), kwargs={'pk': obj.pk})
+        res['instance_groups'] = self.reverse('api:{}_instance_groups_list'.format(get_type_for_model(self.Meta.model)), kwargs={'pk': obj.pk})
        return res

    def _build_mock_obj(self, attrs):
@@ -4086,7 +4118,6 @@ class SystemJobEventSerializer(AdHocCommandEventSerializer):


 class JobLaunchSerializer(BaseSerializer):
-
    # Representational fields
    passwords_needed_to_start = serializers.ReadOnlyField()
    can_start_without_user_input = serializers.BooleanField(read_only=True)
@@ -4109,6 +4140,12 @@ class JobLaunchSerializer(BaseSerializer):
    skip_tags = serializers.CharField(required=False, write_only=True, allow_blank=True)
    limit = serializers.CharField(required=False, write_only=True, allow_blank=True)
    verbosity = serializers.ChoiceField(required=False, choices=VERBOSITY_CHOICES, write_only=True)
+    execution_environment = serializers.PrimaryKeyRelatedField(queryset=ExecutionEnvironment.objects.all(), required=False, write_only=True)
+    labels = serializers.PrimaryKeyRelatedField(many=True, queryset=Label.objects.all(), required=False, write_only=True)
+    forks = serializers.IntegerField(required=False, write_only=True, min_value=0)
+    job_slice_count = serializers.IntegerField(required=False, write_only=True, min_value=0)
+    timeout = serializers.IntegerField(required=False, write_only=True)
+    instance_groups = serializers.PrimaryKeyRelatedField(many=True, queryset=InstanceGroup.objects.all(), required=False, write_only=True)

    class Meta:
        model = JobTemplate
@@ -4136,6 +4173,12 @@ class JobLaunchSerializer(BaseSerializer):
            'ask_verbosity_on_launch',
            'ask_inventory_on_launch',
            'ask_credential_on_launch',
+            'ask_execution_environment_on_launch',
+            'ask_labels_on_launch',
+            'ask_forks_on_launch',
+            'ask_job_slice_count_on_launch',
+            'ask_timeout_on_launch',
+            'ask_instance_groups_on_launch',
            'survey_enabled',
            'variables_needed_to_start',
            'credential_needed_to_start',
@@ -4143,6 +4186,12 @@ class JobLaunchSerializer(BaseSerializer):
            'job_template_data',
            'defaults',
            'verbosity',
+            'execution_environment',
+            'labels',
+            'forks',
+            'job_slice_count',
+            'timeout',
+            'instance_groups',
        )
        read_only_fields = (
            'ask_scm_branch_on_launch',
@@ -4155,6 +4204,12 @@ class JobLaunchSerializer(BaseSerializer):
            'ask_verbosity_on_launch',
            'ask_inventory_on_launch',
            'ask_credential_on_launch',
+            'ask_execution_environment_on_launch',
+            'ask_labels_on_launch',
+            'ask_forks_on_launch',
+            'ask_job_slice_count_on_launch',
+            'ask_timeout_on_launch',
+            'ask_instance_groups_on_launch',
        )

    def get_credential_needed_to_start(self, obj):
@@ -4179,6 +4234,17 @@ class JobLaunchSerializer(BaseSerializer):
                    if cred.credential_type.managed and 'vault_id' in cred.credential_type.defined_fields:
                        cred_dict['vault_id'] = cred.get_input('vault_id', default=None)
                    defaults_dict.setdefault(field_name, []).append(cred_dict)
+            elif field_name == 'execution_environment':
+                if obj.execution_environment_id:
+                    defaults_dict[field_name] = {'id': obj.execution_environment.id, 'name': obj.execution_environment.name}
+                else:
+                    defaults_dict[field_name] = {}
+            elif field_name == 'labels':
+                for label in obj.labels.all():
+                    label_dict = {'id': label.id, 'name': label.name}
+                    defaults_dict.setdefault(field_name, []).append(label_dict)
+            elif field_name == 'instance_groups':
+                defaults_dict[field_name] = []
            else:
                defaults_dict[field_name] = getattr(obj, field_name)
        return defaults_dict
@@ -4201,6 +4267,15 @@ class JobLaunchSerializer(BaseSerializer):
        elif template.project.status in ('error', 'failed'):
            errors['playbook'] = _("Missing a revision to run due to failed project update.")

+            latest_update = template.project.project_updates.last()
+            if latest_update is not None and latest_update.failed:
+                failed_validation_tasks = latest_update.project_update_events.filter(
+                    event='runner_on_failed',
+                    play="Perform project signature/checksum verification",
+                )
+                if failed_validation_tasks:
+                    errors['playbook'] = _("Last project update failed due to signature validation failure.")
+
        # cannot run a playbook without an inventory
        if template.inventory and template.inventory.pending_deletion is True:
            errors['inventory'] = _("The inventory associated with this Job Template is being deleted.")
@@ -4277,6 +4352,10 @@ class WorkflowJobLaunchSerializer(BaseSerializer):
    scm_branch = serializers.CharField(required=False, write_only=True, allow_blank=True)
    workflow_job_template_data = serializers.SerializerMethodField()

+    labels = serializers.PrimaryKeyRelatedField(many=True, queryset=Label.objects.all(), required=False, write_only=True)
+    skip_tags = serializers.CharField(required=False, write_only=True, allow_blank=True)
+    job_tags = serializers.CharField(required=False, write_only=True, allow_blank=True)
+
    class Meta:
        model = WorkflowJobTemplate
        fields = (
@@ -4296,8 +4375,22 @@ class WorkflowJobLaunchSerializer(BaseSerializer):
            'workflow_job_template_data',
            'survey_enabled',
            'ask_variables_on_launch',
+            'ask_labels_on_launch',
+            'labels',
+            'ask_skip_tags_on_launch',
+            'ask_tags_on_launch',
+            'skip_tags',
+            'job_tags',
+        )
+        read_only_fields = (
+            'ask_inventory_on_launch',
+            'ask_variables_on_launch',
+            'ask_skip_tags_on_launch',
+            'ask_labels_on_launch',
+            'ask_limit_on_launch',
+            'ask_scm_branch_on_launch',
+            'ask_tags_on_launch',
        )
-        read_only_fields = ('ask_inventory_on_launch', 'ask_variables_on_launch')

    def get_survey_enabled(self, obj):
        if obj:
@@ -4305,10 +4398,15 @@ class WorkflowJobLaunchSerializer(BaseSerializer):
        return False

    def get_defaults(self, obj):
+
        defaults_dict = {}
        for field_name in WorkflowJobTemplate.get_ask_mapping().keys():
            if field_name == 'inventory':
                defaults_dict[field_name] = dict(name=getattrd(obj, '%s.name' % field_name, None), id=getattrd(obj, '%s.pk' % field_name, None))
+            elif field_name == 'labels':
+                for label in obj.labels.all():
+                    label_dict = {"id": label.id, "name": label.name}
+                    defaults_dict.setdefault(field_name, []).append(label_dict)
            else:
                defaults_dict[field_name] = getattr(obj, field_name)
        return defaults_dict
@@ -4317,6 +4415,7 @@ class WorkflowJobLaunchSerializer(BaseSerializer):
        return dict(name=obj.name, id=obj.id, description=obj.description)

    def validate(self, attrs):
+
        template = self.instance

        accepted, rejected, errors = template._accept_or_ignore_job_kwargs(**attrs)
@@ -4334,6 +4433,7 @@ class WorkflowJobLaunchSerializer(BaseSerializer):
        WFJT_inventory = template.inventory
        WFJT_limit = template.limit
        WFJT_scm_branch = template.scm_branch
+
        super(WorkflowJobLaunchSerializer, self).validate(attrs)
        template.extra_vars = WFJT_extra_vars
        template.inventory = WFJT_inventory
@@ -4725,6 +4825,8 @@ class ScheduleSerializer(LaunchConfigurationBaseSerializer, SchedulePreviewSeria
        if isinstance(obj.unified_job_template, SystemJobTemplate):
            summary_fields['unified_job_template']['job_type'] = obj.unified_job_template.job_type

+        # We are not showing instance groups on summary fields because JTs don't either
+
        if 'inventory' in summary_fields:
            return summary_fields

@@ -4745,13 +4847,6 @@ class ScheduleSerializer(LaunchConfigurationBaseSerializer, SchedulePreviewSeria
            raise serializers.ValidationError(_('Inventory Source must be a cloud resource.'))
        elif type(value) == Project and value.scm_type == '':
            raise serializers.ValidationError(_('Manual Project cannot have a schedule set.'))
-        elif type(value) == InventorySource and value.source == 'scm' and value.update_on_project_update:
-            raise serializers.ValidationError(
-                _(
-                    'Inventory sources with `update_on_project_update` cannot be scheduled. '
-                    'Schedule its source project `{}` instead.'.format(value.source_project.name)
-                )
-            )
        return value

    def validate(self, attrs):
@@ -4766,7 +4861,7 @@ class ScheduleSerializer(LaunchConfigurationBaseSerializer, SchedulePreviewSeria
 class InstanceLinkSerializer(BaseSerializer):
    class Meta:
        model = InstanceLink
-        fields = ('source', 'target')
+        fields = ('source', 'target', 'link_state')

    source = serializers.SlugRelatedField(slug_field="hostname", read_only=True)
    target = serializers.SlugRelatedField(slug_field="hostname", read_only=True)
@@ -4775,63 +4870,80 @@ class InstanceLinkSerializer(BaseSerializer):
 class InstanceNodeSerializer(BaseSerializer):
    class Meta:
        model = Instance
-        fields = ('id', 'hostname', 'node_type', 'node_state')
-
-    node_state = serializers.SerializerMethodField()
-
-    def get_node_state(self, obj):
-        if not obj.enabled:
-            return "disabled"
-        return "error" if obj.errors else "healthy"
+        fields = ('id', 'hostname', 'node_type', 'node_state', 'enabled')


 class InstanceSerializer(BaseSerializer):
+    show_capabilities = ['edit']

    consumed_capacity = serializers.SerializerMethodField()
    percent_capacity_remaining = serializers.SerializerMethodField()
-    jobs_running = serializers.IntegerField(help_text=_('Count of jobs in the running or waiting state that ' 'are targeted for this instance'), read_only=True)
+    jobs_running = serializers.IntegerField(help_text=_('Count of jobs in the running or waiting state that are targeted for this instance'), read_only=True)
    jobs_total = serializers.IntegerField(help_text=_('Count of all jobs that target this instance'), read_only=True)
+    health_check_pending = serializers.SerializerMethodField()

    class Meta:
        model = Instance
-        read_only_fields = ('uuid', 'hostname', 'version', 'node_type')
+        read_only_fields = ('ip_address', 'uuid', 'version')
        fields = (
-            "id",
-            "type",
-            "url",
-            "related",
-            "uuid",
-            "hostname",
-            "created",
-            "modified",
-            "last_seen",
-            "last_health_check",
-            "errors",
+            'id',
+            'hostname',
+            'type',
+            'url',
+            'related',
+            'summary_fields',
+            'uuid',
+            'created',
+            'modified',
+            'last_seen',
+            'health_check_started',
+            'health_check_pending',
+            'last_health_check',
+            'errors',
            'capacity_adjustment',
-            "version",
-            "capacity",
-            "consumed_capacity",
-            "percent_capacity_remaining",
-            "jobs_running",
-            "jobs_total",
-            "cpu",
-            "memory",
-            "cpu_capacity",
-            "mem_capacity",
-            "enabled",
-            "managed_by_policy",
-            "node_type",
+            'version',
+            'capacity',
+            'consumed_capacity',
+            'percent_capacity_remaining',
+            'jobs_running',
+            'jobs_total',
+            'cpu',
+            'memory',
+            'cpu_capacity',
+            'mem_capacity',
+            'enabled',
+            'managed_by_policy',
+            'node_type',
+            'node_state',
+            'ip_address',
+            'listener_port',
        )
+        extra_kwargs = {
+            'node_type': {'initial': Instance.Types.EXECUTION, 'default': Instance.Types.EXECUTION},
+            'node_state': {'initial': Instance.States.INSTALLED, 'default': Instance.States.INSTALLED},
+        }

    def get_related(self, obj):
        res = super(InstanceSerializer, self).get_related(obj)
        res['jobs'] = self.reverse('api:instance_unified_jobs_list', kwargs={'pk': obj.pk})
        res['instance_groups'] = self.reverse('api:instance_instance_groups_list', kwargs={'pk': obj.pk})
+        if settings.IS_K8S and obj.node_type in (Instance.Types.EXECUTION,):
+            res['install_bundle'] = self.reverse('api:instance_install_bundle', kwargs={'pk': obj.pk})
+        res['peers'] = self.reverse('api:instance_peers_list', kwargs={"pk": obj.pk})
        if self.context['request'].user.is_superuser or self.context['request'].user.is_system_auditor:
            if obj.node_type != 'hop':
                res['health_check'] = self.reverse('api:instance_health_check', kwargs={'pk': obj.pk})
        return res

+    def get_summary_fields(self, obj):
+        summary = super().get_summary_fields(obj)
+
+        # use this handle to distinguish between a listView and a detailView
+        if self.is_detail_view:
+            summary['links'] = InstanceLinkSerializer(InstanceLink.objects.select_related('target', 'source').filter(source=obj), many=True).data
+
+        return summary
+
    def get_consumed_capacity(self, obj):
        return obj.consumed_capacity

@@ -4841,10 +4953,54 @@ class InstanceSerializer(BaseSerializer):
        else:
            return float("{0:.2f}".format(((float(obj.capacity) - float(obj.consumed_capacity)) / (float(obj.capacity))) * 100))

-    def validate(self, attrs):
-        if self.instance.node_type == 'hop':
-            raise serializers.ValidationError(_('Hop node instances may not be changed.'))
-        return attrs
+    def get_health_check_pending(self, obj):
+        return obj.health_check_pending
+
+    def validate(self, data):
+        if self.instance:
+            if self.instance.node_type == Instance.Types.HOP:
+                raise serializers.ValidationError("Hop node instances may not be changed.")
+        else:
+            if not settings.IS_K8S:
+                raise serializers.ValidationError("Can only create instances on Kubernetes or OpenShift.")
+        return data
+
+    def validate_node_type(self, value):
+        if not self.instance:
+            if value not in (Instance.Types.EXECUTION,):
+                raise serializers.ValidationError("Can only create execution nodes.")
+        else:
+            if self.instance.node_type != value:
+                raise serializers.ValidationError("Cannot change node type.")
+
+        return value
+
+    def validate_node_state(self, value):
+        if self.instance:
+            if value != self.instance.node_state:
+                if not settings.IS_K8S:
+                    raise serializers.ValidationError("Can only change the state on Kubernetes or OpenShift.")
+                if value != Instance.States.DEPROVISIONING:
+                    raise serializers.ValidationError("Can only change instances to the 'deprovisioning' state.")
+                if self.instance.node_type not in (Instance.Types.EXECUTION,):
+                    raise serializers.ValidationError("Can only deprovision execution nodes.")
+        else:
+            if value and value != Instance.States.INSTALLED:
+                raise serializers.ValidationError("Can only create instances in the 'installed' state.")
+
+        return value
+
+    def validate_hostname(self, value):
+        if self.instance and self.instance.hostname != value:
+            raise serializers.ValidationError("Cannot change hostname.")
+
+        return value
+
+    def validate_listener_port(self, value):
+        if self.instance and self.instance.listener_port != value:
+            raise serializers.ValidationError("Cannot change listener port.")
+
+        return value


 class InstanceHealthCheckSerializer(BaseSerializer):
@@ -5020,8 +5176,7 @@ class ActivityStreamSerializer(BaseSerializer):
    object_association = serializers.SerializerMethodField(help_text=_("When present, shows the field name of the role or relationship that changed."))
    object_type = serializers.SerializerMethodField(help_text=_("When present, shows the model on which the role or relationship was defined."))

-    @cached_property
-    def _local_summarizable_fk_fields(self):
+    def _local_summarizable_fk_fields(self, obj):
        summary_dict = copy.copy(SUMMARIZABLE_FK_FIELDS)
        # Special requests
        summary_dict['group'] = summary_dict['group'] + ('inventory_id',)
@@ -5041,7 +5196,13 @@ class ActivityStreamSerializer(BaseSerializer):
            ('workflow_approval', ('id', 'name', 'unified_job_id')),
            ('instance', ('id', 'hostname')),
        ]
-        return field_list
+        # Optimization - do not attempt to summarize all fields, pair down to only relations that exist
+        if not obj:
+            return field_list
+        existing_association_types = [obj.object1, obj.object2]
+        if 'user' in existing_association_types:
+            existing_association_types.append('role')
+        return [entry for entry in field_list if entry[0] in existing_association_types]

    class Meta:
        model = ActivityStream
@@ -5125,7 +5286,7 @@ class ActivityStreamSerializer(BaseSerializer):
        data = {}
        if obj.actor is not None:
            data['actor'] = self.reverse('api:user_detail', kwargs={'pk': obj.actor.pk})
-        for fk, __ in self._local_summarizable_fk_fields:
+        for fk, __ in self._local_summarizable_fk_fields(obj):
            if not hasattr(obj, fk):
                continue
            m2m_list = self._get_related_objects(obj, fk)
@@ -5182,7 +5343,7 @@ class ActivityStreamSerializer(BaseSerializer):

    def get_summary_fields(self, obj):
        summary_fields = OrderedDict()
-        for fk, related_fields in self._local_summarizable_fk_fields:
+        for fk, related_fields in self._local_summarizable_fk_fields(obj):
            try:
                if not hasattr(obj, fk):
                    continue
--- a/awx/api/templates/instance_install_bundle/group_vars/all.yml
+++ b/awx/api/templates/instance_install_bundle/group_vars/all.yml
@@ -0,0 +1,21 @@
+receptor_verify: true
+receptor_tls: true
+receptor_work_commands:
+  ansible-runner:
+    command: ansible-runner
+    params: worker
+    allowruntimeparams: true
+    verifysignature: true
+custom_worksign_public_keyfile: receptor/work-public-key.pem
+custom_tls_certfile: receptor/tls/receptor.crt
+custom_tls_keyfile: receptor/tls/receptor.key
+custom_ca_certfile: receptor/tls/ca/receptor-ca.crt
+receptor_user: awx
+receptor_group: awx
+receptor_protocol: 'tcp'
+receptor_listener: true
+receptor_port: {{ instance.listener_port }}
+receptor_dependencies:
+  - podman
+  - crun
+  - python39-pip
--- a/awx/api/templates/instance_install_bundle/install_receptor.yml
+++ b/awx/api/templates/instance_install_bundle/install_receptor.yml
@@ -0,0 +1,18 @@
+{% verbatim %}
+---
+- hosts: all
+  become: yes
+  tasks:
+    - name: Create the receptor user
+      user:
+        name: "{{ receptor_user }}"
+        shell: /bin/bash
+    - name: Enable Copr repo for Receptor
+      command: dnf copr enable ansible-awx/receptor -y
+    - import_role:
+        name: ansible.receptor.setup
+    - name: Install ansible-runner
+      pip:
+        name: ansible-runner
+        executable: pip3.9
+{% endverbatim %}
--- a/awx/api/templates/instance_install_bundle/inventory.yml
+++ b/awx/api/templates/instance_install_bundle/inventory.yml
@@ -0,0 +1,7 @@
+---
+all:
+  hosts:
+    remote-execution:
+      ansible_host: {{ instance.hostname }}
+      ansible_user: <username> # user provided
+      ansible_ssh_private_key_file: ~/.ssh/id_rsa
--- a/awx/api/templates/instance_install_bundle/requirements.yml
+++ b/awx/api/templates/instance_install_bundle/requirements.yml
@@ -0,0 +1,6 @@
+---
+collections:
+  - name: ansible.receptor
+    source: https://github.com/ansible/receptor-collection/
+    type: git
+    version: 0.1.1
--- a/awx/api/urls/debug.py
+++ b/awx/api/urls/debug.py
@@ -0,0 +1,17 @@
+from django.urls import re_path
+
+from awx.api.views.debug import (
+    DebugRootView,
+    TaskManagerDebugView,
+    DependencyManagerDebugView,
+    WorkflowManagerDebugView,
+)
+
+urls = [
+    re_path(r'^$', DebugRootView.as_view(), name='debug'),
+    re_path(r'^task_manager/$', TaskManagerDebugView.as_view(), name='task_manager'),
+    re_path(r'^dependency_manager/$', DependencyManagerDebugView.as_view(), name='dependency_manager'),
+    re_path(r'^workflow_manager/$', WorkflowManagerDebugView.as_view(), name='workflow_manager'),
+]
+
+__all__ = ['urls']
--- a/awx/api/urls/instance.py
+++ b/awx/api/urls/instance.py
@@ -3,7 +3,15 @@

 from django.urls import re_path

-from awx.api.views import InstanceList, InstanceDetail, InstanceUnifiedJobsList, InstanceInstanceGroupsList, InstanceHealthCheck
+from awx.api.views import (
+    InstanceList,
+    InstanceDetail,
+    InstanceUnifiedJobsList,
+    InstanceInstanceGroupsList,
+    InstanceHealthCheck,
+    InstanceInstallBundle,
+    InstancePeersList,
+)


 urls = [
@@ -12,6 +20,8 @@ urls = [
    re_path(r'^(?P<pk>[0-9]+)/jobs/$', InstanceUnifiedJobsList.as_view(), name='instance_unified_jobs_list'),
    re_path(r'^(?P<pk>[0-9]+)/instance_groups/$', InstanceInstanceGroupsList.as_view(), name='instance_instance_groups_list'),
    re_path(r'^(?P<pk>[0-9]+)/health_check/$', InstanceHealthCheck.as_view(), name='instance_health_check'),
+    re_path(r'^(?P<pk>[0-9]+)/peers/$', InstancePeersList.as_view(), name='instance_peers_list'),
+    re_path(r'^(?P<pk>[0-9]+)/install_bundle/$', InstanceInstallBundle.as_view(), name='instance_install_bundle'),
 ]

 __all__ = ['urls']
--- a/awx/api/urls/label.py
+++ b/awx/api/urls/label.py
@@ -3,7 +3,7 @@

 from django.urls import re_path

-from awx.api.views import LabelList, LabelDetail
+from awx.api.views.labels import LabelList, LabelDetail


 urls = [re_path(r'^$', LabelList.as_view(), name='label_list'), re_path(r'^(?P<pk>[0-9]+)/$', LabelDetail.as_view(), name='label_detail')]
--- a/awx/api/urls/schedule.py
+++ b/awx/api/urls/schedule.py
@@ -3,7 +3,7 @@

 from django.urls import re_path

-from awx.api.views import ScheduleList, ScheduleDetail, ScheduleUnifiedJobsList, ScheduleCredentialsList
+from awx.api.views import ScheduleList, ScheduleDetail, ScheduleUnifiedJobsList, ScheduleCredentialsList, ScheduleLabelsList, ScheduleInstanceGroupList


 urls = [
@@ -11,6 +11,8 @@ urls = [
    re_path(r'^(?P<pk>[0-9]+)/$', ScheduleDetail.as_view(), name='schedule_detail'),
    re_path(r'^(?P<pk>[0-9]+)/jobs/$', ScheduleUnifiedJobsList.as_view(), name='schedule_unified_jobs_list'),
    re_path(r'^(?P<pk>[0-9]+)/credentials/$', ScheduleCredentialsList.as_view(), name='schedule_credentials_list'),
+    re_path(r'^(?P<pk>[0-9]+)/labels/$', ScheduleLabelsList.as_view(), name='schedule_labels_list'),
+    re_path(r'^(?P<pk>[0-9]+)/instance_groups/$', ScheduleInstanceGroupList.as_view(), name='schedule_instance_groups_list'),
 ]

 __all__ = ['urls']
--- a/awx/api/urls/urls.py
+++ b/awx/api/urls/urls.py
@@ -2,9 +2,9 @@
 # All Rights Reserved.

 from __future__ import absolute_import, unicode_literals
-from django.conf import settings
 from django.urls import include, re_path

+from awx import MODE
 from awx.api.generics import LoggedLoginView, LoggedLogoutView
 from awx.api.views import (
    ApiRootView,
@@ -145,7 +145,12 @@ urlpatterns = [
    re_path(r'^logout/$', LoggedLogoutView.as_view(next_page='/api/', redirect_field_name='next'), name='logout'),
    re_path(r'^o/', include(oauth2_root_urls)),
 ]
-if settings.SETTINGS_MODULE == 'awx.settings.development':
+if MODE == 'development':
+    # Only include these if we are in the development environment
    from awx.api.swagger import SwaggerSchemaView

    urlpatterns += [re_path(r'^swagger/$', SwaggerSchemaView.as_view(), name='swagger_view')]
+
+    from awx.api.urls.debug import urls as debug_urls
+
+    urlpatterns += [re_path(r'^debug/', include(debug_urls))]
--- a/awx/api/urls/workflow_job_node.py
+++ b/awx/api/urls/workflow_job_node.py
@@ -10,6 +10,8 @@ from awx.api.views import (
    WorkflowJobNodeFailureNodesList,
    WorkflowJobNodeAlwaysNodesList,
    WorkflowJobNodeCredentialsList,
+    WorkflowJobNodeLabelsList,
+    WorkflowJobNodeInstanceGroupsList,
 )


@@ -20,6 +22,8 @@ urls = [
    re_path(r'^(?P<pk>[0-9]+)/failure_nodes/$', WorkflowJobNodeFailureNodesList.as_view(), name='workflow_job_node_failure_nodes_list'),
    re_path(r'^(?P<pk>[0-9]+)/always_nodes/$', WorkflowJobNodeAlwaysNodesList.as_view(), name='workflow_job_node_always_nodes_list'),
    re_path(r'^(?P<pk>[0-9]+)/credentials/$', WorkflowJobNodeCredentialsList.as_view(), name='workflow_job_node_credentials_list'),
+    re_path(r'^(?P<pk>[0-9]+)/labels/$', WorkflowJobNodeLabelsList.as_view(), name='workflow_job_node_labels_list'),
+    re_path(r'^(?P<pk>[0-9]+)/instance_groups/$', WorkflowJobNodeInstanceGroupsList.as_view(), name='workflow_job_node_instance_groups_list'),
 ]

 __all__ = ['urls']
--- a/awx/api/urls/workflow_job_template_node.py
+++ b/awx/api/urls/workflow_job_template_node.py
@@ -11,6 +11,8 @@ from awx.api.views import (
    WorkflowJobTemplateNodeAlwaysNodesList,
    WorkflowJobTemplateNodeCredentialsList,
    WorkflowJobTemplateNodeCreateApproval,
+    WorkflowJobTemplateNodeLabelsList,
+    WorkflowJobTemplateNodeInstanceGroupsList,
 )


@@ -21,6 +23,8 @@ urls = [
    re_path(r'^(?P<pk>[0-9]+)/failure_nodes/$', WorkflowJobTemplateNodeFailureNodesList.as_view(), name='workflow_job_template_node_failure_nodes_list'),
    re_path(r'^(?P<pk>[0-9]+)/always_nodes/$', WorkflowJobTemplateNodeAlwaysNodesList.as_view(), name='workflow_job_template_node_always_nodes_list'),
    re_path(r'^(?P<pk>[0-9]+)/credentials/$', WorkflowJobTemplateNodeCredentialsList.as_view(), name='workflow_job_template_node_credentials_list'),
+    re_path(r'^(?P<pk>[0-9]+)/labels/$', WorkflowJobTemplateNodeLabelsList.as_view(), name='workflow_job_template_node_labels_list'),
+    re_path(r'^(?P<pk>[0-9]+)/instance_groups/$', WorkflowJobTemplateNodeInstanceGroupsList.as_view(), name='workflow_job_template_node_instance_groups_list'),
    re_path(r'^(?P<pk>[0-9]+)/create_approval_template/$', WorkflowJobTemplateNodeCreateApproval.as_view(), name='workflow_job_template_node_create_approval'),
 ]

--- a/awx/api/views/init.py
+++ b/awx/api/views/init.py
@@ -22,6 +22,7 @@ from django.conf import settings
 from django.core.exceptions import FieldError, ObjectDoesNotExist
 from django.db.models import Q, Sum
 from django.db import IntegrityError, ProgrammingError, transaction, connection
+from django.db.models.fields.related import ManyToManyField, ForeignKey
 from django.shortcuts import get_object_or_404
 from django.utils.safestring import mark_safe
 from django.utils.timezone import now
@@ -68,7 +69,6 @@ from awx.api.generics import (
    APIView,
    BaseUsersList,
    CopyAPIView,
-    DeleteLastUnattachLabelMixin,
    GenericAPIView,
    ListAPIView,
    ListCreateAPIView,
@@ -85,6 +85,7 @@ from awx.api.generics import (
    SubListCreateAttachDetachAPIView,
    SubListDestroyAPIView,
 )
+from awx.api.views.labels import LabelSubListCreateAttachDetachView
 from awx.api.versioning import reverse
 from awx.main import models
 from awx.main.utils import (
@@ -93,7 +94,7 @@ from awx.main.utils import (
    get_object_or_400,
    getattrd,
    get_pk_from_dict,
-    schedule_task_manager,
+    ScheduleWorkflowManager,
    ignore_inventory_computed_fields,
 )
 from awx.main.utils.encryption import encrypt_value
@@ -115,13 +116,28 @@ from awx.api.metadata import RoleMetadata
 from awx.main.constants import ACTIVE_STATES, SURVEY_TYPE_MAPPING
 from awx.main.scheduler.dag_workflow import WorkflowDAG
 from awx.api.views.mixin import (
-    ControlledByScmMixin,
    InstanceGroupMembershipMixin,
    OrganizationCountsMixin,
    RelatedJobsPreventDeleteMixin,
    UnifiedJobDeletionMixin,
    NoTruncateMixin,
 )
+from awx.api.views.instance_install_bundle import InstanceInstallBundle  # noqa
+from awx.api.views.inventory import (  # noqa
+    InventoryList,
+    InventoryDetail,
+    InventoryUpdateEventsList,
+    InventoryList,
+    InventoryDetail,
+    InventoryActivityStreamList,
+    InventoryInstanceGroupsList,
+    InventoryAccessList,
+    InventoryObjectRolesList,
+    InventoryJobTemplateList,
+    InventoryLabelList,
+    InventoryCopy,
+)
+from awx.api.views.mesh_visualizer import MeshVisualizer  # noqa
 from awx.api.views.organization import (  # noqa
    OrganizationList,
    OrganizationDetail,
@@ -145,21 +161,6 @@ from awx.api.views.organization import (  # noqa
    OrganizationAccessList,
    OrganizationObjectRolesList,
 )
-from awx.api.views.inventory import (  # noqa
-    InventoryList,
-    InventoryDetail,
-    InventoryUpdateEventsList,
-    InventoryList,
-    InventoryDetail,
-    InventoryActivityStreamList,
-    InventoryInstanceGroupsList,
-    InventoryAccessList,
-    InventoryObjectRolesList,
-    InventoryJobTemplateList,
-    InventoryLabelList,
-    InventoryCopy,
-)
-from awx.api.views.mesh_visualizer import MeshVisualizer  # noqa
 from awx.api.views.root import (  # noqa
    ApiRootView,
    ApiOAuthAuthorizationRootView,
@@ -174,7 +175,6 @@ from awx.api.views.webhooks import WebhookKeyView, GithubWebhookReceiver, Gitlab
 from awx.api.pagination import UnifiedJobEventPagination
 from awx.main.utils import set_environ

-
 logger = logging.getLogger('awx.api.views')


@@ -359,7 +359,7 @@ class DashboardJobsGraphView(APIView):
        return Response(dashboard_data)


-class InstanceList(ListAPIView):
+class InstanceList(ListCreateAPIView):

    name = _("Instances")
    model = models.Instance
@@ -398,6 +398,17 @@ class InstanceUnifiedJobsList(SubListAPIView):
        return qs


+class InstancePeersList(SubListAPIView):
+
+    name = _("Instance Peers")
+    parent_model = models.Instance
+    model = models.Instance
+    serializer_class = serializers.InstanceSerializer
+    parent_access = 'read'
+    search_fields = {'hostname'}
+    relationship = 'peers'
+
+
 class InstanceInstanceGroupsList(InstanceGroupMembershipMixin, SubListCreateAttachDetachAPIView):

    name = _("Instance's Instance Groups")
@@ -440,40 +451,21 @@ class InstanceHealthCheck(GenericAPIView):

    def post(self, request, *args, **kwargs):
        obj = self.get_object()
+        if obj.health_check_pending:
+            return Response({'msg': f"Health check was already in progress for {obj.hostname}."}, status=status.HTTP_200_OK)

-        if obj.node_type == 'execution':
+        # Note: hop nodes are already excluded by the get_queryset method
+        obj.health_check_started = now()
+        obj.save(update_fields=['health_check_started'])
+        if obj.node_type == models.Instance.Types.EXECUTION:
            from awx.main.tasks.system import execution_node_health_check

-            runner_data = execution_node_health_check(obj.hostname)
-            obj.refresh_from_db()
-            data = self.get_serializer(data=request.data).to_representation(obj)
-            # Add in some extra unsaved fields
-            for extra_field in ('transmit_timing', 'run_timing'):
-                if extra_field in runner_data:
-                    data[extra_field] = runner_data[extra_field]
+            execution_node_health_check.apply_async([obj.hostname])
        else:
            from awx.main.tasks.system import cluster_node_health_check

-            if settings.CLUSTER_HOST_ID == obj.hostname:
-                cluster_node_health_check(obj.hostname)
-            else:
-                cluster_node_health_check.apply_async([obj.hostname], queue=obj.hostname)
-                start_time = time.time()
-                prior_check_time = obj.last_health_check
-                while time.time() - start_time < 50.0:
-                    obj.refresh_from_db(fields=['last_health_check'])
-                    if obj.last_health_check != prior_check_time:
-                        break
-                    if time.time() - start_time < 1.0:
-                        time.sleep(0.1)
-                    else:
-                        time.sleep(1.0)
-                else:
-                    obj.mark_offline(errors=_('Health check initiated by user determined this instance to be unresponsive'))
-            obj.refresh_from_db()
-            data = self.get_serializer(data=request.data).to_representation(obj)
-
-        return Response(data, status=status.HTTP_200_OK)
+            cluster_node_health_check.apply_async([obj.hostname], queue=obj.hostname)
+        return Response({'msg': f"Health check is running for {obj.hostname}."}, status=status.HTTP_200_OK)


 class InstanceGroupList(ListCreateAPIView):
@@ -618,6 +610,19 @@ class ScheduleCredentialsList(LaunchConfigCredentialsBase):
    parent_model = models.Schedule


+class ScheduleLabelsList(LabelSubListCreateAttachDetachView):
+
+    parent_model = models.Schedule
+
+
+class ScheduleInstanceGroupList(SubListAttachDetachAPIView):
+
+    model = models.InstanceGroup
+    serializer_class = serializers.InstanceGroupSerializer
+    parent_model = models.Schedule
+    relationship = 'instance_groups'
+
+
 class ScheduleUnifiedJobsList(SubListAPIView):

    model = models.UnifiedJob
@@ -1675,7 +1680,7 @@ class HostList(HostRelatedSearchMixin, ListCreateAPIView):
            return Response(dict(error=_(str(e))), status=status.HTTP_400_BAD_REQUEST)


-class HostDetail(RelatedJobsPreventDeleteMixin, ControlledByScmMixin, RetrieveUpdateDestroyAPIView):
+class HostDetail(RelatedJobsPreventDeleteMixin, RetrieveUpdateDestroyAPIView):

    always_allow_superuser = False
    model = models.Host
@@ -1709,7 +1714,7 @@ class InventoryHostsList(HostRelatedSearchMixin, SubListCreateAttachDetachAPIVie
        return qs


-class HostGroupsList(ControlledByScmMixin, SubListCreateAttachDetachAPIView):
+class HostGroupsList(SubListCreateAttachDetachAPIView):
    '''the list of groups a host is directly a member of'''

    model = models.Group
@@ -1825,7 +1830,7 @@ class EnforceParentRelationshipMixin(object):
        return super(EnforceParentRelationshipMixin, self).create(request, *args, **kwargs)


-class GroupChildrenList(ControlledByScmMixin, EnforceParentRelationshipMixin, SubListCreateAttachDetachAPIView):
+class GroupChildrenList(EnforceParentRelationshipMixin, SubListCreateAttachDetachAPIView):

    model = models.Group
    serializer_class = serializers.GroupSerializer
@@ -1871,7 +1876,7 @@ class GroupPotentialChildrenList(SubListAPIView):
        return qs.exclude(pk__in=except_pks)


-class GroupHostsList(HostRelatedSearchMixin, ControlledByScmMixin, SubListCreateAttachDetachAPIView):
+class GroupHostsList(HostRelatedSearchMixin, SubListCreateAttachDetachAPIView):
    '''the list of hosts directly below a group'''

    model = models.Host
@@ -1935,7 +1940,7 @@ class GroupActivityStreamList(SubListAPIView):
        return qs.filter(Q(group=parent) | Q(host__in=parent.hosts.all()))


-class GroupDetail(RelatedJobsPreventDeleteMixin, ControlledByScmMixin, RetrieveUpdateDestroyAPIView):
+class GroupDetail(RelatedJobsPreventDeleteMixin, RetrieveUpdateDestroyAPIView):

    model = models.Group
    serializer_class = serializers.GroupSerializer
@@ -2382,10 +2387,13 @@ class JobTemplateLaunch(RetrieveAPIView):
            for field, ask_field_name in modified_ask_mapping.items():
                if not getattr(obj, ask_field_name):
                    data.pop(field, None)
-                elif field == 'inventory':
+                elif isinstance(getattr(obj.__class__, field).field, ForeignKey):
                    data[field] = getattrd(obj, "%s.%s" % (field, 'id'), None)
-                elif field == 'credentials':
-                    data[field] = [cred.id for cred in obj.credentials.all()]
+                elif isinstance(getattr(obj.__class__, field).field, ManyToManyField):
+                    if field == 'instance_groups':
+                        data[field] = []
+                        continue
+                    data[field] = [item.id for item in getattr(obj, field).all()]
                else:
                    data[field] = getattr(obj, field)
        return data
@@ -2720,28 +2728,9 @@ class JobTemplateCredentialsList(SubListCreateAttachDetachAPIView):
        return super(JobTemplateCredentialsList, self).is_valid_relation(parent, sub, created)


-class JobTemplateLabelList(DeleteLastUnattachLabelMixin, SubListCreateAttachDetachAPIView):
+class JobTemplateLabelList(LabelSubListCreateAttachDetachView):

-    model = models.Label
-    serializer_class = serializers.LabelSerializer
    parent_model = models.JobTemplate
-    relationship = 'labels'
-
-    def post(self, request, *args, **kwargs):
-        # If a label already exists in the database, attach it instead of erroring out
-        # that it already exists
-        if 'id' not in request.data and 'name' in request.data and 'organization' in request.data:
-            existing = models.Label.objects.filter(name=request.data['name'], organization_id=request.data['organization'])
-            if existing.exists():
-                existing = existing[0]
-                request.data['id'] = existing.id
-                del request.data['name']
-                del request.data['organization']
-        if models.Label.objects.filter(unifiedjobtemplate_labels=self.kwargs['pk']).count() > 100:
-            return Response(
-                dict(msg=_('Maximum number of labels for {} reached.'.format(self.parent_model._meta.verbose_name_raw))), status=status.HTTP_400_BAD_REQUEST
-            )
-        return super(JobTemplateLabelList, self).post(request, *args, **kwargs)


 class JobTemplateCallback(GenericAPIView):
@@ -2967,6 +2956,22 @@ class WorkflowJobNodeCredentialsList(SubListAPIView):
    relationship = 'credentials'


+class WorkflowJobNodeLabelsList(SubListAPIView):
+
+    model = models.Label
+    serializer_class = serializers.LabelSerializer
+    parent_model = models.WorkflowJobNode
+    relationship = 'labels'
+
+
+class WorkflowJobNodeInstanceGroupsList(SubListAttachDetachAPIView):
+
+    model = models.InstanceGroup
+    serializer_class = serializers.InstanceGroupSerializer
+    parent_model = models.WorkflowJobNode
+    relationship = 'instance_groups'
+
+
 class WorkflowJobTemplateNodeList(ListCreateAPIView):

    model = models.WorkflowJobTemplateNode
@@ -2985,6 +2990,19 @@ class WorkflowJobTemplateNodeCredentialsList(LaunchConfigCredentialsBase):
    parent_model = models.WorkflowJobTemplateNode


+class WorkflowJobTemplateNodeLabelsList(LabelSubListCreateAttachDetachView):
+
+    parent_model = models.WorkflowJobTemplateNode
+
+
+class WorkflowJobTemplateNodeInstanceGroupsList(SubListAttachDetachAPIView):
+
+    model = models.InstanceGroup
+    serializer_class = serializers.InstanceGroupSerializer
+    parent_model = models.WorkflowJobTemplateNode
+    relationship = 'instance_groups'
+
+
 class WorkflowJobTemplateNodeChildrenBaseList(EnforceParentRelationshipMixin, SubListCreateAttachDetachAPIView):

    model = models.WorkflowJobTemplateNode
@@ -3197,13 +3215,17 @@ class WorkflowJobTemplateLaunch(RetrieveAPIView):
                data['extra_vars'] = extra_vars
            modified_ask_mapping = models.WorkflowJobTemplate.get_ask_mapping()
            modified_ask_mapping.pop('extra_vars')
-            for field_name, ask_field_name in obj.get_ask_mapping().items():
+
+            for field, ask_field_name in modified_ask_mapping.items():
                if not getattr(obj, ask_field_name):
-                    data.pop(field_name, None)
-                elif field_name == 'inventory':
-                    data[field_name] = getattrd(obj, "%s.%s" % (field_name, 'id'), None)
+                    data.pop(field, None)
+                elif isinstance(getattr(obj.__class__, field).field, ForeignKey):
+                    data[field] = getattrd(obj, "%s.%s" % (field, 'id'), None)
+                elif isinstance(getattr(obj.__class__, field).field, ManyToManyField):
+                    data[field] = [item.id for item in getattr(obj, field).all()]
                else:
-                    data[field_name] = getattr(obj, field_name)
+                    data[field] = getattr(obj, field)
+
        return data

    def post(self, request, *args, **kwargs):
@@ -3392,7 +3414,7 @@ class WorkflowJobCancel(RetrieveAPIView):
        obj = self.get_object()
        if obj.can_cancel:
            obj.cancel()
-            schedule_task_manager()
+            ScheduleWorkflowManager().schedule()
            return Response(status=status.HTTP_202_ACCEPTED)
        else:
            return self.http_method_not_allowed(request, *args, **kwargs)
@@ -3690,15 +3712,21 @@ class JobCreateSchedule(RetrieveAPIView):
            extra_data=config.extra_data,
            survey_passwords=config.survey_passwords,
            inventory=config.inventory,
+            execution_environment=config.execution_environment,
            char_prompts=config.char_prompts,
            credentials=set(config.credentials.all()),
+            labels=set(config.labels.all()),
+            instance_groups=list(config.instance_groups.all()),
        )
        if not request.user.can_access(models.Schedule, 'add', schedule_data):
            raise PermissionDenied()

-        creds_list = schedule_data.pop('credentials')
+        related_fields = ('credentials', 'labels', 'instance_groups')
+        related = [schedule_data.pop(relationship) for relationship in related_fields]
        schedule = models.Schedule.objects.create(**schedule_data)
-        schedule.credentials.add(*creds_list)
+        for relationship, items in zip(related_fields, related):
+            for item in items:
+                getattr(schedule, relationship).add(item)

        data = serializers.ScheduleSerializer(schedule, context=self.get_serializer_context()).data
        data.serializer.instance = None  # hack to avoid permissions.py assuming this is Job model
@@ -3840,7 +3868,7 @@ class JobJobEventsList(BaseJobEventsList):
    def get_queryset(self):
        job = self.get_parent_object()
        self.check_parent_access(job)
-        return job.get_event_queryset().select_related('host').order_by('start_line')
+        return job.get_event_queryset().prefetch_related('job__job_template', 'host').order_by('start_line')


 class JobJobEventsChildrenSummary(APIView):
@@ -4429,18 +4457,6 @@ class NotificationDetail(RetrieveAPIView):
    serializer_class = serializers.NotificationSerializer


-class LabelList(ListCreateAPIView):
-
-    model = models.Label
-    serializer_class = serializers.LabelSerializer
-
-
-class LabelDetail(RetrieveUpdateAPIView):
-
-    model = models.Label
-    serializer_class = serializers.LabelSerializer
-
-
 class ActivityStreamList(SimpleListAPIView):

    model = models.ActivityStream
--- a/awx/api/views/debug.py
+++ b/awx/api/views/debug.py
@@ -0,0 +1,68 @@
+from collections import OrderedDict
+
+from django.conf import settings
+
+from rest_framework.permissions import AllowAny
+from rest_framework.response import Response
+from awx.api.generics import APIView
+
+from awx.main.scheduler import TaskManager, DependencyManager, WorkflowManager
+
+
+class TaskManagerDebugView(APIView):
+    _ignore_model_permissions = True
+    exclude_from_schema = True
+    permission_classes = [AllowAny]
+    prefix = 'Task'
+
+    def get(self, request):
+        TaskManager().schedule()
+        if not settings.AWX_DISABLE_TASK_MANAGERS:
+            msg = f"Running {self.prefix} manager. To disable other triggers to the {self.prefix} manager, set AWX_DISABLE_TASK_MANAGERS to True"
+        else:
+            msg = f"AWX_DISABLE_TASK_MANAGERS is True, this view is the only way to trigger the {self.prefix} manager"
+        return Response(msg)
+
+
+class DependencyManagerDebugView(APIView):
+    _ignore_model_permissions = True
+    exclude_from_schema = True
+    permission_classes = [AllowAny]
+    prefix = 'Dependency'
+
+    def get(self, request):
+        DependencyManager().schedule()
+        if not settings.AWX_DISABLE_TASK_MANAGERS:
+            msg = f"Running {self.prefix} manager. To disable other triggers to the {self.prefix} manager, set AWX_DISABLE_TASK_MANAGERS to True"
+        else:
+            msg = f"AWX_DISABLE_TASK_MANAGERS is True, this view is the only way to trigger the {self.prefix} manager"
+        return Response(msg)
+
+
+class WorkflowManagerDebugView(APIView):
+    _ignore_model_permissions = True
+    exclude_from_schema = True
+    permission_classes = [AllowAny]
+    prefix = 'Workflow'
+
+    def get(self, request):
+        WorkflowManager().schedule()
+        if not settings.AWX_DISABLE_TASK_MANAGERS:
+            msg = f"Running {self.prefix} manager. To disable other triggers to the {self.prefix} manager, set AWX_DISABLE_TASK_MANAGERS to True"
+        else:
+            msg = f"AWX_DISABLE_TASK_MANAGERS is True, this view is the only way to trigger the {self.prefix} manager"
+        return Response(msg)
+
+
+class DebugRootView(APIView):
+    _ignore_model_permissions = True
+    exclude_from_schema = True
+    permission_classes = [AllowAny]
+
+    def get(self, request, format=None):
+        '''List of available debug urls'''
+        data = OrderedDict()
+        data['task_manager'] = '/api/debug/task_manager/'
+        data['dependency_manager'] = '/api/debug/dependency_manager/'
+        data['workflow_manager'] = '/api/debug/workflow_manager/'
+        return Response(data)
--- a/awx/api/views/instance_install_bundle.py
+++ b/awx/api/views/instance_install_bundle.py
@@ -0,0 +1,199 @@
+# Copyright (c) 2018 Red Hat, Inc.
+# All Rights Reserved.
+
+import datetime
+import io
+import ipaddress
+import os
+import tarfile
+
+import asn1
+from awx.api import serializers
+from awx.api.generics import GenericAPIView, Response
+from awx.api.permissions import IsSystemAdminOrAuditor
+from awx.main import models
+from cryptography import x509
+from cryptography.hazmat.primitives import hashes, serialization
+from cryptography.hazmat.primitives.asymmetric import rsa
+from cryptography.x509 import DNSName, IPAddress, ObjectIdentifier, OtherName
+from cryptography.x509.oid import NameOID
+from django.http import HttpResponse
+from django.template.loader import render_to_string
+from django.utils.translation import gettext_lazy as _
+from rest_framework import status
+
+# Red Hat has an OID namespace (RHANANA). Receptor has its own designation under that.
+RECEPTOR_OID = "1.3.6.1.4.1.2312.19.1"
+
+# generate install bundle for the instance
+# install bundle directory structure
+# ├── install_receptor.yml (playbook)
+# ├── inventory.yml
+# ├── group_vars
+# │   └── all.yml
+# ├── receptor
+# │   ├── tls
+# │   │   ├── ca
+# │   │   │   └── receptor-ca.crt
+# │   │   ├── receptor.crt
+# │   │   └── receptor.key
+# │   └── work-public-key.pem
+# └── requirements.yml
+class InstanceInstallBundle(GenericAPIView):
+
+    name = _('Install Bundle')
+    model = models.Instance
+    serializer_class = serializers.InstanceSerializer
+    permission_classes = (IsSystemAdminOrAuditor,)
+
+    def get(self, request, *args, **kwargs):
+        instance_obj = self.get_object()
+
+        if instance_obj.node_type not in ('execution',):
+            return Response(
+                data=dict(msg=_('Install bundle can only be generated for execution nodes.')),
+                status=status.HTTP_400_BAD_REQUEST,
+            )
+
+        with io.BytesIO() as f:
+            with tarfile.open(fileobj=f, mode='w:gz') as tar:
+                # copy /etc/receptor/tls/ca/receptor-ca.crt to receptor/tls/ca in the tar file
+                tar.add(
+                    os.path.realpath('/etc/receptor/tls/ca/receptor-ca.crt'), arcname=f"{instance_obj.hostname}_install_bundle/receptor/tls/ca/receptor-ca.crt"
+                )
+
+                # copy /etc/receptor/signing/work-public-key.pem to receptor/work-public-key.pem
+                tar.add('/etc/receptor/signing/work-public-key.pem', arcname=f"{instance_obj.hostname}_install_bundle/receptor/work-public-key.pem")
+
+                # generate and write the receptor key to receptor/tls/receptor.key in the tar file
+                key, cert = generate_receptor_tls(instance_obj)
+
+                key_tarinfo = tarfile.TarInfo(f"{instance_obj.hostname}_install_bundle/receptor/tls/receptor.key")
+                key_tarinfo.size = len(key)
+                tar.addfile(key_tarinfo, io.BytesIO(key))
+
+                cert_tarinfo = tarfile.TarInfo(f"{instance_obj.hostname}_install_bundle/receptor/tls/receptor.crt")
+                cert_tarinfo.size = len(cert)
+                tar.addfile(cert_tarinfo, io.BytesIO(cert))
+
+                # generate and write install_receptor.yml to the tar file
+                playbook = generate_playbook().encode('utf-8')
+                playbook_tarinfo = tarfile.TarInfo(f"{instance_obj.hostname}_install_bundle/install_receptor.yml")
+                playbook_tarinfo.size = len(playbook)
+                tar.addfile(playbook_tarinfo, io.BytesIO(playbook))
+
+                # generate and write inventory.yml to the tar file
+                inventory_yml = generate_inventory_yml(instance_obj).encode('utf-8')
+                inventory_yml_tarinfo = tarfile.TarInfo(f"{instance_obj.hostname}_install_bundle/inventory.yml")
+                inventory_yml_tarinfo.size = len(inventory_yml)
+                tar.addfile(inventory_yml_tarinfo, io.BytesIO(inventory_yml))
+
+                # generate and write group_vars/all.yml to the tar file
+                group_vars = generate_group_vars_all_yml(instance_obj).encode('utf-8')
+                group_vars_tarinfo = tarfile.TarInfo(f"{instance_obj.hostname}_install_bundle/group_vars/all.yml")
+                group_vars_tarinfo.size = len(group_vars)
+                tar.addfile(group_vars_tarinfo, io.BytesIO(group_vars))
+
+                # generate and write requirements.yml to the tar file
+                requirements_yml = generate_requirements_yml().encode('utf-8')
+                requirements_yml_tarinfo = tarfile.TarInfo(f"{instance_obj.hostname}_install_bundle/requirements.yml")
+                requirements_yml_tarinfo.size = len(requirements_yml)
+                tar.addfile(requirements_yml_tarinfo, io.BytesIO(requirements_yml))
+
+            # respond with the tarfile
+            f.seek(0)
+            response = HttpResponse(f.read(), status=status.HTTP_200_OK)
+            response['Content-Disposition'] = f"attachment; filename={instance_obj.hostname}_install_bundle.tar.gz"
+            return response
+
+
+def generate_playbook():
+    return render_to_string("instance_install_bundle/install_receptor.yml")
+
+
+def generate_requirements_yml():
+    return render_to_string("instance_install_bundle/requirements.yml")
+
+
+def generate_inventory_yml(instance_obj):
+    return render_to_string("instance_install_bundle/inventory.yml", context=dict(instance=instance_obj))
+
+
+def generate_group_vars_all_yml(instance_obj):
+    return render_to_string("instance_install_bundle/group_vars/all.yml", context=dict(instance=instance_obj))
+
+
+def generate_receptor_tls(instance_obj):
+    # generate private key for the receptor
+    key = rsa.generate_private_key(public_exponent=65537, key_size=2048)
+
+    # encode receptor hostname to asn1
+    hostname = instance_obj.hostname
+    encoder = asn1.Encoder()
+    encoder.start()
+    encoder.write(hostname.encode(), nr=asn1.Numbers.UTF8String)
+    hostname_asn1 = encoder.output()
+
+    san_params = [
+        DNSName(hostname),
+        OtherName(ObjectIdentifier(RECEPTOR_OID), hostname_asn1),
+    ]
+
+    try:
+        san_params.append(IPAddress(ipaddress.IPv4Address(hostname)))
+    except ipaddress.AddressValueError:
+        pass
+
+    # generate certificate for the receptor
+    csr = (
+        x509.CertificateSigningRequestBuilder()
+        .subject_name(
+            x509.Name(
+                [
+                    x509.NameAttribute(NameOID.COMMON_NAME, hostname),
+                ]
+            )
+        )
+        .add_extension(
+            x509.SubjectAlternativeName(san_params),
+            critical=False,
+        )
+        .sign(key, hashes.SHA256())
+    )
+
+    # sign csr with the receptor ca key from /etc/receptor/ca/receptor-ca.key
+    with open('/etc/receptor/tls/ca/receptor-ca.key', 'rb') as f:
+        ca_key = serialization.load_pem_private_key(
+            f.read(),
+            password=None,
+        )
+
+    with open('/etc/receptor/tls/ca/receptor-ca.crt', 'rb') as f:
+        ca_cert = x509.load_pem_x509_certificate(f.read())
+
+    cert = (
+        x509.CertificateBuilder()
+        .subject_name(csr.subject)
+        .issuer_name(ca_cert.issuer)
+        .public_key(csr.public_key())
+        .serial_number(x509.random_serial_number())
+        .not_valid_before(datetime.datetime.utcnow())
+        .not_valid_after(datetime.datetime.utcnow() + datetime.timedelta(days=10))
+        .add_extension(
+            csr.extensions.get_extension_for_class(x509.SubjectAlternativeName).value,
+            critical=csr.extensions.get_extension_for_class(x509.SubjectAlternativeName).critical,
+        )
+        .sign(ca_key, hashes.SHA256())
+    )
+
+    key = key.private_bytes(
+        encoding=serialization.Encoding.PEM,
+        format=serialization.PrivateFormat.TraditionalOpenSSL,
+        encryption_algorithm=serialization.NoEncryption(),
+    )
+
+    cert = cert.public_bytes(
+        encoding=serialization.Encoding.PEM,
+    )
+
+    return key, cert
--- a/awx/api/views/inventory.py
+++ b/awx/api/views/inventory.py
@@ -18,8 +18,6 @@ from rest_framework import status
 # AWX
 from awx.main.models import ActivityStream, Inventory, JobTemplate, Role, User, InstanceGroup, InventoryUpdateEvent, InventoryUpdate

-from awx.main.models.label import Label
-
 from awx.api.generics import (
    ListCreateAPIView,
    RetrieveUpdateDestroyAPIView,
@@ -27,9 +25,8 @@ from awx.api.generics import (
    SubListAttachDetachAPIView,
    ResourceAccessList,
    CopyAPIView,
-    DeleteLastUnattachLabelMixin,
-    SubListCreateAttachDetachAPIView,
 )
+from awx.api.views.labels import LabelSubListCreateAttachDetachView


 from awx.api.serializers import (
@@ -39,9 +36,8 @@ from awx.api.serializers import (
    InstanceGroupSerializer,
    InventoryUpdateEventSerializer,
    JobTemplateSerializer,
-    LabelSerializer,
 )
-from awx.api.views.mixin import RelatedJobsPreventDeleteMixin, ControlledByScmMixin
+from awx.api.views.mixin import RelatedJobsPreventDeleteMixin

 from awx.api.pagination import UnifiedJobEventPagination

@@ -75,7 +71,7 @@ class InventoryList(ListCreateAPIView):
    serializer_class = InventorySerializer


-class InventoryDetail(RelatedJobsPreventDeleteMixin, ControlledByScmMixin, RetrieveUpdateDestroyAPIView):
+class InventoryDetail(RelatedJobsPreventDeleteMixin, RetrieveUpdateDestroyAPIView):

    model = Inventory
    serializer_class = InventorySerializer
@@ -157,28 +153,9 @@ class InventoryJobTemplateList(SubListAPIView):
        return qs.filter(inventory=parent)


-class InventoryLabelList(DeleteLastUnattachLabelMixin, SubListCreateAttachDetachAPIView, SubListAPIView):
+class InventoryLabelList(LabelSubListCreateAttachDetachView):

-    model = Label
-    serializer_class = LabelSerializer
    parent_model = Inventory
-    relationship = 'labels'
-
-    def post(self, request, *args, **kwargs):
-        # If a label already exists in the database, attach it instead of erroring out
-        # that it already exists
-        if 'id' not in request.data and 'name' in request.data and 'organization' in request.data:
-            existing = Label.objects.filter(name=request.data['name'], organization_id=request.data['organization'])
-            if existing.exists():
-                existing = existing[0]
-                request.data['id'] = existing.id
-                del request.data['name']
-                del request.data['organization']
-        if Label.objects.filter(inventory_labels=self.kwargs['pk']).count() > 100:
-            return Response(
-                dict(msg=_('Maximum number of labels for {} reached.'.format(self.parent_model._meta.verbose_name_raw))), status=status.HTTP_400_BAD_REQUEST
-            )
-        return super(InventoryLabelList, self).post(request, *args, **kwargs)


 class InventoryCopy(CopyAPIView):
--- a/awx/api/views/labels.py
+++ b/awx/api/views/labels.py
@@ -0,0 +1,71 @@
+# AWX
+from awx.api.generics import SubListCreateAttachDetachAPIView, RetrieveUpdateAPIView, ListCreateAPIView
+from awx.main.models import Label
+from awx.api.serializers import LabelSerializer
+
+# Django
+from django.utils.translation import gettext_lazy as _
+
+# Django REST Framework
+from rest_framework.response import Response
+from rest_framework.status import HTTP_400_BAD_REQUEST
+
+
+class LabelSubListCreateAttachDetachView(SubListCreateAttachDetachAPIView):
+    """
+    For related labels lists like /api/v2/inventories/N/labels/
+
+    We want want the last instance to be deleted from the database
+    when the last disassociate happens.
+
+    Subclasses need to define parent_model
+    """
+
+    model = Label
+    serializer_class = LabelSerializer
+    relationship = 'labels'
+
+    def unattach(self, request, *args, **kwargs):
+        (sub_id, res) = super().unattach_validate(request)
+        if res:
+            return res
+
+        res = super().unattach_by_id(request, sub_id)
+
+        obj = self.model.objects.get(id=sub_id)
+
+        if obj.is_detached():
+            obj.delete()
+
+        return res
+
+    def post(self, request, *args, **kwargs):
+        # If a label already exists in the database, attach it instead of erroring out
+        # that it already exists
+        if 'id' not in request.data and 'name' in request.data and 'organization' in request.data:
+            existing = Label.objects.filter(name=request.data['name'], organization_id=request.data['organization'])
+            if existing.exists():
+                existing = existing[0]
+                request.data['id'] = existing.id
+                del request.data['name']
+                del request.data['organization']
+
+        # Give a 400 error if we have attached too many labels to this object
+        label_filter = self.parent_model._meta.get_field(self.relationship).remote_field.name
+        if Label.objects.filter(**{label_filter: self.kwargs['pk']}).count() > 100:
+            return Response(dict(msg=_(f'Maximum number of labels for {self.parent_model._meta.verbose_name_raw} reached.')), status=HTTP_400_BAD_REQUEST)
+
+        return super().post(request, *args, **kwargs)
+
+
+class LabelDetail(RetrieveUpdateAPIView):
+
+    model = Label
+    serializer_class = LabelSerializer
+
+
+class LabelList(ListCreateAPIView):
+
+    name = _("Labels")
+    model = Label
+    serializer_class = LabelSerializer
--- a/awx/api/views/mixin.py
+++ b/awx/api/views/mixin.py
@@ -10,13 +10,12 @@ from django.shortcuts import get_object_or_404
 from django.utils.timezone import now
 from django.utils.translation import gettext_lazy as _

-from rest_framework.permissions import SAFE_METHODS
 from rest_framework.exceptions import PermissionDenied
 from rest_framework.response import Response
 from rest_framework import status

 from awx.main.constants import ACTIVE_STATES
-from awx.main.utils import get_object_or_400, parse_yaml_or_json
+from awx.main.utils import get_object_or_400
 from awx.main.models.ha import Instance, InstanceGroup
 from awx.main.models.organization import Team
 from awx.main.models.projects import Project
@@ -186,35 +185,6 @@ class OrganizationCountsMixin(object):
        return full_context


-class ControlledByScmMixin(object):
-    """
-    Special method to reset SCM inventory commit hash
-    if anything that it manages changes.
-    """
-
-    def _reset_inv_src_rev(self, obj):
-        if self.request.method in SAFE_METHODS or not obj:
-            return
-        project_following_sources = obj.inventory_sources.filter(update_on_project_update=True, source='scm')
-        if project_following_sources:
-            # Allow inventory changes unrelated to variables
-            if self.model == Inventory and (
-                not self.request or not self.request.data or parse_yaml_or_json(self.request.data.get('variables', '')) == parse_yaml_or_json(obj.variables)
-            ):
-                return
-            project_following_sources.update(scm_last_revision='')
-
-    def get_object(self):
-        obj = super(ControlledByScmMixin, self).get_object()
-        self._reset_inv_src_rev(obj)
-        return obj
-
-    def get_parent_object(self):
-        obj = super(ControlledByScmMixin, self).get_parent_object()
-        self._reset_inv_src_rev(obj)
-        return obj
-
-
 class NoTruncateMixin(object):
    def get_serializer_context(self):
        context = super().get_serializer_context()
--- a/awx/conf/settings.py
+++ b/awx/conf/settings.py
@@ -80,7 +80,7 @@ def _ctit_db_wrapper(trans_safe=False):
        yield
    except DBError as exc:
        if trans_safe:
-            level = logger.exception
+            level = logger.warning
            if isinstance(exc, ProgrammingError):
                if 'relation' in str(exc) and 'does not exist' in str(exc):
                    # this generally means we can't fetch Tower configuration
@@ -89,7 +89,7 @@ def _ctit_db_wrapper(trans_safe=False):
                    # has come up *before* the database has finished migrating, and
                    # especially that the conf.settings table doesn't exist yet
                    level = logger.debug
-            level('Database settings are not available, using defaults.')
+            level(f'Database settings are not available, using defaults. error: {str(exc)}')
        else:
            logger.exception('Error modifying something related to database settings.')
    finally:
--- a/awx/locale/ja/LC_MESSAGES/django.po
+++ b/awx/locale/ja/LC_MESSAGES/django.po
@@ -1440,7 +1440,7 @@ msgstr "指定した認証情報は無効 (HTTP 401) です。"

 #: awx/api/views/root.py:193 awx/api/views/root.py:234
 msgid "Unable to connect to proxy server."
-msgstr "プロキシサーバーに接続できません。"
+msgstr "プロキシーサーバーに接続できません。"

 #: awx/api/views/root.py:195 awx/api/views/root.py:236
 msgid "Could not connect to subscription service."
@@ -1976,7 +1976,7 @@ msgstr "リモートホスト名または IP を判別するために検索す

 #: awx/main/conf.py:85
 msgid "Proxy IP Allowed List"
-msgstr "プロキシ IP 許可リスト"
+msgstr "プロキシー IP 許可リスト"

 #: awx/main/conf.py:87
 msgid ""
@@ -2198,7 +2198,7 @@ msgid ""
 "Follow symbolic links when scanning for playbooks. Be aware that setting "
 "this to True can lead to infinite recursion if a link points to a parent "
 "directory of itself."
-msgstr "Playbook をスキャンするときは、シンボリックリンクをたどってください。リンクがそれ自体の親ディレクトリーを指している場合は、これを True に設定すると、無限再帰が発生する可能性があることに注意してください。"
+msgstr "Playbook のスキャン時にシンボリックリンクをたどります。リンクが親ディレクトリーを参照している場合には、この設定を True に指定すると無限再帰が発生する可能性があります。"

 #: awx/main/conf.py:337
 msgid "Ignore Ansible Galaxy SSL Certificate Verification"
@@ -2499,7 +2499,7 @@ msgstr "Insights for Ansible Automation Platform の最終収集日。"
 msgid ""
 "Last gathered entries for expensive collectors for Insights for Ansible "
 "Automation Platform."
-msgstr "Insights for Ansible Automation Platform の高価なコレクターの最後に収集されたエントリー。"
+msgstr "Insights for Ansible Automation Platform でコストがかかっているコレクターに関して最後に収集されたエントリー"

 #: awx/main/conf.py:686
 msgid "Insights for Ansible Automation Platform Gather Interval"
@@ -3692,7 +3692,7 @@ msgstr "タスクの開始"

 #: awx/main/models/events.py:189
 msgid "Variables Prompted"
-msgstr "変数のプロモート"
+msgstr "提示される変数"

 #: awx/main/models/events.py:190
 msgid "Gathering Facts"
@@ -3741,15 +3741,15 @@ msgstr "エラー"

 #: awx/main/models/execution_environments.py:17
 msgid "Always pull container before running."
-msgstr "実行前に必ずコンテナーをプルしてください。"
+msgstr "実行前に必ずコンテナーをプルする"

 #: awx/main/models/execution_environments.py:18
 msgid "Only pull the image if not present before running."
-msgstr "実行する前に、存在しない場合にのみイメージをプルしてください。"
+msgstr "イメージが存在しない場合のみ実行前にプルする"

 #: awx/main/models/execution_environments.py:19
 msgid "Never pull container before running."
-msgstr "実行前にコンテナーをプルしないでください。"
+msgstr "実行前にコンテナーをプルしない"

 #: awx/main/models/execution_environments.py:29
 msgid ""
@@ -5228,7 +5228,7 @@ msgid ""
 "SSL) or \"ldaps://ldap.example.com:636\" (SSL). Multiple LDAP servers may be "
 "specified by separating with spaces or commas. LDAP authentication is "
 "disabled if this parameter is empty."
-msgstr "\"ldap://ldap.example.com:389\" (非 SSL) または \"ldaps://ldap.example.com:636\" (SSL) などの LDAP サーバーに接続する URI です。複数の LDAP サーバーをスペースまたはカンマで区切って指定できます。LDAP 認証は、このパラメーターが空の場合は無効になります。"
+msgstr "\"ldap://ldap.example.com:389\" (非 SSL) または \"ldaps://ldap.example.com:636\" (SSL) などの LDAP サーバーに接続する URI です。複数の LDAP サーバーをスペースまたはコンマで区切って指定できます。LDAP 認証は、このパラメーターが空の場合は無効になります。"

 #: awx/sso/conf.py:170 awx/sso/conf.py:187 awx/sso/conf.py:198
 #: awx/sso/conf.py:209 awx/sso/conf.py:226 awx/sso/conf.py:244
@@ -6236,4 +6236,5 @@ msgstr "%s が現在アップグレード中です。"

 #: awx/ui/urls.py:24
 msgid "This page will refresh when complete."
-msgstr "このページは完了すると更新されます。"
+msgstr "このページは完了すると更新されます。"
+
--- a/awx/locale/ko/LC_MESSAGES/django.po
+++ b/awx/locale/ko/LC_MESSAGES/django.po
@@ -956,7 +956,7 @@ msgstr "인스턴스 그룹의 인스턴스"

 #: awx/api/views/__init__.py:450
 msgid "Schedules"
-msgstr "일정"
+msgstr "스케줄"

 #: awx/api/views/__init__.py:464
 msgid "Schedule Recurrence Rule Preview"
@@ -3261,7 +3261,7 @@ msgstr "JSON 또는 YAML 구문을 사용하여 인젝터를 입력합니다.
 #: awx/main/models/credential/__init__.py:412
 #, python-format
 msgid "adding %s credential type"
-msgstr "인증 정보 유형 %s 추가 중"
+msgstr "인증 정보 유형  %s 추가 중"

 #: awx/main/models/credential/__init__.py:590
 #: awx/main/models/credential/__init__.py:672
@@ -6236,4 +6236,5 @@ msgstr "%s 현재 업그레이드 중입니다."

 #: awx/ui/urls.py:24
 msgid "This page will refresh when complete."
-msgstr "완료되면 이 페이지가 새로 고침됩니다."
+msgstr "완료되면 이 페이지가 새로 고침됩니다."
+
--- a/awx/locale/zh/LC_MESSAGES/django.po
+++ b/awx/locale/zh/LC_MESSAGES/django.po
@@ -348,7 +348,7 @@ msgstr "SCM track_submodules 只能用于 git 项目。"
 msgid ""
 "Only Container Registry credentials can be associated with an Execution "
 "Environment"
-msgstr "只有容器 registry 凭证可以与执行环境关联"
+msgstr "只有容器注册表凭证才可以与执行环境关联"

 #: awx/api/serializers.py:1440
 msgid "Cannot change the organization of an execution environment"
@@ -629,7 +629,7 @@ msgstr "不支持在不替换的情况下在启动时删除 {} 凭证。提供

 #: awx/api/serializers.py:4338
 msgid "The inventory associated with this Workflow is being deleted."
-msgstr "与此 Workflow 关联的清单将被删除。"
+msgstr "与此工作流关联的清单将被删除。"

 #: awx/api/serializers.py:4405
 msgid "Message type '{}' invalid, must be either 'message' or 'body'"
@@ -3229,7 +3229,7 @@ msgstr "云"
 #: awx/main/models/credential/__init__.py:336
 #: awx/main/models/credential/__init__.py:1113
 msgid "Container Registry"
-msgstr "容器 Registry"
+msgstr "容器注册表"

 #: awx/main/models/credential/__init__.py:337
 msgid "Personal Access Token"
@@ -3560,7 +3560,7 @@ msgstr "身份验证 URL"

 #: awx/main/models/credential/__init__.py:1120
 msgid "Authentication endpoint for the container registry."
-msgstr "容器 registry 的身份验证端点。"
+msgstr "容器注册表的身份验证端点。"

 #: awx/main/models/credential/__init__.py:1130
 msgid "Password or Token"
@@ -3764,7 +3764,7 @@ msgstr "镜像位置"
 msgid ""
 "The full image location, including the container registry, image name, and "
 "version tag."
-msgstr "完整镜像位置，包括容器 registry、镜像名称和版本标签。"
+msgstr "完整镜像位置，包括容器注册表、镜像名称和版本标签。"

 #: awx/main/models/execution_environments.py:51
 msgid "Pull image before running?"
@@ -6238,4 +6238,5 @@ msgstr "%s 当前正在升级。"

 #: awx/ui/urls.py:24
 msgid "This page will refresh when complete."
-msgstr "完成后，此页面会刷新。"
+msgstr "完成后，此页面会刷新。"
+
--- a/awx/main/access.py
+++ b/awx/main/access.py
@@ -12,7 +12,7 @@ from django.conf import settings
 from django.db.models import Q, Prefetch
 from django.contrib.auth.models import User
 from django.utils.translation import gettext_lazy as _
-from django.core.exceptions import ObjectDoesNotExist
+from django.core.exceptions import ObjectDoesNotExist, FieldDoesNotExist

 # Django REST Framework
 from rest_framework.exceptions import ParseError, PermissionDenied
@@ -281,13 +281,23 @@ class BaseAccess(object):
        """
        return True

+    def assure_relationship_exists(self, obj, relationship):
+        if '.' in relationship:
+            return  # not attempting validation for complex relationships now
+        try:
+            obj._meta.get_field(relationship)
+        except FieldDoesNotExist:
+            raise NotImplementedError(f'The relationship {relationship} does not exist for model {type(obj)}')
+
    def can_attach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
+        self.assure_relationship_exists(obj, relationship)
        if skip_sub_obj_read_check:
            return self.can_change(obj, None)
        else:
            return bool(self.can_change(obj, None) and self.user.can_access(type(sub_obj), 'read', sub_obj))

    def can_unattach(self, obj, sub_obj, relationship, data=None):
+        self.assure_relationship_exists(obj, relationship)
        return self.can_change(obj, data)

    def check_related(self, field, Model, data, role_field='admin_role', obj=None, mandatory=False):
@@ -328,6 +338,8 @@ class BaseAccess(object):
            role = getattr(resource, role_field, None)
            if role is None:
                # Handle special case where resource does not have direct roles
+                if role_field == 'read_role':
+                    return self.user.can_access(type(resource), 'read', resource)
                access_method_type = {'admin_role': 'change', 'execute_role': 'start'}[role_field]
                return self.user.can_access(type(resource), access_method_type, resource, None)
            return self.user in role
@@ -499,6 +511,21 @@ class BaseAccess(object):
        return False


+class UnifiedCredentialsMixin(BaseAccess):
+    """
+    The credentials many-to-many is a standard relationship for JT, jobs, and others
+    Permission to attach is always use permission, and permission to unattach is admin to the parent object
+    """
+
+    @check_superuser
+    def can_attach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
+        if relationship == 'credentials':
+            if not isinstance(sub_obj, Credential):
+                raise RuntimeError(f'Can only attach credentials to credentials relationship, got {type(sub_obj)}')
+            return self.can_change(obj, None) and (self.user in sub_obj.use_role)
+        return super().can_attach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)
+
+
 class NotificationAttachMixin(BaseAccess):
    """For models that can have notifications attached

@@ -552,7 +579,8 @@ class InstanceAccess(BaseAccess):
        return super(InstanceAccess, self).can_unattach(obj, sub_obj, relationship, relationship, data=data)

    def can_add(self, data):
-        return False
+
+        return self.user.is_superuser

    def can_change(self, obj, data):
        return False
@@ -1031,7 +1059,7 @@ class GroupAccess(BaseAccess):
        return bool(obj and self.user in obj.inventory.admin_role)


-class InventorySourceAccess(NotificationAttachMixin, BaseAccess):
+class InventorySourceAccess(NotificationAttachMixin, UnifiedCredentialsMixin, BaseAccess):
    """
    I can see inventory sources whenever I can see their inventory.
    I can change inventory sources whenever I can change their inventory.
@@ -1075,18 +1103,6 @@ class InventorySourceAccess(NotificationAttachMixin, BaseAccess):
            return self.user in obj.inventory.update_role
        return False

-    @check_superuser
-    def can_attach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        if relationship == 'credentials' and isinstance(sub_obj, Credential):
-            return obj and obj.inventory and self.user in obj.inventory.admin_role and self.user in sub_obj.use_role
-        return super(InventorySourceAccess, self).can_attach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)
-
-    @check_superuser
-    def can_unattach(self, obj, sub_obj, relationship, *args, **kwargs):
-        if relationship == 'credentials' and isinstance(sub_obj, Credential):
-            return obj and obj.inventory and self.user in obj.inventory.admin_role
-        return super(InventorySourceAccess, self).can_attach(obj, sub_obj, relationship, *args, **kwargs)
-

 class InventoryUpdateAccess(BaseAccess):
    """
@@ -1485,7 +1501,7 @@ class ProjectUpdateAccess(BaseAccess):
        return obj and self.user in obj.project.admin_role


-class JobTemplateAccess(NotificationAttachMixin, BaseAccess):
+class JobTemplateAccess(NotificationAttachMixin, UnifiedCredentialsMixin, BaseAccess):
    """
    I can see job templates when:
     - I have read role for the job template.
@@ -1549,8 +1565,7 @@ class JobTemplateAccess(NotificationAttachMixin, BaseAccess):
            if self.user not in inventory.use_role:
                return False

-        ee = get_value(ExecutionEnvironment, 'execution_environment')
-        if ee and not self.user.can_access(ExecutionEnvironment, 'read', ee):
+        if not self.check_related('execution_environment', ExecutionEnvironment, data, role_field='read_role'):
            return False

        project = get_value(Project, 'project')
@@ -1600,10 +1615,8 @@ class JobTemplateAccess(NotificationAttachMixin, BaseAccess):
        if self.changes_are_non_sensitive(obj, data):
            return True

-        if data.get('execution_environment'):
-            ee = get_object_from_data('execution_environment', ExecutionEnvironment, data)
-            if not self.user.can_access(ExecutionEnvironment, 'read', ee):
-                return False
+        if not self.check_related('execution_environment', ExecutionEnvironment, data, obj=obj, role_field='read_role'):
+            return False

        for required_field, cls in (('inventory', Inventory), ('project', Project)):
            is_mandatory = True
@@ -1667,17 +1680,13 @@ class JobTemplateAccess(NotificationAttachMixin, BaseAccess):
            if not obj.organization:
                return False
            return self.user.can_access(type(sub_obj), "read", sub_obj) and self.user in obj.organization.admin_role
-        if relationship == 'credentials' and isinstance(sub_obj, Credential):
-            return self.user in obj.admin_role and self.user in sub_obj.use_role
        return super(JobTemplateAccess, self).can_attach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)

    @check_superuser
    def can_unattach(self, obj, sub_obj, relationship, *args, **kwargs):
        if relationship == "instance_groups":
            return self.can_attach(obj, sub_obj, relationship, *args, **kwargs)
-        if relationship == 'credentials' and isinstance(sub_obj, Credential):
-            return self.user in obj.admin_role
-        return super(JobTemplateAccess, self).can_attach(obj, sub_obj, relationship, *args, **kwargs)
+        return super(JobTemplateAccess, self).can_unattach(obj, sub_obj, relationship, *args, **kwargs)


 class JobAccess(BaseAccess):
@@ -1824,7 +1833,7 @@ class SystemJobAccess(BaseAccess):
        return False  # no relaunching of system jobs


-class JobLaunchConfigAccess(BaseAccess):
+class JobLaunchConfigAccess(UnifiedCredentialsMixin, BaseAccess):
    """
    Launch configs must have permissions checked for
     - relaunching
@@ -1832,63 +1841,69 @@ class JobLaunchConfigAccess(BaseAccess):

    In order to create a new object with a copy of this launch config, I need:
     - use access to related inventory (if present)
+     - read access to Execution Environment (if present), unless the specified ee is already in the template
     - use role to many-related credentials (if any present)
+     - read access to many-related labels (if any present), unless the specified label is already in the template
+     - read access to many-related instance groups (if any present), unless the specified instance group is already in the template
    """

    model = JobLaunchConfig
    select_related = 'job'
    prefetch_related = ('credentials', 'inventory')

-    def _unusable_creds_exist(self, qs):
-        return qs.exclude(pk__in=Credential._accessible_pk_qs(Credential, self.user, 'use_role')).exists()
+    M2M_CHECKS = {'credentials': Credential, 'labels': Label, 'instance_groups': InstanceGroup}

-    def has_credentials_access(self, obj):
-        # user has access if no related credentials exist that the user lacks use role for
-        return not self._unusable_creds_exist(obj.credentials)
+    def _related_filtered_queryset(self, cls):
+        if cls is Label:
+            return LabelAccess(self.user).filtered_queryset()
+        elif cls is InstanceGroup:
+            return InstanceGroupAccess(self.user).filtered_queryset()
+        else:
+            return cls._accessible_pk_qs(cls, self.user, 'use_role')
+
+    def has_obj_m2m_access(self, obj):
+        for relationship, cls in self.M2M_CHECKS.items():
+            if getattr(obj, relationship).exclude(pk__in=self._related_filtered_queryset(cls)).exists():
+                return False
+        return True

    @check_superuser
    def can_add(self, data, template=None):
        # This is a special case, we don't check related many-to-many elsewhere
        # launch RBAC checks use this
-        if 'credentials' in data and data['credentials'] or 'reference_obj' in data:
-            if 'reference_obj' in data:
-                prompted_cred_qs = data['reference_obj'].credentials.all()
-            else:
-                # If given model objects, only use the primary key from them
-                cred_pks = [cred.pk for cred in data['credentials']]
-                if template:
-                    for cred in template.credentials.all():
-                        if cred.pk in cred_pks:
-                            cred_pks.remove(cred.pk)
-                prompted_cred_qs = Credential.objects.filter(pk__in=cred_pks)
-            if self._unusable_creds_exist(prompted_cred_qs):
+        if 'reference_obj' in data:
+            if not self.has_obj_m2m_access(data['reference_obj']):
                return False
-        return self.check_related('inventory', Inventory, data, role_field='use_role')
+        else:
+            for relationship, cls in self.M2M_CHECKS.items():
+                if relationship in data and data[relationship]:
+                    # If given model objects, only use the primary key from them
+                    sub_obj_pks = [sub_obj.pk for sub_obj in data[relationship]]
+                    if template:
+                        for sub_obj in getattr(template, relationship).all():
+                            if sub_obj.pk in sub_obj_pks:
+                                sub_obj_pks.remove(sub_obj.pk)
+                    if cls.objects.filter(pk__in=sub_obj_pks).exclude(pk__in=self._related_filtered_queryset(cls)).exists():
+                        return False
+        return self.check_related('inventory', Inventory, data, role_field='use_role') and self.check_related(
+            'execution_environment', ExecutionEnvironment, data, role_field='read_role'
+        )

    @check_superuser
    def can_use(self, obj):
-        return self.check_related('inventory', Inventory, {}, obj=obj, role_field='use_role', mandatory=True) and self.has_credentials_access(obj)
+        return (
+            self.has_obj_m2m_access(obj)
+            and self.check_related('inventory', Inventory, {}, obj=obj, role_field='use_role', mandatory=True)
+            and self.check_related('execution_environment', ExecutionEnvironment, {}, obj=obj, role_field='read_role')
+        )

    def can_change(self, obj, data):
-        return self.check_related('inventory', Inventory, data, obj=obj, role_field='use_role')
-
-    def can_attach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        if isinstance(sub_obj, Credential) and relationship == 'credentials':
-            return self.user in sub_obj.use_role
-        else:
-            raise NotImplementedError('Only credentials can be attached to launch configurations.')
-
-    def can_unattach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        if isinstance(sub_obj, Credential) and relationship == 'credentials':
-            if skip_sub_obj_read_check:
-                return True
-            else:
-                return self.user in sub_obj.read_role
-        else:
-            raise NotImplementedError('Only credentials can be attached to launch configurations.')
+        return self.check_related('inventory', Inventory, data, obj=obj, role_field='use_role') and self.check_related(
+            'execution_environment', ExecutionEnvironment, data, obj=obj, role_field='read_role'
+        )


-class WorkflowJobTemplateNodeAccess(BaseAccess):
+class WorkflowJobTemplateNodeAccess(UnifiedCredentialsMixin, BaseAccess):
    """
    I can see/use a WorkflowJobTemplateNode if I have read permission
        to associated Workflow Job Template
@@ -1911,7 +1926,7 @@ class WorkflowJobTemplateNodeAccess(BaseAccess):
    """

    model = WorkflowJobTemplateNode
-    prefetch_related = ('success_nodes', 'failure_nodes', 'always_nodes', 'unified_job_template', 'credentials', 'workflow_job_template')
+    prefetch_related = ('success_nodes', 'failure_nodes', 'always_nodes', 'unified_job_template', 'workflow_job_template')

    def filtered_queryset(self):
        return self.model.objects.filter(workflow_job_template__in=WorkflowJobTemplate.accessible_objects(self.user, 'read_role'))
@@ -1923,7 +1938,8 @@ class WorkflowJobTemplateNodeAccess(BaseAccess):
        return (
            self.check_related('workflow_job_template', WorkflowJobTemplate, data, mandatory=True)
            and self.check_related('unified_job_template', UnifiedJobTemplate, data, role_field='execute_role')
-            and JobLaunchConfigAccess(self.user).can_add(data)
+            and self.check_related('inventory', Inventory, data, role_field='use_role')
+            and self.check_related('execution_environment', ExecutionEnvironment, data, role_field='read_role')
        )

    def wfjt_admin(self, obj):
@@ -1932,17 +1948,14 @@ class WorkflowJobTemplateNodeAccess(BaseAccess):
        else:
            return self.user in obj.workflow_job_template.admin_role

-    def ujt_execute(self, obj):
+    def ujt_execute(self, obj, data=None):
        if not obj.unified_job_template:
            return True
-        return self.check_related('unified_job_template', UnifiedJobTemplate, {}, obj=obj, role_field='execute_role', mandatory=True)
+        return self.check_related('unified_job_template', UnifiedJobTemplate, data, obj=obj, role_field='execute_role', mandatory=True)

    def can_change(self, obj, data):
-        if not data:
-            return True
-
        # should not be able to edit the prompts if lacking access to UJT or WFJT
-        return self.ujt_execute(obj) and self.wfjt_admin(obj) and JobLaunchConfigAccess(self.user).can_change(obj, data)
+        return self.ujt_execute(obj, data=data) and self.wfjt_admin(obj) and JobLaunchConfigAccess(self.user).can_change(obj, data)

    def can_delete(self, obj):
        return self.wfjt_admin(obj)
@@ -1955,29 +1968,14 @@ class WorkflowJobTemplateNodeAccess(BaseAccess):
        return True

    def can_attach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        if not self.wfjt_admin(obj):
-            return False
-        if relationship == 'credentials':
-            # Need permission to related template to attach a credential
-            if not self.ujt_execute(obj):
-                return False
-            return JobLaunchConfigAccess(self.user).can_attach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)
-        elif relationship in ('success_nodes', 'failure_nodes', 'always_nodes'):
-            return self.check_same_WFJT(obj, sub_obj)
-        else:
-            raise NotImplementedError('Relationship {} not understood for WFJT nodes.'.format(relationship))
+        if relationship in ('success_nodes', 'failure_nodes', 'always_nodes'):
+            return self.wfjt_admin(obj) and self.check_same_WFJT(obj, sub_obj)
+        return super().can_attach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)

-    def can_unattach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        if not self.wfjt_admin(obj):
-            return False
-        if relationship == 'credentials':
-            if not self.ujt_execute(obj):
-                return False
-            return JobLaunchConfigAccess(self.user).can_unattach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)
-        elif relationship in ('success_nodes', 'failure_nodes', 'always_nodes'):
-            return self.check_same_WFJT(obj, sub_obj)
-        else:
-            raise NotImplementedError('Relationship {} not understood for WFJT nodes.'.format(relationship))
+    def can_unattach(self, obj, sub_obj, relationship, data=None):
+        if relationship in ('success_nodes', 'failure_nodes', 'always_nodes'):
+            return self.wfjt_admin(obj)
+        return super().can_unattach(obj, sub_obj, relationship, data=None)


 class WorkflowJobNodeAccess(BaseAccess):
@@ -2052,13 +2050,10 @@ class WorkflowJobTemplateAccess(NotificationAttachMixin, BaseAccess):
        if not data:  # So the browseable API will work
            return Organization.accessible_objects(self.user, 'workflow_admin_role').exists()

-        if data.get('execution_environment'):
-            ee = get_object_from_data('execution_environment', ExecutionEnvironment, data)
-            if not self.user.can_access(ExecutionEnvironment, 'read', ee):
-                return False
-
-        return self.check_related('organization', Organization, data, role_field='workflow_admin_role', mandatory=True) and self.check_related(
-            'inventory', Inventory, data, role_field='use_role'
+        return bool(
+            self.check_related('organization', Organization, data, role_field='workflow_admin_role', mandatory=True)
+            and self.check_related('inventory', Inventory, data, role_field='use_role')
+            and self.check_related('execution_environment', ExecutionEnvironment, data, role_field='read_role')
        )

    def can_copy(self, obj):
@@ -2104,14 +2099,10 @@ class WorkflowJobTemplateAccess(NotificationAttachMixin, BaseAccess):
        if self.user.is_superuser:
            return True

-        if data and data.get('execution_environment'):
-            ee = get_object_from_data('execution_environment', ExecutionEnvironment, data)
-            if not self.user.can_access(ExecutionEnvironment, 'read', ee):
-                return False
-
        return (
            self.check_related('organization', Organization, data, role_field='workflow_admin_role', obj=obj)
            and self.check_related('inventory', Inventory, data, role_field='use_role', obj=obj)
+            and self.check_related('execution_environment', ExecutionEnvironment, data, obj=obj, role_field='read_role')
            and self.user in obj.admin_role
        )

@@ -2518,7 +2509,7 @@ class UnifiedJobAccess(BaseAccess):
        return super(UnifiedJobAccess, self).get_queryset().filter(workflowapproval__isnull=True)


-class ScheduleAccess(BaseAccess):
+class ScheduleAccess(UnifiedCredentialsMixin, BaseAccess):
    """
    I can see a schedule if I can see it's related unified job, I can create them or update them if I have write access
    """
@@ -2559,12 +2550,6 @@ class ScheduleAccess(BaseAccess):
    def can_delete(self, obj):
        return self.can_change(obj, {})

-    def can_attach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        return JobLaunchConfigAccess(self.user).can_attach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)
-
-    def can_unattach(self, obj, sub_obj, relationship, data, skip_sub_obj_read_check=False):
-        return JobLaunchConfigAccess(self.user).can_unattach(obj, sub_obj, relationship, data, skip_sub_obj_read_check=skip_sub_obj_read_check)
-

 class NotificationTemplateAccess(BaseAccess):
    """
--- a/awx/main/analytics/collectors.py
+++ b/awx/main/analytics/collectors.py
@@ -16,6 +16,7 @@ from awx.conf.license import get_license
 from awx.main.utils import get_awx_version, camelcase_to_underscore, datetime_hook
 from awx.main import models
 from awx.main.analytics import register
+from awx.main.scheduler.task_manager_models import TaskManagerInstances

 """
 This module is used to define metrics collected by awx.main.analytics.gather()
@@ -129,7 +130,7 @@ def config(since, **kwargs):
    }


-@register('counts', '1.1', description=_('Counts of objects such as organizations, inventories, and projects'))
+@register('counts', '1.2', description=_('Counts of objects such as organizations, inventories, and projects'))
 def counts(since, **kwargs):
    counts = {}
    for cls in (
@@ -172,6 +173,13 @@ def counts(since, **kwargs):
        .count()
    )
    counts['pending_jobs'] = models.UnifiedJob.objects.exclude(launch_type='sync').filter(status__in=('pending',)).count()
+    if connection.vendor == 'postgresql':
+        with connection.cursor() as cursor:
+            cursor.execute(f"select count(*) from pg_stat_activity where datname=\'{connection.settings_dict['NAME']}\'")
+            counts['database_connections'] = cursor.fetchone()[0]
+    else:
+        # We should be using postgresql, but if we do that change that ever we should change the below value
+        counts['database_connections'] = 1
    return counts


@@ -228,25 +236,25 @@ def projects_by_scm_type(since, **kwargs):
@register('instance_info', '1.2', description=_('Cluster topology and capacity'))
 def instance_info(since, include_hostnames=False, **kwargs):
    info = {}
-    instances = models.Instance.objects.values_list('hostname').values(
-        'uuid', 'version', 'capacity', 'cpu', 'memory', 'managed_by_policy', 'hostname', 'enabled'
-    )
-    for instance in instances:
-        consumed_capacity = sum(x.task_impact for x in models.UnifiedJob.objects.filter(execution_node=instance['hostname'], status__in=('running', 'waiting')))
+    # Use same method that the TaskManager does to compute consumed capacity without querying all running jobs for each Instance
+    active_tasks = models.UnifiedJob.objects.filter(status__in=['running', 'waiting']).only('task_impact', 'controller_node', 'execution_node')
+    tm_instances = TaskManagerInstances(active_tasks, instance_fields=['uuid', 'version', 'capacity', 'cpu', 'memory', 'managed_by_policy', 'enabled'])
+    for tm_instance in tm_instances.instances_by_hostname.values():
+        instance = tm_instance.obj
        instance_info = {
-            'uuid': instance['uuid'],
-            'version': instance['version'],
-            'capacity': instance['capacity'],
-            'cpu': instance['cpu'],
-            'memory': instance['memory'],
-            'managed_by_policy': instance['managed_by_policy'],
-            'enabled': instance['enabled'],
-            'consumed_capacity': consumed_capacity,
-            'remaining_capacity': instance['capacity'] - consumed_capacity,
+            'uuid': instance.uuid,
+            'version': instance.version,
+            'capacity': instance.capacity,
+            'cpu': instance.cpu,
+            'memory': instance.memory,
+            'managed_by_policy': instance.managed_by_policy,
+            'enabled': instance.enabled,
+            'consumed_capacity': tm_instance.consumed_capacity,
+            'remaining_capacity': instance.capacity - tm_instance.consumed_capacity,
        }
        if include_hostnames is True:
-            instance_info['hostname'] = instance['hostname']
-        info[instance['uuid']] = instance_info
+            instance_info['hostname'] = instance.hostname
+        info[instance.uuid] = instance_info
    return info


@@ -389,7 +397,7 @@ def events_table_partitioned_modified(since, full_path, until, **kwargs):
    return _events_table(since, full_path, until, 'main_jobevent', 'modified', project_job_created=True, **kwargs)


-@register('unified_jobs_table', '1.3', format='csv', description=_('Data on jobs run'), expensive=four_hour_slicing)
+@register('unified_jobs_table', '1.4', format='csv', description=_('Data on jobs run'), expensive=four_hour_slicing)
 def unified_jobs_table(since, full_path, until, **kwargs):
    unified_job_query = '''COPY (SELECT main_unifiedjob.id,
                                 main_unifiedjob.polymorphic_ctype_id,
@@ -415,7 +423,8 @@ def unified_jobs_table(since, full_path, until, **kwargs):
                                 main_unifiedjob.job_explanation,
                                 main_unifiedjob.instance_group_id,
                                 main_unifiedjob.installed_collections,
-                                 main_unifiedjob.ansible_version
+                                 main_unifiedjob.ansible_version,
+                                 main_job.forks
                                 FROM main_unifiedjob
                                 JOIN django_content_type ON main_unifiedjob.polymorphic_ctype_id = django_content_type.id
                                 LEFT JOIN main_job ON main_unifiedjob.id = main_job.unifiedjob_ptr_id
--- a/awx/main/analytics/metrics.py
+++ b/awx/main/analytics/metrics.py
@@ -3,6 +3,7 @@ from prometheus_client import CollectorRegistry, Gauge, Info, generate_latest

 from awx.conf.license import get_license
 from awx.main.utils import get_awx_version
+from awx.main.models import UnifiedJob
 from awx.main.analytics.collectors import (
    counts,
    instance_info,
@@ -126,6 +127,8 @@ def metrics():
    LICENSE_INSTANCE_TOTAL = Gauge('awx_license_instance_total', 'Total number of managed hosts provided by your license', registry=REGISTRY)
    LICENSE_INSTANCE_FREE = Gauge('awx_license_instance_free', 'Number of remaining managed hosts provided by your license', registry=REGISTRY)

+    DATABASE_CONNECTIONS = Gauge('awx_database_connections_total', 'Number of connections to database', registry=REGISTRY)
+
    license_info = get_license()
    SYSTEM_INFO.info(
        {
@@ -163,10 +166,13 @@ def metrics():
    USER_SESSIONS.labels(type='user').set(current_counts['active_user_sessions'])
    USER_SESSIONS.labels(type='anonymous').set(current_counts['active_anonymous_sessions'])

+    DATABASE_CONNECTIONS.set(current_counts['database_connections'])
+
    all_job_data = job_counts(None)
    statuses = all_job_data.get('status', {})
-    for status, value in statuses.items():
-        STATUS.labels(status=status).set(value)
+    states = set(dict(UnifiedJob.STATUS_CHOICES).keys()) - set(['new'])
+    for state in states:
+        STATUS.labels(status=state).set(statuses.get(state, 0))

    RUNNING_JOBS.set(current_counts['running_jobs'])
    PENDING_JOBS.set(current_counts['pending_jobs'])
--- a/awx/main/analytics/subsystem_metrics.py
+++ b/awx/main/analytics/subsystem_metrics.py
@@ -166,7 +166,11 @@ class Metrics:
        elif settings.IS_TESTING():
            self.instance_name = "awx_testing"
        else:
-            self.instance_name = Instance.objects.me().hostname
+            try:
+                self.instance_name = Instance.objects.me().hostname
+            except Exception as e:
+                self.instance_name = settings.CLUSTER_HOST_ID
+                logger.info(f'Instance {self.instance_name} seems to be unregistered, error: {e}')

        # metric name, help_text
        METRICSLIST = [
@@ -184,19 +188,29 @@ class Metrics:
            FloatM('subsystem_metrics_pipe_execute_seconds', 'Time spent saving metrics to redis'),
            IntM('subsystem_metrics_pipe_execute_calls', 'Number of calls to pipe_execute'),
            FloatM('subsystem_metrics_send_metrics_seconds', 'Time spent sending metrics to other nodes'),
-            SetFloatM('task_manager_get_tasks_seconds', 'Time spent in loading all tasks from db'),
+            SetFloatM('task_manager_get_tasks_seconds', 'Time spent in loading tasks from db'),
            SetFloatM('task_manager_start_task_seconds', 'Time spent starting task'),
            SetFloatM('task_manager_process_running_tasks_seconds', 'Time spent processing running tasks'),
            SetFloatM('task_manager_process_pending_tasks_seconds', 'Time spent processing pending tasks'),
-            SetFloatM('task_manager_generate_dependencies_seconds', 'Time spent generating dependencies for pending tasks'),
-            SetFloatM('task_manager_spawn_workflow_graph_jobs_seconds', 'Time spent spawning workflow jobs'),
            SetFloatM('task_manager__schedule_seconds', 'Time spent in running the entire _schedule'),
-            IntM('task_manager_schedule_calls', 'Number of calls to task manager schedule'),
+            IntM('task_manager__schedule_calls', 'Number of calls to _schedule, after lock is acquired'),
            SetFloatM('task_manager_recorded_timestamp', 'Unix timestamp when metrics were last recorded'),
            SetIntM('task_manager_tasks_started', 'Number of tasks started'),
            SetIntM('task_manager_running_processed', 'Number of running tasks processed'),
            SetIntM('task_manager_pending_processed', 'Number of pending tasks processed'),
            SetIntM('task_manager_tasks_blocked', 'Number of tasks blocked from running'),
+            SetFloatM('task_manager_commit_seconds', 'Time spent in db transaction, including on_commit calls'),
+            SetFloatM('dependency_manager_get_tasks_seconds', 'Time spent loading pending tasks from db'),
+            SetFloatM('dependency_manager_generate_dependencies_seconds', 'Time spent generating dependencies for pending tasks'),
+            SetFloatM('dependency_manager__schedule_seconds', 'Time spent in running the entire _schedule'),
+            IntM('dependency_manager__schedule_calls', 'Number of calls to _schedule, after lock is acquired'),
+            SetFloatM('dependency_manager_recorded_timestamp', 'Unix timestamp when metrics were last recorded'),
+            SetIntM('dependency_manager_pending_processed', 'Number of pending tasks processed'),
+            SetFloatM('workflow_manager__schedule_seconds', 'Time spent in running the entire _schedule'),
+            IntM('workflow_manager__schedule_calls', 'Number of calls to _schedule, after lock is acquired'),
+            SetFloatM('workflow_manager_recorded_timestamp', 'Unix timestamp when metrics were last recorded'),
+            SetFloatM('workflow_manager_spawn_workflow_graph_jobs_seconds', 'Time spent spawning workflow tasks'),
+            SetFloatM('workflow_manager_get_tasks_seconds', 'Time spent loading workflow tasks from db'),
        ]
        # turn metric list into dictionary with the metric name as a key
        self.METRICS = {}
@@ -213,6 +227,8 @@ class Metrics:
            m.reset_value(self.conn)
        self.metrics_have_changed = True
        self.conn.delete(root_key + "_lock")
+        for m in self.conn.scan_iter(root_key + '_instance_*'):
+            self.conn.delete(m)

    def inc(self, field, value):
        if value != 0:
@@ -301,7 +317,12 @@ class Metrics:
                self.previous_send_metrics.set(current_time)
                self.previous_send_metrics.store_value(self.conn)
        finally:
-            lock.release()
+            try:
+                lock.release()
+            except Exception as exc:
+                # After system failures, we might throw redis.exceptions.LockNotOwnedError
+                # this is to avoid print a Traceback, and importantly, avoid raising an exception into parent context
+                logger.warning(f'Error releasing subsystem metrics redis lock, error: {str(exc)}')

    def load_other_metrics(self, request):
        # data received from other nodes are stored in their own keys
--- a/awx/main/conf.py
+++ b/awx/main/conf.py
@@ -446,7 +446,7 @@ register(
    label=_('Default Job Idle Timeout'),
    help_text=_(
        'If no output is detected from ansible in this number of seconds the execution will be terminated. '
-        'Use value of 0 to used default idle_timeout is 600s.'
+        'Use value of 0 to indicate that no idle timeout should be imposed.'
    ),
    category=_('Jobs'),
    category_slug='jobs',
--- a/awx/main/dispatch/init.py
+++ b/awx/main/dispatch/init.py
@@ -4,6 +4,7 @@ import select
 from contextlib import contextmanager

 from django.conf import settings
+from django.db import connection as pg_connection


 NOT_READY = ([], [], [])
@@ -15,7 +16,6 @@ def get_local_queuename():

 class PubSub(object):
    def __init__(self, conn):
-        assert conn.autocommit, "Connection must be in autocommit mode."
        self.conn = conn

    def listen(self, channel):
@@ -31,6 +31,9 @@ class PubSub(object):
            cur.execute('SELECT pg_notify(%s, %s);', (channel, payload))

    def events(self, select_timeout=5, yield_timeouts=False):
+        if not self.conn.autocommit:
+            raise RuntimeError('Listening for events can only be done in autocommit mode')
+
        while True:
            if select.select([self.conn], [], [], select_timeout) == NOT_READY:
                if yield_timeouts:
@@ -45,11 +48,32 @@ class PubSub(object):


@contextmanager
-def pg_bus_conn():
-    conf = settings.DATABASES['default']
-    conn = psycopg2.connect(dbname=conf['NAME'], host=conf['HOST'], user=conf['USER'], password=conf['PASSWORD'], port=conf['PORT'], **conf.get("OPTIONS", {}))
-    # Django connection.cursor().connection doesn't have autocommit=True on
-    conn.set_session(autocommit=True)
+def pg_bus_conn(new_connection=False):
+    '''
+    Any listeners probably want to establish a new database connection,
+    separate from the Django connection used for queries, because that will prevent
+    losing connection to the channel whenever a .close() happens.
+
+    Any publishers probably want to use the existing connection
+    so that messages follow postgres transaction rules
+    https://www.postgresql.org/docs/current/sql-notify.html
+    '''
+
+    if new_connection:
+        conf = settings.DATABASES['default']
+        conn = psycopg2.connect(
+            dbname=conf['NAME'], host=conf['HOST'], user=conf['USER'], password=conf['PASSWORD'], port=conf['PORT'], **conf.get("OPTIONS", {})
+        )
+        # Django connection.cursor().connection doesn't have autocommit=True on by default
+        conn.set_session(autocommit=True)
+    else:
+        if pg_connection.connection is None:
+            pg_connection.connect()
+        if pg_connection.connection is None:
+            raise RuntimeError('Unexpectedly could not connect to postgres for pg_notify actions')
+        conn = pg_connection.connection
+
    pubsub = PubSub(conn)
    yield pubsub
-    conn.close()
+    if new_connection:
+        conn.close()
--- a/awx/main/dispatch/control.py
+++ b/awx/main/dispatch/control.py
@@ -37,18 +37,24 @@ class Control(object):
    def running(self, *args, **kwargs):
        return self.control_with_reply('running', *args, **kwargs)

+    def cancel(self, task_ids, *args, **kwargs):
+        return self.control_with_reply('cancel', *args, extra_data={'task_ids': task_ids}, **kwargs)
+
    @classmethod
    def generate_reply_queue_name(cls):
        return f"reply_to_{str(uuid.uuid4()).replace('-','_')}"

-    def control_with_reply(self, command, timeout=5):
+    def control_with_reply(self, command, timeout=5, extra_data=None):
        logger.warning('checking {} {} for {}'.format(self.service, command, self.queuename))
        reply_queue = Control.generate_reply_queue_name()
        self.result = None

-        with pg_bus_conn() as conn:
+        with pg_bus_conn(new_connection=True) as conn:
            conn.listen(reply_queue)
-            conn.notify(self.queuename, json.dumps({'control': command, 'reply_to': reply_queue}))
+            send_data = {'control': command, 'reply_to': reply_queue}
+            if extra_data:
+                send_data.update(extra_data)
+            conn.notify(self.queuename, json.dumps(send_data))

            for reply in conn.events(select_timeout=timeout, yield_timeouts=True):
                if reply is None:
--- a/awx/main/dispatch/pool.py
+++ b/awx/main/dispatch/pool.py
@@ -16,13 +16,14 @@ from queue import Full as QueueFull, Empty as QueueEmpty
 from django.conf import settings
 from django.db import connection as django_connection, connections
 from django.core.cache import cache as django_cache
+from django.utils.timezone import now as tz_now
 from django_guid import set_guid
 from jinja2 import Template
 import psutil

 from awx.main.models import UnifiedJob
 from awx.main.dispatch import reaper
-from awx.main.utils.common import convert_mem_str_to_bytes, get_mem_effective_capacity
+from awx.main.utils.common import convert_mem_str_to_bytes, get_mem_effective_capacity, log_excess_runtime

 if 'run_callback_receiver' in sys.argv:
    logger = logging.getLogger('awx.main.commands.run_callback_receiver')
@@ -328,12 +329,16 @@ class AutoscalePool(WorkerPool):
            # Get same number as max forks based on memory, this function takes memory as bytes
            self.max_workers = get_mem_effective_capacity(total_memory_gb * 2**30)

+            # add magic prime number of extra workers to ensure
+            # we have a few extra workers to run the heartbeat
+            self.max_workers += 7
+
        # max workers can't be less than min_workers
        self.max_workers = max(self.min_workers, self.max_workers)

-    def debug(self, *args, **kwargs):
-        self.cleanup()
-        return super(AutoscalePool, self).debug(*args, **kwargs)
+        # the task manager enforces settings.TASK_MANAGER_TIMEOUT on its own
+        # but if the task takes longer than the time defined here, we will force it to stop here
+        self.task_manager_timeout = settings.TASK_MANAGER_TIMEOUT + settings.TASK_MANAGER_TIMEOUT_GRACE_PERIOD

    @property
    def should_grow(self):
@@ -351,6 +356,7 @@ class AutoscalePool(WorkerPool):
    def debug_meta(self):
        return 'min={} max={}'.format(self.min_workers, self.max_workers)

+    @log_excess_runtime(logger)
    def cleanup(self):
        """
        Perform some internal account and cleanup.  This is run on
@@ -359,8 +365,6 @@ class AutoscalePool(WorkerPool):
        1.  Discover worker processes that exited, and recover messages they
            were handling.
        2.  Clean up unnecessary, idle workers.
-        3.  Check to see if the database says this node is running any tasks
-            that aren't actually running.  If so, reap them.

        IMPORTANT: this function is one of the few places in the dispatcher
        (aside from setting lookups) where we talk to the database.  As such,
@@ -401,13 +405,15 @@ class AutoscalePool(WorkerPool):
                # the task manager to never do more work
                current_task = w.current_task
                if current_task and isinstance(current_task, dict):
-                    if current_task.get('task', '').endswith('tasks.run_task_manager'):
+                    endings = ['tasks.task_manager', 'tasks.dependency_manager', 'tasks.workflow_manager']
+                    current_task_name = current_task.get('task', '')
+                    if any(current_task_name.endswith(e) for e in endings):
                        if 'started' not in current_task:
                            w.managed_tasks[current_task['uuid']]['started'] = time.time()
                        age = time.time() - current_task['started']
                        w.managed_tasks[current_task['uuid']]['age'] = age
-                        if age > (60 * 5):
-                            logger.error(f'run_task_manager has held the advisory lock for >5m, sending SIGTERM to {w.pid}')  # noqa
+                        if age > self.task_manager_timeout:
+                            logger.error(f'{current_task_name} has held the advisory lock for {age}, sending SIGTERM to {w.pid}')
                            os.kill(w.pid, signal.SIGTERM)

        for m in orphaned:
@@ -417,13 +423,17 @@ class AutoscalePool(WorkerPool):
            idx = random.choice(range(len(self.workers)))
            self.write(idx, m)

-        # if the database says a job is running on this node, but it's *not*,
-        # then reap it
-        running_uuids = []
-        for worker in self.workers:
-            worker.calculate_managed_tasks()
-            running_uuids.extend(list(worker.managed_tasks.keys()))
-        reaper.reap(excluded_uuids=running_uuids)
+    def add_bind_kwargs(self, body):
+        bind_kwargs = body.pop('bind_kwargs', [])
+        body.setdefault('kwargs', {})
+        if 'dispatch_time' in bind_kwargs:
+            body['kwargs']['dispatch_time'] = tz_now().isoformat()
+        if 'worker_tasks' in bind_kwargs:
+            worker_tasks = {}
+            for worker in self.workers:
+                worker.calculate_managed_tasks()
+                worker_tasks[worker.pid] = list(worker.managed_tasks.keys())
+            body['kwargs']['worker_tasks'] = worker_tasks

    def up(self):
        if self.full:
@@ -438,6 +448,8 @@ class AutoscalePool(WorkerPool):
        if 'guid' in body:
            set_guid(body['guid'])
        try:
+            if isinstance(body, dict) and body.get('bind_kwargs'):
+                self.add_bind_kwargs(body)
            # when the cluster heartbeat occurs, clean up internally
            if isinstance(body, dict) and 'cluster_node_heartbeat' in body['task']:
                self.cleanup()
@@ -452,6 +464,10 @@ class AutoscalePool(WorkerPool):
                    w.put(body)
                    break
            else:
+                task_name = 'unknown'
+                if isinstance(body, dict):
+                    task_name = body.get('task')
+                logger.warn(f'Workers maxed, queuing {task_name}, load: {sum(len(w.managed_tasks) for w in self.workers)} / {len(self.workers)}')
                return super(AutoscalePool, self).write(preferred_queue, body)
        except Exception:
            for conn in connections.all():
--- a/awx/main/dispatch/publish.py
+++ b/awx/main/dispatch/publish.py
@@ -2,6 +2,7 @@ import inspect
 import logging
 import sys
 import json
+import time
 from uuid import uuid4

 from django.conf import settings
@@ -49,13 +50,21 @@ class task:
    @task(queue='tower_broadcast')
    def announce():
        print("Run this everywhere!")
+
+    # The special parameter bind_kwargs tells the main dispatcher process to add certain kwargs
+
+    @task(bind_kwargs=['dispatch_time'])
+    def print_time(dispatch_time=None):
+        print(f"Time I was dispatched: {dispatch_time}")
    """

-    def __init__(self, queue=None):
+    def __init__(self, queue=None, bind_kwargs=None):
        self.queue = queue
+        self.bind_kwargs = bind_kwargs

    def __call__(self, fn=None):
        queue = self.queue
+        bind_kwargs = self.bind_kwargs

        class PublisherMixin(object):

@@ -75,10 +84,12 @@ class task:
                    msg = f'{cls.name}: Queue value required and may not be None'
                    logger.error(msg)
                    raise ValueError(msg)
-                obj = {'uuid': task_id, 'args': args, 'kwargs': kwargs, 'task': cls.name}
+                obj = {'uuid': task_id, 'args': args, 'kwargs': kwargs, 'task': cls.name, 'time_pub': time.time()}
                guid = get_guid()
                if guid:
                    obj['guid'] = guid
+                if bind_kwargs:
+                    obj['bind_kwargs'] = bind_kwargs
                obj.update(**kw)
                if callable(queue):
                    queue = queue()
--- a/awx/main/dispatch/reaper.py
+++ b/awx/main/dispatch/reaper.py
@@ -2,6 +2,7 @@ from datetime import timedelta
 import logging

 from django.db.models import Q
+from django.conf import settings
 from django.utils.timezone import now as tz_now
 from django.contrib.contenttypes.models import ContentType

@@ -10,28 +11,76 @@ from awx.main.models import Instance, UnifiedJob, WorkflowJob
 logger = logging.getLogger('awx.main.dispatch')


-def reap_job(j, status):
-    if UnifiedJob.objects.get(id=j.id).status not in ('running', 'waiting'):
+def startup_reaping():
+    """
+    If this particular instance is starting, then we know that any running jobs are invalid
+    so we will reap those jobs as a special action here
+    """
+    try:
+        me = Instance.objects.me()
+    except RuntimeError as e:
+        logger.warning(f'Local instance is not registered, not running startup reaper: {e}')
+        return
+    jobs = UnifiedJob.objects.filter(status='running', controller_node=me.hostname)
+    job_ids = []
+    for j in jobs:
+        job_ids.append(j.id)
+        reap_job(
+            j,
+            'failed',
+            job_explanation='Task was marked as running at system start up. The system must have not shut down properly, so it has been marked as failed.',
+        )
+    if job_ids:
+        logger.error(f'Unified jobs {job_ids} were reaped on dispatch startup')
+
+
+def reap_job(j, status, job_explanation=None):
+    j.refresh_from_db(fields=['status', 'job_explanation'])
+    status_before = j.status
+    if status_before not in ('running', 'waiting'):
        # just in case, don't reap jobs that aren't running
        return
    j.status = status
    j.start_args = ''  # blank field to remove encrypted passwords
-    j.job_explanation += ' '.join(
-        (
-            'Task was marked as running but was not present in',
-            'the job queue, so it has been marked as failed.',
-        )
-    )
+    if j.job_explanation:
+        j.job_explanation += ' '  # Separate messages for readability
+    if job_explanation is None:
+        j.job_explanation += 'Task was marked as running but was not present in the job queue, so it has been marked as failed.'
+    else:
+        j.job_explanation += job_explanation
    j.save(update_fields=['status', 'start_args', 'job_explanation'])
    if hasattr(j, 'send_notification_templates'):
        j.send_notification_templates('failed')
    j.websocket_emit_status(status)
-    logger.error('{} is no longer running; reaping'.format(j.log_format))
+    logger.error(f'{j.log_format} is no longer {status_before}; reaping')


-def reap(instance=None, status='failed', excluded_uuids=[]):
+def reap_waiting(instance=None, status='failed', job_explanation=None, grace_period=None, excluded_uuids=None, ref_time=None):
    """
-    Reap all jobs in waiting|running for this instance.
+    Reap all jobs in waiting for this instance.
+    """
+    if grace_period is None:
+        grace_period = settings.JOB_WAITING_GRACE_PERIOD + settings.TASK_MANAGER_TIMEOUT
+
+    me = instance
+    if me is None:
+        try:
+            me = Instance.objects.me()
+        except RuntimeError as e:
+            logger.warning(f'Local instance is not registered, not running reaper: {e}')
+            return
+    if ref_time is None:
+        ref_time = tz_now()
+    jobs = UnifiedJob.objects.filter(status='waiting', modified__lte=ref_time - timedelta(seconds=grace_period), controller_node=me.hostname)
+    if excluded_uuids:
+        jobs = jobs.exclude(celery_task_id__in=excluded_uuids)
+    for j in jobs:
+        reap_job(j, status, job_explanation=job_explanation)
+
+
+def reap(instance=None, status='failed', job_explanation=None, excluded_uuids=None):
+    """
+    Reap all jobs in running for this instance.
    """
    me = instance
    if me is None:
@@ -40,12 +89,11 @@ def reap(instance=None, status='failed', excluded_uuids=[]):
        except RuntimeError as e:
            logger.warning(f'Local instance is not registered, not running reaper: {e}')
            return
-    now = tz_now()
    workflow_ctype_id = ContentType.objects.get_for_model(WorkflowJob).id
    jobs = UnifiedJob.objects.filter(
-        (Q(status='running') | Q(status='waiting', modified__lte=now - timedelta(seconds=60)))
-        & (Q(execution_node=me.hostname) | Q(controller_node=me.hostname))
-        & ~Q(polymorphic_ctype_id=workflow_ctype_id)
-    ).exclude(celery_task_id__in=excluded_uuids)
+        Q(status='running') & (Q(execution_node=me.hostname) | Q(controller_node=me.hostname)) & ~Q(polymorphic_ctype_id=workflow_ctype_id)
+    )
+    if excluded_uuids:
+        jobs = jobs.exclude(celery_task_id__in=excluded_uuids)
    for j in jobs:
-        reap_job(j, status)
+        reap_job(j, status, job_explanation=job_explanation)
--- a/awx/main/dispatch/worker/base.py
+++ b/awx/main/dispatch/worker/base.py
@@ -17,6 +17,7 @@ from django.conf import settings

 from awx.main.dispatch.pool import WorkerPool
 from awx.main.dispatch import pg_bus_conn
+from awx.main.utils.common import log_excess_runtime

 if 'run_callback_receiver' in sys.argv:
    logger = logging.getLogger('awx.main.commands.run_callback_receiver')
@@ -62,7 +63,7 @@ class AWXConsumerBase(object):
    def control(self, body):
        logger.warning(f'Received control signal:\n{body}')
        control = body.get('control')
-        if control in ('status', 'running'):
+        if control in ('status', 'running', 'cancel'):
            reply_queue = body['reply_to']
            if control == 'status':
                msg = '\n'.join([self.listening_on, self.pool.debug()])
@@ -71,6 +72,17 @@ class AWXConsumerBase(object):
                for worker in self.pool.workers:
                    worker.calculate_managed_tasks()
                    msg.extend(worker.managed_tasks.keys())
+            elif control == 'cancel':
+                msg = []
+                task_ids = set(body['task_ids'])
+                for worker in self.pool.workers:
+                    task = worker.current_task
+                    if task and task['uuid'] in task_ids:
+                        logger.warn(f'Sending SIGTERM to task id={task["uuid"]}, task={task.get("task")}, args={task.get("args")}')
+                        os.kill(worker.pid, signal.SIGTERM)
+                        msg.append(task['uuid'])
+                if task_ids and not msg:
+                    logger.info(f'Could not locate running tasks to cancel with ids={task_ids}')

            with pg_bus_conn() as conn:
                conn.notify(reply_queue, json.dumps(msg))
@@ -81,6 +93,9 @@ class AWXConsumerBase(object):
            logger.error('unrecognized control message: {}'.format(control))

    def process_task(self, body):
+        if isinstance(body, dict):
+            body['time_ack'] = time.time()
+
        if 'control' in body:
            try:
                return self.control(body)
@@ -101,6 +116,7 @@ class AWXConsumerBase(object):
        self.total_messages += 1
        self.record_statistics()

+    @log_excess_runtime(logger)
    def record_statistics(self):
        if time.time() - self.last_stats > 1:  # buffer stat recording to once per second
            try:
@@ -149,7 +165,7 @@ class AWXConsumerPG(AWXConsumerBase):

        while True:
            try:
-                with pg_bus_conn() as conn:
+                with pg_bus_conn(new_connection=True) as conn:
                    for queue in self.queues:
                        conn.listen(queue)
                    if init is False:
@@ -169,8 +185,9 @@ class AWXConsumerPG(AWXConsumerBase):
                    logger.exception(f"Error consuming new events from postgres, will retry for {self.pg_max_wait} s")
                    self.pg_down_time = time.time()
                    self.pg_is_down = True
-                if time.time() - self.pg_down_time > self.pg_max_wait:
-                    logger.warning(f"Postgres event consumer has not recovered in {self.pg_max_wait} s, exiting")
+                current_downtime = time.time() - self.pg_down_time
+                if current_downtime > self.pg_max_wait:
+                    logger.exception(f"Postgres event consumer has not recovered in {current_downtime} s, exiting")
                    raise
                # Wait for a second before next attempt, but still listen for any shutdown signals
                for i in range(10):
@@ -179,6 +196,10 @@ class AWXConsumerPG(AWXConsumerBase):
                    time.sleep(0.1)
                for conn in db.connections.all():
                    conn.close_if_unusable_or_obsolete()
+            except Exception:
+                # Log unanticipated exception in addition to writing to stderr to get timestamps and other metadata
+                logger.exception('Encountered unhandled error in dispatcher main loop')
+                raise


 class BaseWorker(object):
--- a/awx/main/dispatch/worker/callback.py
+++ b/awx/main/dispatch/worker/callback.py
@@ -167,17 +167,27 @@ class CallbackBrokerWorker(BaseWorker):
                try:
                    cls.objects.bulk_create(events)
                    metrics_bulk_events_saved += len(events)
-                except Exception:
+                except Exception as exc:
+                    logger.warning(f'Error in events bulk_create, will try indiviually up to 5 errors, error {str(exc)}')
                    # if an exception occurs, we should re-attempt to save the
                    # events one-by-one, because something in the list is
                    # broken/stale
+                    consecutive_errors = 0
+                    events_saved = 0
                    metrics_events_batch_save_errors += 1
                    for e in events:
                        try:
                            e.save()
-                            metrics_singular_events_saved += 1
-                        except Exception:
-                            logger.exception('Database Error Saving Job Event')
+                            events_saved += 1
+                            consecutive_errors = 0
+                        except Exception as exc_indv:
+                            consecutive_errors += 1
+                            logger.info(f'Database Error Saving individual Job Event, error {str(exc_indv)}')
+                        if consecutive_errors >= 5:
+                            raise
+                    metrics_singular_events_saved += events_saved
+                    if events_saved == 0:
+                        raise
                metrics_duration_to_save = time.perf_counter() - metrics_duration_to_save
                for e in events:
                    if not getattr(e, '_skip_websocket_message', False):
@@ -257,17 +267,18 @@ class CallbackBrokerWorker(BaseWorker):
                try:
                    self.flush(force=flush)
                    break
-                except (OperationalError, InterfaceError, InternalError):
+                except (OperationalError, InterfaceError, InternalError) as exc:
                    if retries >= self.MAX_RETRIES:
                        logger.exception('Worker could not re-establish database connectivity, giving up on one or more events.')
                        return
                    delay = 60 * retries
-                    logger.exception('Database Error Saving Job Event, retry #{i} in {delay} seconds:'.format(i=retries + 1, delay=delay))
+                    logger.warning(f'Database Error Flushing Job Events, retry #{retries + 1} in {delay} seconds: {str(exc)}')
                    django_connection.close()
                    time.sleep(delay)
                    retries += 1
                except DatabaseError:
-                    logger.exception('Database Error Saving Job Event')
+                    logger.exception('Database Error Flushing Job Events')
+                    django_connection.close()
                    break
        except Exception as exc:
            tb = traceback.format_exc()
--- a/awx/main/dispatch/worker/task.py
+++ b/awx/main/dispatch/worker/task.py
@@ -3,6 +3,7 @@ import logging
 import importlib
 import sys
 import traceback
+import time

 from kubernetes.config import kube_config

@@ -60,8 +61,19 @@ class TaskWorker(BaseWorker):
            # the callable is a class, e.g., RunJob; instantiate and
            # return its `run()` method
            _call = _call().run
+
+        log_extra = ''
+        logger_method = logger.debug
+        if ('time_ack' in body) and ('time_pub' in body):
+            time_publish = body['time_ack'] - body['time_pub']
+            time_waiting = time.time() - body['time_ack']
+            if time_waiting > 5.0 or time_publish > 5.0:
+                # If task too a very long time to process, add this information to the log
+                log_extra = f' took {time_publish:.4f} to ack, {time_waiting:.4f} in local dispatcher'
+                logger_method = logger.info
        # don't print kwargs, they often contain launch-time secrets
-        logger.debug('task {} starting {}(*{})'.format(uuid, task, args))
+        logger_method(f'task {uuid} starting {task}(*{args}){log_extra}')
+
        return _call(*args, **kwargs)

    def perform_work(self, body):
--- a/awx/main/management/commands/inventory_import.py
+++ b/awx/main/management/commands/inventory_import.py
@@ -862,7 +862,7 @@ class Command(BaseCommand):
                    overwrite_vars=bool(options.get('overwrite_vars', False)),
                )
                inventory_update = inventory_source.create_inventory_update(
-                    _eager_fields=dict(job_args=json.dumps(sys.argv), job_env=dict(os.environ.items()), job_cwd=os.getcwd())
+                    _eager_fields=dict(status='running', job_args=json.dumps(sys.argv), job_env=dict(os.environ.items()), job_cwd=os.getcwd())
                )

            data = AnsibleInventoryLoader(source=source, verbosity=verbosity).load()
--- a/awx/main/management/commands/list_instances.py
+++ b/awx/main/management/commands/list_instances.py
@@ -54,7 +54,7 @@ class Command(BaseCommand):

                capacity = f' capacity={x.capacity}' if x.node_type != 'hop' else ''
                version = f" version={x.version or '?'}" if x.node_type != 'hop' else ''
-                heartbeat = f' heartbeat="{x.modified:%Y-%m-%d %H:%M:%S}"' if x.capacity or x.node_type == 'hop' else ''
+                heartbeat = f' heartbeat="{x.last_seen:%Y-%m-%d %H:%M:%S}"' if x.capacity or x.node_type == 'hop' else ''
                print(f'\t{color}{x.hostname}{capacity} node_type={x.node_type}{version}{heartbeat}\033[0m')

            print()
--- a/awx/main/management/commands/register_peers.py
+++ b/awx/main/management/commands/register_peers.py
@@ -27,7 +27,9 @@ class Command(BaseCommand):
        )

    def handle(self, **options):
+        # provides a mapping of hostname to Instance objects
        nodes = Instance.objects.in_bulk(field_name='hostname')
+
        if options['source'] not in nodes:
            raise CommandError(f"Host {options['source']} is not a registered instance.")
        if not (options['peers'] or options['disconnect'] or options['exact'] is not None):
@@ -57,7 +59,9 @@ class Command(BaseCommand):

            results = 0
            for target in options['peers']:
-                _, created = InstanceLink.objects.get_or_create(source=nodes[options['source']], target=nodes[target])
+                _, created = InstanceLink.objects.update_or_create(
+                    source=nodes[options['source']], target=nodes[target], defaults={'link_state': InstanceLink.States.ESTABLISHED}
+                )
                if created:
                    results += 1

@@ -80,7 +84,9 @@ class Command(BaseCommand):
                links = set(InstanceLink.objects.filter(source=nodes[options['source']]).values_list('target__hostname', flat=True))
                removals, _ = InstanceLink.objects.filter(source=nodes[options['source']], target__hostname__in=links - peers).delete()
                for target in peers - links:
-                    _, created = InstanceLink.objects.get_or_create(source=nodes[options['source']], target=nodes[target])
+                    _, created = InstanceLink.objects.update_or_create(
+                        source=nodes[options['source']], target=nodes[target], defaults={'link_state': InstanceLink.States.ESTABLISHED}
+                    )
                    if created:
                        additions += 1

--- a/awx/main/management/commands/run_dispatcher.py
+++ b/awx/main/management/commands/run_dispatcher.py
@@ -1,13 +1,14 @@
 # Copyright (c) 2015 Ansible, Inc.
 # All Rights Reserved.
 import logging
+import yaml

 from django.conf import settings
 from django.core.cache import cache as django_cache
 from django.core.management.base import BaseCommand
 from django.db import connection as django_connection

-from awx.main.dispatch import get_local_queuename, reaper
+from awx.main.dispatch import get_local_queuename
 from awx.main.dispatch.control import Control
 from awx.main.dispatch.pool import AutoscalePool
 from awx.main.dispatch.worker import AWXConsumerPG, TaskWorker
@@ -30,7 +31,16 @@ class Command(BaseCommand):
            '--reload',
            dest='reload',
            action='store_true',
-            help=('cause the dispatcher to recycle all of its worker processes;' 'running jobs will run to completion first'),
+            help=('cause the dispatcher to recycle all of its worker processes; running jobs will run to completion first'),
+        )
+        parser.add_argument(
+            '--cancel',
+            dest='cancel',
+            help=(
+                'Cancel a particular task id. Takes either a single id string, or a JSON list of multiple ids. '
+                'Can take in output from the --running argument as input to cancel all tasks. '
+                'Only running tasks can be canceled, queued tasks must be started before they can be canceled.'
+            ),
        )

    def handle(self, *arg, **options):
@@ -42,6 +52,16 @@ class Command(BaseCommand):
            return
        if options.get('reload'):
            return Control('dispatcher').control({'control': 'reload'})
+        if options.get('cancel'):
+            cancel_str = options.get('cancel')
+            try:
+                cancel_data = yaml.safe_load(cancel_str)
+            except Exception:
+                cancel_data = [cancel_str]
+            if not isinstance(cancel_data, list):
+                cancel_data = [cancel_str]
+            print(Control('dispatcher').cancel(cancel_data))
+            return

        # It's important to close these because we're _about_ to fork, and we
        # don't want the forked processes to inherit the open sockets
@@ -53,7 +73,6 @@ class Command(BaseCommand):
        # (like the node heartbeat)
        periodic.run_continuously()

-        reaper.reap()
        consumer = None

        try:
--- a/awx/main/management/commands/run_wsbroadcast.py
+++ b/awx/main/management/commands/run_wsbroadcast.py
@@ -95,8 +95,13 @@ class Command(BaseCommand):
        # database migrations are still running
        from awx.main.models.ha import Instance

-        executor = MigrationExecutor(connection)
-        migrating = bool(executor.migration_plan(executor.loader.graph.leaf_nodes()))
+        try:
+            executor = MigrationExecutor(connection)
+            migrating = bool(executor.migration_plan(executor.loader.graph.leaf_nodes()))
+        except Exception as exc:
+            logger.info(f'Error on startup of run_wsbroadcast (error: {exc}), retry in 10s...')
+            time.sleep(10)
+            return

        # In containerized deployments, migrations happen in the task container,
        # and the services running there don't start until migrations are
--- a/awx/main/managers.py
+++ b/awx/main/managers.py
@@ -129,10 +129,13 @@ class InstanceManager(models.Manager):
                # if instance was not retrieved by uuid and hostname was, use the hostname
                instance = self.filter(hostname=hostname)

+            from awx.main.models import Instance
+
            # Return existing instance
            if instance.exists():
                instance = instance.first()  # in the unusual occasion that there is more than one, only get one
-                update_fields = []
+                instance.node_state = Instance.States.INSTALLED  # Wait for it to show up on the mesh
+                update_fields = ['node_state']
                # if instance was retrieved by uuid and hostname has changed, update hostname
                if instance.hostname != hostname:
                    logger.warning("passed in hostname {0} is different from the original hostname {1}, updating to {0}".format(hostname, instance.hostname))
@@ -141,6 +144,7 @@ class InstanceManager(models.Manager):
                # if any other fields are to be updated
                if instance.ip_address != ip_address:
                    instance.ip_address = ip_address
+                    update_fields.append('ip_address')
                if instance.node_type != node_type:
                    instance.node_type = node_type
                    update_fields.append('node_type')
@@ -151,12 +155,12 @@ class InstanceManager(models.Manager):
                    return (False, instance)

            # Create new instance, and fill in default values
-            create_defaults = dict(capacity=0)
+            create_defaults = {'node_state': Instance.States.INSTALLED, 'capacity': 0}
            if defaults is not None:
                create_defaults.update(defaults)
            uuid_option = {}
            if uuid is not None:
-                uuid_option = dict(uuid=uuid)
+                uuid_option = {'uuid': uuid}
            if node_type == 'execution' and 'version' not in create_defaults:
                create_defaults['version'] = RECEPTOR_PENDING
            instance = self.create(hostname=hostname, ip_address=ip_address, node_type=node_type, **create_defaults, **uuid_option)
--- a/awx/main/migrations/0164_remove_inventorysource_update_on_project_update.py
+++ b/awx/main/migrations/0164_remove_inventorysource_update_on_project_update.py
@@ -0,0 +1,40 @@
+# Generated by Django 3.2.13 on 2022-06-21 21:29
+
+from django.db import migrations
+import logging
+
+logger = logging.getLogger("awx")
+
+
+def forwards(apps, schema_editor):
+    InventorySource = apps.get_model('main', 'InventorySource')
+    sources = InventorySource.objects.filter(update_on_project_update=True)
+    for src in sources:
+        if src.update_on_launch == False:
+            src.update_on_launch = True
+            src.save(update_fields=['update_on_launch'])
+            logger.info(f"Setting update_on_launch to True for {src}")
+        proj = src.source_project
+        if proj and proj.scm_update_on_launch is False:
+            proj.scm_update_on_launch = True
+            proj.save(update_fields=['scm_update_on_launch'])
+            logger.warning(f"Setting scm_update_on_launch to True for {proj}")
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0163_convert_job_tags_to_textfield'),
+    ]
+
+    operations = [
+        migrations.RunPython(forwards, migrations.RunPython.noop),
+        migrations.RemoveField(
+            model_name='inventorysource',
+            name='scm_last_revision',
+        ),
+        migrations.RemoveField(
+            model_name='inventorysource',
+            name='update_on_project_update',
+        ),
+    ]
--- a/awx/main/migrations/0165_task_manager_refactor.py
+++ b/awx/main/migrations/0165_task_manager_refactor.py
@@ -0,0 +1,35 @@
+# Generated by Django 3.2.13 on 2022-08-10 14:03
+
+from django.db import migrations, models
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0164_remove_inventorysource_update_on_project_update'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='unifiedjob',
+            name='preferred_instance_groups_cache',
+            field=models.JSONField(
+                blank=True, default=None, editable=False, help_text='A cached list with pk values from preferred instance groups.', null=True
+            ),
+        ),
+        migrations.AddField(
+            model_name='unifiedjob',
+            name='task_impact',
+            field=models.PositiveIntegerField(default=0, editable=False, help_text='Number of forks an instance consumes when running this job.'),
+        ),
+        migrations.AddField(
+            model_name='workflowapproval',
+            name='expires',
+            field=models.DateTimeField(
+                default=None,
+                editable=False,
+                help_text='The time this approval will expire. This is the created time plus timeout, used for filtering.',
+                null=True,
+            ),
+        ),
+    ]
--- a/awx/main/migrations/0166_alter_jobevent_host.py
+++ b/awx/main/migrations/0166_alter_jobevent_host.py
@@ -0,0 +1,40 @@
+# Generated by Django 3.2.13 on 2022-07-06 13:19
+
+from django.db import migrations, models
+import django.db.models.deletion
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0165_task_manager_refactor'),
+    ]
+
+    operations = [
+        migrations.AlterField(
+            model_name='adhoccommandevent',
+            name='host',
+            field=models.ForeignKey(
+                db_constraint=False,
+                default=None,
+                editable=False,
+                null=True,
+                on_delete=django.db.models.deletion.SET_NULL,
+                related_name='ad_hoc_command_events',
+                to='main.host',
+            ),
+        ),
+        migrations.AlterField(
+            model_name='jobevent',
+            name='host',
+            field=models.ForeignKey(
+                db_constraint=False,
+                default=None,
+                editable=False,
+                null=True,
+                on_delete=django.db.models.deletion.DO_NOTHING,
+                related_name='job_events_as_primary_host',
+                to='main.host',
+            ),
+        ),
+    ]
--- a/awx/main/migrations/0167_project_signature_validation_credential.py
+++ b/awx/main/migrations/0167_project_signature_validation_credential.py
@@ -0,0 +1,57 @@
+# Generated by Django 3.2.13 on 2022-08-24 14:02
+
+from django.db import migrations, models
+import django.db.models.deletion
+
+from awx.main.models import CredentialType
+from awx.main.utils.common import set_current_apps
+
+
+def setup_tower_managed_defaults(apps, schema_editor):
+    set_current_apps(apps)
+    CredentialType.setup_tower_managed_defaults(apps)
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0166_alter_jobevent_host'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='project',
+            name='signature_validation_credential',
+            field=models.ForeignKey(
+                blank=True,
+                default=None,
+                null=True,
+                on_delete=django.db.models.deletion.SET_NULL,
+                related_name='projects_signature_validation',
+                to='main.credential',
+                help_text='An optional credential used for validating files in the project against unexpected changes.',
+            ),
+        ),
+        migrations.AlterField(
+            model_name='credentialtype',
+            name='kind',
+            field=models.CharField(
+                choices=[
+                    ('ssh', 'Machine'),
+                    ('vault', 'Vault'),
+                    ('net', 'Network'),
+                    ('scm', 'Source Control'),
+                    ('cloud', 'Cloud'),
+                    ('registry', 'Container Registry'),
+                    ('token', 'Personal Access Token'),
+                    ('insights', 'Insights'),
+                    ('external', 'External'),
+                    ('kubernetes', 'Kubernetes'),
+                    ('galaxy', 'Galaxy/Automation Hub'),
+                    ('cryptography', 'Cryptography'),
+                ],
+                max_length=32,
+            ),
+        ),
+        migrations.RunPython(setup_tower_managed_defaults),
+    ]
--- a/awx/main/migrations/0168_inventoryupdate_scm_revision.py
+++ b/awx/main/migrations/0168_inventoryupdate_scm_revision.py
@@ -0,0 +1,25 @@
+# Generated by Django 3.2.13 on 2022-09-08 16:03
+
+from django.db import migrations, models
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0167_project_signature_validation_credential'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='inventoryupdate',
+            name='scm_revision',
+            field=models.CharField(
+                blank=True,
+                default='',
+                editable=False,
+                help_text='The SCM Revision from the Project used for this inventory update.  Only applicable to inventories source from scm',
+                max_length=1024,
+                verbose_name='SCM Revision',
+            ),
+        ),
+    ]
--- a/awx/main/migrations/0169_jt_prompt_everything_on_launch.py
+++ b/awx/main/migrations/0169_jt_prompt_everything_on_launch.py
@@ -0,0 +1,225 @@
+# Generated by Django 3.2.13 on 2022-09-15 14:07
+
+import awx.main.fields
+import awx.main.utils.polymorphic
+from django.db import migrations, models
+import django.db.models.deletion
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0168_inventoryupdate_scm_revision'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='joblaunchconfig',
+            name='execution_environment',
+            field=models.ForeignKey(
+                blank=True,
+                default=None,
+                help_text='The container image to be used for execution.',
+                null=True,
+                on_delete=awx.main.utils.polymorphic.SET_NULL,
+                related_name='joblaunchconfig_as_prompt',
+                to='main.executionenvironment',
+            ),
+        ),
+        migrations.AddField(
+            model_name='joblaunchconfig',
+            name='labels',
+            field=models.ManyToManyField(related_name='joblaunchconfig_labels', to='main.Label'),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='ask_execution_environment_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='ask_forks_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='ask_instance_groups_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='ask_job_slice_count_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='ask_labels_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='ask_timeout_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='schedule',
+            name='execution_environment',
+            field=models.ForeignKey(
+                blank=True,
+                default=None,
+                help_text='The container image to be used for execution.',
+                null=True,
+                on_delete=awx.main.utils.polymorphic.SET_NULL,
+                related_name='schedule_as_prompt',
+                to='main.executionenvironment',
+            ),
+        ),
+        migrations.AddField(
+            model_name='schedule',
+            name='labels',
+            field=models.ManyToManyField(related_name='schedule_labels', to='main.Label'),
+        ),
+        migrations.AddField(
+            model_name='workflowjobnode',
+            name='execution_environment',
+            field=models.ForeignKey(
+                blank=True,
+                default=None,
+                help_text='The container image to be used for execution.',
+                null=True,
+                on_delete=awx.main.utils.polymorphic.SET_NULL,
+                related_name='workflowjobnode_as_prompt',
+                to='main.executionenvironment',
+            ),
+        ),
+        migrations.AddField(
+            model_name='workflowjobnode',
+            name='labels',
+            field=models.ManyToManyField(related_name='workflowjobnode_labels', to='main.Label'),
+        ),
+        migrations.AddField(
+            model_name='workflowjobtemplate',
+            name='ask_labels_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='workflowjobtemplate',
+            name='ask_skip_tags_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='workflowjobtemplate',
+            name='ask_tags_on_launch',
+            field=awx.main.fields.AskForField(blank=True, default=False),
+        ),
+        migrations.AddField(
+            model_name='workflowjobtemplatenode',
+            name='execution_environment',
+            field=models.ForeignKey(
+                blank=True,
+                default=None,
+                help_text='The container image to be used for execution.',
+                null=True,
+                on_delete=awx.main.utils.polymorphic.SET_NULL,
+                related_name='workflowjobtemplatenode_as_prompt',
+                to='main.executionenvironment',
+            ),
+        ),
+        migrations.AddField(
+            model_name='workflowjobtemplatenode',
+            name='labels',
+            field=models.ManyToManyField(related_name='workflowjobtemplatenode_labels', to='main.Label'),
+        ),
+        migrations.CreateModel(
+            name='WorkflowJobTemplateNodeBaseInstanceGroupMembership',
+            fields=[
+                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
+                ('position', models.PositiveIntegerField(db_index=True, default=None, null=True)),
+                ('instancegroup', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.instancegroup')),
+                ('workflowjobtemplatenode', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.workflowjobtemplatenode')),
+            ],
+        ),
+        migrations.CreateModel(
+            name='WorkflowJobNodeBaseInstanceGroupMembership',
+            fields=[
+                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
+                ('position', models.PositiveIntegerField(db_index=True, default=None, null=True)),
+                ('instancegroup', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.instancegroup')),
+                ('workflowjobnode', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.workflowjobnode')),
+            ],
+        ),
+        migrations.CreateModel(
+            name='WorkflowJobInstanceGroupMembership',
+            fields=[
+                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
+                ('position', models.PositiveIntegerField(db_index=True, default=None, null=True)),
+                ('instancegroup', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.instancegroup')),
+                ('workflowjobnode', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.workflowjob')),
+            ],
+        ),
+        migrations.CreateModel(
+            name='ScheduleInstanceGroupMembership',
+            fields=[
+                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
+                ('position', models.PositiveIntegerField(db_index=True, default=None, null=True)),
+                ('instancegroup', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.instancegroup')),
+                ('schedule', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.schedule')),
+            ],
+        ),
+        migrations.CreateModel(
+            name='JobLaunchConfigInstanceGroupMembership',
+            fields=[
+                ('id', models.AutoField(auto_created=True, primary_key=True, serialize=False, verbose_name='ID')),
+                ('position', models.PositiveIntegerField(db_index=True, default=None, null=True)),
+                ('instancegroup', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.instancegroup')),
+                ('joblaunchconfig', models.ForeignKey(on_delete=django.db.models.deletion.CASCADE, to='main.joblaunchconfig')),
+            ],
+        ),
+        migrations.AddField(
+            model_name='joblaunchconfig',
+            name='instance_groups',
+            field=awx.main.fields.OrderedManyToManyField(
+                blank=True, editable=False, related_name='joblaunchconfigs', through='main.JobLaunchConfigInstanceGroupMembership', to='main.InstanceGroup'
+            ),
+        ),
+        migrations.AddField(
+            model_name='schedule',
+            name='instance_groups',
+            field=awx.main.fields.OrderedManyToManyField(
+                blank=True, editable=False, related_name='schedule_instance_groups', through='main.ScheduleInstanceGroupMembership', to='main.InstanceGroup'
+            ),
+        ),
+        migrations.AddField(
+            model_name='workflowjob',
+            name='instance_groups',
+            field=awx.main.fields.OrderedManyToManyField(
+                blank=True,
+                editable=False,
+                related_name='workflow_job_instance_groups',
+                through='main.WorkflowJobInstanceGroupMembership',
+                to='main.InstanceGroup',
+            ),
+        ),
+        migrations.AddField(
+            model_name='workflowjobnode',
+            name='instance_groups',
+            field=awx.main.fields.OrderedManyToManyField(
+                blank=True,
+                editable=False,
+                related_name='workflow_job_node_instance_groups',
+                through='main.WorkflowJobNodeBaseInstanceGroupMembership',
+                to='main.InstanceGroup',
+            ),
+        ),
+        migrations.AddField(
+            model_name='workflowjobtemplatenode',
+            name='instance_groups',
+            field=awx.main.fields.OrderedManyToManyField(
+                blank=True,
+                editable=False,
+                related_name='workflow_job_template_node_instance_groups',
+                through='main.WorkflowJobTemplateNodeBaseInstanceGroupMembership',
+                to='main.InstanceGroup',
+            ),
+        ),
+    ]
--- a/awx/main/migrations/0170_node_and_link_state.py
+++ b/awx/main/migrations/0170_node_and_link_state.py
@@ -0,0 +1,79 @@
+# Generated by Django 3.2.13 on 2022-08-02 17:53
+
+import django.core.validators
+from django.db import migrations, models
+
+
+def forwards(apps, schema_editor):
+    # All existing InstanceLink objects need to be in the state
+    # 'Established', which is the default, so nothing needs to be done
+    # for that.
+
+    Instance = apps.get_model('main', 'Instance')
+    for instance in Instance.objects.all():
+        instance.node_state = 'ready' if not instance.errors else 'unavailable'
+        instance.save(update_fields=['node_state'])
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0169_jt_prompt_everything_on_launch'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='instance',
+            name='listener_port',
+            field=models.PositiveIntegerField(
+                blank=True,
+                default=27199,
+                help_text='Port that Receptor will listen for incoming connections on.',
+                validators=[django.core.validators.MinValueValidator(1), django.core.validators.MaxValueValidator(65535)],
+            ),
+        ),
+        migrations.AddField(
+            model_name='instance',
+            name='node_state',
+            field=models.CharField(
+                choices=[
+                    ('provisioning', 'Provisioning'),
+                    ('provision-fail', 'Provisioning Failure'),
+                    ('installed', 'Installed'),
+                    ('ready', 'Ready'),
+                    ('unavailable', 'Unavailable'),
+                    ('deprovisioning', 'De-provisioning'),
+                    ('deprovision-fail', 'De-provisioning Failure'),
+                ],
+                default='ready',
+                help_text='Indicates the current life cycle stage of this instance.',
+                max_length=16,
+            ),
+        ),
+        migrations.AddField(
+            model_name='instancelink',
+            name='link_state',
+            field=models.CharField(
+                choices=[('adding', 'Adding'), ('established', 'Established'), ('removing', 'Removing')],
+                default='established',
+                help_text='Indicates the current life cycle stage of this peer link.',
+                max_length=16,
+            ),
+        ),
+        migrations.AlterField(
+            model_name='instance',
+            name='node_type',
+            field=models.CharField(
+                choices=[
+                    ('control', 'Control plane node'),
+                    ('execution', 'Execution plane node'),
+                    ('hybrid', 'Controller and execution'),
+                    ('hop', 'Message-passing node, no execution capability'),
+                ],
+                default='hybrid',
+                help_text='Role that this node plays in the mesh.',
+                max_length=16,
+            ),
+        ),
+        migrations.RunPython(forwards, reverse_code=migrations.RunPython.noop),
+    ]
--- a/awx/main/migrations/0171_add_health_check_started.py
+++ b/awx/main/migrations/0171_add_health_check_started.py
@@ -0,0 +1,18 @@
+# Generated by Django 3.2.13 on 2022-09-26 20:54
+
+from django.db import migrations, models
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0170_node_and_link_state'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='instance',
+            name='health_check_started',
+            field=models.DateTimeField(editable=False, help_text='The last time a health check was initiated on this instance.', null=True),
+        ),
+    ]
--- a/awx/main/migrations/0172_prevent_instance_fallback.py
+++ b/awx/main/migrations/0172_prevent_instance_fallback.py
@@ -0,0 +1,29 @@
+# Generated by Django 3.2.13 on 2022-09-29 18:10
+
+from django.db import migrations, models
+
+
+class Migration(migrations.Migration):
+
+    dependencies = [
+        ('main', '0171_add_health_check_started'),
+    ]
+
+    operations = [
+        migrations.AddField(
+            model_name='inventory',
+            name='prevent_instance_group_fallback',
+            field=models.BooleanField(
+                default=False,
+                help_text='If enabled, the inventory will prevent adding any organization instance groups to the list of preferred instances groups to run associated job templates on.If this setting is enabled and you provided an empty list, the global instance groups will be applied.',
+            ),
+        ),
+        migrations.AddField(
+            model_name='jobtemplate',
+            name='prevent_instance_group_fallback',
+            field=models.BooleanField(
+                default=False,
+                help_text='If enabled, the job template will prevent adding any inventory or organization instance groups to the list of preferred instances groups to run on.If this setting is enabled and you provided an empty list, the global instance groups will be applied.',
+            ),
+        ),
+    ]
--- a/awx/main/migrations/_create_system_jobs.py
+++ b/awx/main/migrations/_create_system_jobs.py
@@ -36,7 +36,7 @@ def create_clearsessions_jt(apps, schema_editor):
    if created:
        sched = Schedule(
            name='Cleanup Expired Sessions',
-            rrule='DTSTART:%s RRULE:FREQ=WEEKLY;INTERVAL=1;COUNT=1' % schedule_time,
+            rrule='DTSTART:%s RRULE:FREQ=WEEKLY;INTERVAL=1' % schedule_time,
            description='Cleans out expired browser sessions',
            enabled=True,
            created=now_dt,
@@ -69,7 +69,7 @@ def create_cleartokens_jt(apps, schema_editor):
    if created:
        sched = Schedule(
            name='Cleanup Expired OAuth 2 Tokens',
-            rrule='DTSTART:%s RRULE:FREQ=WEEKLY;INTERVAL=1;COUNT=1' % schedule_time,
+            rrule='DTSTART:%s RRULE:FREQ=WEEKLY;INTERVAL=1' % schedule_time,
            description='Removes expired OAuth 2 access and refresh tokens',
            enabled=True,
            created=now_dt,
--- a/awx/main/models/ad_hoc_commands.py
+++ b/awx/main/models/ad_hoc_commands.py
@@ -90,6 +90,9 @@ class AdHocCommand(UnifiedJob, JobNotificationMixin):

    extra_vars_dict = VarsDictProperty('extra_vars', True)

+    def _set_default_dependencies_processed(self):
+        self.dependencies_processed = True
+
    def clean_inventory(self):
        inv = self.inventory
        if not inv:
@@ -178,12 +181,12 @@ class AdHocCommand(UnifiedJob, JobNotificationMixin):
    def get_passwords_needed_to_start(self):
        return self.passwords_needed_to_start

-    @property
-    def task_impact(self):
+    def _get_task_impact(self):
        # NOTE: We sorta have to assume the host count matches and that forks default to 5
-        from awx.main.models.inventory import Host
-
-        count_hosts = Host.objects.filter(enabled=True, inventory__ad_hoc_commands__pk=self.pk).count()
+        if self.inventory:
+            count_hosts = self.inventory.total_hosts
+        else:
+            count_hosts = 5
        return min(count_hosts, 5 if self.forks == 0 else self.forks) + 1

    def copy(self):
@@ -207,23 +210,32 @@ class AdHocCommand(UnifiedJob, JobNotificationMixin):

    def save(self, *args, **kwargs):
        update_fields = kwargs.get('update_fields', [])
+
+        def add_to_update_fields(name):
+            if name not in update_fields:
+                update_fields.append(name)
+
+        if not self.preferred_instance_groups_cache:
+            self.preferred_instance_groups_cache = self._get_preferred_instance_group_cache()
+            add_to_update_fields("preferred_instance_groups_cache")
        if not self.name:
            self.name = Truncator(u': '.join(filter(None, (self.module_name, self.module_args)))).chars(512)
-            if 'name' not in update_fields:
-                update_fields.append('name')
+            add_to_update_fields("name")
+        if self.task_impact == 0:
+            self.task_impact = self._get_task_impact()
+            add_to_update_fields("task_impact")
        super(AdHocCommand, self).save(*args, **kwargs)

    @property
    def preferred_instance_groups(self):
-        if self.inventory is not None and self.inventory.organization is not None:
-            organization_groups = [x for x in self.inventory.organization.instance_groups.all()]
-        else:
-            organization_groups = []
+        selected_groups = []
        if self.inventory is not None:
-            inventory_groups = [x for x in self.inventory.instance_groups.all()]
-        else:
-            inventory_groups = []
-        selected_groups = inventory_groups + organization_groups
+            for instance_group in self.inventory.instance_groups.all():
+                selected_groups.append(instance_group)
+            if not self.inventory.prevent_instance_group_fallback and self.inventory.organization is not None:
+                for instance_group in self.inventory.organization.instance_groups.all():
+                    selected_groups.append(instance_group)
+
        if not selected_groups:
            return self.global_instance_groups
        return selected_groups
--- a/awx/main/models/base.py
+++ b/awx/main/models/base.py
@@ -316,16 +316,17 @@ class PrimordialModel(HasEditsMixin, CreatedModifiedModel):
        user = get_current_user()
        if user and not user.id:
            user = None
-        if not self.pk and not self.created_by:
+        if (not self.pk) and (user is not None) and (not self.created_by):
            self.created_by = user
            if 'created_by' not in update_fields:
                update_fields.append('created_by')
        # Update modified_by if any editable fields have changed
        new_values = self._get_fields_snapshot()
        if (not self.pk and not self.modified_by) or self._values_have_edits(new_values):
-            self.modified_by = user
-            if 'modified_by' not in update_fields:
-                update_fields.append('modified_by')
+            if self.modified_by != user:
+                self.modified_by = user
+                if 'modified_by' not in update_fields:
+                    update_fields.append('modified_by')
        super(PrimordialModel, self).save(*args, **kwargs)
        self._prior_values_store = new_values

--- a/awx/main/models/credential/init.py
+++ b/awx/main/models/credential/init.py
@@ -336,6 +336,7 @@ class CredentialType(CommonModelNameNotUnique):
        ('external', _('External')),
        ('kubernetes', _('Kubernetes')),
        ('galaxy', _('Galaxy/Automation Hub')),
+        ('cryptography', _('Cryptography')),
    )

    kind = models.CharField(max_length=32, choices=KIND_CHOICES)
@@ -1171,6 +1172,25 @@ ManagedCredentialType(
    },
 )

+ManagedCredentialType(
+    namespace='gpg_public_key',
+    kind='cryptography',
+    name=gettext_noop('GPG Public Key'),
+    inputs={
+        'fields': [
+            {
+                'id': 'gpg_public_key',
+                'label': gettext_noop('GPG Public Key'),
+                'type': 'string',
+                'secret': True,
+                'multiline': True,
+                'help_text': gettext_noop('GPG Public Key used to validate content signatures.'),
+            },
+        ],
+        'required': ['gpg_public_key'],
+    },
+)
+

 class CredentialInputSource(PrimordialModel):
    class Meta:
--- a/awx/main/models/credential/injectors.py
+++ b/awx/main/models/credential/injectors.py
@@ -35,6 +35,7 @@ def gce(cred, env, private_data_dir):
    container_path = to_container_path(path, private_data_dir)
    env['GCE_CREDENTIALS_FILE_PATH'] = container_path
    env['GCP_SERVICE_ACCOUNT_FILE'] = container_path
+    env['GOOGLE_APPLICATION_CREDENTIALS'] = container_path

    # Handle env variables for new module types.
    # This includes gcp_compute inventory plugin and
--- a/awx/main/models/events.py
+++ b/awx/main/models/events.py
@@ -25,7 +25,6 @@ analytics_logger = logging.getLogger('awx.analytics.job_events')

 logger = logging.getLogger('awx.main.models.events')

-
 __all__ = ['JobEvent', 'ProjectUpdateEvent', 'AdHocCommandEvent', 'InventoryUpdateEvent', 'SystemJobEvent']


@@ -486,13 +485,18 @@ class JobEvent(BasePlaybookEvent):
        editable=False,
        db_index=False,
    )
+    # When we partitioned the table we accidentally "lost" the foreign key constraint.
+    # However this is good because the cascade on delete at the django layer was causing DB issues
+    # We are going to leave this as a foreign key but mark it as not having a DB relation and
+    #  prevent cascading on delete.
    host = models.ForeignKey(
        'Host',
        related_name='job_events_as_primary_host',
        null=True,
        default=None,
-        on_delete=models.SET_NULL,
+        on_delete=models.DO_NOTHING,
        editable=False,
+        db_constraint=False,
    )
    host_name = models.CharField(
        max_length=1024,
@@ -794,6 +798,10 @@ class AdHocCommandEvent(BaseCommandEvent):
        editable=False,
        db_index=False,
    )
+    # We need to keep this as a FK in the model because AdHocCommand uses a ManyToMany field
+    #   to hosts through adhoc_events. But in https://github.com/ansible/awx/pull/8236/ we
+    #   removed the nulling of the field in case of a host going away before an event is saved
+    #   so this needs to stay SET_NULL on the ORM level
    host = models.ForeignKey(
        'Host',
        related_name='ad_hoc_command_events',
@@ -801,6 +809,7 @@ class AdHocCommandEvent(BaseCommandEvent):
        default=None,
        on_delete=models.SET_NULL,
        editable=False,
+        db_constraint=False,
    )
    host_name = models.CharField(
        max_length=1024,
--- a/awx/main/models/ha.py
+++ b/awx/main/models/ha.py
@@ -5,13 +5,14 @@ from decimal import Decimal
 import logging
 import os

-from django.core.validators import MinValueValidator
+from django.core.validators import MinValueValidator, MaxValueValidator
 from django.db import models, connection
 from django.db.models.signals import post_save, post_delete
 from django.dispatch import receiver
 from django.utils.translation import gettext_lazy as _
 from django.conf import settings
 from django.utils.timezone import now, timedelta
+from django.db.models import Sum

 import redis
 from solo.models import SingletonModel
@@ -58,6 +59,15 @@ class InstanceLink(BaseModel):
    source = models.ForeignKey('Instance', on_delete=models.CASCADE, related_name='+')
    target = models.ForeignKey('Instance', on_delete=models.CASCADE, related_name='reverse_peers')

+    class States(models.TextChoices):
+        ADDING = 'adding', _('Adding')
+        ESTABLISHED = 'established', _('Established')
+        REMOVING = 'removing', _('Removing')
+
+    link_state = models.CharField(
+        choices=States.choices, default=States.ESTABLISHED, max_length=16, help_text=_("Indicates the current life cycle stage of this peer link.")
+    )
+
    class Meta:
        unique_together = ('source', 'target')

@@ -104,6 +114,11 @@ class Instance(HasPolicyEditsMixin, BaseModel):
        editable=False,
        help_text=_('Last time instance ran its heartbeat task for main cluster nodes. Last known connection to receptor mesh for execution nodes.'),
    )
+    health_check_started = models.DateTimeField(
+        null=True,
+        editable=False,
+        help_text=_("The last time a health check was initiated on this instance."),
+    )
    last_health_check = models.DateTimeField(
        null=True,
        editable=False,
@@ -126,13 +141,33 @@ class Instance(HasPolicyEditsMixin, BaseModel):
        default=0,
        editable=False,
    )
-    NODE_TYPE_CHOICES = [
-        ("control", "Control plane node"),
-        ("execution", "Execution plane node"),
-        ("hybrid", "Controller and execution"),
-        ("hop", "Message-passing node, no execution capability"),
-    ]
-    node_type = models.CharField(default='hybrid', choices=NODE_TYPE_CHOICES, max_length=16)
+
+    class Types(models.TextChoices):
+        CONTROL = 'control', _("Control plane node")
+        EXECUTION = 'execution', _("Execution plane node")
+        HYBRID = 'hybrid', _("Controller and execution")
+        HOP = 'hop', _("Message-passing node, no execution capability")
+
+    node_type = models.CharField(default=Types.HYBRID, choices=Types.choices, max_length=16, help_text=_("Role that this node plays in the mesh."))
+
+    class States(models.TextChoices):
+        PROVISIONING = 'provisioning', _('Provisioning')
+        PROVISION_FAIL = 'provision-fail', _('Provisioning Failure')
+        INSTALLED = 'installed', _('Installed')
+        READY = 'ready', _('Ready')
+        UNAVAILABLE = 'unavailable', _('Unavailable')
+        DEPROVISIONING = 'deprovisioning', _('De-provisioning')
+        DEPROVISION_FAIL = 'deprovision-fail', _('De-provisioning Failure')
+
+    node_state = models.CharField(
+        choices=States.choices, default=States.READY, max_length=16, help_text=_("Indicates the current life cycle stage of this instance.")
+    )
+    listener_port = models.PositiveIntegerField(
+        blank=True,
+        default=27199,
+        validators=[MinValueValidator(1), MaxValueValidator(65535)],
+        help_text=_("Port that Receptor will listen for incoming connections on."),
+    )

    peers = models.ManyToManyField('self', symmetrical=False, through=InstanceLink, through_fields=('source', 'target'))

@@ -149,10 +184,13 @@ class Instance(HasPolicyEditsMixin, BaseModel):
    def consumed_capacity(self):
        capacity_consumed = 0
        if self.node_type in ('hybrid', 'execution'):
-            capacity_consumed += sum(x.task_impact for x in UnifiedJob.objects.filter(execution_node=self.hostname, status__in=('running', 'waiting')))
+            capacity_consumed += (
+                UnifiedJob.objects.filter(execution_node=self.hostname, status__in=('running', 'waiting')).aggregate(Sum("task_impact"))["task_impact__sum"]
+                or 0
+            )
        if self.node_type in ('hybrid', 'control'):
-            capacity_consumed += sum(
-                settings.AWX_CONTROL_NODE_TASK_IMPACT for x in UnifiedJob.objects.filter(controller_node=self.hostname, status__in=('running', 'waiting'))
+            capacity_consumed += (
+                settings.AWX_CONTROL_NODE_TASK_IMPACT * UnifiedJob.objects.filter(controller_node=self.hostname, status__in=('running', 'waiting')).count()
            )
        return capacity_consumed

@@ -174,6 +212,14 @@ class Instance(HasPolicyEditsMixin, BaseModel):
    def jobs_total(self):
        return UnifiedJob.objects.filter(execution_node=self.hostname).count()

+    @property
+    def health_check_pending(self):
+        if self.health_check_started is None:
+            return False
+        if self.last_health_check is None:
+            return True
+        return self.health_check_started > self.last_health_check
+
    def get_cleanup_task_kwargs(self, **kwargs):
        """
        Produce options to use for the command: ansible-runner worker cleanup
@@ -203,24 +249,28 @@ class Instance(HasPolicyEditsMixin, BaseModel):
            return True
        if ref_time is None:
            ref_time = now()
-        grace_period = settings.CLUSTER_NODE_HEARTBEAT_PERIOD * 2
+        grace_period = settings.CLUSTER_NODE_HEARTBEAT_PERIOD * settings.CLUSTER_NODE_MISSED_HEARTBEAT_TOLERANCE
        if self.node_type in ('execution', 'hop'):
            grace_period += settings.RECEPTOR_SERVICE_ADVERTISEMENT_PERIOD
        return self.last_seen < ref_time - timedelta(seconds=grace_period)

    def mark_offline(self, update_last_seen=False, perform_save=True, errors=''):
-        if self.cpu_capacity == 0 and self.mem_capacity == 0 and self.capacity == 0 and self.errors == errors and (not update_last_seen):
-            return
+        if self.node_state not in (Instance.States.READY, Instance.States.UNAVAILABLE, Instance.States.INSTALLED):
+            return []
+        if self.node_state == Instance.States.UNAVAILABLE and self.errors == errors and (not update_last_seen):
+            return []
+        self.node_state = Instance.States.UNAVAILABLE
        self.cpu_capacity = self.mem_capacity = self.capacity = 0
        self.errors = errors
        if update_last_seen:
            self.last_seen = now()

+        update_fields = ['node_state', 'capacity', 'cpu_capacity', 'mem_capacity', 'errors']
+        if update_last_seen:
+            update_fields += ['last_seen']
        if perform_save:
-            update_fields = ['capacity', 'cpu_capacity', 'mem_capacity', 'errors']
-            if update_last_seen:
-                update_fields += ['last_seen']
            self.save(update_fields=update_fields)
+        return update_fields

    def set_capacity_value(self):
        """Sets capacity according to capacity adjustment rule (no save)"""
@@ -274,8 +324,12 @@ class Instance(HasPolicyEditsMixin, BaseModel):
        if not errors:
            self.refresh_capacity_fields()
            self.errors = ''
+            if self.node_state in (Instance.States.UNAVAILABLE, Instance.States.INSTALLED):
+                self.node_state = Instance.States.READY
+                update_fields.append('node_state')
        else:
-            self.mark_offline(perform_save=False, errors=errors)
+            fields_to_update = self.mark_offline(perform_save=False, errors=errors)
+            update_fields.extend(fields_to_update)
        update_fields.extend(['cpu_capacity', 'mem_capacity', 'capacity'])

        # disabling activity stream will avoid extra queries, which is important for heatbeat actions
@@ -292,7 +346,7 @@ class Instance(HasPolicyEditsMixin, BaseModel):
            # playbook event data; we should consider this a zero capacity event
            redis.Redis.from_url(settings.BROKER_URL).ping()
        except redis.ConnectionError:
-            errors = _('Failed to connect ot Redis')
+            errors = _('Failed to connect to Redis')

        self.save_health_data(awx_application_version, get_cpu_count(), get_mem_in_bytes(), update_last_seen=True, errors=errors)

@@ -384,6 +438,20 @@ def on_instance_group_saved(sender, instance, created=False, raw=False, **kwargs

@receiver(post_save, sender=Instance)
 def on_instance_saved(sender, instance, created=False, raw=False, **kwargs):
+    if settings.IS_K8S and instance.node_type in (Instance.Types.EXECUTION,):
+        if instance.node_state == Instance.States.DEPROVISIONING:
+            from awx.main.tasks.receptor import remove_deprovisioned_node  # prevents circular import
+
+            # wait for jobs on the node to complete, then delete the
+            # node and kick off write_receptor_config
+            connection.on_commit(lambda: remove_deprovisioned_node.apply_async([instance.hostname]))
+
+        if instance.node_state == Instance.States.INSTALLED:
+            from awx.main.tasks.receptor import write_receptor_config  # prevents circular import
+
+            # broadcast to all control instances to update their receptor configs
+            connection.on_commit(lambda: write_receptor_config.apply_async(queue='tower_broadcast_all'))
+
    if created or instance.has_policy_changes():
        schedule_policy_task()

@@ -430,3 +498,58 @@ class InventoryInstanceGroupMembership(models.Model):
        default=None,
        db_index=True,
    )
+
+
+class JobLaunchConfigInstanceGroupMembership(models.Model):
+
+    joblaunchconfig = models.ForeignKey('JobLaunchConfig', on_delete=models.CASCADE)
+    instancegroup = models.ForeignKey('InstanceGroup', on_delete=models.CASCADE)
+    position = models.PositiveIntegerField(
+        null=True,
+        default=None,
+        db_index=True,
+    )
+
+
+class ScheduleInstanceGroupMembership(models.Model):
+
+    schedule = models.ForeignKey('Schedule', on_delete=models.CASCADE)
+    instancegroup = models.ForeignKey('InstanceGroup', on_delete=models.CASCADE)
+    position = models.PositiveIntegerField(
+        null=True,
+        default=None,
+        db_index=True,
+    )
+
+
+class WorkflowJobTemplateNodeBaseInstanceGroupMembership(models.Model):
+
+    workflowjobtemplatenode = models.ForeignKey('WorkflowJobTemplateNode', on_delete=models.CASCADE)
+    instancegroup = models.ForeignKey('InstanceGroup', on_delete=models.CASCADE)
+    position = models.PositiveIntegerField(
+        null=True,
+        default=None,
+        db_index=True,
+    )
+
+
+class WorkflowJobNodeBaseInstanceGroupMembership(models.Model):
+
+    workflowjobnode = models.ForeignKey('WorkflowJobNode', on_delete=models.CASCADE)
+    instancegroup = models.ForeignKey('InstanceGroup', on_delete=models.CASCADE)
+    position = models.PositiveIntegerField(
+        null=True,
+        default=None,
+        db_index=True,
+    )
+
+
+class WorkflowJobInstanceGroupMembership(models.Model):
+
+    workflowjobnode = models.ForeignKey('WorkflowJob', on_delete=models.CASCADE)
+    instancegroup = models.ForeignKey('InstanceGroup', on_delete=models.CASCADE)
+    position = models.PositiveIntegerField(
+        null=True,
+        default=None,
+        db_index=True,
+    )
--- a/awx/main/models/inventory.py
+++ b/awx/main/models/inventory.py
@@ -63,7 +63,7 @@ class Inventory(CommonModelNameNotUnique, ResourceMixin, RelatedJobsMixin):
    an inventory source contains lists and hosts.
    """

-    FIELDS_TO_PRESERVE_AT_COPY = ['hosts', 'groups', 'instance_groups']
+    FIELDS_TO_PRESERVE_AT_COPY = ['hosts', 'groups', 'instance_groups', 'prevent_instance_group_fallback']
    KIND_CHOICES = [
        ('', _('Hosts have a direct link to this inventory.')),
        ('smart', _('Hosts for inventory generated using the host_filter property.')),
@@ -175,6 +175,16 @@ class Inventory(CommonModelNameNotUnique, ResourceMixin, RelatedJobsMixin):
        related_name='inventory_labels',
        help_text=_('Labels associated with this inventory.'),
    )
+    prevent_instance_group_fallback = models.BooleanField(
+        default=False,
+        help_text=(
+            "If enabled, the inventory will prevent adding any organization "
+            "instance groups to the list of preferred instances groups to run "
+            "associated job templates on."
+            "If this setting is enabled and you provided an empty list, the global instance "
+            "groups will be applied."
+        ),
+    )

    def get_absolute_url(self, request=None):
        return reverse('api:inventory_detail', kwargs={'pk': self.pk}, request=request)
@@ -236,6 +246,12 @@ class Inventory(CommonModelNameNotUnique, ResourceMixin, RelatedJobsMixin):
            raise ParseError(_('Slice number must be 1 or higher.'))
        return (number, step)

+    def get_sliced_hosts(self, host_queryset, slice_number, slice_count):
+        if slice_count > 1 and slice_number > 0:
+            offset = slice_number - 1
+            host_queryset = host_queryset[offset::slice_count]
+        return host_queryset
+
    def get_script_data(self, hostvars=False, towervars=False, show_all=False, slice_number=1, slice_count=1):
        hosts_kw = dict()
        if not show_all:
@@ -243,10 +259,8 @@ class Inventory(CommonModelNameNotUnique, ResourceMixin, RelatedJobsMixin):
        fetch_fields = ['name', 'id', 'variables', 'inventory_id']
        if towervars:
            fetch_fields.append('enabled')
-        hosts = self.hosts.filter(**hosts_kw).order_by('name').only(*fetch_fields)
-        if slice_count > 1 and slice_number > 0:
-            offset = slice_number - 1
-            hosts = hosts[offset::slice_count]
+        host_queryset = self.hosts.filter(**hosts_kw).order_by('name').only(*fetch_fields)
+        hosts = self.get_sliced_hosts(host_queryset, slice_number, slice_count)

        data = dict()
        all_group = data.setdefault('all', dict())
@@ -337,9 +351,12 @@ class Inventory(CommonModelNameNotUnique, ResourceMixin, RelatedJobsMixin):
        else:
            active_inventory_sources = self.inventory_sources.filter(source__in=CLOUD_INVENTORY_SOURCES)
        failed_inventory_sources = active_inventory_sources.filter(last_job_failed=True)
+        total_hosts = active_hosts.count()
+        # if total_hosts has changed, set update_task_impact to True
+        update_task_impact = total_hosts != self.total_hosts
        computed_fields = {
            'has_active_failures': bool(failed_hosts.count()),
-            'total_hosts': active_hosts.count(),
+            'total_hosts': total_hosts,
            'hosts_with_active_failures': failed_hosts.count(),
            'total_groups': active_groups.count(),
            'has_inventory_sources': bool(active_inventory_sources.count()),
@@ -357,6 +374,14 @@ class Inventory(CommonModelNameNotUnique, ResourceMixin, RelatedJobsMixin):
                computed_fields.pop(field)
        if computed_fields:
            iobj.save(update_fields=computed_fields.keys())
+        if update_task_impact:
+            # if total hosts count has changed, re-calculate task_impact for any
+            # job that is still in pending for this inventory, since task_impact
+            # is cached on task creation and used in task management system
+            tasks = self.jobs.filter(status="pending")
+            for t in tasks:
+                t.task_impact = t._get_task_impact()
+            UnifiedJob.objects.bulk_update(tasks, ['task_impact'])
        logger.debug("Finished updating inventory computed fields, pk={0}, in " "{1:.3f} seconds".format(self.pk, time.time() - start_time))

    def websocket_emit_status(self, status):
@@ -985,22 +1010,11 @@ class InventorySource(UnifiedJobTemplate, InventorySourceOptions, CustomVirtualE
        default=None,
        null=True,
    )
-    scm_last_revision = models.CharField(
-        max_length=1024,
-        blank=True,
-        default='',
-        editable=False,
-    )
-    update_on_project_update = models.BooleanField(
-        default=False,
-        help_text=_(
-            'This field is deprecated and will be removed in a future release. '
-            'In future release, functionality will be migrated to source project update_on_launch.'
-        ),
-    )
+
    update_on_launch = models.BooleanField(
        default=False,
    )
+
    update_cache_timeout = models.PositiveIntegerField(
        default=0,
    )
@@ -1038,14 +1052,6 @@ class InventorySource(UnifiedJobTemplate, InventorySourceOptions, CustomVirtualE
                self.name = 'inventory source (%s)' % replace_text
            if 'name' not in update_fields:
                update_fields.append('name')
-        # Reset revision if SCM source has changed parameters
-        if self.source == 'scm' and not is_new_instance:
-            before_is = self.__class__.objects.get(pk=self.pk)
-            if before_is.source_path != self.source_path or before_is.source_project_id != self.source_project_id:
-                # Reset the scm_revision if file changed to force update
-                self.scm_last_revision = ''
-                if 'scm_last_revision' not in update_fields:
-                    update_fields.append('scm_last_revision')

        # Do the actual save.
        super(InventorySource, self).save(*args, **kwargs)
@@ -1054,10 +1060,6 @@ class InventorySource(UnifiedJobTemplate, InventorySourceOptions, CustomVirtualE
        if replace_text in self.name:
            self.name = self.name.replace(replace_text, str(self.pk))
            super(InventorySource, self).save(update_fields=['name'])
-        if self.source == 'scm' and is_new_instance and self.update_on_project_update:
-            # Schedule a new Project update if one is not already queued
-            if self.source_project and not self.source_project.project_updates.filter(status__in=['new', 'pending', 'waiting']).exists():
-                self.update()
        if not getattr(_inventory_updates, 'is_updating', False):
            if self.inventory is not None:
                self.inventory.update_computed_fields()
@@ -1147,25 +1149,6 @@ class InventorySource(UnifiedJobTemplate, InventorySourceOptions, CustomVirtualE
            )
        return dict(error=list(error_notification_templates), started=list(started_notification_templates), success=list(success_notification_templates))

-    def clean_update_on_project_update(self):
-        if (
-            self.update_on_project_update is True
-            and self.source == 'scm'
-            and InventorySource.objects.filter(Q(inventory=self.inventory, update_on_project_update=True, source='scm') & ~Q(id=self.id)).exists()
-        ):
-            raise ValidationError(_("More than one SCM-based inventory source with update on project update per-inventory not allowed."))
-        return self.update_on_project_update
-
-    def clean_update_on_launch(self):
-        if self.update_on_project_update is True and self.source == 'scm' and self.update_on_launch is True:
-            raise ValidationError(
-                _(
-                    "Cannot update SCM-based inventory source on launch if set to update on project update. "
-                    "Instead, configure the corresponding source project to update on launch."
-                )
-            )
-        return self.update_on_launch
-
    def clean_source_path(self):
        if self.source != 'scm' and self.source_path:
            raise ValidationError(_("Cannot set source_path if not SCM type."))
@@ -1218,6 +1201,14 @@ class InventoryUpdate(UnifiedJob, InventorySourceOptions, JobNotificationMixin,
        default=None,
        null=True,
    )
+    scm_revision = models.CharField(
+        max_length=1024,
+        blank=True,
+        default='',
+        editable=False,
+        verbose_name=_('SCM Revision'),
+        help_text=_('The SCM Revision from the Project used for this inventory update.  Only applicable to inventories source from scm'),
+    )

    @property
    def is_container_group_task(self):
@@ -1262,8 +1253,7 @@ class InventoryUpdate(UnifiedJob, InventorySourceOptions, JobNotificationMixin,
            return UnpartitionedInventoryUpdateEvent
        return InventoryUpdateEvent

-    @property
-    def task_impact(self):
+    def _get_task_impact(self):
        return 1

    # InventoryUpdate credential required
@@ -1288,26 +1278,23 @@ class InventoryUpdate(UnifiedJob, InventorySourceOptions, JobNotificationMixin,

    @property
    def preferred_instance_groups(self):
-        if self.inventory_source.inventory is not None and self.inventory_source.inventory.organization is not None:
-            organization_groups = [x for x in self.inventory_source.inventory.organization.instance_groups.all()]
-        else:
-            organization_groups = []
+        selected_groups = []
        if self.inventory_source.inventory is not None:
-            inventory_groups = [x for x in self.inventory_source.inventory.instance_groups.all()]
-        else:
-            inventory_groups = []
-        selected_groups = inventory_groups + organization_groups
+            # Add the inventory sources IG to the selected IGs first
+            for instance_group in self.inventory_source.inventory.instance_groups.all():
+                selected_groups.append(instance_group)
+            # If the inventory allows for fallback and we have an organization then also append the orgs IGs to the end of the list
+            if (
+                not getattr(self.inventory_source.inventory, 'prevent_instance_group_fallback', False)
+                and self.inventory_source.inventory.organization is not None
+            ):
+                for instance_group in self.inventory_source.inventory.organization.instance_groups.all():
+                    selected_groups.append(instance_group)
+
        if not selected_groups:
            return self.global_instance_groups
        return selected_groups

-    def cancel(self, job_explanation=None, is_chain=False):
-        res = super(InventoryUpdate, self).cancel(job_explanation=job_explanation, is_chain=is_chain)
-        if res:
-            if self.launch_type != 'scm' and self.source_project_update:
-                self.source_project_update.cancel(job_explanation=job_explanation)
-        return res
-

 class CustomInventoryScript(CommonModelNameNotUnique, ResourceMixin):
    class Meta:
--- a/awx/main/models/jobs.py
+++ b/awx/main/models/jobs.py
@@ -43,8 +43,8 @@ from awx.main.models.notifications import (
    NotificationTemplate,
    JobNotificationMixin,
 )
-from awx.main.utils import parse_yaml_or_json, getattr_dne, NullablePromptPseudoField
-from awx.main.fields import ImplicitRoleField, AskForField, JSONBlob
+from awx.main.utils import parse_yaml_or_json, getattr_dne, NullablePromptPseudoField, polymorphic
+from awx.main.fields import ImplicitRoleField, AskForField, JSONBlob, OrderedManyToManyField
 from awx.main.models.mixins import (
    ResourceMixin,
    SurveyJobTemplateMixin,
@@ -203,7 +203,7 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
    playbook) to an inventory source with a given credential.
    """

-    FIELDS_TO_PRESERVE_AT_COPY = ['labels', 'instance_groups', 'credentials', 'survey_spec']
+    FIELDS_TO_PRESERVE_AT_COPY = ['labels', 'instance_groups', 'credentials', 'survey_spec', 'prevent_instance_group_fallback']
    FIELDS_TO_DISCARD_AT_COPY = ['vault_credential', 'credential']
    SOFT_UNIQUE_TOGETHER = [('polymorphic_ctype', 'name', 'organization')]

@@ -227,15 +227,6 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
        blank=True,
        default=False,
    )
-    ask_limit_on_launch = AskForField(
-        blank=True,
-        default=False,
-    )
-    ask_tags_on_launch = AskForField(blank=True, default=False, allows_field='job_tags')
-    ask_skip_tags_on_launch = AskForField(
-        blank=True,
-        default=False,
-    )
    ask_job_type_on_launch = AskForField(
        blank=True,
        default=False,
@@ -244,12 +235,27 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
        blank=True,
        default=False,
    )
-    ask_inventory_on_launch = AskForField(
+    ask_credential_on_launch = AskForField(blank=True, default=False, allows_field='credentials')
+    ask_execution_environment_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_forks_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_job_slice_count_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_timeout_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_instance_groups_on_launch = AskForField(
        blank=True,
        default=False,
    )
-    ask_credential_on_launch = AskForField(blank=True, default=False, allows_field='credentials')
-    ask_scm_branch_on_launch = AskForField(blank=True, default=False, allows_field='scm_branch')
    job_slice_count = models.PositiveIntegerField(
        blank=True,
        default=1,
@@ -268,6 +274,15 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
            'admin_role',
        ],
    )
+    prevent_instance_group_fallback = models.BooleanField(
+        default=False,
+        help_text=(
+            "If enabled, the job template will prevent adding any inventory or organization "
+            "instance groups to the list of preferred instances groups to run on."
+            "If this setting is enabled and you provided an empty list, the global instance "
+            "groups will be applied."
+        ),
+    )

    @classmethod
    def _get_unified_job_class(cls):
@@ -276,7 +291,17 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
    @classmethod
    def _get_unified_job_field_names(cls):
        return set(f.name for f in JobOptions._meta.fields) | set(
-            ['name', 'description', 'organization', 'survey_passwords', 'labels', 'credentials', 'job_slice_number', 'job_slice_count', 'execution_environment']
+            [
+                'name',
+                'description',
+                'organization',
+                'survey_passwords',
+                'labels',
+                'credentials',
+                'job_slice_number',
+                'job_slice_count',
+                'execution_environment',
+            ]
        )

    @property
@@ -314,10 +339,13 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
        actual_inventory = self.inventory
        if self.ask_inventory_on_launch and 'inventory' in kwargs:
            actual_inventory = kwargs['inventory']
+        actual_slice_count = self.job_slice_count
+        if self.ask_job_slice_count_on_launch and 'job_slice_count' in kwargs:
+            actual_slice_count = kwargs['job_slice_count']
        if actual_inventory:
-            return min(self.job_slice_count, actual_inventory.hosts.count())
+            return min(actual_slice_count, actual_inventory.hosts.count())
        else:
-            return self.job_slice_count
+            return actual_slice_count

    def save(self, *args, **kwargs):
        update_fields = kwargs.get('update_fields', [])
@@ -425,10 +453,15 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour

            field = self._meta.get_field(field_name)
            if isinstance(field, models.ManyToManyField):
-                old_value = set(old_value.all())
-                new_value = set(kwargs[field_name]) - old_value
-                if not new_value:
-                    continue
+                if field_name == 'instance_groups':
+                    # Instance groups are ordered so we can't make a set out of them
+                    old_value = old_value.all()
+                elif field_name == 'credentials':
+                    # Credentials have a weird pattern because of how they are layered
+                    old_value = set(old_value.all())
+                    new_value = set(kwargs[field_name]) - old_value
+                    if not new_value:
+                        continue

            if new_value == old_value:
                # no-op case: Fields the same as template's value
@@ -449,6 +482,10 @@ class JobTemplate(UnifiedJobTemplate, JobOptions, SurveyJobTemplateMixin, Resour
                        rejected_data[field_name] = new_value
                        errors_dict[field_name] = _('Project does not allow override of branch.')
                        continue
+                elif field_name == 'job_slice_count' and (new_value > 1) and (self.get_effective_slice_ct(kwargs) <= 1):
+                    rejected_data[field_name] = new_value
+                    errors_dict[field_name] = _('Job inventory does not have enough hosts for slicing')
+                    continue
                # accepted prompt
                prompted_data[field_name] = new_value
            else:
@@ -600,6 +637,19 @@ class Job(UnifiedJob, JobOptions, SurveyJobMixin, JobNotificationMixin, TaskMana
    def get_ui_url(self):
        return urljoin(settings.TOWER_URL_BASE, "/#/jobs/playbook/{}".format(self.pk))

+    def _set_default_dependencies_processed(self):
+        """
+        This sets the initial value of dependencies_processed
+        and here we use this as a shortcut to avoid the DependencyManager for jobs that do not need it
+        """
+        if (not self.project) or self.project.scm_update_on_launch:
+            self.dependencies_processed = False
+        elif (not self.inventory) or self.inventory.inventory_sources.filter(update_on_launch=True).exists():
+            self.dependencies_processed = False
+        else:
+            # No dependencies to process
+            self.dependencies_processed = True
+
    @property
    def event_class(self):
        if self.has_unpartitioned_events:
@@ -644,8 +694,7 @@ class Job(UnifiedJob, JobOptions, SurveyJobMixin, JobNotificationMixin, TaskMana
            raise ParseError(_('{status_value} is not a valid status option.').format(status_value=status))
        return self._get_hosts(**kwargs)

-    @property
-    def task_impact(self):
+    def _get_task_impact(self):
        if self.launch_type == 'callback':
            count_hosts = 2
        else:
@@ -743,25 +792,27 @@ class Job(UnifiedJob, JobOptions, SurveyJobMixin, JobNotificationMixin, TaskMana
            return "$hidden due to Ansible no_log flag$"
        return artifacts

+    def get_effective_artifacts(self, **kwargs):
+        """Return unified job artifacts (from set_stats) to pass downstream in workflows"""
+        if isinstance(self.artifacts, dict):
+            return self.artifacts
+        return {}
+
    @property
    def is_container_group_task(self):
        return bool(self.instance_group and self.instance_group.is_container_group)

    @property
    def preferred_instance_groups(self):
-        if self.organization is not None:
-            organization_groups = [x for x in self.organization.instance_groups.all()]
-        else:
-            organization_groups = []
-        if self.inventory is not None:
-            inventory_groups = [x for x in self.inventory.instance_groups.all()]
-        else:
-            inventory_groups = []
-        if self.job_template is not None:
-            template_groups = [x for x in self.job_template.instance_groups.all()]
-        else:
-            template_groups = []
-        selected_groups = template_groups + inventory_groups + organization_groups
+        # If the user specified instance groups those will be handled by the unified_job.create_unified_job
+        # This function handles only the defaults for a template w/o user specification
+        selected_groups = []
+        for obj_type in ['job_template', 'inventory', 'organization']:
+            if getattr(self, obj_type) is not None:
+                for instance_group in getattr(self, obj_type).instance_groups.all():
+                    selected_groups.append(instance_group)
+                if getattr(getattr(self, obj_type), 'prevent_instance_group_fallback', False):
+                    break
        if not selected_groups:
            return self.global_instance_groups
        return selected_groups
@@ -796,7 +847,8 @@ class Job(UnifiedJob, JobOptions, SurveyJobMixin, JobNotificationMixin, TaskMana
    def _get_inventory_hosts(self, only=['name', 'ansible_facts', 'ansible_facts_modified', 'modified', 'inventory_id']):
        if not self.inventory:
            return []
-        return self.inventory.hosts.only(*only)
+        host_queryset = self.inventory.hosts.only(*only)
+        return self.inventory.get_sliced_hosts(host_queryset, self.job_slice_number, self.job_slice_count)

    def start_job_fact_cache(self, destination, modification_times, timeout=None):
        self.log_lifecycle("start_job_fact_cache")
@@ -841,7 +893,7 @@ class Job(UnifiedJob, JobOptions, SurveyJobMixin, JobNotificationMixin, TaskMana
                            continue
                        host.ansible_facts = ansible_facts
                        host.ansible_facts_modified = now()
-                        host.save()
+                        host.save(update_fields=['ansible_facts', 'ansible_facts_modified'])
                        system_tracking_logger.info(
                            'New fact for inventory {} host {}'.format(smart_str(host.inventory.name), smart_str(host.name)),
                            extra=dict(
@@ -887,10 +939,36 @@ class LaunchTimeConfigBase(BaseModel):
    # This is a solution to the nullable CharField problem, specific to prompting
    char_prompts = JSONBlob(default=dict, blank=True)

-    def prompts_dict(self, display=False):
+    # Define fields that are not really fields, but alias to char_prompts lookups
+    limit = NullablePromptPseudoField('limit')
+    scm_branch = NullablePromptPseudoField('scm_branch')
+    job_tags = NullablePromptPseudoField('job_tags')
+    skip_tags = NullablePromptPseudoField('skip_tags')
+    diff_mode = NullablePromptPseudoField('diff_mode')
+    job_type = NullablePromptPseudoField('job_type')
+    verbosity = NullablePromptPseudoField('verbosity')
+    forks = NullablePromptPseudoField('forks')
+    job_slice_count = NullablePromptPseudoField('job_slice_count')
+    timeout = NullablePromptPseudoField('timeout')
+
+    # NOTE: additional fields are assumed to exist but must be defined in subclasses
+    # due to technical limitations
+    SUBCLASS_FIELDS = (
+        'instance_groups',  # needs a through model defined
+        'extra_vars',  # alternates between extra_vars and extra_data
+        'credentials',  # already a unified job and unified JT field
+        'labels',  # already a unified job and unified JT field
+        'execution_environment',  # already a unified job and unified JT field
+    )
+
+    def prompts_dict(self, display=False, for_cls=None):
        data = {}
+        if for_cls:
+            cls = for_cls
+        else:
+            cls = JobTemplate
        # Some types may have different prompts, but always subset of JT prompts
-        for prompt_name in JobTemplate.get_ask_mapping().keys():
+        for prompt_name in cls.get_ask_mapping().keys():
            try:
                field = self._meta.get_field(prompt_name)
            except FieldDoesNotExist:
@@ -898,18 +976,23 @@ class LaunchTimeConfigBase(BaseModel):
            if isinstance(field, models.ManyToManyField):
                if not self.pk:
                    continue  # unsaved object can't have related many-to-many
-                prompt_val = set(getattr(self, prompt_name).all())
-                if len(prompt_val) > 0:
-                    data[prompt_name] = prompt_val
+                prompt_values = list(getattr(self, prompt_name).all())
+                # Many to manys can't distinguish between None and []
+                # Because of this, from a config perspective, we assume [] is none and we don't save [] into the config
+                if len(prompt_values) > 0:
+                    data[prompt_name] = prompt_values
            elif prompt_name == 'extra_vars':
                if self.extra_vars:
+                    extra_vars = {}
                    if display:
-                        data[prompt_name] = self.display_extra_vars()
+                        extra_vars = self.display_extra_vars()
                    else:
-                        data[prompt_name] = self.extra_vars
+                        extra_vars = self.extra_vars
                    # Depending on model, field type may save and return as string
-                    if isinstance(data[prompt_name], str):
-                        data[prompt_name] = parse_yaml_or_json(data[prompt_name])
+                    if isinstance(extra_vars, str):
+                        extra_vars = parse_yaml_or_json(extra_vars)
+                    if extra_vars:
+                        data['extra_vars'] = extra_vars
                if self.survey_passwords and not display:
                    data['survey_passwords'] = self.survey_passwords
            else:
@@ -919,15 +1002,6 @@ class LaunchTimeConfigBase(BaseModel):
        return data


-for field_name in JobTemplate.get_ask_mapping().keys():
-    if field_name == 'extra_vars':
-        continue
-    try:
-        LaunchTimeConfigBase._meta.get_field(field_name)
-    except FieldDoesNotExist:
-        setattr(LaunchTimeConfigBase, field_name, NullablePromptPseudoField(field_name))
-
-
 class LaunchTimeConfig(LaunchTimeConfigBase):
    """
    Common model for all objects that save details of a saved launch config
@@ -946,8 +1020,18 @@ class LaunchTimeConfig(LaunchTimeConfigBase):
            blank=True,
        )
    )
-    # Credentials needed for non-unified job / unified JT models
+    # Fields needed for non-unified job / unified JT models, because they are defined on unified models
    credentials = models.ManyToManyField('Credential', related_name='%(class)ss')
+    labels = models.ManyToManyField('Label', related_name='%(class)s_labels')
+    execution_environment = models.ForeignKey(
+        'ExecutionEnvironment',
+        null=True,
+        blank=True,
+        default=None,
+        on_delete=polymorphic.SET_NULL,
+        related_name='%(class)s_as_prompt',
+        help_text="The container image to be used for execution.",
+    )

    @property
    def extra_vars(self):
@@ -991,6 +1075,11 @@ class JobLaunchConfig(LaunchTimeConfig):
        editable=False,
    )

+    # Instance Groups needed for non-unified job / unified JT models
+    instance_groups = OrderedManyToManyField(
+        'InstanceGroup', related_name='%(class)ss', blank=True, editable=False, through='JobLaunchConfigInstanceGroupMembership'
+    )
+
    def has_user_prompts(self, template):
        """
        Returns True if any fields exist in the launch config that are
@@ -1207,6 +1296,9 @@ class SystemJob(UnifiedJob, SystemJobOptions, JobNotificationMixin):

    extra_vars_dict = VarsDictProperty('extra_vars', True)

+    def _set_default_dependencies_processed(self):
+        self.dependencies_processed = True
+
    @classmethod
    def _get_parent_field_name(cls):
        return 'system_job_template'
@@ -1232,8 +1324,7 @@ class SystemJob(UnifiedJob, SystemJobOptions, JobNotificationMixin):
            return UnpartitionedSystemJobEvent
        return SystemJobEvent

-    @property
-    def task_impact(self):
+    def _get_task_impact(self):
        return 5

    @property
--- a/awx/main/models/label.py
+++ b/awx/main/models/label.py
@@ -10,6 +10,8 @@ from awx.api.versioning import reverse
 from awx.main.models.base import CommonModelNameNotUnique
 from awx.main.models.unified_jobs import UnifiedJobTemplate, UnifiedJob
 from awx.main.models.inventory import Inventory
+from awx.main.models.schedules import Schedule
+from awx.main.models.workflow import WorkflowJobTemplateNode, WorkflowJobNode

 __all__ = ('Label',)

@@ -34,16 +36,22 @@ class Label(CommonModelNameNotUnique):
    def get_absolute_url(self, request=None):
        return reverse('api:label_detail', kwargs={'pk': self.pk}, request=request)

-    @staticmethod
-    def get_orphaned_labels():
-        return Label.objects.filter(organization=None, unifiedjobtemplate_labels__isnull=True, inventory_labels__isnull=True)
-
    def is_detached(self):
-        return Label.objects.filter(id=self.id, unifiedjob_labels__isnull=True, unifiedjobtemplate_labels__isnull=True, inventory_labels__isnull=True).exists()
+        return Label.objects.filter(
+            id=self.id,
+            unifiedjob_labels__isnull=True,
+            unifiedjobtemplate_labels__isnull=True,
+            inventory_labels__isnull=True,
+            schedule_labels__isnull=True,
+            workflowjobtemplatenode_labels__isnull=True,
+            workflowjobnode_labels__isnull=True,
+        ).exists()

    def is_candidate_for_detach(self):
-
-        c1 = UnifiedJob.objects.filter(labels__in=[self.id]).count()
-        c2 = UnifiedJobTemplate.objects.filter(labels__in=[self.id]).count()
-        c3 = Inventory.objects.filter(labels__in=[self.id]).count()
-        return (c1 + c2 + c3 - 1) == 0
+        count = UnifiedJob.objects.filter(labels__in=[self.id]).count()  # Both Jobs and WFJobs
+        count += UnifiedJobTemplate.objects.filter(labels__in=[self.id]).count()  # Both JTs and WFJT
+        count += Inventory.objects.filter(labels__in=[self.id]).count()
+        count += Schedule.objects.filter(labels__in=[self.id]).count()
+        count += WorkflowJobTemplateNode.objects.filter(labels__in=[self.id]).count()
+        count += WorkflowJobNode.objects.filter(labels__in=[self.id]).count()
+        return (count - 1) == 0
--- a/awx/main/models/mixins.py
+++ b/awx/main/models/mixins.py
@@ -104,6 +104,33 @@ class SurveyJobTemplateMixin(models.Model):
        default=False,
    )
    survey_spec = prevent_search(JSONBlob(default=dict, blank=True))
+
+    ask_inventory_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_limit_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_scm_branch_on_launch = AskForField(
+        blank=True,
+        default=False,
+        allows_field='scm_branch',
+    )
+    ask_labels_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
+    ask_tags_on_launch = AskForField(
+        blank=True,
+        default=False,
+        allows_field='job_tags',
+    )
+    ask_skip_tags_on_launch = AskForField(
+        blank=True,
+        default=False,
+    )
    ask_variables_on_launch = AskForField(blank=True, default=False, allows_field='extra_vars')

    def survey_password_variables(self):
@@ -412,6 +439,11 @@ class TaskManagerJobMixin(TaskManagerUnifiedJobMixin):
    class Meta:
        abstract = True

+    def get_jobs_fail_chain(self):
+        if self.project_update_id:
+            return [self.project_update]
+        return []
+

 class TaskManagerUpdateOnLaunchMixin(TaskManagerUnifiedJobMixin):
    class Meta:
--- a/awx/main/models/notifications.py
+++ b/awx/main/models/notifications.py
@@ -408,6 +408,7 @@ class JobNotificationMixin(object):
 'inventory': 'Stub Inventory',
 'id': 42,
 'hosts': {},
+ 'extra_vars': {},
 'friendly_name': 'Job',
 'finished': False,
 'credential': 'Stub credential',
--- a/awx/main/models/organization.py
+++ b/awx/main/models/organization.py
@@ -114,13 +114,6 @@ class Organization(CommonModel, NotificationFieldsModel, ResourceMixin, CustomVi
    def _get_related_jobs(self):
        return UnifiedJob.objects.non_polymorphic().filter(organization=self)

-    def create_default_galaxy_credential(self):
-        from awx.main.models import Credential
-
-        public_galaxy_credential = Credential.objects.filter(managed=True, name='Ansible Galaxy').first()
-        if public_galaxy_credential is not None and public_galaxy_credential not in self.galaxy_credentials.all():
-            self.galaxy_credentials.add(public_galaxy_credential)
-

 class OrganizationGalaxyCredentialMembership(models.Model):

--- a/awx/main/models/projects.py
+++ b/awx/main/models/projects.py
@@ -284,6 +284,17 @@ class Project(UnifiedJobTemplate, ProjectOptions, ResourceMixin, CustomVirtualEn
        help_text=_('Allow changing the SCM branch or revision in a job template ' 'that uses this project.'),
    )

+    # credential (keys) used to validate content signature
+    signature_validation_credential = models.ForeignKey(
+        'Credential',
+        related_name='%(class)ss_signature_validation',
+        blank=True,
+        null=True,
+        default=None,
+        on_delete=models.SET_NULL,
+        help_text=_('An optional credential used for validating files in the project against unexpected changes.'),
+    )
+
    scm_revision = models.CharField(
        max_length=1024,
        blank=True,
@@ -513,6 +524,9 @@ class ProjectUpdate(UnifiedJob, ProjectOptions, JobNotificationMixin, TaskManage
        help_text=_('The SCM Revision discovered by this update for the given project and branch.'),
    )

+    def _set_default_dependencies_processed(self):
+        self.dependencies_processed = True
+
    def _get_parent_field_name(self):
        return 'project'

@@ -560,8 +574,7 @@ class ProjectUpdate(UnifiedJob, ProjectOptions, JobNotificationMixin, TaskManage
            return UnpartitionedProjectUpdateEvent
        return ProjectUpdateEvent

-    @property
-    def task_impact(self):
+    def _get_task_impact(self):
        return 0 if self.job_type == 'run' else 1

    @property
@@ -618,6 +631,10 @@ class ProjectUpdate(UnifiedJob, ProjectOptions, JobNotificationMixin, TaskManage
        added_update_fields = []
        if not self.job_tags:
            job_tags = ['update_{}'.format(self.scm_type), 'install_roles', 'install_collections']
+            if self.project.signature_validation_credential is not None:
+                credential_type = self.project.signature_validation_credential.credential_type.namespace
+                job_tags.append(f'validation_{credential_type}')
+                job_tags.append('validation_checksum_manifest')
            self.job_tags = ','.join(job_tags)
            added_update_fields.append('job_tags')
        if self.scm_delete_on_update and 'delete' not in self.job_tags and self.job_type == 'check':
--- a/awx/main/models/schedules.py
+++ b/awx/main/models/schedules.py
@@ -18,6 +18,7 @@ from django.utils.translation import gettext_lazy as _

 # AWX
 from awx.api.versioning import reverse
+from awx.main.fields import OrderedManyToManyField
 from awx.main.models.base import PrimordialModel
 from awx.main.models.jobs import LaunchTimeConfig
 from awx.main.utils import ignore_inventory_computed_fields
@@ -83,6 +84,13 @@ class Schedule(PrimordialModel, LaunchTimeConfig):
    )
    rrule = models.TextField(help_text=_("A value representing the schedules iCal recurrence rule."))
    next_run = models.DateTimeField(null=True, default=None, editable=False, help_text=_("The next time that the scheduled action will run."))
+    instance_groups = OrderedManyToManyField(
+        'InstanceGroup',
+        related_name='schedule_instance_groups',
+        blank=True,
+        editable=False,
+        through='ScheduleInstanceGroupMembership',
+    )

    @classmethod
    def get_zoneinfo(cls):
--- a/awx/main/models/unified_jobs.py
+++ b/awx/main/models/unified_jobs.py
@@ -45,7 +45,8 @@ from awx.main.utils.common import (
    get_type_for_model,
    parse_yaml_or_json,
    getattr_dne,
-    schedule_task_manager,
+    ScheduleDependencyManager,
+    ScheduleTaskManager,
    get_event_partition_epoch,
    get_capacity_type,
 )
@@ -331,10 +332,11 @@ class UnifiedJobTemplate(PolymorphicModel, CommonModelNameNotUnique, ExecutionEn

        return NotificationTemplate.objects.none()

-    def create_unified_job(self, **kwargs):
+    def create_unified_job(self, instance_groups=None, **kwargs):
        """
        Create a new unified job based on this unified job template.
        """
+        # TODO: rename kwargs to prompts, to set expectation that these are runtime values
        new_job_passwords = kwargs.pop('survey_passwords', {})
        eager_fields = kwargs.pop('_eager_fields', None)

@@ -381,6 +383,14 @@ class UnifiedJobTemplate(PolymorphicModel, CommonModelNameNotUnique, ExecutionEn
            unified_job.survey_passwords = new_job_passwords
            kwargs['survey_passwords'] = new_job_passwords  # saved in config object for relaunch

+        if instance_groups:
+            unified_job.preferred_instance_groups_cache = [ig.id for ig in instance_groups]
+        else:
+            unified_job.preferred_instance_groups_cache = unified_job._get_preferred_instance_group_cache()
+
+        unified_job._set_default_dependencies_processed()
+        unified_job.task_impact = unified_job._get_task_impact()
+
        from awx.main.signals import disable_activity_stream, activity_stream_create

        with disable_activity_stream():
@@ -406,13 +416,17 @@ class UnifiedJobTemplate(PolymorphicModel, CommonModelNameNotUnique, ExecutionEn
            unified_job.handle_extra_data(validated_kwargs['extra_vars'])

        # Create record of provided prompts for relaunch and rescheduling
-        unified_job.create_config_from_prompts(kwargs, parent=self)
+        config = unified_job.create_config_from_prompts(kwargs, parent=self)
+        if instance_groups:
+            for ig in instance_groups:
+                config.instance_groups.add(ig)

        # manually issue the create activity stream entry _after_ M2M relations
        # have been associated to the UJ
        if unified_job.__class__ in activity_stream_registrar.models:
            activity_stream_create(None, unified_job, True)
        unified_job.log_lifecycle("created")
+
        return unified_job

    @classmethod
@@ -533,7 +547,7 @@ class UnifiedJob(
        ('workflow', _('Workflow')),  # Job was started from a workflow job.
        ('webhook', _('Webhook')),  # Job was started from a webhook event.
        ('sync', _('Sync')),  # Job was started from a project sync.
-        ('scm', _('SCM Update')),  # Job was created as an Inventory SCM sync.
+        ('scm', _('SCM Update')),  # (deprecated) Job was created as an Inventory SCM sync.
    ]

    PASSWORD_FIELDS = ('start_args',)
@@ -693,6 +707,14 @@ class UnifiedJob(
        on_delete=polymorphic.SET_NULL,
        help_text=_('The Instance group the job was run under'),
    )
+    preferred_instance_groups_cache = models.JSONField(
+        blank=True,
+        null=True,
+        default=None,
+        editable=False,
+        help_text=_("A cached list with pk values from preferred instance groups."),
+    )
+    task_impact = models.PositiveIntegerField(default=0, editable=False, help_text=_("Number of forks an instance consumes when running this job."))
    organization = models.ForeignKey(
        'Organization',
        blank=True,
@@ -754,6 +776,9 @@ class UnifiedJob(
    def _get_parent_field_name(self):
        return 'unified_job_template'  # Override in subclasses.

+    def _get_preferred_instance_group_cache(self):
+        return [ig.pk for ig in self.preferred_instance_groups]
+
    @classmethod
    def _get_unified_job_template_class(cls):
        """
@@ -808,6 +833,9 @@ class UnifiedJob(
            update_fields = self._update_parent_instance_no_save(parent_instance)
            parent_instance.save(update_fields=update_fields)

+    def _set_default_dependencies_processed(self):
+        pass
+
    def save(self, *args, **kwargs):
        """Save the job, with current status, to the database.
        Ensure that all data is consistent before doing so.
@@ -821,7 +849,8 @@ class UnifiedJob(

        # If this job already exists in the database, retrieve a copy of
        # the job in its prior state.
-        if self.pk:
+        # If update_fields are given without status, then that indicates no change
+        if self.pk and ((not update_fields) or ('status' in update_fields)):
            self_before = self.__class__.objects.get(pk=self.pk)
            if self_before.status != self.status:
                status_before = self_before.status
@@ -952,22 +981,38 @@ class UnifiedJob(
            valid_fields.extend(['survey_passwords', 'extra_vars'])
        else:
            kwargs.pop('survey_passwords', None)
+        many_to_many_fields = []
        for field_name, value in kwargs.items():
            if field_name not in valid_fields:
                raise Exception('Unrecognized launch config field {}.'.format(field_name))
-            if field_name == 'credentials':
+            field = None
+            # may use extra_data as a proxy for extra_vars
+            if field_name in config.SUBCLASS_FIELDS and field_name != 'extra_vars':
+                field = config._meta.get_field(field_name)
+            if isinstance(field, models.ManyToManyField):
+                many_to_many_fields.append(field_name)
                continue
-            key = field_name
-            if key == 'extra_vars':
-                key = 'extra_data'
-            setattr(config, key, value)
+            if isinstance(field, (models.ForeignKey)) and (value is None):
+                continue  # the null value indicates not-provided for ForeignKey case
+            setattr(config, field_name, value)
        config.save()

-        job_creds = set(kwargs.get('credentials', []))
-        if 'credentials' in [field.name for field in parent._meta.get_fields()]:
-            job_creds = job_creds - set(parent.credentials.all())
-        if job_creds:
-            config.credentials.add(*job_creds)
+        for field_name in many_to_many_fields:
+            prompted_items = kwargs.get(field_name, [])
+            if not prompted_items:
+                continue
+            if field_name == 'instance_groups':
+                # Here we are doing a loop to make sure we preserve order for this Ordered field
+                # also do not merge IGs with parent, so this saves the literal list
+                for item in prompted_items:
+                    getattr(config, field_name).add(item)
+            else:
+                # Assuming this field merges prompts with parent, save just the diff
+                if field_name in [field.name for field in parent._meta.get_fields()]:
+                    prompted_items = set(prompted_items) - set(getattr(parent, field_name).all())
+                if prompted_items:
+                    getattr(config, field_name).add(*prompted_items)
+
        return config

    @property
@@ -1026,7 +1071,6 @@ class UnifiedJob(
            event_qs = self.get_event_queryset()
        except NotImplementedError:
            return True  # Model without events, such as WFJT
-        self.log_lifecycle("event_processing_finished")
        return self.emitted_events == event_qs.count()

    def result_stdout_raw_handle(self, enforce_max_bytes=True):
@@ -1204,6 +1248,10 @@ class UnifiedJob(
                pass
        return None

+    def get_effective_artifacts(self, **kwargs):
+        """Return unified job artifacts (from set_stats) to pass downstream in workflows"""
+        return {}
+
    def get_passwords_needed_to_start(self):
        return []

@@ -1237,9 +1285,8 @@ class UnifiedJob(
        except JobLaunchConfig.DoesNotExist:
            return False

-    @property
-    def task_impact(self):
-        raise NotImplementedError  # Implement in subclass.
+    def _get_task_impact(self):
+        return self.task_impact  # return default, should implement in subclass.

    def websocket_emit_data(self):
        '''Return extra data that should be included when submitting data to the browser over the websocket connection'''
@@ -1251,7 +1298,7 @@ class UnifiedJob(
    def _websocket_emit_status(self, status):
        try:
            status_data = dict(unified_job_id=self.id, status=status)
-            if status == 'waiting':
+            if status == 'running':
                if self.instance_group:
                    status_data['instance_group_name'] = self.instance_group.name
                else:
@@ -1354,7 +1401,10 @@ class UnifiedJob(
        self.update_fields(start_args=json.dumps(kwargs), status='pending')
        self.websocket_emit_status("pending")

-        schedule_task_manager()
+        if self.dependencies_processed:
+            ScheduleTaskManager().schedule()
+        else:
+            ScheduleDependencyManager().schedule()

        # Each type of unified job has a different Task class; get the
        # appropirate one.
@@ -1369,22 +1419,6 @@ class UnifiedJob(
        # Done!
        return True

-    @property
-    def actually_running(self):
-        # returns True if the job is running in the appropriate dispatcher process
-        running = False
-        if all([self.status == 'running', self.celery_task_id, self.execution_node]):
-            # If the job is marked as running, but the dispatcher
-            # doesn't know about it (or the dispatcher doesn't reply),
-            # then cancel the job
-            timeout = 5
-            try:
-                running = self.celery_task_id in ControlDispatcher('dispatcher', self.controller_node or self.execution_node).running(timeout=timeout)
-            except (socket.timeout, RuntimeError):
-                logger.error('could not reach dispatcher on {} within {}s'.format(self.execution_node, timeout))
-                running = False
-        return running
-
    @property
    def can_cancel(self):
        return bool(self.status in CAN_CANCEL)
@@ -1394,27 +1428,61 @@ class UnifiedJob(
            return 'Previous Task Canceled: {"job_type": "%s", "job_name": "%s", "job_id": "%s"}' % (self.model_to_str(), self.name, self.id)
        return None

+    def fallback_cancel(self):
+        if not self.celery_task_id:
+            self.refresh_from_db(fields=['celery_task_id'])
+        self.cancel_dispatcher_process()
+
+    def cancel_dispatcher_process(self):
+        """Returns True if dispatcher running this job acknowledged request and sent SIGTERM"""
+        if not self.celery_task_id:
+            return
+        canceled = []
+        try:
+            # Use control and reply mechanism to cancel and obtain confirmation
+            timeout = 5
+            canceled = ControlDispatcher('dispatcher', self.controller_node).cancel([self.celery_task_id])
+        except socket.timeout:
+            logger.error(f'could not reach dispatcher on {self.controller_node} within {timeout}s')
+        except Exception:
+            logger.exception("error encountered when checking task status")
+        return bool(self.celery_task_id in canceled)  # True or False, whether confirmation was obtained
+
    def cancel(self, job_explanation=None, is_chain=False):
        if self.can_cancel:
            if not is_chain:
                for x in self.get_jobs_fail_chain():
                    x.cancel(job_explanation=self._build_job_explanation(), is_chain=True)

+            cancel_fields = []
            if not self.cancel_flag:
                self.cancel_flag = True
                self.start_args = ''  # blank field to remove encrypted passwords
-                cancel_fields = ['cancel_flag', 'start_args']
-                if self.status in ('pending', 'waiting', 'new'):
-                    self.status = 'canceled'
-                    cancel_fields.append('status')
-                if self.status == 'running' and not self.actually_running:
-                    self.status = 'canceled'
-                    cancel_fields.append('status')
+                cancel_fields.extend(['cancel_flag', 'start_args'])
+                connection.on_commit(lambda: self.websocket_emit_status("canceled"))
+
                if job_explanation is not None:
                    self.job_explanation = job_explanation
                    cancel_fields.append('job_explanation')
-                self.save(update_fields=cancel_fields)
-                self.websocket_emit_status("canceled")
+
+            controller_notified = False
+            if self.celery_task_id:
+                controller_notified = self.cancel_dispatcher_process()
+
+            else:
+                # Avoid race condition where we have stale model from pending state but job has already started,
+                # its checking signal but not cancel_flag, so re-send signal after this database commit
+                connection.on_commit(self.fallback_cancel)
+
+            # If a SIGTERM signal was sent to the control process, and acked by the dispatcher
+            # then we want to let its own cleanup change status, otherwise change status now
+            if not controller_notified:
+                if self.status != 'canceled':
+                    self.status = 'canceled'
+                    cancel_fields.append('status')
+
+            self.save(update_fields=cancel_fields)
+
        return self.cancel_flag

    @property
@@ -1511,8 +1579,8 @@ class UnifiedJob(
            'state': state,
            'work_unit_id': self.work_unit_id,
        }
-        if self.unified_job_template:
-            extra["template_name"] = self.unified_job_template.name
+        if self.name:
+            extra["task_name"] = self.name
        if state == "blocked" and blocked_by:
            blocked_by_msg = f"{blocked_by._meta.model_name}-{blocked_by.id}"
            msg = f"{self._meta.model_name}-{self.id} blocked by {blocked_by_msg}"
@@ -1524,7 +1592,7 @@ class UnifiedJob(
            extra["controller_node"] = self.controller_node or "NOT_SET"
        elif state == "execution_node_chosen":
            extra["execution_node"] = self.execution_node or "NOT_SET"
-        logger_job_lifecycle.debug(msg, extra=extra)
+        logger_job_lifecycle.info(msg, extra=extra)

    @property
    def launched_by(self):
--- a/awx/main/models/workflow.py
+++ b/awx/main/models/workflow.py
@@ -13,6 +13,7 @@ from django.db import connection, models
 from django.conf import settings
 from django.utils.translation import gettext_lazy as _
 from django.core.exceptions import ObjectDoesNotExist
+from django.utils.timezone import now, timedelta

 # from django import settings as tower_settings

@@ -28,7 +29,7 @@ from awx.main.models import prevent_search, accepts_json, UnifiedJobTemplate, Un
 from awx.main.models.notifications import NotificationTemplate, JobNotificationMixin
 from awx.main.models.base import CreatedModifiedModel, VarsDictProperty
 from awx.main.models.rbac import ROLE_SINGLETON_SYSTEM_ADMINISTRATOR, ROLE_SINGLETON_SYSTEM_AUDITOR
-from awx.main.fields import ImplicitRoleField, AskForField, JSONBlob
+from awx.main.fields import ImplicitRoleField, JSONBlob, OrderedManyToManyField
 from awx.main.models.mixins import (
    ResourceMixin,
    SurveyJobTemplateMixin,
@@ -40,7 +41,7 @@ from awx.main.models.mixins import (
 from awx.main.models.jobs import LaunchTimeConfigBase, LaunchTimeConfig, JobTemplate
 from awx.main.models.credential import Credential
 from awx.main.redact import REPLACE_STR
-from awx.main.utils import schedule_task_manager
+from awx.main.utils import ScheduleWorkflowManager


 __all__ = [
@@ -113,6 +114,9 @@ class WorkflowNodeBase(CreatedModifiedModel, LaunchTimeConfig):
            'credentials',
            'char_prompts',
            'all_parents_must_converge',
+            'labels',
+            'instance_groups',
+            'execution_environment',
        ]

    def create_workflow_job_node(self, **kwargs):
@@ -121,7 +125,7 @@ class WorkflowNodeBase(CreatedModifiedModel, LaunchTimeConfig):
        """
        create_kwargs = {}
        for field_name in self._get_workflow_job_field_names():
-            if field_name == 'credentials':
+            if field_name in ['credentials', 'labels', 'instance_groups']:
                continue
            if field_name in kwargs:
                create_kwargs[field_name] = kwargs[field_name]
@@ -131,10 +135,20 @@ class WorkflowNodeBase(CreatedModifiedModel, LaunchTimeConfig):
        new_node = WorkflowJobNode.objects.create(**create_kwargs)
        if self.pk:
            allowed_creds = self.credentials.all()
+            allowed_labels = self.labels.all()
+            allowed_instance_groups = self.instance_groups.all()
        else:
            allowed_creds = []
+            allowed_labels = []
+            allowed_instance_groups = []
        for cred in allowed_creds:
            new_node.credentials.add(cred)
+
+        for label in allowed_labels:
+            new_node.labels.add(label)
+        for instance_group in allowed_instance_groups:
+            new_node.instance_groups.add(instance_group)
+
        return new_node


@@ -152,6 +166,9 @@ class WorkflowJobTemplateNode(WorkflowNodeBase):
        'char_prompts',
        'all_parents_must_converge',
        'identifier',
+        'labels',
+        'execution_environment',
+        'instance_groups',
    ]
    REENCRYPTION_BLOCKLIST_AT_COPY = ['extra_data', 'survey_passwords']

@@ -166,6 +183,13 @@ class WorkflowJobTemplateNode(WorkflowNodeBase):
        blank=False,
        help_text=_('An identifier for this node that is unique within its workflow. ' 'It is copied to workflow job nodes corresponding to this node.'),
    )
+    instance_groups = OrderedManyToManyField(
+        'InstanceGroup',
+        related_name='workflow_job_template_node_instance_groups',
+        blank=True,
+        editable=False,
+        through='WorkflowJobTemplateNodeBaseInstanceGroupMembership',
+    )

    class Meta:
        app_label = 'main'
@@ -210,7 +234,7 @@ class WorkflowJobTemplateNode(WorkflowNodeBase):
        approval_template = WorkflowApprovalTemplate(**kwargs)
        approval_template.save()
        self.unified_job_template = approval_template
-        self.save()
+        self.save(update_fields=['unified_job_template'])
        return approval_template


@@ -249,6 +273,9 @@ class WorkflowJobNode(WorkflowNodeBase):
        blank=True,  # blank denotes pre-migration job nodes
        help_text=_('An identifier coresponding to the workflow job template node that this node was created from.'),
    )
+    instance_groups = OrderedManyToManyField(
+        'InstanceGroup', related_name='workflow_job_node_instance_groups', blank=True, editable=False, through='WorkflowJobNodeBaseInstanceGroupMembership'
+    )

    class Meta:
        app_label = 'main'
@@ -264,19 +291,6 @@ class WorkflowJobNode(WorkflowNodeBase):
    def get_absolute_url(self, request=None):
        return reverse('api:workflow_job_node_detail', kwargs={'pk': self.pk}, request=request)

-    def prompts_dict(self, *args, **kwargs):
-        r = super(WorkflowJobNode, self).prompts_dict(*args, **kwargs)
-        # Explanation - WFJT extra_vars still break pattern, so they are not
-        # put through prompts processing, but inventory and others are only accepted
-        # if JT prompts for it, so it goes through this mechanism
-        if self.workflow_job:
-            if self.workflow_job.inventory_id:
-                # workflow job inventory takes precedence
-                r['inventory'] = self.workflow_job.inventory
-            if self.workflow_job.char_prompts:
-                r.update(self.workflow_job.char_prompts)
-        return r
-
    def get_job_kwargs(self):
        """
        In advance of creating a new unified job as part of a workflow,
@@ -286,16 +300,38 @@ class WorkflowJobNode(WorkflowNodeBase):
        """
        # reject/accept prompted fields
        data = {}
+        wj_special_vars = {}
+        wj_special_passwords = {}
        ujt_obj = self.unified_job_template
        if ujt_obj is not None:
-            # MERGE note: move this to prompts_dict method on node when merging
-            # with the workflow inventory branch
-            prompts_data = self.prompts_dict()
-            if isinstance(ujt_obj, WorkflowJobTemplate):
-                if self.workflow_job.extra_vars:
-                    prompts_data.setdefault('extra_vars', {})
-                    prompts_data['extra_vars'].update(self.workflow_job.extra_vars_dict)
-            accepted_fields, ignored_fields, errors = ujt_obj._accept_or_ignore_job_kwargs(**prompts_data)
+            node_prompts_data = self.prompts_dict(for_cls=ujt_obj.__class__)
+            wj_prompts_data = self.workflow_job.prompts_dict(for_cls=ujt_obj.__class__)
+            # Explanation - special historical case
+            # WFJT extra_vars ignored JobTemplate.ask_variables_on_launch, bypassing _accept_or_ignore_job_kwargs
+            # inventory and others are only accepted if JT prompts for it with related ask_ field
+            # this is inconsistent, but maintained
+            if not isinstance(ujt_obj, WorkflowJobTemplate):
+                wj_special_vars = wj_prompts_data.pop('extra_vars', {})
+                wj_special_passwords = wj_prompts_data.pop('survey_passwords', {})
+            elif 'extra_vars' in node_prompts_data:
+                # Follow the vars combination rules
+                node_prompts_data['extra_vars'].update(wj_prompts_data.pop('extra_vars', {}))
+            elif 'survey_passwords' in node_prompts_data:
+                node_prompts_data['survey_passwords'].update(wj_prompts_data.pop('survey_passwords', {}))
+
+            # Follow the credential combination rules
+            if ('credentials' in wj_prompts_data) and ('credentials' in node_prompts_data):
+                wj_pivoted_creds = Credential.unique_dict(wj_prompts_data['credentials'])
+                node_pivoted_creds = Credential.unique_dict(node_prompts_data['credentials'])
+                node_pivoted_creds.update(wj_pivoted_creds)
+                wj_prompts_data['credentials'] = [cred for cred in node_pivoted_creds.values()]
+
+            # NOTE: no special rules for instance_groups, because they do not merge
+            # or labels, because they do not propogate WFJT-->node at all
+
+            # Combine WFJT prompts with node here, WFJT at higher level
+            node_prompts_data.update(wj_prompts_data)
+            accepted_fields, ignored_fields, errors = ujt_obj._accept_or_ignore_job_kwargs(**node_prompts_data)
            if errors:
                logger.info(
                    _('Bad launch configuration starting template {template_pk} as part of ' 'workflow {workflow_pk}. Errors:\n{error_text}').format(
@@ -303,36 +339,24 @@ class WorkflowJobNode(WorkflowNodeBase):
                    )
                )
            data.update(accepted_fields)  # missing fields are handled in the scheduler
-            try:
-                # config saved on the workflow job itself
-                wj_config = self.workflow_job.launch_config
-            except ObjectDoesNotExist:
-                wj_config = None
-            if wj_config:
-                accepted_fields, ignored_fields, errors = ujt_obj._accept_or_ignore_job_kwargs(**wj_config.prompts_dict())
-                accepted_fields.pop('extra_vars', None)  # merge handled with other extra_vars later
-                data.update(accepted_fields)
        # build ancestor artifacts, save them to node model for later
        aa_dict = {}
        is_root_node = True
        for parent_node in self.get_parent_nodes():
            is_root_node = False
            aa_dict.update(parent_node.ancestor_artifacts)
-            if parent_node.job and hasattr(parent_node.job, 'artifacts'):
-                aa_dict.update(parent_node.job.artifacts)
+            if parent_node.job:
+                aa_dict.update(parent_node.job.get_effective_artifacts(parents_set=set([self.workflow_job_id])))
        if aa_dict and not is_root_node:
            self.ancestor_artifacts = aa_dict
            self.save(update_fields=['ancestor_artifacts'])
        # process password list
-        password_dict = {}
+        password_dict = data.get('survey_passwords', {})
        if '_ansible_no_log' in aa_dict:
            for key in aa_dict:
                if key != '_ansible_no_log':
                    password_dict[key] = REPLACE_STR
-        if self.workflow_job.survey_passwords:
-            password_dict.update(self.workflow_job.survey_passwords)
-        if self.survey_passwords:
-            password_dict.update(self.survey_passwords)
+        password_dict.update(wj_special_passwords)
        if password_dict:
            data['survey_passwords'] = password_dict
        # process extra_vars
@@ -342,12 +366,12 @@ class WorkflowJobNode(WorkflowNodeBase):
                functional_aa_dict = copy(aa_dict)
                functional_aa_dict.pop('_ansible_no_log', None)
                extra_vars.update(functional_aa_dict)
-        if ujt_obj and isinstance(ujt_obj, JobTemplate):
-            # Workflow Job extra_vars higher precedence than ancestor artifacts
-            if self.workflow_job and self.workflow_job.extra_vars:
-                extra_vars.update(self.workflow_job.extra_vars_dict)
+
+        # Workflow Job extra_vars higher precedence than ancestor artifacts
+        extra_vars.update(wj_special_vars)
        if extra_vars:
            data['extra_vars'] = extra_vars
+
        # ensure that unified jobs created by WorkflowJobs are marked
        data['_eager_fields'] = {'launch_type': 'workflow'}
        if self.workflow_job and self.workflow_job.created_by:
@@ -373,6 +397,10 @@ class WorkflowJobOptions(LaunchTimeConfigBase):
            )
        )
    )
+    # Workflow jobs are used for sliced jobs, and thus, must be a conduit for any JT prompts
+    instance_groups = OrderedManyToManyField(
+        'InstanceGroup', related_name='workflow_job_instance_groups', blank=True, editable=False, through='WorkflowJobInstanceGroupMembership'
+    )
    allow_simultaneous = models.BooleanField(default=False)

    extra_vars_dict = VarsDictProperty('extra_vars', True)
@@ -384,7 +412,7 @@ class WorkflowJobOptions(LaunchTimeConfigBase):
    @classmethod
    def _get_unified_job_field_names(cls):
        r = set(f.name for f in WorkflowJobOptions._meta.fields) | set(
-            ['name', 'description', 'organization', 'survey_passwords', 'labels', 'limit', 'scm_branch']
+            ['name', 'description', 'organization', 'survey_passwords', 'labels', 'limit', 'scm_branch', 'job_tags', 'skip_tags']
        )
        r.remove('char_prompts')  # needed due to copying launch config to launch config
        return r
@@ -424,26 +452,29 @@ class WorkflowJobOptions(LaunchTimeConfigBase):
 class WorkflowJobTemplate(UnifiedJobTemplate, WorkflowJobOptions, SurveyJobTemplateMixin, ResourceMixin, RelatedJobsMixin, WebhookTemplateMixin):

    SOFT_UNIQUE_TOGETHER = [('polymorphic_ctype', 'name', 'organization')]
-    FIELDS_TO_PRESERVE_AT_COPY = ['labels', 'organization', 'instance_groups', 'workflow_job_template_nodes', 'credentials', 'survey_spec']
+    FIELDS_TO_PRESERVE_AT_COPY = [
+        'labels',
+        'organization',
+        'instance_groups',
+        'workflow_job_template_nodes',
+        'credentials',
+        'survey_spec',
+        'skip_tags',
+        'job_tags',
+        'execution_environment',
+    ]

    class Meta:
        app_label = 'main'

-    ask_inventory_on_launch = AskForField(
+    notification_templates_approvals = models.ManyToManyField(
+        "NotificationTemplate",
        blank=True,
-        default=False,
+        related_name='%(class)s_notification_templates_for_approvals',
    )
-    ask_limit_on_launch = AskForField(
-        blank=True,
-        default=False,
+    admin_role = ImplicitRoleField(
+        parent_role=['singleton:' + ROLE_SINGLETON_SYSTEM_ADMINISTRATOR, 'organization.workflow_admin_role'],
    )
-    ask_scm_branch_on_launch = AskForField(
-        blank=True,
-        default=False,
-    )
-    notification_templates_approvals = models.ManyToManyField("NotificationTemplate", blank=True, related_name='%(class)s_notification_templates_for_approvals')
-
-    admin_role = ImplicitRoleField(parent_role=['singleton:' + ROLE_SINGLETON_SYSTEM_ADMINISTRATOR, 'organization.workflow_admin_role'])
    execute_role = ImplicitRoleField(
        parent_role=[
            'admin_role',
@@ -622,6 +653,9 @@ class WorkflowJob(UnifiedJob, WorkflowJobOptions, SurveyJobMixin, JobNotificatio
    )
    is_sliced_job = models.BooleanField(default=False)

+    def _set_default_dependencies_processed(self):
+        self.dependencies_processed = True
+
    @property
    def workflow_nodes(self):
        return self.workflow_job_nodes
@@ -659,10 +693,16 @@ class WorkflowJob(UnifiedJob, WorkflowJobOptions, SurveyJobMixin, JobNotificatio
                node_job_description = 'job #{0}, "{1}", which finished with status {2}.'.format(node.job.id, node.job.name, node.job.status)
            str_arr.append("- node #{0} spawns {1}".format(node.id, node_job_description))
        result['body'] = '\n'.join(str_arr)
+        result.update(
+            dict(
+                inventory=self.inventory.name if self.inventory else None,
+                limit=self.limit,
+                extra_vars=self.display_extra_vars(),
+            )
+        )
        return result

-    @property
-    def task_impact(self):
+    def _get_task_impact(self):
        return 0

    def get_ancestor_workflows(self):
@@ -682,6 +722,46 @@ class WorkflowJob(UnifiedJob, WorkflowJobOptions, SurveyJobMixin, JobNotificatio
            wj = wj.get_workflow_job()
        return ancestors

+    def get_effective_artifacts(self, **kwargs):
+        """
+        For downstream jobs of a workflow nested inside of a workflow,
+        we send aggregated artifacts from the nodes inside of the nested workflow
+        """
+        artifacts = {}
+        job_queryset = (
+            UnifiedJob.objects.filter(unified_job_node__workflow_job=self)
+            .defer('job_args', 'job_cwd', 'start_args', 'result_traceback')
+            .order_by('finished', 'id')
+            .filter(status__in=['successful', 'failed'])
+            .iterator()
+        )
+        parents_set = kwargs.get('parents_set', set())
+        new_parents_set = parents_set | {self.id}
+        for job in job_queryset:
+            if job.id in parents_set:
+                continue
+            artifacts.update(job.get_effective_artifacts(parents_set=new_parents_set))
+        return artifacts
+
+    def prompts_dict(self, *args, **kwargs):
+        if self.job_template_id:
+            # HACK: Exception for sliced jobs here, this is bad
+            # when sliced jobs were introduced, workflows did not have all the prompted JT fields
+            # so to support prompting with slicing, we abused the workflow job launch config
+            # these would be more properly saved on the workflow job, but it gets the wrong fields now
+            try:
+                wj_config = self.launch_config
+                r = wj_config.prompts_dict(*args, **kwargs)
+            except ObjectDoesNotExist:
+                r = {}
+        else:
+            r = super().prompts_dict(*args, **kwargs)
+            # Workflow labels and job labels are treated separately
+            # that means that they do not propogate from WFJT / workflow job to jobs in workflow
+            r.pop('labels', None)
+
+        return r
+
    def get_notification_templates(self):
        return self.workflow_job_template.notification_templates

@@ -692,11 +772,10 @@ class WorkflowJob(UnifiedJob, WorkflowJobOptions, SurveyJobMixin, JobNotificatio
    def preferred_instance_groups(self):
        return []

-    @property
-    def actually_running(self):
+    def cancel_dispatcher_process(self):
        # WorkflowJobs don't _actually_ run anything in the dispatcher, so
        # there's no point in asking the dispatcher if it knows about this task
-        return self.status == 'running'
+        return True


 class WorkflowApprovalTemplate(UnifiedJobTemplate, RelatedJobsMixin):
@@ -755,6 +834,12 @@ class WorkflowApproval(UnifiedJob, JobNotificationMixin):
        default=0,
        help_text=_("The amount of time (in seconds) before the approval node expires and fails."),
    )
+    expires = models.DateTimeField(
+        default=None,
+        null=True,
+        editable=False,
+        help_text=_("The time this approval will expire. This is the created time plus timeout, used for filtering."),
+    )
    timed_out = models.BooleanField(default=False, help_text=_("Shows when an approval node (with a timeout assigned to it) has timed out."))
    approved_or_denied_by = models.ForeignKey(
        'auth.User',
@@ -765,6 +850,9 @@ class WorkflowApproval(UnifiedJob, JobNotificationMixin):
        on_delete=models.SET_NULL,
    )

+    def _set_default_dependencies_processed(self):
+        self.dependencies_processed = True
+
    @classmethod
    def _get_unified_job_template_class(cls):
        return WorkflowApprovalTemplate
@@ -782,13 +870,32 @@ class WorkflowApproval(UnifiedJob, JobNotificationMixin):
    def _get_parent_field_name(self):
        return 'workflow_approval_template'

+    def save(self, *args, **kwargs):
+        update_fields = list(kwargs.get('update_fields', []))
+        if self.timeout != 0 and ((not self.pk) or (not update_fields) or ('timeout' in update_fields)):
+            if not self.created:  # on creation, created will be set by parent class, so we fudge it here
+                created = now()
+            else:
+                created = self.created
+            new_expires = created + timedelta(seconds=self.timeout)
+            if new_expires != self.expires:
+                self.expires = new_expires
+                if update_fields and 'expires' not in update_fields:
+                    update_fields.append('expires')
+        elif self.timeout == 0 and ((not update_fields) or ('timeout' in update_fields)):
+            if self.expires:
+                self.expires = None
+                if update_fields and 'expires' not in update_fields:
+                    update_fields.append('expires')
+        super(WorkflowApproval, self).save(*args, **kwargs)
+
    def approve(self, request=None):
        self.status = 'successful'
        self.approved_or_denied_by = get_current_user()
        self.save()
        self.send_approval_notification('approved')
        self.websocket_emit_status(self.status)
-        schedule_task_manager()
+        ScheduleWorkflowManager().schedule()
        return reverse('api:workflow_approval_approve', kwargs={'pk': self.pk}, request=request)

    def deny(self, request=None):
@@ -797,7 +904,7 @@ class WorkflowApproval(UnifiedJob, JobNotificationMixin):
        self.save()
        self.send_approval_notification('denied')
        self.websocket_emit_status(self.status)
-        schedule_task_manager()
+        ScheduleWorkflowManager().schedule()
        return reverse('api:workflow_approval_deny', kwargs={'pk': self.pk}, request=request)

    def signal_start(self, **kwargs):
@@ -885,3 +992,12 @@ class WorkflowApproval(UnifiedJob, JobNotificationMixin):
    @property
    def workflow_job(self):
        return self.unified_job_node.workflow_job
+
+    def notification_data(self):
+        result = super(WorkflowApproval, self).notification_data()
+        result.update(
+            dict(
+                extra_vars=self.workflow_job.display_extra_vars(),
+            )
+        )
+        return result
--- a/awx/main/scheduler/init.py
+++ b/awx/main/scheduler/init.py
@@ -1,6 +1,6 @@
 # Copyright (c) 2017 Ansible, Inc.
 #

-from .task_manager import TaskManager
+from .task_manager import TaskManager, DependencyManager, WorkflowManager

-__all__ = ['TaskManager']
+__all__ = ['TaskManager', 'DependencyManager', 'WorkflowManager']
--- a/awx/main/scheduler/dependency_graph.py
+++ b/awx/main/scheduler/dependency_graph.py
@@ -7,6 +7,11 @@ from awx.main.models import (
    WorkflowJob,
 )

+import logging
+
+
+logger = logging.getLogger('awx.main.scheduler.dependency_graph')
+

 class DependencyGraph(object):
    PROJECT_UPDATES = 'project_updates'
@@ -36,6 +41,9 @@ class DependencyGraph(object):
        self.data[self.WORKFLOW_JOB_TEMPLATES_JOBS] = {}

    def mark_if_no_key(self, job_type, id, job):
+        if id is None:
+            logger.warning(f'Null dependency graph key from {job}, could be integrity error or bug, ignoring')
+            return
        # only mark first occurrence of a task. If 10 of JobA are launched
        # (concurrent disabled), the dependency graph should return that jobs
        # 2 through 10 are blocked by job1
@@ -66,7 +74,10 @@ class DependencyGraph(object):
        self.mark_if_no_key(self.JOB_TEMPLATE_JOBS, job.job_template_id, job)

    def mark_workflow_job(self, job):
-        self.mark_if_no_key(self.WORKFLOW_JOB_TEMPLATES_JOBS, job.workflow_job_template_id, job)
+        if job.workflow_job_template_id:
+            self.mark_if_no_key(self.WORKFLOW_JOB_TEMPLATES_JOBS, job.workflow_job_template_id, job)
+        elif job.unified_job_template_id:  # for sliced jobs
+            self.mark_if_no_key(self.WORKFLOW_JOB_TEMPLATES_JOBS, job.unified_job_template_id, job)

    def project_update_blocked_by(self, job):
        return self.get_item(self.PROJECT_UPDATES, job.project_id)
@@ -85,7 +96,13 @@ class DependencyGraph(object):

    def workflow_job_blocked_by(self, job):
        if job.allow_simultaneous is False:
-            return self.get_item(self.WORKFLOW_JOB_TEMPLATES_JOBS, job.workflow_job_template_id)
+            if job.workflow_job_template_id:
+                return self.get_item(self.WORKFLOW_JOB_TEMPLATES_JOBS, job.workflow_job_template_id)
+            elif job.unified_job_template_id:
+                # Sliced jobs can be either Job or WorkflowJob type, and either should block a sliced WorkflowJob
+                return self.get_item(self.WORKFLOW_JOB_TEMPLATES_JOBS, job.unified_job_template_id) or self.get_item(
+                    self.JOB_TEMPLATE_JOBS, job.unified_job_template_id
+                )
        return None

    def system_job_blocked_by(self, job):
--- a/awx/main/scheduler/task_manager.py
+++ b/awx/main/scheduler/task_manager.py
@@ -11,31 +11,35 @@ import sys
 import signal

 # Django
-from django.db import transaction, connection
+from django.db import transaction
 from django.utils.translation import gettext_lazy as _, gettext_noop
 from django.utils.timezone import now as tz_now
 from django.conf import settings
+from django.contrib.contenttypes.models import ContentType

 # AWX
 from awx.main.dispatch.reaper import reap_job
 from awx.main.models import (
-    AdHocCommand,
    Instance,
    InventorySource,
    InventoryUpdate,
    Job,
    Project,
    ProjectUpdate,
-    SystemJob,
    UnifiedJob,
    WorkflowApproval,
    WorkflowJob,
+    WorkflowJobNode,
    WorkflowJobTemplate,
 )
 from awx.main.scheduler.dag_workflow import WorkflowDAG
 from awx.main.utils.pglock import advisory_lock
-from awx.main.utils import get_type_for_model, task_manager_bulk_reschedule, schedule_task_manager
-from awx.main.utils.common import create_partition
+from awx.main.utils import (
+    get_type_for_model,
+    ScheduleTaskManager,
+    ScheduleWorkflowManager,
+)
+from awx.main.utils.common import task_manager_bulk_reschedule
 from awx.main.signals import disable_activity_stream
 from awx.main.constants import ACTIVE_STATES
 from awx.main.scheduler.dependency_graph import DependencyGraph
@@ -53,167 +57,101 @@ def timeit(func):
        t_now = time.perf_counter()
        result = func(*args, **kwargs)
        dur = time.perf_counter() - t_now
-        args[0].subsystem_metrics.inc("task_manager_" + func.__name__ + "_seconds", dur)
+        args[0].subsystem_metrics.inc(f"{args[0].prefix}_{func.__name__}_seconds", dur)
        return result

    return inner


-class TaskManager:
-    def __init__(self):
-        """
-        Do NOT put database queries or other potentially expensive operations
-        in the task manager init. The task manager object is created every time a
-        job is created, transitions state, and every 30 seconds on each tower node.
-        More often then not, the object is destroyed quickly because the NOOP case is hit.
-
-        The NOOP case is short-circuit logic. If the task manager realizes that another instance
-        of the task manager is already running, then it short-circuits and decides not to run.
-        """
-        # start task limit indicates how many pending jobs can be started on this
-        # .schedule() run. Starting jobs is expensive, and there is code in place to reap
-        # the task manager after 5 minutes. At scale, the task manager can easily take more than
-        # 5 minutes to start pending jobs. If this limit is reached, pending jobs
-        # will no longer be started and will be started on the next task manager cycle.
-        self.start_task_limit = settings.START_TASK_LIMIT
-        self.time_delta_job_explanation = timedelta(seconds=30)
-        self.subsystem_metrics = s_metrics.Metrics(auto_pipe_execute=False)
+class TaskBase:
+    def __init__(self, prefix=""):
+        self.prefix = prefix
        # initialize each metric to 0 and force metric_has_changed to true. This
        # ensures each task manager metric will be overridden when pipe_execute
        # is called later.
+        self.subsystem_metrics = s_metrics.Metrics(auto_pipe_execute=False)
+        self.start_time = time.time()
+        self.start_task_limit = settings.START_TASK_LIMIT
        for m in self.subsystem_metrics.METRICS:
-            if m.startswith("task_manager"):
+            if m.startswith(self.prefix):
                self.subsystem_metrics.set(m, 0)

-    def after_lock_init(self, all_sorted_tasks):
-        """
-        Init AFTER we know this instance of the task manager will run because the lock is acquired.
-        """
-        self.dependency_graph = DependencyGraph()
-        self.instances = TaskManagerInstances(all_sorted_tasks)
-        self.instance_groups = TaskManagerInstanceGroups(instances_by_hostname=self.instances)
-        self.controlplane_ig = self.instance_groups.controlplane_ig
-
-    def job_blocked_by(self, task):
-        # TODO: I'm not happy with this, I think blocking behavior should be decided outside of the dependency graph
-        #       in the old task manager this was handled as a method on each task object outside of the graph and
-        #       probably has the side effect of cutting down *a lot* of the logic from this task manager class
-        blocked_by = self.dependency_graph.task_blocked_by(task)
-        if blocked_by:
-            return blocked_by
-
-        for dep in task.dependent_jobs.all():
-            if dep.status in ACTIVE_STATES:
-                return dep
-            # if we detect a failed or error dependency, go ahead and fail this
-            # task. The errback on the dependency takes some time to trigger,
-            # and we don't want the task to enter running state if its
-            # dependency has failed or errored.
-            elif dep.status in ("error", "failed"):
-                task.status = 'failed'
-                task.job_explanation = 'Previous Task Failed: {"job_type": "%s", "job_name": "%s", "job_id": "%s"}' % (
-                    get_type_for_model(type(dep)),
-                    dep.name,
-                    dep.id,
-                )
-                task.save(update_fields=['status', 'job_explanation'])
-                task.websocket_emit_status('failed')
-                return dep
-
-        return None
+    def timed_out(self):
+        """Return True/False if we have met or exceeded the timeout for the task manager."""
+        elapsed = time.time() - self.start_time
+        if elapsed >= settings.TASK_MANAGER_TIMEOUT:
+            logger.warning(f"{self.prefix} manager has run for {elapsed} which is greater than TASK_MANAGER_TIMEOUT of {settings.TASK_MANAGER_TIMEOUT}.")
+            return True
+        return False

    @timeit
-    def get_tasks(self, status_list=('pending', 'waiting', 'running')):
-        jobs = [j for j in Job.objects.filter(status__in=status_list).prefetch_related('instance_group')]
-        inventory_updates_qs = (
-            InventoryUpdate.objects.filter(status__in=status_list).exclude(source='file').prefetch_related('inventory_source', 'instance_group')
+    def get_tasks(self, filter_args):
+        wf_approval_ctype_id = ContentType.objects.get_for_model(WorkflowApproval).id
+        qs = (
+            UnifiedJob.objects.filter(**filter_args)
+            .exclude(launch_type='sync')
+            .exclude(polymorphic_ctype_id=wf_approval_ctype_id)
+            .order_by('created')
+            .prefetch_related('dependent_jobs')
        )
-        inventory_updates = [i for i in inventory_updates_qs]
-        # Notice the job_type='check': we want to prevent implicit project updates from blocking our jobs.
-        project_updates = [p for p in ProjectUpdate.objects.filter(status__in=status_list, job_type='check').prefetch_related('instance_group')]
-        system_jobs = [s for s in SystemJob.objects.filter(status__in=status_list).prefetch_related('instance_group')]
-        ad_hoc_commands = [a for a in AdHocCommand.objects.filter(status__in=status_list).prefetch_related('instance_group')]
-        workflow_jobs = [w for w in WorkflowJob.objects.filter(status__in=status_list)]
-        all_tasks = sorted(jobs + project_updates + inventory_updates + system_jobs + ad_hoc_commands + workflow_jobs, key=lambda task: task.created)
-        return all_tasks
+        self.all_tasks = [t for t in qs]

-    def get_running_workflow_jobs(self):
-        graph_workflow_jobs = [wf for wf in WorkflowJob.objects.filter(status='running')]
-        return graph_workflow_jobs
+    def record_aggregate_metrics(self, *args):
+        if not settings.IS_TESTING():
+            # increment task_manager_schedule_calls regardless if the other
+            # metrics are recorded
+            s_metrics.Metrics(auto_pipe_execute=True).inc(f"{self.prefix}__schedule_calls", 1)
+            # Only record metrics if the last time recording was more
+            # than SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL ago.
+            # Prevents a short-duration task manager that runs directly after a
+            # long task manager to override useful metrics.
+            current_time = time.time()
+            time_last_recorded = current_time - self.subsystem_metrics.decode(f"{self.prefix}_recorded_timestamp")
+            if time_last_recorded > settings.SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL:
+                logger.debug(f"recording {self.prefix} metrics, last recorded {time_last_recorded} seconds ago")
+                self.subsystem_metrics.set(f"{self.prefix}_recorded_timestamp", current_time)
+                self.subsystem_metrics.pipe_execute()
+            else:
+                logger.debug(f"skipping recording {self.prefix} metrics, last recorded {time_last_recorded} seconds ago")

-    def get_inventory_source_tasks(self, all_sorted_tasks):
-        inventory_ids = set()
-        for task in all_sorted_tasks:
-            if isinstance(task, Job):
-                inventory_ids.add(task.inventory_id)
-        return [invsrc for invsrc in InventorySource.objects.filter(inventory_id__in=inventory_ids, update_on_launch=True)]
+    def record_aggregate_metrics_and_exit(self, *args):
+        self.record_aggregate_metrics()
+        sys.exit(1)
+
+    def schedule(self):
+        # Lock
+        with task_manager_bulk_reschedule():
+            with advisory_lock(f"{self.prefix}_lock", wait=False) as acquired:
+                with transaction.atomic():
+                    if acquired is False:
+                        logger.debug(f"Not running {self.prefix} scheduler, another task holds lock")
+                        return
+                    logger.debug(f"Starting {self.prefix} Scheduler")
+                    # if sigterm due to timeout, still record metrics
+                    signal.signal(signal.SIGTERM, self.record_aggregate_metrics_and_exit)
+                    self._schedule()
+                    commit_start = time.time()
+
+                if self.prefix == "task_manager":
+                    self.subsystem_metrics.set(f"{self.prefix}_commit_seconds", time.time() - commit_start)
+                self.record_aggregate_metrics()
+                logger.debug(f"Finishing {self.prefix} Scheduler")
+
+
+class WorkflowManager(TaskBase):
+    def __init__(self):
+        super().__init__(prefix="workflow_manager")

    @timeit
-    def spawn_workflow_graph_jobs(self, workflow_jobs):
-        for workflow_job in workflow_jobs:
-            if workflow_job.cancel_flag:
-                logger.debug('Not spawning jobs for %s because it is pending cancelation.', workflow_job.log_format)
-                continue
-            dag = WorkflowDAG(workflow_job)
-            spawn_nodes = dag.bfs_nodes_to_run()
-            if spawn_nodes:
-                logger.debug('Spawning jobs for %s', workflow_job.log_format)
-            else:
-                logger.debug('No nodes to spawn for %s', workflow_job.log_format)
-            for spawn_node in spawn_nodes:
-                if spawn_node.unified_job_template is None:
-                    continue
-                kv = spawn_node.get_job_kwargs()
-                job = spawn_node.unified_job_template.create_unified_job(**kv)
-                spawn_node.job = job
-                spawn_node.save()
-                logger.debug('Spawned %s in %s for node %s', job.log_format, workflow_job.log_format, spawn_node.pk)
-                can_start = True
-                if isinstance(spawn_node.unified_job_template, WorkflowJobTemplate):
-                    workflow_ancestors = job.get_ancestor_workflows()
-                    if spawn_node.unified_job_template in set(workflow_ancestors):
-                        can_start = False
-                        logger.info(
-                            'Refusing to start recursive workflow-in-workflow id={}, wfjt={}, ancestors={}'.format(
-                                job.id, spawn_node.unified_job_template.pk, [wa.pk for wa in workflow_ancestors]
-                            )
-                        )
-                        display_list = [spawn_node.unified_job_template] + workflow_ancestors
-                        job.job_explanation = gettext_noop(
-                            "Workflow Job spawned from workflow could not start because it " "would result in recursion (spawn order, most recent first: {})"
-                        ).format(', '.join(['<{}>'.format(tmp) for tmp in display_list]))
-                    else:
-                        logger.debug(
-                            'Starting workflow-in-workflow id={}, wfjt={}, ancestors={}'.format(
-                                job.id, spawn_node.unified_job_template.pk, [wa.pk for wa in workflow_ancestors]
-                            )
-                        )
-                if not job._resources_sufficient_for_launch():
-                    can_start = False
-                    job.job_explanation = gettext_noop(
-                        "Job spawned from workflow could not start because it " "was missing a related resource such as project or inventory"
-                    )
-                if can_start:
-                    if workflow_job.start_args:
-                        start_args = json.loads(decrypt_field(workflow_job, 'start_args'))
-                    else:
-                        start_args = {}
-                    can_start = job.signal_start(**start_args)
-                    if not can_start:
-                        job.job_explanation = gettext_noop(
-                            "Job spawned from workflow could not start because it " "was not in the right state or required manual credentials"
-                        )
-                if not can_start:
-                    job.status = 'failed'
-                    job.save(update_fields=['status', 'job_explanation'])
-                    job.websocket_emit_status('failed')
-
-                # TODO: should we emit a status on the socket here similar to tasks.py awx_periodic_scheduler() ?
-                # emit_websocket_notification('/socket.io/jobs', '', dict(id=))
-
-    def process_finished_workflow_jobs(self, workflow_jobs):
+    def spawn_workflow_graph_jobs(self):
        result = []
-        for workflow_job in workflow_jobs:
+        for workflow_job in self.all_tasks:
+            if self.timed_out():
+                logger.warning("Workflow manager has reached time out while processing running workflows, exiting loop early")
+                ScheduleWorkflowManager().schedule()
+                # Do not process any more workflow jobs. Stop here.
+                # Maybe we should schedule another WorkflowManager run
+                break
            dag = WorkflowDAG(workflow_job)
            status_changed = False
            if workflow_job.cancel_flag:
@@ -228,99 +166,106 @@ class TaskManager:
                    status_changed = True
            else:
                workflow_nodes = dag.mark_dnr_nodes()
-                for n in workflow_nodes:
-                    n.save(update_fields=['do_not_run'])
+                WorkflowJobNode.objects.bulk_update(workflow_nodes, ['do_not_run'])
+                # If workflow is now done, we do special things to mark it as done.
                is_done = dag.is_workflow_done()
-                if not is_done:
-                    continue
-                has_failed, reason = dag.has_workflow_failed()
-                logger.debug('Marking %s as %s.', workflow_job.log_format, 'failed' if has_failed else 'successful')
-                result.append(workflow_job.id)
-                new_status = 'failed' if has_failed else 'successful'
-                logger.debug("Transitioning {} to {} status.".format(workflow_job.log_format, new_status))
-                update_fields = ['status', 'start_args']
-                workflow_job.status = new_status
-                if reason:
-                    logger.info(f'Workflow job {workflow_job.id} failed due to reason: {reason}')
-                    workflow_job.job_explanation = gettext_noop("No error handling paths found, marking workflow as failed")
-                    update_fields.append('job_explanation')
-                workflow_job.start_args = ''  # blank field to remove encrypted passwords
-                workflow_job.save(update_fields=update_fields)
-                status_changed = True
+                if is_done:
+                    has_failed, reason = dag.has_workflow_failed()
+                    logger.debug('Marking %s as %s.', workflow_job.log_format, 'failed' if has_failed else 'successful')
+                    result.append(workflow_job.id)
+                    new_status = 'failed' if has_failed else 'successful'
+                    logger.debug("Transitioning {} to {} status.".format(workflow_job.log_format, new_status))
+                    update_fields = ['status', 'start_args']
+                    workflow_job.status = new_status
+                    if reason:
+                        logger.info(f'Workflow job {workflow_job.id} failed due to reason: {reason}')
+                        workflow_job.job_explanation = gettext_noop("No error handling paths found, marking workflow as failed")
+                        update_fields.append('job_explanation')
+                    workflow_job.start_args = ''  # blank field to remove encrypted passwords
+                    workflow_job.save(update_fields=update_fields)
+                    status_changed = True
+
            if status_changed:
+                if workflow_job.spawned_by_workflow:
+                    ScheduleWorkflowManager().schedule()
                workflow_job.websocket_emit_status(workflow_job.status)
                # Operations whose queries rely on modifications made during the atomic scheduling session
                workflow_job.send_notification_templates('succeeded' if workflow_job.status == 'successful' else 'failed')
-                if workflow_job.spawned_by_workflow:
-                    schedule_task_manager()
+
+            if workflow_job.status == 'running':
+                spawn_nodes = dag.bfs_nodes_to_run()
+                if spawn_nodes:
+                    logger.debug('Spawning jobs for %s', workflow_job.log_format)
+                else:
+                    logger.debug('No nodes to spawn for %s', workflow_job.log_format)
+                for spawn_node in spawn_nodes:
+                    if spawn_node.unified_job_template is None:
+                        continue
+                    kv = spawn_node.get_job_kwargs()
+                    job = spawn_node.unified_job_template.create_unified_job(**kv)
+                    spawn_node.job = job
+                    spawn_node.save()
+                    logger.debug('Spawned %s in %s for node %s', job.log_format, workflow_job.log_format, spawn_node.pk)
+                    can_start = True
+                    if isinstance(spawn_node.unified_job_template, WorkflowJobTemplate):
+                        workflow_ancestors = job.get_ancestor_workflows()
+                        if spawn_node.unified_job_template in set(workflow_ancestors):
+                            can_start = False
+                            logger.info(
+                                'Refusing to start recursive workflow-in-workflow id={}, wfjt={}, ancestors={}'.format(
+                                    job.id, spawn_node.unified_job_template.pk, [wa.pk for wa in workflow_ancestors]
+                                )
+                            )
+                            display_list = [spawn_node.unified_job_template] + workflow_ancestors
+                            job.job_explanation = gettext_noop(
+                                "Workflow Job spawned from workflow could not start because it "
+                                "would result in recursion (spawn order, most recent first: {})"
+                            ).format(', '.join('<{}>'.format(tmp) for tmp in display_list))
+                        else:
+                            logger.debug(
+                                'Starting workflow-in-workflow id={}, wfjt={}, ancestors={}'.format(
+                                    job.id, spawn_node.unified_job_template.pk, [wa.pk for wa in workflow_ancestors]
+                                )
+                            )
+                    if not job._resources_sufficient_for_launch():
+                        can_start = False
+                        job.job_explanation = gettext_noop(
+                            "Job spawned from workflow could not start because it was missing a related resource such as project or inventory"
+                        )
+                    if can_start:
+                        if workflow_job.start_args:
+                            start_args = json.loads(decrypt_field(workflow_job, 'start_args'))
+                        else:
+                            start_args = {}
+                        can_start = job.signal_start(**start_args)
+                        if not can_start:
+                            job.job_explanation = gettext_noop(
+                                "Job spawned from workflow could not start because it was not in the right state or required manual credentials"
+                            )
+                    if not can_start:
+                        job.status = 'failed'
+                        job.save(update_fields=['status', 'job_explanation'])
+                        job.websocket_emit_status('failed')
+
+                    # TODO: should we emit a status on the socket here similar to tasks.py awx_periodic_scheduler() ?
+                    # emit_websocket_notification('/socket.io/jobs', '', dict(id=))
+
        return result

    @timeit
-    def start_task(self, task, instance_group, dependent_tasks=None, instance=None):
-        self.subsystem_metrics.inc("task_manager_tasks_started", 1)
-        self.start_task_limit -= 1
-        if self.start_task_limit == 0:
-            # schedule another run immediately after this task manager
-            schedule_task_manager()
-        from awx.main.tasks.system import handle_work_error, handle_work_success
-
-        dependent_tasks = dependent_tasks or []
-
-        task_actual = {
-            'type': get_type_for_model(type(task)),
-            'id': task.id,
-        }
-        dependencies = [{'type': get_type_for_model(type(t)), 'id': t.id} for t in dependent_tasks]
-
-        task.status = 'waiting'
-
-        (start_status, opts) = task.pre_start()
-        if not start_status:
-            task.status = 'failed'
-            if task.job_explanation:
-                task.job_explanation += ' '
-            task.job_explanation += 'Task failed pre-start check.'
-            task.save()
-            # TODO: run error handler to fail sub-tasks and send notifications
-        else:
-            if type(task) is WorkflowJob:
-                task.status = 'running'
-                task.send_notification_templates('running')
-                logger.debug('Transitioning %s to running status.', task.log_format)
-                schedule_task_manager()
-            # at this point we already have control/execution nodes selected for the following cases
-            else:
-                task.instance_group = instance_group
-                execution_node_msg = f' and execution node {task.execution_node}' if task.execution_node else ''
-                logger.debug(
-                    f'Submitting job {task.log_format} controlled by {task.controller_node} to instance group {instance_group.name}{execution_node_msg}.'
-                )
-            with disable_activity_stream():
-                task.celery_task_id = str(uuid.uuid4())
-                task.save()
-                task.log_lifecycle("waiting")
-
-        def post_commit():
-            if task.status != 'failed' and type(task) is not WorkflowJob:
-                # Before task is dispatched, ensure that job_event partitions exist
-                create_partition(task.event_class._meta.db_table, start=task.created)
-                task_cls = task._get_task_class()
-                task_cls.apply_async(
-                    [task.pk],
-                    opts,
-                    queue=task.get_queue_name(),
-                    uuid=task.celery_task_id,
-                    callbacks=[{'task': handle_work_success.name, 'kwargs': {'task_actual': task_actual}}],
-                    errbacks=[{'task': handle_work_error.name, 'args': [task.celery_task_id], 'kwargs': {'subtasks': [task_actual] + dependencies}}],
-                )
-
-        task.websocket_emit_status(task.status)  # adds to on_commit
-        connection.on_commit(post_commit)
+    def get_tasks(self, filter_args):
+        self.all_tasks = [wf for wf in WorkflowJob.objects.filter(**filter_args)]

    @timeit
-    def process_running_tasks(self, running_tasks):
-        for task in running_tasks:
-            self.dependency_graph.add_job(task)
+    def _schedule(self):
+        self.get_tasks(dict(status__in=["running"], dependencies_processed=True))
+        if len(self.all_tasks) > 0:
+            self.spawn_workflow_graph_jobs()
+
+
+class DependencyManager(TaskBase):
+    def __init__(self):
+        super().__init__(prefix="dependency_manager")

    def create_project_update(self, task, project_id=None):
        if project_id is None:
@@ -341,14 +286,20 @@ class TaskManager:
        inventory_task.status = 'pending'
        inventory_task.save()
        logger.debug('Spawned {} as dependency of {}'.format(inventory_task.log_format, task.log_format))
-        # inventory_sources = self.get_inventory_source_tasks([task])
-        # self.process_inventory_sources(inventory_sources)
+
        return inventory_task

    def add_dependencies(self, task, dependencies):
        with disable_activity_stream():
            task.dependent_jobs.add(*dependencies)

+    def get_inventory_source_tasks(self):
+        inventory_ids = set()
+        for task in self.all_tasks:
+            if isinstance(task, Job):
+                inventory_ids.add(task.inventory_id)
+        self.all_inventory_sources = [invsrc for invsrc in InventorySource.objects.filter(inventory_id__in=inventory_ids, update_on_launch=True)]
+
    def get_latest_inventory_update(self, inventory_source):
        latest_inventory_update = InventoryUpdate.objects.filter(inventory_source=inventory_source).order_by("-created")
        if not latest_inventory_update.exists():
@@ -481,16 +432,167 @@ class TaskManager:

        return created_dependencies

+    def process_tasks(self):
+        deps = self.generate_dependencies(self.all_tasks)
+        self.generate_dependencies(deps)
+        self.subsystem_metrics.inc(f"{self.prefix}_pending_processed", len(self.all_tasks) + len(deps))
+
+    @timeit
+    def _schedule(self):
+        self.get_tasks(dict(status__in=["pending"], dependencies_processed=False))
+
+        if len(self.all_tasks) > 0:
+            self.get_inventory_source_tasks()
+            self.process_tasks()
+            ScheduleTaskManager().schedule()
+
+
+class TaskManager(TaskBase):
+    def __init__(self):
+        """
+        Do NOT put database queries or other potentially expensive operations
+        in the task manager init. The task manager object is created every time a
+        job is created, transitions state, and every 30 seconds on each tower node.
+        More often then not, the object is destroyed quickly because the NOOP case is hit.
+
+        The NOOP case is short-circuit logic. If the task manager realizes that another instance
+        of the task manager is already running, then it short-circuits and decides not to run.
+        """
+        # start task limit indicates how many pending jobs can be started on this
+        # .schedule() run. Starting jobs is expensive, and there is code in place to reap
+        # the task manager after 5 minutes. At scale, the task manager can easily take more than
+        # 5 minutes to start pending jobs. If this limit is reached, pending jobs
+        # will no longer be started and will be started on the next task manager cycle.
+        self.time_delta_job_explanation = timedelta(seconds=30)
+        super().__init__(prefix="task_manager")
+
+    def after_lock_init(self):
+        """
+        Init AFTER we know this instance of the task manager will run because the lock is acquired.
+        """
+        self.dependency_graph = DependencyGraph()
+        self.instances = TaskManagerInstances(self.all_tasks)
+        self.instance_groups = TaskManagerInstanceGroups(instances_by_hostname=self.instances)
+        self.controlplane_ig = self.instance_groups.controlplane_ig
+
+    def job_blocked_by(self, task):
+        # TODO: I'm not happy with this, I think blocking behavior should be decided outside of the dependency graph
+        #       in the old task manager this was handled as a method on each task object outside of the graph and
+        #       probably has the side effect of cutting down *a lot* of the logic from this task manager class
+        blocked_by = self.dependency_graph.task_blocked_by(task)
+        if blocked_by:
+            return blocked_by
+
+        for dep in task.dependent_jobs.all():
+            if dep.status in ACTIVE_STATES:
+                return dep
+            # if we detect a failed or error dependency, go ahead and fail this
+            # task. The errback on the dependency takes some time to trigger,
+            # and we don't want the task to enter running state if its
+            # dependency has failed or errored.
+            elif dep.status in ("error", "failed"):
+                task.status = 'failed'
+                task.job_explanation = 'Previous Task Failed: {"job_type": "%s", "job_name": "%s", "job_id": "%s"}' % (
+                    get_type_for_model(type(dep)),
+                    dep.name,
+                    dep.id,
+                )
+                task.save(update_fields=['status', 'job_explanation'])
+                task.websocket_emit_status('failed')
+                return dep
+
+        return None
+
+    @timeit
+    def start_task(self, task, instance_group, dependent_tasks=None, instance=None):
+        self.dependency_graph.add_job(task)
+        self.subsystem_metrics.inc(f"{self.prefix}_tasks_started", 1)
+        self.start_task_limit -= 1
+        if self.start_task_limit == 0:
+            # schedule another run immediately after this task manager
+            ScheduleTaskManager().schedule()
+        from awx.main.tasks.system import handle_work_error, handle_work_success
+
+        # update capacity for control node and execution node
+        if task.controller_node:
+            self.instances[task.controller_node].consume_capacity(settings.AWX_CONTROL_NODE_TASK_IMPACT)
+        if task.execution_node:
+            self.instances[task.execution_node].consume_capacity(task.task_impact)
+
+        dependent_tasks = dependent_tasks or []
+
+        task_actual = {
+            'type': get_type_for_model(type(task)),
+            'id': task.id,
+        }
+        dependencies = [{'type': get_type_for_model(type(t)), 'id': t.id} for t in dependent_tasks]
+
+        task.status = 'waiting'
+
+        (start_status, opts) = task.pre_start()
+        if not start_status:
+            task.status = 'failed'
+            if task.job_explanation:
+                task.job_explanation += ' '
+            task.job_explanation += 'Task failed pre-start check.'
+            task.save()
+            # TODO: run error handler to fail sub-tasks and send notifications
+        else:
+            if type(task) is WorkflowJob:
+                task.status = 'running'
+                task.send_notification_templates('running')
+                logger.debug('Transitioning %s to running status.', task.log_format)
+                # Call this to ensure Workflow nodes get spawned in timely manner
+                ScheduleWorkflowManager().schedule()
+            # at this point we already have control/execution nodes selected for the following cases
+            else:
+                task.instance_group = instance_group
+                execution_node_msg = f' and execution node {task.execution_node}' if task.execution_node else ''
+                logger.debug(
+                    f'Submitting job {task.log_format} controlled by {task.controller_node} to instance group {instance_group.name}{execution_node_msg}.'
+                )
+            with disable_activity_stream():
+                task.celery_task_id = str(uuid.uuid4())
+                task.save()
+                task.log_lifecycle("waiting")
+
+        # apply_async does a NOTIFY to the channel dispatcher is listening to
+        # postgres will treat this as part of the transaction, which is what we want
+        if task.status != 'failed' and type(task) is not WorkflowJob:
+            task_cls = task._get_task_class()
+            task_cls.apply_async(
+                [task.pk],
+                opts,
+                queue=task.get_queue_name(),
+                uuid=task.celery_task_id,
+                callbacks=[{'task': handle_work_success.name, 'kwargs': {'task_actual': task_actual}}],
+                errbacks=[{'task': handle_work_error.name, 'args': [task.celery_task_id], 'kwargs': {'subtasks': [task_actual] + dependencies}}],
+            )
+
+        # In exception cases, like a job failing pre-start checks, we send the websocket status message
+        # for jobs going into waiting, we omit this because of performance issues, as it should go to running quickly
+        if task.status != 'waiting':
+            task.websocket_emit_status(task.status)  # adds to on_commit
+
+    @timeit
+    def process_running_tasks(self, running_tasks):
+        for task in running_tasks:
+            if type(task) is WorkflowJob:
+                ScheduleWorkflowManager().schedule()
+            self.dependency_graph.add_job(task)
+
    @timeit
    def process_pending_tasks(self, pending_tasks):
-        running_workflow_templates = {wf.unified_job_template_id for wf in self.get_running_workflow_jobs()}
        tasks_to_update_job_explanation = []
        for task in pending_tasks:
            if self.start_task_limit <= 0:
                break
+            if self.timed_out():
+                logger.warning("Task manager has reached time out while processing pending jobs, exiting loop early")
+                break
            blocked_by = self.job_blocked_by(task)
            if blocked_by:
-                self.subsystem_metrics.inc("task_manager_tasks_blocked", 1)
+                self.subsystem_metrics.inc(f"{self.prefix}_tasks_blocked", 1)
                task.log_lifecycle("blocked", blocked_by=blocked_by)
                job_explanation = gettext_noop(f"waiting for {blocked_by._meta.model_name}-{blocked_by.id} to finish")
                if task.job_explanation != job_explanation:
@@ -499,19 +601,14 @@ class TaskManager:
                        tasks_to_update_job_explanation.append(task)
                continue

-            found_acceptable_queue = False
-            preferred_instance_groups = task.preferred_instance_groups
-
            if isinstance(task, WorkflowJob):
-                if task.unified_job_template_id in running_workflow_templates:
-                    if not task.allow_simultaneous:
-                        logger.debug("{} is blocked from running, workflow already running".format(task.log_format))
-                        continue
-                else:
-                    running_workflow_templates.add(task.unified_job_template_id)
+                # Previously we were tracking allow_simultaneous blocking both here and in DependencyGraph.
+                # Double check that using just the DependencyGraph works for Workflows and Sliced Jobs.
                self.start_task(task, None, task.get_jobs_fail_chain(), None)
                continue

+            found_acceptable_queue = False
+
            # Determine if there is control capacity for the task
            if task.capacity_type == 'control':
                control_impact = task.task_impact + settings.AWX_CONTROL_NODE_TASK_IMPACT
@@ -530,8 +627,6 @@ class TaskManager:
            # All task.capacity_type == 'control' jobs should run on control plane, no need to loop over instance groups
            if task.capacity_type == 'control':
                task.execution_node = control_instance.hostname
-                control_instance.consume_capacity(control_impact)
-                self.dependency_graph.add_job(task)
                execution_instance = self.instances[control_instance.hostname].obj
                task.log_lifecycle("controller_node_chosen")
                task.log_lifecycle("execution_node_chosen")
@@ -539,17 +634,12 @@ class TaskManager:
                found_acceptable_queue = True
                continue

-            for instance_group in preferred_instance_groups:
+            for instance_group in self.instance_groups.get_instance_groups_from_task_cache(task):
                if instance_group.is_container_group:
-                    self.dependency_graph.add_job(task)
                    self.start_task(task, instance_group, task.get_jobs_fail_chain(), None)
                    found_acceptable_queue = True
                    break

-                # TODO: remove this after we have confidence that OCP control nodes are reporting node_type=control
-                if settings.IS_K8S and task.capacity_type == 'execution':
-                    logger.debug("Skipping group {}, task cannot run on control plane".format(instance_group.name))
-                    continue
                # at this point we know the instance group is NOT a container group
                # because if it was, it would have started the task and broke out of the loop.
                execution_instance = self.instance_groups.fit_task_to_most_remaining_capacity_instance(
@@ -563,9 +653,7 @@ class TaskManager:
                        control_instance = execution_instance
                        task.controller_node = execution_instance.hostname

-                    control_instance.consume_capacity(settings.AWX_CONTROL_NODE_TASK_IMPACT)
                    task.log_lifecycle("controller_node_chosen")
-                    execution_instance.consume_capacity(task.task_impact)
                    task.log_lifecycle("execution_node_chosen")
                    logger.debug(
                        "Starting {} in group {} instance {} (remaining_capacity={})".format(
@@ -573,7 +661,6 @@ class TaskManager:
                        )
                    )
                    execution_instance = self.instances[execution_instance.hostname].obj
-                    self.dependency_graph.add_job(task)
                    self.start_task(task, instance_group, task.get_jobs_fail_chain(), execution_instance)
                    found_acceptable_queue = True
                    break
@@ -599,25 +686,6 @@ class TaskManager:
                tasks_to_update_job_explanation.append(task)
        logger.debug("{} couldn't be scheduled on graph, waiting for next cycle".format(task.log_format))

-    def timeout_approval_node(self):
-        workflow_approvals = WorkflowApproval.objects.filter(status='pending')
-        now = tz_now()
-        for task in workflow_approvals:
-            approval_timeout_seconds = timedelta(seconds=task.timeout)
-            if task.timeout == 0:
-                continue
-            if (now - task.created) >= approval_timeout_seconds:
-                timeout_message = _("The approval node {name} ({pk}) has expired after {timeout} seconds.").format(
-                    name=task.name, pk=task.pk, timeout=task.timeout
-                )
-                logger.warning(timeout_message)
-                task.timed_out = True
-                task.status = 'failed'
-                task.send_approval_notification('timed_out')
-                task.websocket_emit_status(task.status)
-                task.job_explanation = timeout_message
-                task.save(update_fields=['status', 'job_explanation', 'timed_out'])
-
    def reap_jobs_from_orphaned_instances(self):
        # discover jobs that are in running state but aren't on an execution node
        # that we know about; this is a fairly rare event, but it can occur if you,
@@ -630,92 +698,45 @@ class TaskManager:
                logger.error(f'{j.execution_node} is not a registered instance; reaping {j.log_format}')
                reap_job(j, 'failed')

-    def process_tasks(self, all_sorted_tasks):
-        running_tasks = [t for t in all_sorted_tasks if t.status in ['waiting', 'running']]
+    def process_tasks(self):
+        running_tasks = [t for t in self.all_tasks if t.status in ['waiting', 'running']]
        self.process_running_tasks(running_tasks)
-        self.subsystem_metrics.inc("task_manager_running_processed", len(running_tasks))
+        self.subsystem_metrics.inc(f"{self.prefix}_running_processed", len(running_tasks))

-        pending_tasks = [t for t in all_sorted_tasks if t.status == 'pending']
-
-        undeped_tasks = [t for t in pending_tasks if not t.dependencies_processed]
-        dependencies = self.generate_dependencies(undeped_tasks)
-        deps_of_deps = self.generate_dependencies(dependencies)
-        dependencies += deps_of_deps
-        self.process_pending_tasks(dependencies)
-        self.subsystem_metrics.inc("task_manager_pending_processed", len(dependencies))
+        pending_tasks = [t for t in self.all_tasks if t.status == 'pending']

        self.process_pending_tasks(pending_tasks)
-        self.subsystem_metrics.inc("task_manager_pending_processed", len(pending_tasks))
+        self.subsystem_metrics.inc(f"{self.prefix}_pending_processed", len(pending_tasks))
+
+    def timeout_approval_node(self, task):
+        if self.timed_out():
+            logger.warning("Task manager has reached time out while processing approval nodes, exiting loop early")
+            # Do not process any more workflow approval nodes. Stop here.
+            # Maybe we should schedule another TaskManager run
+            return
+        timeout_message = _("The approval node {name} ({pk}) has expired after {timeout} seconds.").format(name=task.name, pk=task.pk, timeout=task.timeout)
+        logger.warning(timeout_message)
+        task.timed_out = True
+        task.status = 'failed'
+        task.send_approval_notification('timed_out')
+        task.websocket_emit_status(task.status)
+        task.job_explanation = timeout_message
+        task.save(update_fields=['status', 'job_explanation', 'timed_out'])
+
+    def get_expired_workflow_approvals(self):
+        # timeout of 0 indicates that it never expires
+        qs = WorkflowApproval.objects.filter(status='pending').exclude(timeout=0).filter(expires__lt=tz_now())
+        return qs

    @timeit
    def _schedule(self):
-        finished_wfjs = []
-        all_sorted_tasks = self.get_tasks()
+        self.get_tasks(dict(status__in=["pending", "waiting", "running"], dependencies_processed=True))

-        self.after_lock_init(all_sorted_tasks)
+        self.after_lock_init()
+        self.reap_jobs_from_orphaned_instances()

-        if len(all_sorted_tasks) > 0:
-            # TODO: Deal with
-            # latest_project_updates = self.get_latest_project_update_tasks(all_sorted_tasks)
-            # self.process_latest_project_updates(latest_project_updates)
+        if len(self.all_tasks) > 0:
+            self.process_tasks()

-            # latest_inventory_updates = self.get_latest_inventory_update_tasks(all_sorted_tasks)
-            # self.process_latest_inventory_updates(latest_inventory_updates)
-
-            self.all_inventory_sources = self.get_inventory_source_tasks(all_sorted_tasks)
-
-            running_workflow_tasks = self.get_running_workflow_jobs()
-            finished_wfjs = self.process_finished_workflow_jobs(running_workflow_tasks)
-
-            previously_running_workflow_tasks = running_workflow_tasks
-            running_workflow_tasks = []
-            for workflow_job in previously_running_workflow_tasks:
-                if workflow_job.status == 'running':
-                    running_workflow_tasks.append(workflow_job)
-                else:
-                    logger.debug('Removed %s from job spawning consideration.', workflow_job.log_format)
-
-            self.spawn_workflow_graph_jobs(running_workflow_tasks)
-
-            self.timeout_approval_node()
-            self.reap_jobs_from_orphaned_instances()
-
-            self.process_tasks(all_sorted_tasks)
-        return finished_wfjs
-
-    def record_aggregate_metrics(self, *args):
-        if not settings.IS_TESTING():
-            # increment task_manager_schedule_calls regardless if the other
-            # metrics are recorded
-            s_metrics.Metrics(auto_pipe_execute=True).inc("task_manager_schedule_calls", 1)
-            # Only record metrics if the last time recording was more
-            # than SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL ago.
-            # Prevents a short-duration task manager that runs directly after a
-            # long task manager to override useful metrics.
-            current_time = time.time()
-            time_last_recorded = current_time - self.subsystem_metrics.decode("task_manager_recorded_timestamp")
-            if time_last_recorded > settings.SUBSYSTEM_METRICS_TASK_MANAGER_RECORD_INTERVAL:
-                logger.debug(f"recording metrics, last recorded {time_last_recorded} seconds ago")
-                self.subsystem_metrics.set("task_manager_recorded_timestamp", current_time)
-                self.subsystem_metrics.pipe_execute()
-            else:
-                logger.debug(f"skipping recording metrics, last recorded {time_last_recorded} seconds ago")
-
-    def record_aggregate_metrics_and_exit(self, *args):
-        self.record_aggregate_metrics()
-        sys.exit(1)
-
-    def schedule(self):
-        # Lock
-        with advisory_lock('task_manager_lock', wait=False) as acquired:
-            with transaction.atomic():
-                if acquired is False:
-                    logger.debug("Not running scheduler, another task holds lock")
-                    return
-                logger.debug("Starting Scheduler")
-                with task_manager_bulk_reschedule():
-                    # if sigterm due to timeout, still record metrics
-                    signal.signal(signal.SIGTERM, self.record_aggregate_metrics_and_exit)
-                    self._schedule()
-                    self.record_aggregate_metrics()
-                logger.debug("Finishing Scheduler")
+        for workflow_approval in self.get_expired_workflow_approvals():
+            self.timeout_approval_node(workflow_approval)
--- a/awx/main/scheduler/task_manager_models.py
+++ b/awx/main/scheduler/task_manager_models.py
@@ -34,11 +34,13 @@ class TaskManagerInstance:


 class TaskManagerInstances:
-    def __init__(self, active_tasks, instances=None):
+    def __init__(self, active_tasks, instances=None, instance_fields=('node_type', 'capacity', 'hostname', 'enabled')):
        self.instances_by_hostname = dict()
        if instances is None:
            instances = (
-                Instance.objects.filter(hostname__isnull=False, enabled=True).exclude(node_type='hop').only('node_type', 'capacity', 'hostname', 'enabled')
+                Instance.objects.filter(hostname__isnull=False, node_state=Instance.States.READY, enabled=True)
+                .exclude(node_type='hop')
+                .only('node_type', 'node_state', 'capacity', 'hostname', 'enabled')
            )
        for instance in instances:
            self.instances_by_hostname[instance.hostname] = TaskManagerInstance(instance)
@@ -67,6 +69,7 @@ class TaskManagerInstanceGroups:
    def __init__(self, instances_by_hostname=None, instance_groups=None, instance_groups_queryset=None):
        self.instance_groups = dict()
        self.controlplane_ig = None
+        self.pk_ig_map = dict()

        if instance_groups is not None:  # for testing
            self.instance_groups = instance_groups
@@ -81,6 +84,7 @@ class TaskManagerInstanceGroups:
                        instances_by_hostname[instance.hostname] for instance in instance_group.instances.all() if instance.hostname in instances_by_hostname
                    ],
                )
+                self.pk_ig_map[instance_group.pk] = instance_group

    def get_remaining_capacity(self, group_name):
        instances = self.instance_groups[group_name]['instances']
@@ -121,3 +125,17 @@ class TaskManagerInstanceGroups:
                elif i.capacity > largest_instance.capacity:
                    largest_instance = i
        return largest_instance
+
+    def get_instance_groups_from_task_cache(self, task):
+        igs = []
+        if task.preferred_instance_groups_cache:
+            for pk in task.preferred_instance_groups_cache:
+                ig = self.pk_ig_map.get(pk, None)
+                if ig:
+                    igs.append(ig)
+                else:
+                    logger.warn(f"Unknown instance group with pk {pk} for task {task}")
+        if len(igs) == 0:
+            logger.warn(f"No instance groups in cache exist, defaulting to global instance groups for task {task}")
+            return task.global_instance_groups
+        return igs
--- a/awx/main/scheduler/tasks.py
+++ b/awx/main/scheduler/tasks.py
@@ -1,15 +1,35 @@
 # Python
 import logging

+# Django
+from django.conf import settings
+
 # AWX
-from awx.main.scheduler import TaskManager
+from awx import MODE
+from awx.main.scheduler import TaskManager, DependencyManager, WorkflowManager
 from awx.main.dispatch.publish import task
 from awx.main.dispatch import get_local_queuename

 logger = logging.getLogger('awx.main.scheduler')


+def run_manager(manager, prefix):
+    if MODE == 'development' and settings.AWX_DISABLE_TASK_MANAGERS:
+        logger.debug(f"Not running {prefix} manager, AWX_DISABLE_TASK_MANAGERS is True. Trigger with GET to /api/debug/{prefix}_manager/")
+        return
+    manager().schedule()
+
+
@task(queue=get_local_queuename)
-def run_task_manager():
-    logger.debug("Running task manager.")
-    TaskManager().schedule()
+def task_manager():
+    run_manager(TaskManager, "task")
+
+
+@task(queue=get_local_queuename)
+def dependency_manager():
+    run_manager(DependencyManager, "dependency")
+
+
+@task(queue=get_local_queuename)
+def workflow_manager():
+    run_manager(WorkflowManager, "workflow")
--- a/awx/main/signals.py
+++ b/awx/main/signals.py
@@ -409,7 +409,7 @@ def emit_activity_stream_change(instance):
    from awx.api.serializers import ActivityStreamSerializer

    actor = None
-    if instance.actor:
+    if instance.actor_id:
        actor = instance.actor.username
    summary_fields = ActivityStreamSerializer(instance).get_summary_fields(instance)
    analytics_logger.info(
--- a/awx/main/tasks/callback.py
+++ b/awx/main/tasks/callback.py
@@ -6,10 +6,10 @@ import os
 import stat

 # Django
-from django.utils.timezone import now
 from django.conf import settings
 from django_guid import get_guid
 from django.utils.functional import cached_property
+from django.db import connections

 # AWX
 from awx.main.redact import UriCleaner
@@ -174,22 +174,6 @@ class RunnerCallback:

        return False

-    def cancel_callback(self):
-        """
-        Ansible runner callback to tell the job when/if it is canceled
-        """
-        unified_job_id = self.instance.pk
-        self.instance = self.update_model(unified_job_id)
-        if not self.instance:
-            logger.error('unified job {} was deleted while running, canceling'.format(unified_job_id))
-            return True
-        if self.instance.cancel_flag or self.instance.status == 'canceled':
-            cancel_wait = (now() - self.instance.modified).seconds if self.instance.modified else 0
-            if cancel_wait > 5:
-                logger.warning('Request to cancel {} took {} seconds to complete.'.format(self.instance.log_format, cancel_wait))
-            return True
-        return False
-
    def finished_callback(self, runner_obj):
        """
        Ansible runner callback triggered on finished run
@@ -220,6 +204,8 @@ class RunnerCallback:

            with disable_activity_stream():
                self.instance = self.update_model(self.instance.pk, job_args=json.dumps(runner_config.command), job_cwd=runner_config.cwd, job_env=job_env)
+            # We opened a connection just for that save, close it here now
+            connections.close_all()
        elif status_data['status'] == 'failed':
            # For encrypted ssh_key_data, ansible-runner worker will open and write the
            # ssh_key_data to a named pipe. Then, once the podman container starts, ssh-agent will
--- a/awx/main/tasks/jobs.py
+++ b/awx/main/tasks/jobs.py
@@ -1,6 +1,5 @@
 # Python
 from collections import OrderedDict
-from distutils.dir_util import copy_tree
 import errno
 import functools
 import fcntl
@@ -15,11 +14,9 @@ import tempfile
 import traceback
 import time
 import urllib.parse as urlparse
-from uuid import uuid4

 # Django
 from django.conf import settings
-from django.db import transaction


 # Runner
@@ -34,12 +31,12 @@ from gitdb.exc import BadName as BadGitName
 from awx.main.dispatch.publish import task
 from awx.main.dispatch import get_local_queuename
 from awx.main.constants import (
-    ACTIVE_STATES,
    PRIVILEGE_ESCALATION_METHODS,
    STANDARD_INVENTORY_UPDATE_ENV,
    JOB_FOLDER_PREFIX,
    MAX_ISOLATED_PATH_COLON_DELIMITER,
    CONTAINER_VOLUMES_MOUNT_TYPES,
+    ACTIVE_STATES,
 )
 from awx.main.models import (
    Instance,
@@ -64,6 +61,7 @@ from awx.main.tasks.callback import (
    RunnerCallbackForProjectUpdate,
    RunnerCallbackForSystemJob,
 )
+from awx.main.tasks.signals import with_signal_handling, signal_callback
 from awx.main.tasks.receptor import AWXReceptorJob
 from awx.main.exceptions import AwxTaskError, PostRunError, ReceptorNodeNotFound
 from awx.main.utils.ansible import read_ansible_config
@@ -147,7 +145,7 @@ class BaseTask(object):
        """
        Return params structure to be executed by the container runtime
        """
-        if settings.IS_K8S:
+        if settings.IS_K8S and instance.instance_group.is_container_group:
            return {}

        image = instance.execution_environment.image
@@ -212,14 +210,22 @@ class BaseTask(object):
        os.chmod(path, stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR)
        if settings.AWX_CLEANUP_PATHS:
            self.cleanup_paths.append(path)
-        # Ansible runner requires that project exists,
-        # and we will write files in the other folders without pre-creating the folder
-        for subfolder in ('project', 'inventory', 'env'):
+        # We will write files in these folders later
+        for subfolder in ('inventory', 'env'):
            runner_subfolder = os.path.join(path, subfolder)
            if not os.path.exists(runner_subfolder):
                os.mkdir(runner_subfolder)
        return path

+    def build_project_dir(self, instance, private_data_dir):
+        """
+        Create the ansible-runner project subdirectory. In many cases this is the source checkout.
+        In cases that do not even need the source checkout, we create an empty dir to be the workdir.
+        """
+        project_dir = os.path.join(private_data_dir, 'project')
+        if not os.path.exists(project_dir):
+            os.mkdir(project_dir)
+
    def build_private_data_files(self, instance, private_data_dir):
        """
        Creates temporary files containing the private data.
@@ -355,12 +361,65 @@ class BaseTask(object):
            expect_passwords[k] = passwords.get(v, '') or ''
        return expect_passwords

+    def release_lock(self, project):
+        try:
+            fcntl.lockf(self.lock_fd, fcntl.LOCK_UN)
+        except IOError as e:
+            logger.error("I/O error({0}) while trying to release lock file [{1}]: {2}".format(e.errno, project.get_lock_file(), e.strerror))
+            os.close(self.lock_fd)
+            raise
+
+        os.close(self.lock_fd)
+        self.lock_fd = None
+
+    def acquire_lock(self, project, unified_job_id=None):
+        if not os.path.exists(settings.PROJECTS_ROOT):
+            os.mkdir(settings.PROJECTS_ROOT)
+
+        lock_path = project.get_lock_file()
+        if lock_path is None:
+            # If from migration or someone blanked local_path for any other reason, recoverable by save
+            project.save()
+            lock_path = project.get_lock_file()
+            if lock_path is None:
+                raise RuntimeError(u'Invalid lock file path')
+
+        try:
+            self.lock_fd = os.open(lock_path, os.O_RDWR | os.O_CREAT)
+        except OSError as e:
+            logger.error("I/O error({0}) while trying to open lock file [{1}]: {2}".format(e.errno, lock_path, e.strerror))
+            raise
+
+        start_time = time.time()
+        while True:
+            try:
+                fcntl.lockf(self.lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
+                break
+            except IOError as e:
+                if e.errno not in (errno.EAGAIN, errno.EACCES):
+                    os.close(self.lock_fd)
+                    logger.error("I/O error({0}) while trying to aquire lock on file [{1}]: {2}".format(e.errno, lock_path, e.strerror))
+                    raise
+                else:
+                    time.sleep(1.0)
+            self.instance.refresh_from_db(fields=['cancel_flag'])
+            if self.instance.cancel_flag or signal_callback():
+                logger.debug(f"Unified job {self.instance.id} was canceled while waiting for project file lock")
+                return
+        waiting_time = time.time() - start_time
+
+        if waiting_time > 1.0:
+            logger.info(f'Job {unified_job_id} waited {waiting_time} to acquire lock for local source tree for path {lock_path}.')
+
    def pre_run_hook(self, instance, private_data_dir):
        """
        Hook for any steps to run before the job/task starts
        """
        instance.log_lifecycle("pre_run")

+        # Before task is started, ensure that job_event partitions exist
+        create_partition(instance.event_class._meta.db_table, start=instance.created)
+
    def post_run_hook(self, instance, status):
        """
        Hook for any steps to run before job/task is marked as complete.
@@ -373,15 +432,9 @@ class BaseTask(object):
        """
        instance.log_lifecycle("finalize_run")
        artifact_dir = os.path.join(private_data_dir, 'artifacts', str(self.instance.id))
-        job_profiling_dir = os.path.join(artifact_dir, 'playbook_profiling')
-        awx_profiling_dir = '/var/log/tower/playbook_profiling/'
        collections_info = os.path.join(artifact_dir, 'collections.json')
        ansible_version_file = os.path.join(artifact_dir, 'ansible_version.txt')

-        if not os.path.exists(awx_profiling_dir):
-            os.mkdir(awx_profiling_dir)
-        if os.path.isdir(job_profiling_dir):
-            shutil.copytree(job_profiling_dir, os.path.join(awx_profiling_dir, str(instance.pk)))
        if os.path.exists(collections_info):
            with open(collections_info) as ee_json_info:
                ee_collections_info = json.loads(ee_json_info.read())
@@ -394,11 +447,17 @@ class BaseTask(object):
                instance.save(update_fields=['ansible_version'])

    @with_path_cleanup
+    @with_signal_handling
    def run(self, pk, **kwargs):
        """
        Run the job/task and capture its output.
        """
        self.instance = self.model.objects.get(pk=pk)
+        if self.instance.status != 'canceled' and self.instance.cancel_flag:
+            self.instance = self.update_model(self.instance.pk, start_args='', status='canceled')
+        if self.instance.status not in ACTIVE_STATES:
+            # Prevent starting the job if it has been reaped or handled by another process.
+            raise RuntimeError(f'Not starting {self.instance.status} task pk={pk} because {self.instance.status} is not a valid active state')

        if self.instance.execution_environment_id is None:
            from awx.main.signals import disable_activity_stream
@@ -424,9 +483,11 @@ class BaseTask(object):
            self.instance.send_notification_templates("running")
            private_data_dir = self.build_private_data_dir(self.instance)
            self.pre_run_hook(self.instance, private_data_dir)
+            self.build_project_dir(self.instance, private_data_dir)
            self.instance.log_lifecycle("preparing_playbook")
-            if self.instance.cancel_flag:
+            if self.instance.cancel_flag or signal_callback():
                self.instance = self.update_model(self.instance.pk, status='canceled')
+
            if self.instance.status != 'running':
                # Stop the task chain and prevent starting the job if it has
                # already been canceled.
@@ -529,7 +590,7 @@ class BaseTask(object):
                    event_handler=self.runner_callback.event_handler,
                    finished_callback=self.runner_callback.finished_callback,
                    status_handler=self.runner_callback.status_handler,
-                    cancel_callback=self.runner_callback.cancel_callback,
+                    cancel_callback=signal_callback,
                    **params,
                )
            else:
@@ -547,6 +608,15 @@ class BaseTask(object):
                self.runner_callback.delay_update(skip_if_already_set=True, job_explanation=f"Job terminated due to {status}")
                if status == 'timeout':
                    status = 'failed'
+            elif status == 'canceled':
+                self.instance = self.update_model(pk)
+                cancel_flag_value = getattr(self.instance, 'cancel_flag', False)
+                if (cancel_flag_value is False) and signal_callback():
+                    self.runner_callback.delay_update(skip_if_already_set=True, job_explanation="Task was canceled due to receiving a shutdown signal.")
+                    status = 'failed'
+                elif cancel_flag_value is False:
+                    self.runner_callback.delay_update(skip_if_already_set=True, job_explanation="The running ansible process received a shutdown signal.")
+                    status = 'failed'
        except ReceptorNodeNotFound as exc:
            self.runner_callback.delay_update(job_explanation=str(exc))
        except Exception:
@@ -588,8 +658,141 @@ class BaseTask(object):
                raise AwxTaskError.TaskError(self.instance, rc)


+class SourceControlMixin(BaseTask):
+    """Utility methods for tasks that run use content from source control"""
+
+    def get_sync_needs(self, project, scm_branch=None):
+        project_path = project.get_project_path(check_if_exists=False)
+        job_revision = project.scm_revision
+        sync_needs = []
+        source_update_tag = 'update_{}'.format(project.scm_type)
+        branch_override = bool(scm_branch and scm_branch != project.scm_branch)
+        # TODO: skip syncs for inventory updates. Now, UI needs a link added so clients can link to project
+        # source_project is only a field on inventory sources.
+        if isinstance(self.instance, InventoryUpdate):
+            sync_needs.append(source_update_tag)
+        elif not project.scm_type:
+            pass  # manual projects are not synced, user has responsibility for that
+        elif not os.path.exists(project_path):
+            logger.debug(f'Performing fresh clone of {project.id} for unified job {self.instance.id} on this instance.')
+            sync_needs.append(source_update_tag)
+        elif project.scm_type == 'git' and project.scm_revision and (not branch_override):
+            try:
+                git_repo = git.Repo(project_path)
+
+                if job_revision == git_repo.head.commit.hexsha:
+                    logger.debug(f'Skipping project sync for {self.instance.id} because commit is locally available')
+                else:
+                    sync_needs.append(source_update_tag)
+            except (ValueError, BadGitName, git.exc.InvalidGitRepositoryError):
+                logger.debug(f'Needed commit for {self.instance.id} not in local source tree, will sync with remote')
+                sync_needs.append(source_update_tag)
+        else:
+            logger.debug(f'Project not available locally, {self.instance.id} will sync with remote')
+            sync_needs.append(source_update_tag)
+
+        has_cache = os.path.exists(os.path.join(project.get_cache_path(), project.cache_id))
+        # Galaxy requirements are not supported for manual projects
+        if project.scm_type and ((not has_cache) or branch_override):
+            sync_needs.extend(['install_roles', 'install_collections'])
+
+        return sync_needs
+
+    def spawn_project_sync(self, project, sync_needs, scm_branch=None):
+        pu_ig = self.instance.instance_group
+        pu_en = Instance.objects.me().hostname
+
+        sync_metafields = dict(
+            launch_type="sync",
+            job_type='run',
+            job_tags=','.join(sync_needs),
+            status='running',
+            instance_group=pu_ig,
+            execution_node=pu_en,
+            controller_node=pu_en,
+            celery_task_id=self.instance.celery_task_id,
+        )
+        if scm_branch and scm_branch != project.scm_branch:
+            sync_metafields['scm_branch'] = scm_branch
+            sync_metafields['scm_clean'] = True  # to accomidate force pushes
+        if 'update_' not in sync_metafields['job_tags']:
+            sync_metafields['scm_revision'] = project.scm_revision
+        local_project_sync = project.create_project_update(_eager_fields=sync_metafields)
+        local_project_sync.log_lifecycle("controller_node_chosen")
+        local_project_sync.log_lifecycle("execution_node_chosen")
+        return local_project_sync
+
+    def sync_and_copy_without_lock(self, project, private_data_dir, scm_branch=None):
+        sync_needs = self.get_sync_needs(project, scm_branch=scm_branch)
+
+        if sync_needs:
+            local_project_sync = self.spawn_project_sync(project, sync_needs, scm_branch=scm_branch)
+            # save the associated job before calling run() so that a
+            # cancel() call on the job can cancel the project update
+            if isinstance(self.instance, Job):
+                self.instance = self.update_model(self.instance.pk, project_update=local_project_sync)
+            else:
+                self.instance = self.update_model(self.instance.pk, source_project_update=local_project_sync)
+
+            try:
+                # the job private_data_dir is passed so sync can download roles and collections there
+                sync_task = RunProjectUpdate(job_private_data_dir=private_data_dir)
+                sync_task.run(local_project_sync.id)
+                local_project_sync.refresh_from_db()
+                self.instance = self.update_model(self.instance.pk, scm_revision=local_project_sync.scm_revision)
+            except Exception:
+                local_project_sync.refresh_from_db()
+                if local_project_sync.status != 'canceled':
+                    self.instance = self.update_model(
+                        self.instance.pk,
+                        status='failed',
+                        job_explanation=(
+                            'Previous Task Failed: {"job_type": "project_update", '
+                            f'"job_name": "{local_project_sync.name}", "job_id": "{local_project_sync.id}"}}'
+                        ),
+                    )
+                    raise
+                self.instance.refresh_from_db()
+                if self.instance.cancel_flag:
+                    return
+        else:
+            # Case where a local sync is not needed, meaning that local tree is
+            # up-to-date with project, job is running project current version
+            self.instance = self.update_model(self.instance.pk, scm_revision=project.scm_revision)
+            # Project update does not copy the folder, so copy here
+            RunProjectUpdate.make_local_copy(project, private_data_dir)
+
+    def sync_and_copy(self, project, private_data_dir, scm_branch=None):
+        self.acquire_lock(project, self.instance.id)
+
+        try:
+            original_branch = None
+            project_path = project.get_project_path(check_if_exists=False)
+            if project.scm_type == 'git' and (scm_branch and scm_branch != project.scm_branch):
+                if os.path.exists(project_path):
+                    git_repo = git.Repo(project_path)
+                    if git_repo.head.is_detached:
+                        original_branch = git_repo.head.commit
+                    else:
+                        original_branch = git_repo.active_branch
+
+            return self.sync_and_copy_without_lock(project, private_data_dir, scm_branch=scm_branch)
+        finally:
+            # We have made the copy so we can set the tree back to its normal state
+            if original_branch:
+                # for git project syncs, non-default branches can be problems
+                # restore to branch the repo was on before this run
+                try:
+                    original_branch.checkout()
+                except Exception:
+                    # this could have failed due to dirty tree, but difficult to predict all cases
+                    logger.exception(f'Failed to restore project repo to prior state after {self.instance.id}')
+
+            self.release_lock(project)
+
+
@task(queue=get_local_queuename)
-class RunJob(BaseTask):
+class RunJob(SourceControlMixin, BaseTask):
    """
    Run a job using ansible-playbook.
    """
@@ -858,98 +1061,14 @@ class RunJob(BaseTask):
            job = self.update_model(job.pk, status='failed', job_explanation=msg)
            raise RuntimeError(msg)

-        project_path = job.project.get_project_path(check_if_exists=False)
-        job_revision = job.project.scm_revision
-        sync_needs = []
-        source_update_tag = 'update_{}'.format(job.project.scm_type)
-        branch_override = bool(job.scm_branch and job.scm_branch != job.project.scm_branch)
-        if not job.project.scm_type:
-            pass  # manual projects are not synced, user has responsibility for that
-        elif not os.path.exists(project_path):
-            logger.debug('Performing fresh clone of {} on this instance.'.format(job.project))
-            sync_needs.append(source_update_tag)
-        elif job.project.scm_type == 'git' and job.project.scm_revision and (not branch_override):
-            try:
-                git_repo = git.Repo(project_path)
-
-                if job_revision == git_repo.head.commit.hexsha:
-                    logger.debug('Skipping project sync for {} because commit is locally available'.format(job.log_format))
-                else:
-                    sync_needs.append(source_update_tag)
-            except (ValueError, BadGitName, git.exc.InvalidGitRepositoryError):
-                logger.debug('Needed commit for {} not in local source tree, will sync with remote'.format(job.log_format))
-                sync_needs.append(source_update_tag)
-        else:
-            logger.debug('Project not available locally, {} will sync with remote'.format(job.log_format))
-            sync_needs.append(source_update_tag)
-
-        has_cache = os.path.exists(os.path.join(job.project.get_cache_path(), job.project.cache_id))
-        # Galaxy requirements are not supported for manual projects
-        if job.project.scm_type and ((not has_cache) or branch_override):
-            sync_needs.extend(['install_roles', 'install_collections'])
-
-        if sync_needs:
-            pu_ig = job.instance_group
-            pu_en = Instance.objects.me().hostname
-
-            sync_metafields = dict(
-                launch_type="sync",
-                job_type='run',
-                job_tags=','.join(sync_needs),
-                status='running',
-                instance_group=pu_ig,
-                execution_node=pu_en,
-                controller_node=pu_en,
-                celery_task_id=job.celery_task_id,
-            )
-            if branch_override:
-                sync_metafields['scm_branch'] = job.scm_branch
-                sync_metafields['scm_clean'] = True  # to accomidate force pushes
-            if 'update_' not in sync_metafields['job_tags']:
-                sync_metafields['scm_revision'] = job_revision
-            local_project_sync = job.project.create_project_update(_eager_fields=sync_metafields)
-            local_project_sync.log_lifecycle("controller_node_chosen")
-            local_project_sync.log_lifecycle("execution_node_chosen")
-            create_partition(local_project_sync.event_class._meta.db_table, start=local_project_sync.created)
-            # save the associated job before calling run() so that a
-            # cancel() call on the job can cancel the project update
-            job = self.update_model(job.pk, project_update=local_project_sync)
-
-            project_update_task = local_project_sync._get_task_class()
-            try:
-                # the job private_data_dir is passed so sync can download roles and collections there
-                sync_task = project_update_task(job_private_data_dir=private_data_dir)
-                sync_task.run(local_project_sync.id)
-                local_project_sync.refresh_from_db()
-                job = self.update_model(job.pk, scm_revision=local_project_sync.scm_revision)
-            except Exception:
-                local_project_sync.refresh_from_db()
-                if local_project_sync.status != 'canceled':
-                    job = self.update_model(
-                        job.pk,
-                        status='failed',
-                        job_explanation=(
-                            'Previous Task Failed: {"job_type": "%s", "job_name": "%s", "job_id": "%s"}'
-                            % ('project_update', local_project_sync.name, local_project_sync.id)
-                        ),
-                    )
-                    raise
-                job.refresh_from_db()
-                if job.cancel_flag:
-                    return
-        else:
-            # Case where a local sync is not needed, meaning that local tree is
-            # up-to-date with project, job is running project current version
-            if job_revision:
-                job = self.update_model(job.pk, scm_revision=job_revision)
-            # Project update does not copy the folder, so copy here
-            RunProjectUpdate.make_local_copy(job.project, private_data_dir, scm_revision=job_revision)
-
        if job.inventory.kind == 'smart':
            # cache smart inventory memberships so that the host_filter query is not
            # ran inside of the event saving code
            update_smart_memberships_for_inventory(job.inventory)

+    def build_project_dir(self, job, private_data_dir):
+        self.sync_and_copy(job.project, private_data_dir, scm_branch=job.scm_branch)
+
    def final_run_hook(self, job, status, private_data_dir, fact_modification_times):
        super(RunJob, self).final_run_hook(job, status, private_data_dir, fact_modification_times)
        if not private_data_dir:
@@ -981,7 +1100,6 @@ class RunProjectUpdate(BaseTask):

    def __init__(self, *args, job_private_data_dir=None, **kwargs):
        super(RunProjectUpdate, self).__init__(*args, **kwargs)
-        self.original_branch = None
        self.job_private_data_dir = job_private_data_dir

    def build_private_data(self, project_update, private_data_dir):
@@ -1151,6 +1269,10 @@ class RunProjectUpdate(BaseTask):
            # for raw archive, prevent error moving files between volumes
            extra_vars['ansible_remote_tmp'] = os.path.join(project_update.get_project_path(check_if_exists=False), '.ansible_awx', 'tmp')

+        if project_update.project.signature_validation_credential is not None:
+            pubkey = project_update.project.signature_validation_credential.get_input('gpg_public_key')
+            extra_vars['gpg_pubkey'] = pubkey
+
        self._write_extra_vars_file(private_data_dir, extra_vars)

    def build_playbook_path_relative_to_cwd(self, project_update, private_data_dir):
@@ -1168,132 +1290,13 @@ class RunProjectUpdate(BaseTask):
        d[r'^Are you sure you want to continue connecting \(yes/no\)\?\s*?$'] = 'yes'
        return d

-    def _update_dependent_inventories(self, project_update, dependent_inventory_sources):
-        scm_revision = project_update.project.scm_revision
-        inv_update_class = InventoryUpdate._get_task_class()
-        for inv_src in dependent_inventory_sources:
-            if not inv_src.update_on_project_update:
-                continue
-            if inv_src.scm_last_revision == scm_revision:
-                logger.debug('Skipping SCM inventory update for `{}` because ' 'project has not changed.'.format(inv_src.name))
-                continue
-            logger.debug('Local dependent inventory update for `{}`.'.format(inv_src.name))
-            with transaction.atomic():
-                if InventoryUpdate.objects.filter(inventory_source=inv_src, status__in=ACTIVE_STATES).exists():
-                    logger.debug('Skipping SCM inventory update for `{}` because ' 'another update is already active.'.format(inv_src.name))
-                    continue
-
-                if settings.IS_K8S:
-                    instance_group = InventoryUpdate(inventory_source=inv_src).preferred_instance_groups[0]
-                else:
-                    instance_group = project_update.instance_group
-
-                local_inv_update = inv_src.create_inventory_update(
-                    _eager_fields=dict(
-                        launch_type='scm',
-                        status='running',
-                        instance_group=instance_group,
-                        execution_node=project_update.execution_node,
-                        controller_node=project_update.execution_node,
-                        source_project_update=project_update,
-                        celery_task_id=project_update.celery_task_id,
-                    )
-                )
-                local_inv_update.log_lifecycle("controller_node_chosen")
-                local_inv_update.log_lifecycle("execution_node_chosen")
-            try:
-                create_partition(local_inv_update.event_class._meta.db_table, start=local_inv_update.created)
-                inv_update_class().run(local_inv_update.id)
-            except Exception:
-                logger.exception('{} Unhandled exception updating dependent SCM inventory sources.'.format(project_update.log_format))
-
-            try:
-                project_update.refresh_from_db()
-            except ProjectUpdate.DoesNotExist:
-                logger.warning('Project update deleted during updates of dependent SCM inventory sources.')
-                break
-            try:
-                local_inv_update.refresh_from_db()
-            except InventoryUpdate.DoesNotExist:
-                logger.warning('%s Dependent inventory update deleted during execution.', project_update.log_format)
-                continue
-            if project_update.cancel_flag:
-                logger.info('Project update {} was canceled while updating dependent inventories.'.format(project_update.log_format))
-                break
-            if local_inv_update.cancel_flag:
-                logger.info('Continuing to process project dependencies after {} was canceled'.format(local_inv_update.log_format))
-            if local_inv_update.status == 'successful':
-                inv_src.scm_last_revision = scm_revision
-                inv_src.save(update_fields=['scm_last_revision'])
-
-    def release_lock(self, instance):
-        try:
-            fcntl.lockf(self.lock_fd, fcntl.LOCK_UN)
-        except IOError as e:
-            logger.error("I/O error({0}) while trying to release lock file [{1}]: {2}".format(e.errno, instance.get_lock_file(), e.strerror))
-            os.close(self.lock_fd)
-            raise
-
-        os.close(self.lock_fd)
-        self.lock_fd = None
-
-    '''
-    Note: We don't support blocking=False
-    '''
-
-    def acquire_lock(self, instance, blocking=True):
-        lock_path = instance.get_lock_file()
-        if lock_path is None:
-            # If from migration or someone blanked local_path for any other reason, recoverable by save
-            instance.save()
-            lock_path = instance.get_lock_file()
-            if lock_path is None:
-                raise RuntimeError(u'Invalid lock file path')
-
-        try:
-            self.lock_fd = os.open(lock_path, os.O_RDWR | os.O_CREAT)
-        except OSError as e:
-            logger.error("I/O error({0}) while trying to open lock file [{1}]: {2}".format(e.errno, lock_path, e.strerror))
-            raise
-
-        start_time = time.time()
-        while True:
-            try:
-                instance.refresh_from_db(fields=['cancel_flag'])
-                if instance.cancel_flag:
-                    logger.debug("ProjectUpdate({0}) was canceled".format(instance.pk))
-                    return
-                fcntl.lockf(self.lock_fd, fcntl.LOCK_EX | fcntl.LOCK_NB)
-                break
-            except IOError as e:
-                if e.errno not in (errno.EAGAIN, errno.EACCES):
-                    os.close(self.lock_fd)
-                    logger.error("I/O error({0}) while trying to aquire lock on file [{1}]: {2}".format(e.errno, lock_path, e.strerror))
-                    raise
-                else:
-                    time.sleep(1.0)
-        waiting_time = time.time() - start_time
-
-        if waiting_time > 1.0:
-            logger.info('{} spent {} waiting to acquire lock for local source tree ' 'for path {}.'.format(instance.log_format, waiting_time, lock_path))
-
    def pre_run_hook(self, instance, private_data_dir):
        super(RunProjectUpdate, self).pre_run_hook(instance, private_data_dir)
        # re-create root project folder if a natural disaster has destroyed it
-        if not os.path.exists(settings.PROJECTS_ROOT):
-            os.mkdir(settings.PROJECTS_ROOT)
        project_path = instance.project.get_project_path(check_if_exists=False)

-        self.acquire_lock(instance)
-
-        self.original_branch = None
-        if instance.scm_type == 'git' and instance.branch_override:
-            if os.path.exists(project_path):
-                git_repo = git.Repo(project_path)
-                if git_repo.head.is_detached:
-                    self.original_branch = git_repo.head.commit
-                else:
-                    self.original_branch = git_repo.active_branch
+        if instance.launch_type != 'sync':
+            self.acquire_lock(instance.project, instance.id)

        if not os.path.exists(project_path):
            os.makedirs(project_path)  # used as container mount
@@ -1304,11 +1307,12 @@ class RunProjectUpdate(BaseTask):
            shutil.rmtree(stage_path)
        os.makedirs(stage_path)  # presence of empty cache indicates lack of roles or collections

+    def build_project_dir(self, instance, private_data_dir):
        # the project update playbook is not in a git repo, but uses a vendoring directory
        # to be consistent with the ansible-runner model,
        # that is moved into the runner project folder here
        awx_playbooks = self.get_path_to('../../', 'playbooks')
-        copy_tree(awx_playbooks, os.path.join(private_data_dir, 'project'))
+        shutil.copytree(awx_playbooks, os.path.join(private_data_dir, 'project'))

    @staticmethod
    def clear_project_cache(cache_dir, keep_value):
@@ -1325,50 +1329,18 @@ class RunProjectUpdate(BaseTask):
                        logger.warning(f"Could not remove cache directory {old_path}")

    @staticmethod
-    def make_local_copy(p, job_private_data_dir, scm_revision=None):
+    def make_local_copy(project, job_private_data_dir):
        """Copy project content (roles and collections) to a job private_data_dir

-        :param object p: Either a project or a project update
+        :param object project: Either a project or a project update
        :param str job_private_data_dir: The root of the target ansible-runner folder
-        :param str scm_revision: For branch_override cases, the git revision to copy
        """
-        project_path = p.get_project_path(check_if_exists=False)
+        project_path = project.get_project_path(check_if_exists=False)
        destination_folder = os.path.join(job_private_data_dir, 'project')
-        if not scm_revision:
-            scm_revision = p.scm_revision
-
-        if p.scm_type == 'git':
-            git_repo = git.Repo(project_path)
-            if not os.path.exists(destination_folder):
-                os.mkdir(destination_folder, stat.S_IREAD | stat.S_IWRITE | stat.S_IEXEC)
-            tmp_branch_name = 'awx_internal/{}'.format(uuid4())
-            # always clone based on specific job revision
-            if not p.scm_revision:
-                raise RuntimeError('Unexpectedly could not determine a revision to run from project.')
-            source_branch = git_repo.create_head(tmp_branch_name, p.scm_revision)
-            # git clone must take file:// syntax for source repo or else options like depth will be ignored
-            source_as_uri = Path(project_path).as_uri()
-            git.Repo.clone_from(
-                source_as_uri,
-                destination_folder,
-                branch=source_branch,
-                depth=1,
-                single_branch=True,  # shallow, do not copy full history
-            )
-            # submodules copied in loop because shallow copies from local HEADs are ideal
-            # and no git clone submodule options are compatible with minimum requirements
-            for submodule in git_repo.submodules:
-                subrepo_path = os.path.abspath(os.path.join(project_path, submodule.path))
-                subrepo_destination_folder = os.path.abspath(os.path.join(destination_folder, submodule.path))
-                subrepo_uri = Path(subrepo_path).as_uri()
-                git.Repo.clone_from(subrepo_uri, subrepo_destination_folder, depth=1, single_branch=True)
-            # force option is necessary because remote refs are not counted, although no information is lost
-            git_repo.delete_head(tmp_branch_name, force=True)
-        else:
-            copy_tree(project_path, destination_folder, preserve_symlinks=1)
+        shutil.copytree(project_path, destination_folder, ignore=shutil.ignore_patterns('.git'), symlinks=True)

        # copy over the roles and collection cache to job folder
-        cache_path = os.path.join(p.get_cache_path(), p.cache_id)
+        cache_path = os.path.join(project.get_cache_path(), project.cache_id)
        subfolders = []
        if settings.AWX_COLLECTIONS_ENABLED:
            subfolders.append('requirements_collections')
@@ -1378,8 +1350,8 @@ class RunProjectUpdate(BaseTask):
            cache_subpath = os.path.join(cache_path, subfolder)
            if os.path.exists(cache_subpath):
                dest_subpath = os.path.join(job_private_data_dir, subfolder)
-                copy_tree(cache_subpath, dest_subpath, preserve_symlinks=1)
-                logger.debug('{0} {1} prepared {2} from cache'.format(type(p).__name__, p.pk, dest_subpath))
+                shutil.copytree(cache_subpath, dest_subpath, symlinks=True)
+                logger.debug('{0} {1} prepared {2} from cache'.format(type(project).__name__, project.pk, dest_subpath))

    def post_run_hook(self, instance, status):
        super(RunProjectUpdate, self).post_run_hook(instance, status)
@@ -1409,23 +1381,13 @@ class RunProjectUpdate(BaseTask):
            if self.job_private_data_dir:
                if status == 'successful':
                    # copy project folder before resetting to default branch
-                    # because some git-tree-specific resources (like submodules) might matter
                    self.make_local_copy(instance, self.job_private_data_dir)
-                if self.original_branch:
-                    # for git project syncs, non-default branches can be problems
-                    # restore to branch the repo was on before this run
-                    try:
-                        self.original_branch.checkout()
-                    except Exception:
-                        # this could have failed due to dirty tree, but difficult to predict all cases
-                        logger.exception('Failed to restore project repo to prior state after {}'.format(instance.log_format))
        finally:
-            self.release_lock(instance)
+            if instance.launch_type != 'sync':
+                self.release_lock(instance.project)
+
        p = instance.project
-        if instance.job_type == 'check' and status not in (
-            'failed',
-            'canceled',
-        ):
+        if instance.job_type == 'check' and status not in ('failed', 'canceled'):
            if self.runner_callback.playbook_new_revision:
                p.scm_revision = self.runner_callback.playbook_new_revision
            else:
@@ -1435,12 +1397,6 @@ class RunProjectUpdate(BaseTask):
            p.inventory_files = p.inventories
            p.save(update_fields=['scm_revision', 'playbook_files', 'inventory_files'])

-        # Update any inventories that depend on this project
-        dependent_inventory_sources = p.scm_inventory_sources.filter(update_on_project_update=True)
-        if len(dependent_inventory_sources) > 0:
-            if status == 'successful' and instance.launch_type != 'sync':
-                self._update_dependent_inventories(instance, dependent_inventory_sources)
-
    def build_execution_environment_params(self, instance, private_data_dir):
        if settings.IS_K8S:
            return {}
@@ -1459,7 +1415,7 @@ class RunProjectUpdate(BaseTask):


@task(queue=get_local_queuename)
-class RunInventoryUpdate(BaseTask):
+class RunInventoryUpdate(SourceControlMixin, BaseTask):

    model = InventoryUpdate
    event_model = InventoryUpdateEvent
@@ -1615,61 +1571,18 @@ class RunInventoryUpdate(BaseTask):
        # All credentials not used by inventory source injector
        return inventory_update.get_extra_credentials()

-    def pre_run_hook(self, inventory_update, private_data_dir):
-        super(RunInventoryUpdate, self).pre_run_hook(inventory_update, private_data_dir)
+    def build_project_dir(self, inventory_update, private_data_dir):
        source_project = None
        if inventory_update.inventory_source:
            source_project = inventory_update.inventory_source.source_project
-        if (
-            inventory_update.source == 'scm' and inventory_update.launch_type != 'scm' and source_project and source_project.scm_type
-        ):  # never ever update manual projects

-            # Check if the content cache exists, so that we do not unnecessarily re-download roles
-            sync_needs = ['update_{}'.format(source_project.scm_type)]
-            has_cache = os.path.exists(os.path.join(source_project.get_cache_path(), source_project.cache_id))
-            # Galaxy requirements are not supported for manual projects
-            if not has_cache:
-                sync_needs.extend(['install_roles', 'install_collections'])
-
-            local_project_sync = source_project.create_project_update(
-                _eager_fields=dict(
-                    launch_type="sync",
-                    job_type='run',
-                    job_tags=','.join(sync_needs),
-                    status='running',
-                    execution_node=Instance.objects.me().hostname,
-                    controller_node=Instance.objects.me().hostname,
-                    instance_group=inventory_update.instance_group,
-                    celery_task_id=inventory_update.celery_task_id,
-                )
-            )
-            local_project_sync.log_lifecycle("controller_node_chosen")
-            local_project_sync.log_lifecycle("execution_node_chosen")
-            create_partition(local_project_sync.event_class._meta.db_table, start=local_project_sync.created)
-            # associate the inventory update before calling run() so that a
-            # cancel() call on the inventory update can cancel the project update
-            local_project_sync.scm_inventory_updates.add(inventory_update)
-
-            project_update_task = local_project_sync._get_task_class()
-            try:
-                sync_task = project_update_task(job_private_data_dir=private_data_dir)
-                sync_task.run(local_project_sync.id)
-                local_project_sync.refresh_from_db()
-                inventory_update.inventory_source.scm_last_revision = local_project_sync.scm_revision
-                inventory_update.inventory_source.save(update_fields=['scm_last_revision'])
-            except Exception:
-                inventory_update = self.update_model(
-                    inventory_update.pk,
-                    status='failed',
-                    job_explanation=(
-                        'Previous Task Failed: {"job_type": "%s", "job_name": "%s", "job_id": "%s"}'
-                        % ('project_update', local_project_sync.name, local_project_sync.id)
-                    ),
-                )
-                raise
-        elif inventory_update.source == 'scm' and inventory_update.launch_type == 'scm' and source_project:
-            # This follows update, not sync, so make copy here
-            RunProjectUpdate.make_local_copy(source_project, private_data_dir)
+        if inventory_update.source == 'scm':
+            if not source_project:
+                raise RuntimeError('Could not find project to run SCM inventory update from.')
+            self.sync_and_copy(source_project, private_data_dir)
+        else:
+            # If source is not SCM make an empty project directory, content is built inside inventory folder
+            super(RunInventoryUpdate, self).build_project_dir(inventory_update, private_data_dir)

    def post_run_hook(self, inventory_update, status):
        super(RunInventoryUpdate, self).post_run_hook(inventory_update, status)
@@ -1712,7 +1625,7 @@ class RunInventoryUpdate(BaseTask):

        handler = SpecialInventoryHandler(
            self.runner_callback.event_handler,
-            self.runner_callback.cancel_callback,
+            signal_callback,
            verbosity=inventory_update.verbosity,
            job_timeout=self.get_instance_timeout(self.instance),
            start_time=inventory_update.started,
--- a/awx/main/tasks/receptor.py
+++ b/awx/main/tasks/receptor.py
@@ -12,6 +12,7 @@ import yaml

 # Django
 from django.conf import settings
+from django.db import connections

 # Runner
 import ansible_runner
@@ -25,12 +26,19 @@ from awx.main.utils.common import (
    cleanup_new_process,
 )
 from awx.main.constants import MAX_ISOLATED_PATH_COLON_DELIMITER
+from awx.main.tasks.signals import signal_state, signal_callback, SignalExit
+from awx.main.models import Instance, InstanceLink, UnifiedJob
+from awx.main.dispatch import get_local_queuename
+from awx.main.dispatch.publish import task

 # Receptorctl
 from receptorctl.socket_interface import ReceptorControl

+from filelock import FileLock
+
 logger = logging.getLogger('awx.main.tasks.receptor')
 __RECEPTOR_CONF = '/etc/receptor/receptor.conf'
+__RECEPTOR_CONF_LOCKFILE = f'{__RECEPTOR_CONF}.lock'
 RECEPTOR_ACTIVE_STATES = ('Pending', 'Running')


@@ -40,9 +48,22 @@ class ReceptorConnectionType(Enum):
    STREAMTLS = 2


+def read_receptor_config():
+    # for K8S deployments, getting a lock is necessary as another process
+    # may be re-writing the config at this time
+    if settings.IS_K8S:
+        lock = FileLock(__RECEPTOR_CONF_LOCKFILE)
+        with lock:
+            with open(__RECEPTOR_CONF, 'r') as f:
+                return yaml.safe_load(f)
+    else:
+        with open(__RECEPTOR_CONF, 'r') as f:
+            return yaml.safe_load(f)
+
+
 def get_receptor_sockfile():
-    with open(__RECEPTOR_CONF, 'r') as f:
-        data = yaml.safe_load(f)
+    data = read_receptor_config()
+
    for section in data:
        for entry_name, entry_data in section.items():
            if entry_name == 'control-service':
@@ -58,8 +79,7 @@ def get_tls_client(use_stream_tls=None):
    if not use_stream_tls:
        return None

-    with open(__RECEPTOR_CONF, 'r') as f:
-        data = yaml.safe_load(f)
+    data = read_receptor_config()
    for section in data:
        for entry_name, entry_data in section.items():
            if entry_name == 'tls-client':
@@ -76,12 +96,25 @@ def get_receptor_ctl():
        return ReceptorControl(receptor_sockfile)


+def find_node_in_mesh(node_name, receptor_ctl):
+    attempts = 10
+    backoff = 1
+    for attempt in range(attempts):
+        all_nodes = receptor_ctl.simple_command("status").get('Advertisements', None)
+        for node in all_nodes:
+            if node.get('NodeID') == node_name:
+                return node
+        else:
+            logger.warning(f"Instance {node_name} is not in the receptor mesh. {attempts-attempt} attempts left.")
+            time.sleep(backoff)
+            backoff += 1
+    else:
+        raise ReceptorNodeNotFound(f'Instance {node_name} is not in the receptor mesh')
+
+
 def get_conn_type(node_name, receptor_ctl):
-    all_nodes = receptor_ctl.simple_command("status").get('Advertisements', None)
-    for node in all_nodes:
-        if node.get('NodeID') == node_name:
-            return ReceptorConnectionType(node.get('ConnType'))
-    raise ReceptorNodeNotFound(f'Instance {node_name} is not in the receptor mesh')
+    node = find_node_in_mesh(node_name, receptor_ctl)
+    return ReceptorConnectionType(node.get('ConnType'))


 def administrative_workunit_reaper(work_list=None):
@@ -99,16 +132,22 @@ def administrative_workunit_reaper(work_list=None):

    for unit_id, work_data in work_list.items():
        extra_data = work_data.get('ExtraData')
-        if (extra_data is None) or (extra_data.get('RemoteWorkType') != 'ansible-runner'):
+        if extra_data is None:
            continue  # if this is not ansible-runner work, we do not want to touch it
-        params = extra_data.get('RemoteParams', {}).get('params')
-        if not params:
-            continue
-        if not (params == '--worker-info' or params.startswith('cleanup')):
-            continue  # if this is not a cleanup or health check, we do not want to touch it
-        if work_data.get('StateName') in RECEPTOR_ACTIVE_STATES:
-            continue  # do not want to touch active work units
-        logger.info(f'Reaping orphaned work unit {unit_id} with params {params}')
+        if isinstance(extra_data, str):
+            if not work_data.get('StateName', None) or work_data.get('StateName') in RECEPTOR_ACTIVE_STATES:
+                continue
+        else:
+            if extra_data.get('RemoteWorkType') != 'ansible-runner':
+                continue
+            params = extra_data.get('RemoteParams', {}).get('params')
+            if not params:
+                continue
+            if not (params == '--worker-info' or params.startswith('cleanup')):
+                continue  # if this is not a cleanup or health check, we do not want to touch it
+            if work_data.get('StateName') in RECEPTOR_ACTIVE_STATES:
+                continue  # do not want to touch active work units
+            logger.info(f'Reaping orphaned work unit {unit_id} with params {params}')
        receptor_ctl.simple_command(f"work release {unit_id}")


@@ -128,8 +167,7 @@ def run_until_complete(node, timing_data=None, **kwargs):
    kwargs.setdefault('payload', '')

    transmit_start = time.time()
-    sign_work = False if settings.IS_K8S else True
-    result = receptor_ctl.submit_work(worktype='ansible-runner', node=node, signwork=sign_work, **kwargs)
+    result = receptor_ctl.submit_work(worktype='ansible-runner', node=node, signwork=True, **kwargs)

    unit_id = result['unitid']
    run_start = time.time()
@@ -204,7 +242,7 @@ def worker_info(node_name, work_type='ansible-runner'):
        else:
            error_list.append(details)

-    except (ReceptorNodeNotFound, RuntimeError) as exc:
+    except Exception as exc:
        error_list.append(str(exc))

    # If we have a connection error, missing keys would be trivial consequence of that
@@ -275,10 +313,6 @@ class AWXReceptorJob:
                except Exception:
                    logger.exception(f"Error releasing work unit {self.unit_id}.")

-    @property
-    def sign_work(self):
-        return False if settings.IS_K8S else True
-
    def _run_internal(self, receptor_ctl):
        # Create a socketpair. Where the left side will be used for writing our payload
        # (private data dir, kwargs). The right side will be passed to Receptor for
@@ -329,24 +363,32 @@ class AWXReceptorJob:
            shutil.rmtree(artifact_dir)

        resultsock, resultfile = receptor_ctl.get_work_results(self.unit_id, return_socket=True, return_sockfile=True)
-        # Both "processor" and "cancel_watcher" are spawned in separate threads.
-        # We wait for the first one to return. If cancel_watcher returns first,
-        # we yank the socket out from underneath the processor, which will cause it
-        # to exit. A reference to the processor_future is passed into the cancel_watcher_future,
-        # Which exits if the job has finished normally. The context manager ensures we do not
-        # leave any threads laying around.
-        with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
-            processor_future = executor.submit(self.processor, resultfile)
-            cancel_watcher_future = executor.submit(self.cancel_watcher, processor_future)
-            futures = [processor_future, cancel_watcher_future]
-            first_future = concurrent.futures.wait(futures, return_when=concurrent.futures.FIRST_COMPLETED)

-            res = list(first_future.done)[0].result()
-            if res.status == 'canceled':
+        connections.close_all()
+
+        # "processor" and the main thread will be separate threads.
+        # If a cancel happens, the main thread will encounter an exception, in which case
+        # we yank the socket out from underneath the processor, which will cause it to exit.
+        # The ThreadPoolExecutor context manager ensures we do not leave any threads laying around.
+        with concurrent.futures.ThreadPoolExecutor(max_workers=1) as executor:
+            processor_future = executor.submit(self.processor, resultfile)
+
+            try:
+                signal_state.raise_exception = True
+                # address race condition where SIGTERM was issued after this dispatcher task started
+                if signal_callback():
+                    raise SignalExit()
+                res = processor_future.result()
+            except SignalExit:
                receptor_ctl.simple_command(f"work cancel {self.unit_id}")
                resultsock.shutdown(socket.SHUT_RDWR)
                resultfile.close()
-            elif res.status == 'error':
+                result = namedtuple('result', ['status', 'rc'])
+                res = result('canceled', 1)
+            finally:
+                signal_state.raise_exception = False
+
+            if res.status == 'error':
                # If ansible-runner ran, but an error occured at runtime, the traceback information
                # is saved via the status_handler passed in to the processor.
                if 'result_traceback' in self.task.runner_callback.extra_update_fields:
@@ -430,6 +472,10 @@ class AWXReceptorJob:

        return receptor_params

+    @property
+    def sign_work(self):
+        return True if self.work_type in ('ansible-runner', 'local') else False
+
    @property
    def work_type(self):
        if self.task.instance.is_container_group_task:
@@ -440,18 +486,6 @@ class AWXReceptorJob:
            return 'local'
        return 'ansible-runner'

-    @cleanup_new_process
-    def cancel_watcher(self, processor_future):
-        while True:
-            if processor_future.done():
-                return processor_future.result()
-
-            if self.task.runner_callback.cancel_callback():
-                result = namedtuple('result', ['status', 'rc'])
-                return result('canceled', 1)
-
-            time.sleep(1)
-
    @property
    def pod_definition(self):
        ee = self.task.instance.execution_environment
@@ -570,3 +604,105 @@ class AWXReceptorJob:
        else:
            config["clusters"][0]["cluster"]["insecure-skip-tls-verify"] = True
        return config
+
+
+# TODO: receptor reload expects ordering within config items to be preserved
+# if python dictionary is not preserving order properly, may need to find a
+# solution. yaml.dump does not seem to work well with OrderedDict. below line may help
+# yaml.add_representer(OrderedDict, lambda dumper, data: dumper.represent_mapping('tag:yaml.org,2002:map', data.items()))
+#
+RECEPTOR_CONFIG_STARTER = (
+    {'local-only': None},
+    {'log-level': 'debug'},
+    {'node': {'firewallrules': [{'action': 'reject', 'tonode': settings.CLUSTER_HOST_ID, 'toservice': 'control'}]}},
+    {'control-service': {'service': 'control', 'filename': '/var/run/receptor/receptor.sock', 'permissions': '0660'}},
+    {'work-command': {'worktype': 'local', 'command': 'ansible-runner', 'params': 'worker', 'allowruntimeparams': True}},
+    {'work-signing': {'privatekey': '/etc/receptor/signing/work-private-key.pem', 'tokenexpiration': '1m'}},
+    {
+        'work-kubernetes': {
+            'worktype': 'kubernetes-runtime-auth',
+            'authmethod': 'runtime',
+            'allowruntimeauth': True,
+            'allowruntimepod': True,
+            'allowruntimeparams': True,
+        }
+    },
+    {
+        'work-kubernetes': {
+            'worktype': 'kubernetes-incluster-auth',
+            'authmethod': 'incluster',
+            'allowruntimeauth': True,
+            'allowruntimepod': True,
+            'allowruntimeparams': True,
+        }
+    },
+    {
+        'tls-client': {
+            'name': 'tlsclient',
+            'rootcas': '/etc/receptor/tls/ca/receptor-ca.crt',
+            'cert': '/etc/receptor/tls/receptor.crt',
+            'key': '/etc/receptor/tls/receptor.key',
+        }
+    },
+)
+
+
+@task()
+def write_receptor_config():
+    lock = FileLock(__RECEPTOR_CONF_LOCKFILE)
+    with lock:
+        receptor_config = list(RECEPTOR_CONFIG_STARTER)
+
+        this_inst = Instance.objects.me()
+        instances = Instance.objects.filter(node_type=Instance.Types.EXECUTION)
+        existing_peers = {link.target_id for link in InstanceLink.objects.filter(source=this_inst)}
+        new_links = []
+        for instance in instances:
+            peer = {'tcp-peer': {'address': f'{instance.hostname}:{instance.listener_port}', 'tls': 'tlsclient'}}
+            receptor_config.append(peer)
+            if instance.id not in existing_peers:
+                new_links.append(InstanceLink(source=this_inst, target=instance, link_state=InstanceLink.States.ADDING))
+
+        InstanceLink.objects.bulk_create(new_links)
+
+        with open(__RECEPTOR_CONF, 'w') as file:
+            yaml.dump(receptor_config, file, default_flow_style=False)
+
+    # This needs to be outside of the lock because this function itself will acquire the lock.
+    receptor_ctl = get_receptor_ctl()
+
+    attempts = 10
+    for backoff in range(1, attempts + 1):
+        try:
+            receptor_ctl.simple_command("reload")
+            break
+        except ValueError:
+            logger.warning(f"Unable to reload Receptor configuration. {attempts-backoff} attempts left.")
+            time.sleep(backoff)
+    else:
+        raise RuntimeError("Receptor reload failed")
+
+    links = InstanceLink.objects.filter(source=this_inst, target__in=instances, link_state=InstanceLink.States.ADDING)
+    links.update(link_state=InstanceLink.States.ESTABLISHED)
+
+
+@task(queue=get_local_queuename)
+def remove_deprovisioned_node(hostname):
+    InstanceLink.objects.filter(source__hostname=hostname).update(link_state=InstanceLink.States.REMOVING)
+    InstanceLink.objects.filter(target__hostname=hostname).update(link_state=InstanceLink.States.REMOVING)
+
+    node_jobs = UnifiedJob.objects.filter(
+        execution_node=hostname,
+        status__in=(
+            'running',
+            'waiting',
+        ),
+    )
+    while node_jobs.exists():
+        time.sleep(60)
+
+    # This will as a side effect also delete the InstanceLinks that are tied to it.
+    Instance.objects.filter(hostname=hostname).delete()
+
+    # Update the receptor configs for all of the control-plane.
+    write_receptor_config.apply_async(queue='tower_broadcast_all')
--- a/awx/main/tasks/signals.py
+++ b/awx/main/tasks/signals.py
@@ -0,0 +1,71 @@
+import signal
+import functools
+import logging
+
+
+logger = logging.getLogger('awx.main.tasks.signals')
+
+
+__all__ = ['with_signal_handling', 'signal_callback']
+
+
+class SignalExit(Exception):
+    pass
+
+
+class SignalState:
+    def reset(self):
+        self.sigterm_flag = False
+        self.is_active = False
+        self.original_sigterm = None
+        self.original_sigint = None
+        self.raise_exception = False
+
+    def __init__(self):
+        self.reset()
+
+    def set_flag(self, *args):
+        """Method to pass into the python signal.signal method to receive signals"""
+        self.sigterm_flag = True
+        if self.raise_exception:
+            self.raise_exception = False  # so it is not raised a second time in error handling
+            raise SignalExit()
+
+    def connect_signals(self):
+        self.original_sigterm = signal.getsignal(signal.SIGTERM)
+        self.original_sigint = signal.getsignal(signal.SIGINT)
+        signal.signal(signal.SIGTERM, self.set_flag)
+        signal.signal(signal.SIGINT, self.set_flag)
+        self.is_active = True
+
+    def restore_signals(self):
+        signal.signal(signal.SIGTERM, self.original_sigterm)
+        signal.signal(signal.SIGINT, self.original_sigint)
+        self.reset()
+
+
+signal_state = SignalState()
+
+
+def signal_callback():
+    return signal_state.sigterm_flag
+
+
+def with_signal_handling(f):
+    """
+    Change signal handling to make signal_callback return True in event of SIGTERM or SIGINT.
+    """
+
+    @functools.wraps(f)
+    def _wrapped(*args, **kwargs):
+        try:
+            this_is_outermost_caller = False
+            if not signal_state.is_active:
+                signal_state.connect_signals()
+                this_is_outermost_caller = True
+            return f(*args, **kwargs)
+        finally:
+            if this_is_outermost_caller:
+                signal_state.restore_signals()
+
+    return _wrapped
--- a/awx/main/tasks/system.py
+++ b/awx/main/tasks/system.py
@@ -10,12 +10,13 @@ from contextlib import redirect_stdout
 import shutil
 import time
 from distutils.version import LooseVersion as Version
+from datetime import datetime

 # Django
 from django.conf import settings
 from django.db import transaction, DatabaseError, IntegrityError
 from django.db.models.fields.related import ForeignKey
-from django.utils.timezone import now
+from django.utils.timezone import now, timedelta
 from django.utils.encoding import smart_str
 from django.contrib.auth.models import User
 from django.utils.translation import gettext_lazy as _
@@ -53,13 +54,14 @@ from awx.main.dispatch import get_local_queuename, reaper
 from awx.main.utils.common import (
    ignore_inventory_computed_fields,
    ignore_inventory_group_removal,
-    schedule_task_manager,
+    ScheduleWorkflowManager,
+    ScheduleTaskManager,
 )

 from awx.main.utils.external_logging import reconfigure_rsyslog
 from awx.main.utils.reload import stop_local_services
 from awx.main.utils.pglock import advisory_lock
-from awx.main.tasks.receptor import get_receptor_ctl, worker_info, worker_cleanup, administrative_workunit_reaper
+from awx.main.tasks.receptor import get_receptor_ctl, worker_info, worker_cleanup, administrative_workunit_reaper, write_receptor_config
 from awx.main.consumers import emit_channel_notification
 from awx.main import analytics
 from awx.conf import settings_registry
@@ -79,6 +81,10 @@ Try upgrading OpenSSH or providing your private key in an different format. \
 def dispatch_startup():
    startup_logger = logging.getLogger('awx.main.tasks')

+    # TODO: Enable this on VM installs
+    if settings.IS_K8S:
+        write_receptor_config()
+
    startup_logger.debug("Syncing Schedules")
    for sch in Schedule.objects.all():
        try:
@@ -103,6 +109,8 @@ def dispatch_startup():
    #
    apply_cluster_membership_policies()
    cluster_node_heartbeat()
+    reaper.startup_reaping()
+    reaper.reap_waiting(grace_period=0)
    m = Metrics()
    m.reset_values()

@@ -115,10 +123,10 @@ def inform_cluster_of_shutdown():
        this_inst = Instance.objects.get(hostname=settings.CLUSTER_HOST_ID)
        this_inst.mark_offline(update_last_seen=True, errors=_('Instance received normal shutdown signal'))
        try:
-            reaper.reap(this_inst)
+            reaper.reap_waiting(this_inst, grace_period=0)
        except Exception:
-            logger.exception('failed to reap jobs for {}'.format(this_inst.hostname))
-        logger.warning('Normal shutdown signal for instance {}, ' 'removed self from capacity pool.'.format(this_inst.hostname))
+            logger.exception('failed to reap waiting jobs for {}'.format(this_inst.hostname))
+        logger.warning('Normal shutdown signal for instance {}, removed self from capacity pool.'.format(this_inst.hostname))
    except Exception:
        logger.exception('Encountered problem with normal shutdown signal.')

@@ -345,9 +353,13 @@ def _cleanup_images_and_files(**kwargs):
            logger.info(f'Performed local cleanup with kwargs {kwargs}, output:\n{stdout}')

    # if we are the first instance alphabetically, then run cleanup on execution nodes
-    checker_instance = Instance.objects.filter(node_type__in=['hybrid', 'control'], enabled=True, capacity__gt=0).order_by('-hostname').first()
+    checker_instance = (
+        Instance.objects.filter(node_type__in=['hybrid', 'control'], node_state=Instance.States.READY, enabled=True, capacity__gt=0)
+        .order_by('-hostname')
+        .first()
+    )
    if checker_instance and this_inst.hostname == checker_instance.hostname:
-        for inst in Instance.objects.filter(node_type='execution', enabled=True, capacity__gt=0):
+        for inst in Instance.objects.filter(node_type='execution', node_state=Instance.States.READY, enabled=True, capacity__gt=0):
            runner_cleanup_kwargs = inst.get_cleanup_task_kwargs(**kwargs)
            if not runner_cleanup_kwargs:
                continue
@@ -401,7 +413,12 @@ def execution_node_health_check(node):
        return

    if instance.node_type != 'execution':
-        raise RuntimeError(f'Execution node health check ran against {instance.node_type} node {instance.hostname}')
+        logger.warning(f'Execution node health check ran against {instance.node_type} node {instance.hostname}')
+        return
+
+    if instance.node_state not in (Instance.States.READY, Instance.States.UNAVAILABLE, Instance.States.INSTALLED):
+        logger.warning(f"Execution node health check ran against node {instance.hostname} in state {instance.node_state}")
+        return

    data = worker_info(node)

@@ -436,6 +453,7 @@ def inspect_execution_nodes(instance_list):

        nowtime = now()
        workers = mesh_status['Advertisements']
+
        for ad in workers:
            hostname = ad['NodeID']

@@ -446,25 +464,23 @@ def inspect_execution_nodes(instance_list):
                continue

            # Control-plane nodes are dealt with via local_health_check instead.
-            if instance.node_type in ('control', 'hybrid'):
+            if instance.node_type in (Instance.Types.CONTROL, Instance.Types.HYBRID):
                continue

-            was_lost = instance.is_lost(ref_time=nowtime)
            last_seen = parse_date(ad['Time'])
-
            if instance.last_seen and instance.last_seen >= last_seen:
                continue
            instance.last_seen = last_seen
            instance.save(update_fields=['last_seen'])

            # Only execution nodes should be dealt with by execution_node_health_check
-            if instance.node_type == 'hop':
-                if was_lost and (not instance.is_lost(ref_time=nowtime)):
+            if instance.node_type == Instance.Types.HOP:
+                if instance.node_state in (Instance.States.UNAVAILABLE, Instance.States.INSTALLED):
                    logger.warning(f'Hop node {hostname}, has rejoined the receptor mesh')
                    instance.save_health_data(errors='')
                continue

-            if was_lost:
+            if instance.node_state in (Instance.States.UNAVAILABLE, Instance.States.INSTALLED):
                # if the instance *was* lost, but has appeared again,
                # attempt to re-establish the initial capacity and version
                # check
@@ -479,11 +495,11 @@ def inspect_execution_nodes(instance_list):
                    execution_node_health_check.apply_async([hostname])


-@task(queue=get_local_queuename)
-def cluster_node_heartbeat():
+@task(queue=get_local_queuename, bind_kwargs=['dispatch_time', 'worker_tasks'])
+def cluster_node_heartbeat(dispatch_time=None, worker_tasks=None):
    logger.debug("Cluster node heartbeat task.")
    nowtime = now()
-    instance_list = list(Instance.objects.all())
+    instance_list = list(Instance.objects.filter(node_state__in=(Instance.States.READY, Instance.States.UNAVAILABLE, Instance.States.INSTALLED)))
    this_inst = None
    lost_instances = []

@@ -503,12 +519,23 @@ def cluster_node_heartbeat():

    if this_inst:
        startup_event = this_inst.is_lost(ref_time=nowtime)
+        last_last_seen = this_inst.last_seen
        this_inst.local_health_check()
        if startup_event and this_inst.capacity != 0:
-            logger.warning('Rejoining the cluster as instance {}.'.format(this_inst.hostname))
+            logger.warning(f'Rejoining the cluster as instance {this_inst.hostname}. Prior last_seen {last_last_seen}')
            return
+        elif not last_last_seen:
+            logger.warning(f'Instance does not have recorded last_seen, updating to {nowtime}')
+        elif (nowtime - last_last_seen) > timedelta(seconds=settings.CLUSTER_NODE_HEARTBEAT_PERIOD + 2):
+            logger.warning(f'Heartbeat skew - interval={(nowtime - last_last_seen).total_seconds():.4f}, expected={settings.CLUSTER_NODE_HEARTBEAT_PERIOD}')
    else:
-        raise RuntimeError("Cluster Host Not Found: {}".format(settings.CLUSTER_HOST_ID))
+        if settings.AWX_AUTO_DEPROVISION_INSTANCES:
+            (changed, this_inst) = Instance.objects.register(ip_address=os.environ.get('MY_POD_IP'), node_type='control', uuid=settings.SYSTEM_UUID)
+            if changed:
+                logger.warning(f'Recreated instance record {this_inst.hostname} after unexpected removal')
+            this_inst.local_health_check()
+        else:
+            raise RuntimeError("Cluster Host Not Found: {}".format(settings.CLUSTER_HOST_ID))
    # IFF any node has a greater version than we do, then we'll shutdown services
    for other_inst in instance_list:
        if other_inst.node_type in ('execution', 'hop'):
@@ -528,15 +555,17 @@ def cluster_node_heartbeat():

    for other_inst in lost_instances:
        try:
-            reaper.reap(other_inst)
+            explanation = "Job reaped due to instance shutdown"
+            reaper.reap(other_inst, job_explanation=explanation)
+            reaper.reap_waiting(other_inst, grace_period=0, job_explanation=explanation)
        except Exception:
            logger.exception('failed to reap jobs for {}'.format(other_inst.hostname))
        try:
-            if settings.AWX_AUTO_DEPROVISION_INSTANCES:
+            if settings.AWX_AUTO_DEPROVISION_INSTANCES and other_inst.node_type == "control":
                deprovision_hostname = other_inst.hostname
-                other_inst.delete()
+                other_inst.delete()  # FIXME: what about associated inbound links?
                logger.info("Host {} Automatically Deprovisioned.".format(deprovision_hostname))
-            elif other_inst.capacity != 0 or (not other_inst.errors):
+            elif other_inst.node_state == Instance.States.READY:
                other_inst.mark_offline(errors=_('Another cluster node has determined this instance to be unresponsive'))
                logger.error("Host {} last checked in at {}, marked as lost.".format(other_inst.hostname, other_inst.last_seen))

@@ -546,6 +575,15 @@ def cluster_node_heartbeat():
            else:
                logger.exception('Error marking {} as lost'.format(other_inst.hostname))

+    # Run local reaper
+    if worker_tasks is not None:
+        active_task_ids = []
+        for task_list in worker_tasks.values():
+            active_task_ids.extend(task_list)
+        reaper.reap(instance=this_inst, excluded_uuids=active_task_ids)
+        if max(len(task_list) for task_list in worker_tasks.values()) <= 1:
+            reaper.reap_waiting(instance=this_inst, excluded_uuids=active_task_ids, ref_time=datetime.fromisoformat(dispatch_time))
+

@task(queue=get_local_queuename)
 def awx_receptor_workunit_reaper():
@@ -593,7 +631,8 @@ def awx_k8s_reaper():
    for group in InstanceGroup.objects.filter(is_container_group=True).iterator():
        logger.debug("Checking for orphaned k8s pods for {}.".format(group))
        pods = PodManager.list_active_jobs(group)
-        for job in UnifiedJob.objects.filter(pk__in=pods.keys()).exclude(status__in=ACTIVE_STATES):
+        time_cutoff = now() - timedelta(seconds=settings.K8S_POD_REAPER_GRACE_PERIOD)
+        for job in UnifiedJob.objects.filter(pk__in=pods.keys(), finished__lte=time_cutoff).exclude(status__in=ACTIVE_STATES):
            logger.debug('{} is no longer active, reaping orphaned k8s pod'.format(job.log_format))
            try:
                pm = PodManager(job)
@@ -661,6 +700,13 @@ def awx_periodic_scheduler():
        state.save()


+def schedule_manager_success_or_error(instance):
+    if instance.unifiedjob_blocked_jobs.exists():
+        ScheduleTaskManager().schedule()
+    if instance.spawned_by_workflow:
+        ScheduleWorkflowManager().schedule()
+
+
@task(queue=get_local_queuename)
 def handle_work_success(task_actual):
    try:
@@ -670,8 +716,7 @@ def handle_work_success(task_actual):
        return
    if not instance:
        return
-
-    schedule_task_manager()
+    schedule_manager_success_or_error(instance)


@task(queue=get_local_queuename)
@@ -713,8 +758,7 @@ def handle_work_error(task_id, *args, **kwargs):
    # what the job complete message handler does then we may want to send a
    # completion event for each job here.
    if first_instance:
-        schedule_task_manager()
-        pass
+        schedule_manager_success_or_error(first_instance)


@task(queue=get_local_queuename)
--- a/awx/main/tests/data/inventory/plugins/gce/env.json
+++ b/awx/main/tests/data/inventory/plugins/gce/env.json
@@ -2,8 +2,9 @@
    "ANSIBLE_JINJA2_NATIVE": "True",
    "ANSIBLE_TRANSFORM_INVALID_GROUP_CHARS": "never",
    "GCE_CREDENTIALS_FILE_PATH": "{{ file_reference }}",
+    "GOOGLE_APPLICATION_CREDENTIALS": "{{ file_reference }}",
    "GCP_AUTH_KIND": "serviceaccount",
    "GCP_ENV_TYPE": "tower",
    "GCP_PROJECT": "fooo",
    "GCP_SERVICE_ACCOUNT_FILE": "{{ file_reference }}"
-}
+}
--- a/awx/main/tests/factories/fixtures.py
+++ b/awx/main/tests/factories/fixtures.py
@@ -210,7 +210,7 @@ def mk_workflow_job_template(name, extra_vars='', spec=None, organization=None,
    if extra_vars:
        extra_vars = json.dumps(extra_vars)

-    wfjt = WorkflowJobTemplate(name=name, extra_vars=extra_vars, organization=organization, webhook_service=webhook_service)
+    wfjt = WorkflowJobTemplate.objects.create(name=name, extra_vars=extra_vars, organization=organization, webhook_service=webhook_service)

    if spec:
        wfjt.survey_spec = spec
--- a/Show More
+++ b/Show More