* Change the location of the receptor socket
to /var/run/awx-receptor, to match what the installer is currently doing.
* Sync awx and receptor nodes for control socket
Co-authored-by: Jeff Bradberry <jeff.bradberry@gmail.com>
* Model changes for instance last_seen field to replace modified
* Break up refresh_capacity into smaller units
* Rename execution node methods, fix last_seen clustering
* Use update_fields to make it clear save only affects capacity
* Restructing to pass unit tests
* Fix bug where a PATCH did not update capacity value
* Update docker-compose
- Deploys 1 control and 1 execution node
* Add a new Receptor cluster configuration file
* update receptor peer to awx_1
to match how hop node is configured in cluster (Jim Ladd's commit)
* Move receptor_1 instantiation in the docker-compose setup
* Hard code receptor_1 name
* Update execution node name, move standalone conf file to docker-compose directory
* Reformat docker-compose file, mount another volume, change privileges
This requires swapping out the container images
for the execution nodes from awx-ee to the awx image
For completeness, the hop node image is switched to the raw
receptor image
A few outright bugs are fixed here
memory calculation just was not right at all
the execution_capacity calculation was reverse of intention
Drop in a few TODOs about error handling from debugging
- the task container needs to wait longer for migrations to complete for fresh installs before starting services
- otherwise, services start prematurely and clutter the logs with errors because migrations are mid-flight
* Our tests could consistently get awx jobs into a deadlocked state
whenever the parallelism was high. Even podman ps would hang when the
system was in this state. We don't know exactly where in runc the bug is
but the deadlocks stopped happening when we changed the OCI runtime
environment to crun.
Update Dockerfile.j2
SUMMARY
Jobs unable to start because podman trying to use systemd cgroup manager. See error below :
WARN[0000] Failed to add conmon to systemd sandbox cgroup: dial unix /run/systemd/private: connect: no such file or directory
Error: OCI runtime error: systemd cgroup flag passed, but systemd support for managing cgroups is not available
related #10099
ISSUE TYPE
Bugfix Pull Request
COMPONENT NAME
API
AWX VERSION
awx: 19.0.0
ADDITIONAL INFORMATION
According to this PR containers/podman#7009, podman switch references from libpod.conf to containers.conf.
According to containers.conf man (https://github.com/containers/common/blob/main/docs/containers.conf.5.md), configuration file is a TOML file but engine section declaration is missing.
thanks to @Siorde too 👍
Reviewed-by: Chris Meyers <None>
Reviewed-by: Shane McDonald <me@shanemcd.com>
Jobs unable to start because podman trying to use systemd cgroup manager. See error below :
```
WARN[0000] Failed to add conmon to systemd sandbox cgroup: dial unix /run/systemd/private: connect: no such file or directory
Error: OCI runtime error: systemd cgroup flag passed, but systemd support for managing cgroups is not available
```
* According to this PR https://github.com/containers/podman/pull/7009, podman switch references from libpod.conf to containers.conf.
* According to containers.conf man (https://github.com/containers/common/blob/main/docs/containers.conf.5.md), configuration file is a TOML file but engine section declaration is missing.
Force fully qualified image names
If we try and pull an unqualified image name, jobs hang on a podman
prompt.
I set the permissions as 644 because thats what worked for me because rootless podman needs to be able to read the file, but maybe there is another way to achieve that
Reviewed-by: Christian Adams <rooftopcellist@gmail.com>