From bdabe3602931acfcd61f441da823205377c23868 Mon Sep 17 00:00:00 2001 From: Chris Meyers Date: Thu, 15 Oct 2020 15:25:47 -0400 Subject: [PATCH] reduce parent->child lock contention * We update the parent unified job template to point at new jobs created. We also update a similar foreign key when the job finishes running. This causes lock contention when the job template is allow_simultaneous and there are a lot of jobs from that job template running in parallel. I've seen as bad as 5 minutes waiting for the lock when a job finishes. * This change moves the parent->child update to OUTSIDE of the transaction if the job is allow_simultaneous (inherited from the parent unified job). We sacrafice a bit of correctness for performance. The logic is, if you are launching 1,000 parallel jobs do you really care that the job template contains a pointer to the last one you launched? Probably not. If you do, you can always query jobs related to the job template sorted by created time. --- awx/main/models/unified_jobs.py | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/awx/main/models/unified_jobs.py b/awx/main/models/unified_jobs.py index 1abbb29fcb..c50c8668d5 100644 --- a/awx/main/models/unified_jobs.py +++ b/awx/main/models/unified_jobs.py @@ -873,7 +873,13 @@ class UnifiedJob(PolymorphicModel, PasswordFieldsModel, CommonModelNameNotUnique # If status changed, update the parent instance. if self.status != status_before: - self._update_parent_instance() + # Update parent outside of the transaction for Job w/ allow_simultaneous=True + # This dodges lock contention at the expense of the foreign key not being + # completely correct. + if getattr(self, 'allow_simultaneous', False): + connection.on_commit(self._update_parent_instance) + else: + self._update_parent_instance() # Done. return result