Add retry_stagger var for failed download/pushes.

* Add the retry_stagger var to tweak push and retry time strategies. * Add large deployments related docs. Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
2026-06-19 05:37:44 -02:30 · 2016-09-15 11:23:27 +02:00
parent 9926395e5b
commit 390764c2b4
8 changed files with 30 additions and 9 deletions
--- a/docs/large-deploymets.md
+++ b/docs/large-deploymets.md
@@ -0,0 +1,19 @@
+Large deployments of K8s
+========================
+
+For a large scaled deployments, consider the following configuration changes:
+
+* Tune [ansible settings](http://docs.ansible.com/ansible/intro_configuration.html)
+  for `forks` and `timeout` vars to fit large numbers of nodes being deployed.
+
+* Override containers' `foo_image_repo` vars to point to intranet registry.
+
+* Override the ``download_run_once: true`` to download binaries and container
+  images only once then push to nodes in batches.
+
+* Adjust the `retry_stagger` global var as appropriate. It should provide sane
+  load on a delegate (the first K8s master node) then retrying failed
+  push or download operations.
+
+For example, when deploying 200 nodes, you may want to run ansible with
+``--forks=50``, ``--timeout=600`` and define the ``retry_stagger: 60``.