Add retry_stagger var for failed download/pushes.

* Add the retry_stagger var to tweak push and retry time strategies.
* Add large deployments related docs.

Signed-off-by: Bogdan Dobrelya <bdobrelia@mirantis.com>
This commit is contained in:
Bogdan Dobrelya
2016-09-15 11:23:27 +02:00
parent 9926395e5b
commit 390764c2b4
8 changed files with 30 additions and 9 deletions

19
docs/large-deploymets.md Normal file
View File

@@ -0,0 +1,19 @@
Large deployments of K8s
========================
For a large scaled deployments, consider the following configuration changes:
* Tune [ansible settings](http://docs.ansible.com/ansible/intro_configuration.html)
for `forks` and `timeout` vars to fit large numbers of nodes being deployed.
* Override containers' `foo_image_repo` vars to point to intranet registry.
* Override the ``download_run_once: true`` to download binaries and container
images only once then push to nodes in batches.
* Adjust the `retry_stagger` global var as appropriate. It should provide sane
load on a delegate (the first K8s master node) then retrying failed
push or download operations.
For example, when deploying 200 nodes, you may want to run ansible with
``--forks=50``, ``--timeout=600`` and define the ``retry_stagger: 60``.