Fixed issue #7112. Created new API Server vars that replace defunct Controller Manager one (#7114)

Signed-off-by: Brendan Holmes <5072156+holmesb@users.noreply.github.com>
2026-04-10 04:29:24 -02:30 · 2021-01-08 15:20:53 +00:00
parent ab2bfd7f8c
commit b0ad8ec023
4 changed files with 25 additions and 13 deletions
--- a/docs/kubernetes-reliability.md
+++ b/docs/kubernetes-reliability.md
@@ -43,8 +43,10 @@ attempts to set a status of node.

 At the same time Kubernetes controller manager will try to check
 `nodeStatusUpdateRetry` times every `--node-monitor-period` of time. After
-`--node-monitor-grace-period` it will consider node unhealthy. It will remove
-its pods based on `--pod-eviction-timeout`
+`--node-monitor-grace-period` it will consider node unhealthy.  Pods will then be rescheduled based on the
+[Taint Based Eviction](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/#taint-based-evictions)
+timers that you set on them individually, or the API Server's global timers:`--default-not-ready-toleration-seconds` &
+``--default-unreachable-toleration-seconds``.

 Kube proxy has a watcher over API. Once pods are evicted, Kube proxy will
 notice and will update iptables of the node. It will remove endpoints from
@@ -57,12 +59,14 @@ services so pods from failed node won't be accessible anymore.
 If `-–node-status-update-frequency` is set to **4s** (10s is default).
 `--node-monitor-period` to **2s** (5s is default).
 `--node-monitor-grace-period` to **20s** (40s is default).
-`--pod-eviction-timeout` is set to **30s** (5m is default)
+`--default-not-ready-toleration-seconds` and ``--default-unreachable-toleration-seconds`` are set to **30**
+(300 seconds is default).  Note these two values should be integers representing the number of seconds ("s" or "m" for
+seconds\minutes are not specified).

 In such scenario, pods will be evicted in **50s** because the node will be
-considered as down after **20s**, and `--pod-eviction-timeout` occurs after
-**30s** more.  However, this scenario creates an overhead on etcd as every node
-will try to update its status every 2 seconds.
+considered as down after **20s**, and `--default-not-ready-toleration-seconds` or
+``--default-unreachable-toleration-seconds`` occur after **30s** more.  However, this scenario creates an overhead on
+etcd as every node will try to update its status every 2 seconds.

 If the environment has 1000 nodes, there will be 15000 node updates per
 minute which may require large etcd containers or even dedicated nodes for etcd.
@@ -75,7 +79,8 @@ minute which may require large etcd containers or even dedicated nodes for etcd.
 ## Medium Update and Average Reaction

 Let's set `-–node-status-update-frequency` to **20s**
-`--node-monitor-grace-period` to **2m** and `--pod-eviction-timeout` to **1m**.
+`--node-monitor-grace-period` to **2m** and `--default-not-ready-toleration-seconds` and
+``--default-unreachable-toleration-seconds`` to **60**.
 In that case, Kubelet will try to update status every 20s. So, it will be 6 * 5
 = 30 attempts before Kubernetes controller manager will consider unhealthy
 status of node. After 1m it will evict all pods. The total time will be 3m
@@ -90,9 +95,9 @@ etcd updates per minute.
 ## Low Update and Slow reaction

 Let's set `-–node-status-update-frequency` to **1m**.
-`--node-monitor-grace-period` will set to **5m** and `--pod-eviction-timeout`
-to **1m**. In this scenario, every kubelet will try to update the status every
-minute. There will be 5 * 5 = 25 attempts before unhealthy status. After 5m,
+`--node-monitor-grace-period` will set to **5m** and `--default-not-ready-toleration-seconds` and
+``--default-unreachable-toleration-seconds`` to **60**. In this scenario, every kubelet will try to update the status
+every minute. There will be 5 * 5 = 25 attempts before unhealthy status. After 5m,
 Kubernetes controller manager will set unhealthy status. This means that pods
 will be evicted after 1m after being marked unhealthy. (6m in total).