Commit Graph

8578 Commits

Author SHA1 Message Date
github-actions[bot]
fd8dc350dc Patch versions updates 2026-01-30 03:14:34 +00:00
k8s-infra-cherrypick-robot
683ee4233f wait for control plane node to become ready after joining (#12924)
When joining a control plane node and "upgrading" the cluster setup (for
example, to update etcd addresses after adding a new etcd) in the same
playbook run, the node can take a bit of time to become ready after
joining.
This triggers a kubeadm preflight check (ControlPlaneNodesReady) in
kubeadm upgrade, which is run directly after the join tasks.

Add a configurable wait for the control plane node to become Ready to
fix this race condition.

Co-authored-by: Max Gautier <mg@max.gautier.name>
2026-01-29 14:47:50 +05:30
k8s-infra-cherrypick-robot
4ff716dddd etcd-certs: only change necessary permissions (#12914)
We currently **recursively** set the permissions of /etc/ssl/etcd/ssl
(default path) to 700. But this removes group permission from the files
under it, and certain composents (like calio with etcd datastore) rely
on it ; thus, the upgrade of a cluster can fail because the
calico-kube-controller can't access the certs, and thus the etcd.

This works in other case because as far as I can tell, the apiserver
which do access the etcd run as root (the owner of the files, not just
the "group owner")

We also for some reasons do this twice.

Only create the etcd cert directory with the correct permissions once,
not recursively.

Co-authored-by: Max Gautier <mg@max.gautier.name>
2026-01-27 20:29:51 +05:30
k8s-infra-cherrypick-robot
73fcc6075d Docs: cilium_kube_proxy_replacement change boolean (#12911)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2026-01-27 17:03:49 +05:30
Max Gautier
a4e1a2aaaf Patch versions updates (#12895)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-01-24 09:41:27 +05:30
Kubernetes Prow Robot
2228e15860 Merge pull request #12882 from VannTen/fix/defaut_lb_address_backport
[release-2.29] Use loadbalancer IP as default apiserver endpoint if no LB hostname is used
2026-01-20 20:42:51 +05:30
k8s-infra-cherrypick-robot
f6d6351fdd cri-o: fix duplicate top-level "auths" keys in registry config template (#12886)
The config.json.j2 template was generating invalid JSON when multiple
crio_registry_auth entries were defined, resulting in multiple top-level
"auths" objects being rendered, e.g.:

{
  "auths": { "registry1": { "auth": "xxxx" } },
  "auths": { "registry2": { "auth": "yyyy" } }
}

This change moves the loop inside the "auths" object so that all registries
are rendered as siblings under a single "auths" key, producing valid JSON:

{
  "auths": {
    "registry1": { "auth": "xxxx" },
    "registry2": { "auth": "yyyy" }
  }
}

Co-authored-by: Martin Cahill <martin.cahill@gmail.com>
2026-01-20 20:16:49 +05:30
Max Gautier
051d03ead7 Fix defaults for apiserver_loadbalancer_domain_name
Since we're not longer injecting pseudo DNS into /etc/hosts,
'lb-apiserver.kubernetes.local' (the previous default) won't resolve to
anything.

Instead, default to the loadbalancer IP if defined, or to the node local
loadbalancer if it's in use.

Make the necessary adjustements in use site to deal with ip addresses as
well as hostnames.
2026-01-20 14:27:16 +01:00
Max Gautier
afe7d927c9 Do not use apiserver LB in etcd certificates
etcd does not use the apiserver load balancer, there is no reason to
include it's DNS into etcd certificates.
2026-01-20 14:23:07 +01:00
k8s-infra-cherrypick-robot
0b199325c8 k8s-certs-renew: fix broken script (#12881)
Unproquer quoting of variable assignment make the shell interpret it as
a command ; since the variable is unused anyway, just delete it.

Co-authored-by: Max Gautier <mg@max.gautier.name>
2026-01-20 08:48:48 +05:30
Max Gautier
7303abacb3 Patch versions updates (#12855)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2026-01-13 21:13:41 +05:30
k8s-infra-cherrypick-robot
485031dfe4 Fix ansible-lint config error (#12865)
Co-authored-by: Max Gautier <mg@max.gautier.name>
2026-01-13 20:33:40 +05:30
k8s-infra-cherrypick-robot
5fb85dc8a5 Add rbac for calico kube-controllers to access services (#12831)
Co-authored-by: Lawik974 <loic97429@gmail.com>
2026-01-02 21:00:35 +05:30
Max Gautier
84d8746b41 Patch versions updates (#12800)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-12-20 00:26:31 -08:00
k8s-infra-cherrypick-robot
8181d8c688 Upgrade cilium from 1.18.4 to 1.18.5 (#12804)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
Co-authored-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-12-19 07:44:34 -08:00
ChengHao Yang
c4c3205a71 Releng: galaxy version to 2.29.2 (#12786)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-12-11 19:47:30 -08:00
ChengHao Yang
0c6a29553f Patch versions updates (#12782)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
v2.29.1
2025-12-11 00:55:31 -08:00
Max Gautier
2375fae1c2 Patch versions updates (#12763)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-12-04 06:39:01 -08:00
Max Gautier
55f7b7f54c Patch versions updates (#12744)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-25 00:52:36 -08:00
k8s-infra-cherrypick-robot
dbca6a7757 [release-2.29] CI: enable unsafe_show_logs == true by default (#12728)
* CI: enable unsafe_show_logs == true by default

* Deduplicate defaults vars (unsafe_show_logs)

---------

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-11-19 23:32:02 -08:00
k8s-infra-cherrypick-robot
c5c43619a7 Upgrade cilium from 1.18.3 to 1.18.4 (#12725)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
Co-authored-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-11-18 20:09:59 -08:00
Max Gautier
584b0a4036 Patch versions updates (#12719)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-18 05:15:39 -08:00
k8s-infra-cherrypick-robot
084c2be8b9 CI: use a dedicated disk for releases (#12721)
This should make 'no space left on device' problems easier to handle

Use /tmp/releases as local_release_dir CI created machine, while keeping
the same folder on the runner (needed for gitlab-ci runner pods)

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-11-18 03:29:39 -08:00
k8s-infra-cherrypick-robot
932025fbd6 Let containerd create storage / state dir (#12722)
Containerd manages by itself, so there is no need to override it and
change permissions.

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-11-18 03:09:38 -08:00
k8s-infra-cherrypick-robot
a04592de18 Adjust hubble export values for cilium 1.18 schema change (#12718)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
Co-authored-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-11-18 00:41:42 -08:00
k8s-infra-cherrypick-robot
d8b9288b27 [release-2.29] CI: Try a full ssh connection on hosts instead of only checking the port (#12711)
* CI: Try a full ssh connection on hosts instead of only checking the port

If we only try the port, we can try to connect in the playbook which is
executed next even though the managed node has not yet completed it's
boot-up sequence ("System is booting up. Unprivileged users are not
permitted to log in yet. Please come back later. For technical details,
see pam_nologin(8).")

This does not account for python-less hosts, but we don't use those in
CI anyway (for now, at least).

* CI: Remove connection method override when creating VMs

This prevented wait_for_connection to work correctly by hijacking the
connection to localhost, thus bypassing the connection check.

---------

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-11-15 12:37:38 -08:00
Max Gautier
cbdd7cf3a7 update pre-commit hooks (#12706) 2025-11-14 22:41:40 -08:00
k8s-infra-cherrypick-robot
3c0cff983d fix(cilium):correct loadBalancer.mode rendering in values.yaml (#12705)
Co-authored-by: Anurag Ojha <aojharaj2004@gmail.com>
2025-11-14 07:01:40 -08:00
k8s-infra-cherrypick-robot
e5a1f68a2c Update Calico apiserver RBAC for Kubernetes 1.33+ (#12695)
Add missing RBAC permissions for Calico apiserver to function correctly
with Kubernetes 1.33+

Changes:

1. Add K8s 1.33 ValidatingAdmissionPolicy resources to calico-webhook-reader
   - validatingadmissionpolicies
   - validatingadmissionpolicybindings

Kubernetes 1.33 introduced ValidatingAdmissionPolicy resources (KEP-3488)
that require explicit RBAC permissions. Without these changes, Calico
apiserver on k8s 1.33+ will not work and needless errors are logged

Co-authored-by: rickerc <chris.ricker@gmail.com>
2025-11-14 04:49:38 -08:00
k8s-infra-cherrypick-robot
fe566df651 Fix the (upgrade/remove_node) + collection test cases (#12687)
The 'old' playbook and the collection use '-' and '_' as separator,
which breaks the logic in scripts/testcases_run.sh.

Add aliases using the old schemes to make the test work and avoid
breaking anything.

Both '-' and '_' variants will be deleted once we switch to supporting
collection only.

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-11-10 06:46:57 -08:00
k8s-infra-cherrypick-robot
59b3c686a8 [release-2.29] Remove etcd member by peerURLs (#12685)
* Remove etcd member by peerURLs

The way to obtain the IP of a particular member is convoluted and depend
on multiple variables. The match is also textual and it's not clear
against what we're matching

It's also broken for etcd member which are not also Kubernetes nodes,
because the "Lookup node IP in kubernetes" task will fail and abort the
play.

Instead, match against 'peerURLs', which does not need new variable, and
use json output.

* Add testcase for etcd removal on external etcd

* do not merge

* fixup! Remove etcd member by peerURLs

* fixup! Remove etcd member by peerURLs

---------

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-11-10 05:48:56 -08:00
Ali Afsharzadeh
4b970baa5a [release-2.29] Upgrade cilium from 1.18.2 to 1.18.3 (#12679) 2025-11-09 06:00:52 -08:00
ChengHao Yang
a15fcb729b Patch versions updates (#12646)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-03 02:19:36 -08:00
k8s-infra-cherrypick-robot
9a9e33dc9f fix(calico): Add missed rbac verb for hostendpoints (#12644)
Signed-off-by: Meza <meza-xyz@proton.me>
Co-authored-by: Meza <meza-xyz@proton.me>
2025-10-24 01:05:34 -07:00
ChengHao Yang
d9f188c39c [release-2.29] Releng: galaxy version to 2.29.1 (#12645)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-10-24 00:41:36 -07:00
ChengHao Yang
9991412b45 Docs: bump version to 2.29.0 (#12621)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
v2.29.0
2025-10-14 01:29:36 -07:00
Mahendra Reddy
ee6a792ec0 feat: add support crio additional mounts (#12561)
removed default since it's already set in variables

fix pre commit issue in the pipeline
2025-10-13 18:15:32 -07:00
Max Gautier
fbf957ab5d Fix breakage when ignoring all kubeadm preflight errors (#12606)
kubeadm errors out if 'all' is specified with specific checks, so check
that case when we add hardcoded checks.

Add a test to catch regression.
2025-10-13 05:54:58 -07:00
dependabot[bot]
202a0f3461 build(deps): bump redhat-plumbers-in-action/advanced-issue-labeler (#12600)
Bumps [redhat-plumbers-in-action/advanced-issue-labeler](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler) from 3.2.2 to 3.2.3.
- [Release notes](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/releases)
- [Commits](0db433d412...e38e6809c5)

---
updated-dependencies:
- dependency-name: redhat-plumbers-in-action/advanced-issue-labeler
  dependency-version: 3.2.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-09 11:53:00 -07:00
Arthur Outhenin-Chalandre
8c16c0f2b9 owner: remove myself from reviewers (#12594)
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2025-10-09 02:47:03 -07:00
Jan Breitkopf
deaabb694d fix missing directory when run with download_run_once (#12275) 2025-10-09 02:01:02 -07:00
Mahendra Reddy
e39e005306 bugfix: skip etcd cert extraction if cilium identity uses crd (#12565)
* bugfix: skip etcd cert extraction if cilium identity uses crd

* remove new line end of the file
2025-10-09 00:31:00 -07:00
Matthias Lohr
6d6633a905 show node name to be more clear which node is going to be upgraded (#12399)
* show node name to be more clear which node is going to be upgraded

* also show nodename when uncordoning
2025-10-09 00:19:07 -07:00
Mohamed Omar Zaian
fd7f39043b [ingress-nginx] upgrade to 1.13.3 (#12604) 2025-10-08 19:04:59 -07:00
Ali Afsharzadeh
f8e74aafb9 Fix cilium_policy_audit_mode variable (#12569)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-10-07 09:15:02 -07:00
ChengHao Yang
aa255f8831 Patch versions updates (#12602)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-10-07 07:25:02 -07:00
Bas
9ded45f703 Documentation - hardening.md - etcd_deployment_type: host (#12520)
* Fix for #12447

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>

* Update hardening.md

Co-authored-by: spatterlight <81454789+spatterIight@users.noreply.github.com>

---------

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>
Co-authored-by: spatterlight <81454789+spatterIight@users.noreply.github.com>
2025-10-06 02:07:00 -07:00
Mahendra Reddy
270ff65992 fix crio restart while switching runtime (#12008)
fixed kubelet condition

CRI-O: fix for handling of container runtime switching

refactored kubelet start condition

stop/start kubelet and crio only when default runtime is changed

fixed condition for runtime_matches fact variable

fixed set facts for existing container runtime

added crio runtime switch variable

changed condition to use runtime switch variable

added comment for not-found for readers
2025-10-06 01:58:59 -07:00
dependabot[bot]
324e7f50c9 build(deps): bump cryptography from 46.0.1 to 46.0.2 (#12599)
Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.1 to 46.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/46.0.1...46.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-06 01:47:00 -07:00
R. P. Taylor
055274937b Fix variable typos (#12595) 2025-10-06 01:28:58 -07:00