K8S集群证书更新

通常情况下,使用 KubeKey 部署的 Kubernetes 集群是不会遇到证书过期问题的,KubeKey 在部署时会自动配置一个定时任务,定期检查集群中所有证书的有效期,系统会监控证书状态,当发现任何证书的剩余有效期低于 30 天时,就会触发自动更新流程。

为了实现这个自动化的证书更新机制,KubeKey 在系统中配置了以下三个关键组件:

脚本文件文件路径功能说明
k8s-certs-renew.service/etc/systemd/system/k8s-certs-renew.service系统服务单元文件,用于执行证书更新脚本
k8s-certs-renew.timer/etc/systemd/system/k8s-certs-renew.timer定时器单元,设置为每周一凌晨3点自动执行更新
k8s-certs-renew.sh/usr/local/bin/kube-scripts/k8s-certs-renew.sh证书更新主脚本,使用 kubeadm certs renew all 命令更新所有证书

备份脚本k8s-certs-renew.sh原始内容:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#!/bin/bash
kubeadmCerts='/usr/local/bin/kubeadm certs'
getCertValidDays() {
local earliestExpireDate; earliestExpireDate=$(${kubeadmCerts} check-expiration | grep -o "[A-Za-z]\{3,4\}\s\w\w,\s[0-9]\{4,\}\s\w*:\w*\s\w*\s*" | xargs -I {} date -d {} +%s | sort | head -n 1)
local today; today="$(date +%s)"
echo -n $(( ($earliestExpireDate - $today) / (24 * 60 * 60) ))
}
echo "## Expiration before renewal ##"
${kubeadmCerts} check-expiration
if [ $(getCertValidDays) -lt 30 ]; then
echo "## Renewing certificates managed by kubeadm ##"
${kubeadmCerts} renew all
echo "## Restarting control plane pods managed by kubeadm ##"
$(which crictl | grep crictl) pods --namespace kube-system --name 'kube-scheduler-*|kube-controller-manager-*|kube-apiserver-*|etcd-*' -q | /usr/bin/xargs $(which crictl | grep crictl) rmp -f
echo "## Updating /root/.kube/config ##"
cp /etc/kubernetes/admin.conf /root/.kube/config
fi
echo "## Waiting for apiserver to be up again ##"
until printf "" 2>>/dev/null >>/dev/tcp/127.0.0.1/6443; do sleep 1; done
echo "## Expiration after renewal ##"
${kubeadmCerts} check-expiration

我的K8S集群不是通过KubeKey部署的,这里说一下更新k8s证书的流程。

收到告警通知,证书即将过期:

查看证书到期时间

1
kubeadm certs check-expiration

备份集群证书关键信息

操作有风险,备份是王道!请务必在任何改动前,完整备份现有环境的配置文件和证书。

请在每个 Control 节点上执行以下操作:

创建备份目录

1
mkdir /root/ksp-backup

备份原有信息

1
cp -a /etc/kubernetes /root/ksp-backup/

备份 ssl

1
cp -a /etc/ssl/etcd/ /root/ksp-backup/etcd-ssl-bak-`date +%Y-%H-%M`

备份etcd 数据

1
cp -a /var/lib/etcd /root/ksp-backup/etcd-bak-`date +%Y-%H-%M`

更新证书

1
kubeadm certs renew all

更新完成后会提示重启组件生效:

1
Done renewing certificates. You must restart the kube-apiserver, kube-controller-manager, kube-scheduler and etcd, so that they can use the new certificates.

重启组件

获取控制平面组件列表:

1
ll /etc/kubernetes/manifests/

重启所有控制平面组件(推荐顺序),先重启 etcd(如果多节点需逐个重启):

1
2
3
sudo mv /etc/kubernetes/manifests/etcd.yaml /tmp/ && \
sleep 20 && \
sudo mv /tmp/etcd.yaml /etc/kubernetes/manifests/

再重启 API Server

1
2
3
sudo mv /etc/kubernetes/manifests/kube-apiserver.yaml /tmp/ && \
sleep 20 && \
sudo mv /tmp/kube-apiserver.yaml /etc/kubernetes/manifests/

最后重启其他组件

1
2
3
4
5
for comp in controller-manager scheduler; do
sudo mv /etc/kubernetes/manifests/kube-${comp}.yaml /tmp/ && \
sleep 10 && \
sudo mv /tmp/kube-${comp}.yaml /etc/kubernetes/manifests/
done
Thank you for your accept. mua!
-------------本文结束感谢您的阅读-------------