【水文】calico-node 启动失败 Init:CrashLoopBackOff-Toy模板网

这篇具有很好参考价值的文章主要介绍了【水文】calico-node 启动失败 Init:CrashLoopBackOff。希望对大家有所帮助。如果存在错误或未考虑完全的地方，请大家不吝赐教，您也可以点击"举报违法"按钮提交疑问。

查看日志报错如下

Defaulted container "calico-node" out of: calico-node, upgrade-ipam (init), install-cni (init), mount-bpffs (init)
Error from server (BadRequest): container "calico-node" in pod "calico-node-4j7td" is waiting to start: PodInitializing

结果：kube-proxy没启动，每个人的环境不同，需要具体排查看日志。下面是分析过程。

一直没看懂上面第一条啥意思，之前搞calico遇到过各种问题，潜意识觉得calico难搞就无脑搜啊搜，以为是疑难杂症，后来网上看到几个相同第一条报错的，才知道没啥价值，浪费了好多时间。

具体错误如下：

[root@k8s-master01 ~]# kubectl logs -n kube-system calico-node-4j7td
Defaulted container "calico-node" out of: calico-node, upgrade-ipam (init), install-cni (init), mount-bpffs (init)
Error from server (BadRequest): container "calico-node" in pod "calico-node-4j7td" is waiting to start: PodInitializing


[root@k8s-master01 ~]# kubectl get po -A -owide
NAMESPACE     NAME                                       READY   STATUS                  RESTARTS         AGE   IP                NODE           NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-6747f75cdc-pzhhj   1/1     Running                 0                52m   172.27.14.194     k8s-node02     <none>           <none>
kube-system   calico-node-4j7td                          0/1     Init:CrashLoopBackOff   16 (3m58s ago)   43m   192.168.145.161   k8s-master01   <none>           <none>
kube-system   calico-node-hwttj                          1/1     Running                 0                52m   192.168.145.162   k8s-master02   <none>           <none>

【精要】在这个启动失败的节点上，查看容器的日志，如果看不到运行中的容器，就查看exited的容器，如下：

[root@k8s-master01 ~]# crictl ps
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
[root@k8s-master01 ~]# crictl ps -a
CONTAINER           IMAGE               CREATED             STATE               NAME                ATTEMPT             POD ID              POD
6e2c272bcbd41       8479c67f450d3       4 minutes ago       Exited              install-cni         17                  0b0d02f6ac576       calico-node-4j7td
5c5f54bbda5db       8479c67f450d3       50 minutes ago      Exited              upgrade-ipam        1                   0b0d02f6ac576       calico-node-4j7td
[root@k8s-master01 ~]# crictl logs 6e2c272bcbd41
time="2022-12-01T15:36:27Z" level=info msg="Running as a Kubernetes pod" source="install.go:145"
2022-12-01 15:36:28.196 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/bandwidth"
2022-12-01 15:36:28.197 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/bandwidth
2022-12-01 15:36:28.297 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/calico"
2022-12-01 15:36:28.297 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/calico
2022-12-01 15:36:28.373 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/calico-ipam"
2022-12-01 15:36:28.373 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/calico-ipam
2022-12-01 15:36:28.376 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/flannel"
2022-12-01 15:36:28.376 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/flannel
2022-12-01 15:36:28.381 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/host-local"
2022-12-01 15:36:28.381 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/host-local
2022-12-01 15:36:28.447 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/install"
2022-12-01 15:36:28.447 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/install
2022-12-01 15:36:28.451 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/loopback"
2022-12-01 15:36:28.451 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/loopback
2022-12-01 15:36:28.455 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/portmap"
2022-12-01 15:36:28.455 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/portmap
2022-12-01 15:36:28.459 [INFO][1] cni-installer/<nil> <nil>: File is already up to date, skipping file="/host/opt/cni/bin/tuning"
2022-12-01 15:36:28.459 [INFO][1] cni-installer/<nil> <nil>: Installed /host/opt/cni/bin/tuning
2022-12-01 15:36:28.459 [INFO][1] cni-installer/<nil> <nil>: Wrote Calico CNI binaries to /host/opt/cni/bin

2022-12-01 15:36:28.494 [INFO][1] cni-installer/<nil> <nil>: CNI plugin version: v3.25.0-0.dev-519-g2fee4ee0153d

2022-12-01 15:36:28.494 [INFO][1] cni-installer/<nil> <nil>: /host/secondary-bin-dir is not writeable, skipping
W1201 15:36:28.494754       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
2022-12-01 15:36:49.499 [ERROR][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=Post "https://10.96.0.1:443/api/v1/namespaces/kube-system/serviceaccounts/calico-node/token": dial tcp 10.96.0.1:443: connect: connection refused
2022-12-01 15:36:49.499 [FATAL][1] cni-installer/<nil> <nil>: Unable to create token for CNI kubeconfig error=Post "https://10.96.0.1:443/api/v1/namespaces/kube-system/serviceaccounts/calico-node/token": dial tcp 10.96.0.1:443: connect: connection refused
[root@k8s-master01 ~]#
[root@k8s-master01 ~]#
[root@k8s-master01 ~]#
[root@k8s-master01 ~]#
[root@k8s-master01 ~]# kubectl get svc
NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   2d16h
[root@k8s-master01 ~]# kubectl get svc  -A
NAMESPACE     NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
default       kubernetes     ClusterIP   10.96.0.1       <none>        443/TCP    2d16h
kube-system   calico-typha   ClusterIP   10.107.84.250   <none>        5473/TCP   64m
[root@k8s-master01 ~]#
[root@k8s-master01 ~]#
[root@k8s-master01 ~]# telnet 10.96.0.1 443
Trying 10.96.0.1...
^C
[root@k8s-master01 ~]# ping 10.96.0.1
PING 10.96.0.1 (10.96.0.1) 56(84) bytes of data.
^C
--- 10.96.0.1 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

上面发现10.96.0.1不通，去其他节点查看是通的

[root@k8s-node02 ~]#
[root@k8s-node02 ~]# telnet 10.96.0.1 443
Trying 10.96.0.1...
Connected to 10.96.0.1.
Escape character is '^]'.
^CConnection closed by foreign host.
[root@k8s-node02 ~]# ping 10.96.0.1
PING 10.96.0.1 (10.96.0.1) 56(84) bytes of data.
64 bytes from 10.96.0.1: icmp_seq=1 ttl=64 time=0.061 ms
64 bytes from 10.96.0.1: icmp_seq=2 ttl=64 time=0.073 ms
^C
--- 10.96.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1057ms
rtt min/avg/max/mdev = 0.061/0.067/0.073/0.006 ms
[root@k8s-node02 ~]# ping 10.96.0.1
PING 10.96.0.1 (10.96.0.1) 56(84) bytes of data.
64 bytes from 10.96.0.1: icmp_seq=1 ttl=64 time=0.056 ms


64 bytes from 10.96.0.1: icmp_seq=2 ttl=64 time=0.067 ms
^C
--- 10.96.0.1 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1003ms
rtt min/avg/max/mdev = 0.056/0.061/0.067/0.009 ms

查看问题节点的代理服务kube-proxy，果然，这个居然是disable，郁闷，启动后，delete掉失败的pod，瞬间好了

[root@k8s-master01 ~]# systemctl status kube-proxy
● kube-proxy.service - Kubernetes Kube Proxy
   Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; disabled; vendor preset: disabled)
   Active: inactive (dead)
     Docs: https://github.com/kubernetes/kubernetes
[root@k8s-master01 ~]# systemctl enable --now kube-proxy
Created symlink from /etc/systemd/system/multi-user.target.wants/kube-proxy.service to /usr/lib/systemd/system/kube-proxy.service.
[root@k8s-master01 ~]# systemctl status kube-proxy
● kube-proxy.service - Kubernetes Kube Proxy
   Loaded: loaded (/usr/lib/systemd/system/kube-proxy.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2022-12-01 23:43:25 CST; 2s ago
     Docs: https://github.com/kubernetes/kubernetes
 Main PID: 20930 (kube-proxy)
    Tasks: 7
   Memory: 52.7M
   CGroup: /system.slice/kube-proxy.service
           └─20930 /usr/local/bin/kube-proxy --config=/etc/kubernetes/kube-proxy.yaml --v=2



[root@k8s-master01 ~]# kubectl get po -A -owide -w
NAMESPACE     NAME                                       READY   STATUS                  RESTARTS         AGE   IP                NODE           NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-6747f75cdc-pzhhj   1/1     Running                 0                68m   172.27.14.194     k8s-node02     <none>           <none>
kube-system   calico-node-4j7td                          0/1     Init:CrashLoopBackOff   19 (3m27s ago)   58m   192.168.145.161   k8s-master01   <none>           <none>
kube-system   calico-node-hwttj                          1/1     Running                 0                68m   192.168.145.162   k8s-master02   <none>           <none>
kube-system   calico-node-rjbz8                          1/1     Running                 0                68m   192.168.145.163   k8s-master03   <none>           <none>
kube-system   calico-node-rmjqj                          1/1     Running                 1 (55m ago)      68m   192.168.145.165   k8s-node02     <none>           <none>
kube-system   calico-node-vd7w2                          1/1     Running                 0                68m   192.168.145.164   k8s-node01     <none>           <none>
kube-system   calico-typha-6cdc4b4fbc-sb85z              1/1     Running                 0                68m   192.168.145.164   k8s-node01     <none>           <none>


^C[root@k8s-master01 ~]# kubectl delete po -n kube-system calico-node-4j7td
pod "calico-node-4j7td" deleted
[root@k8s-master01 ~]# kubectl get po -A -owide -w
NAMESPACE     NAME                                       READY   STATUS     RESTARTS      AGE   IP                NODE           NOMINATED NODE   READINESS GATES
kube-system   calico-kube-controllers-6747f75cdc-pzhhj   1/1     Running    0             68m   172.27.14.194     k8s-node02     <none>           <none>
kube-system   calico-node-hwttj                          1/1     Running    0             68m   192.168.145.162   k8s-master02   <none>           <none>
kube-system   calico-node-jbvlc                          0/1     Init:1/3   0             2s    192.168.145.161   k8s-master01   <none>           <none>
kube-system   calico-node-rjbz8                          1/1     Running    0             68m   192.168.145.163   k8s-master03   <none>           <none>
kube-system   calico-node-rmjqj                          1/1     Running    1 (55m ago)   68m   192.168.145.165   k8s-node02     <none>           <none>
kube-system   calico-node-vd7w2                          1/1     Running    0             68m   192.168.145.164   k8s-node01     <none>           <none>
kube-system   calico-typha-6cdc4b4fbc-sb85z              1/1     Running    0             68m   192.168.145.164   k8s-node01     <none>           <none>
kube-system   calico-node-jbvlc                          0/1     Init:1/3   0             2s    192.168.145.161   k8s-master01   <none>           <none>
kube-system   calico-node-jbvlc                          0/1     Init:2/3   0             3s    192.168.145.161   k8s-master01   <none>           <none>
kube-system   calico-node-jbvlc                          0/1     PodInitializing   0             4s    192.168.145.161   k8s-master01   <none>           <none>
kube-system   calico-node-jbvlc                          0/1     Running           0             5s    192.168.145.161   k8s-master01   <none>           <none>
^C[root@k8s-master01 ~]#

世上无难事，只怕有心人还不够，要有脑子，不要浪费不该浪费的时间。文章来源地址https://www.toymoban.com/news/detail-514004.html

到了这里，关于【水文】calico-node 启动失败 Init:CrashLoopBackOff的文章就介绍完了。如果您还想了解更多内容，请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章，希望大家以后多多支持TOY模板网！