Deployment 的生命周期中有不同状态,大致可分为三种
- rogressing 正在执行滚动更新
- complete
- fail to progress
Progressing 状态
当如下任何一个任务正在执行时,kubernnete将Deployment 的状态标记为 progressing ;
- Deployment 创建了一个新的 ReplicaSet
- Deployment 正在 scale up 其最新的 ReplicaSet
- Deployment 正在 scale down 其旧的 ReplicaSet
- 新的 Pod 变为 就绪(ready) 或 可用(available)
可以使用命令 kubectl rollout status
监控 Deployment 滚动更新的过程
Complete 状态
如果 Deployment 符合以下条件,Kubernetes 将其状态标记为 complete:
- 该 Deployment 中的所有 Pod 副本都已经被更新到指定的最新版本
- 该 Deployment 中的所有 Pod 副本都处于 可用(available) 状态
- 该 Deployment 中没有旧的 ReplicaSet 正在运行
以执行命令 kubectl rollout status
检查 Deployment 是否已经处于 complete 状态。如果是,则该命令的退出码为 0。
kubectl rollout status deployment.v1.apps/nginx-deployment1
输出结果:
Waiting for rollout to finish: 2 of 3 updated replicas are available...
deployment.apps/nginx-deployment successfully rolled out
$ echo $?
0
Failed 状态
Deployment 在更新其最新的 ReplicaSet 时,可能卡住而不能达到 complete 状态。如下原因都可能导致此现象发生:
- 集群资源不够
- 就绪检查(readiness probe)失败
- 镜像抓取失败
- 权限不够
- 资源限制
- 应用程序的配置错误导致启动失败
指定 Deployment 定义中的 .spec.progressDeadlineSeconds
字段,Deployment Controller 在等待指定的时长后,将 Deployment 的标记为处理失败。例如,执行命令 kubectl patch deployment.v1.apps/nginx-deployment -p '{"spec":{"progressDeadlineSeconds":600}}'
使得 Deployment Controller 为 Deployment 的处理过程等候 10 分钟
输出结果:
deployment.apps/nginx-deployment1 patched (no change)
等候时间达到后,Deployment Controller 将在 Deployment 的 .status.conditions
字段添加如下 DeploymentCondition
:
- Type=Progressing
- Status=False
- Reason=ProgressDeadlineExceeded
- 除了添加一个
Reason=ProgressDeadlineExceeded
的DeploymentCondition
到.status.conditions
字段以外,Kubernetes 不会对被卡住的 Deployment 做任何操作。您可以执行kubectl rollout undo
命令,将 Deployment 回滚到上一个版本- 如果您暂停了 Deployment,Kubernetes 将不会检查
.spec.progressDeadlineSeconds
。
如果您设定的 .spec.progressDeadlinSeconds
太短了,或者其他原因,您可能发现 Deployment 的状态改变出错。例如,假设您的集群缺乏足够的资源,
执行命令
kubectl describe deployment nginx-deployment
输出结果:
<...>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True ReplicaSetUpdated
ReplicaFailure True FailedCreate
<...>
执行命令查看Deployment 的 Status 结果
kubectl get deployment nginx-deployment -o yaml
输出结果
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
deployment.kubernetes.io/revision: "3"
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"app":"nginx"},"name":"nginx-deployment1","namespace":"default"},"spec":{"replicas":3,"selector":{"matchLabels":{"app":"nginx"}},"template":{"metadata":{"labels":{"app":"nginx"}},"spec":{"containers":[{"image":"nginx:1.7.9","name":"nginx","ports":[{"containerPort":80}],"resources":{"limits":{"cpu":"100000m"}}}]}}}}
creationTimestamp: "2023-05-29T07:12:58Z"
generation: 6
labels:
app: nginx
name: nginx-deployment1
namespace: default
resourceVersion: "19511985"
uid: 78301c5d-9cf6-476b-9ee3-0d483f318df4
spec:
progressDeadlineSeconds: 600
replicas: 3
revisionHistoryLimit: 10
selector:
matchLabels:
app: nginx
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
creationTimestamp: null
labels:
app: nginx
spec:
containers:
- image: nginx:1.7.9
imagePullPolicy: IfNotPresent
name: nginx
ports:
- containerPort: 80
protocol: TCP
resources:
limits:
cpu: "100"
memory: 512Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
dnsPolicy: ClusterFirst
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
terminationGracePeriodSeconds: 30
status:
availableReplicas: 3
conditions:
- lastTransitionTime: "2023-05-29T07:13:37Z"
lastUpdateTime: "2023-05-29T07:13:37Z"
message: Deployment has minimum availability.
reason: MinimumReplicasAvailable
status: "True"
type: Available
- lastTransitionTime: "2023-05-29T07:59:16Z"
lastUpdateTime: "2023-05-29T09:48:53Z"
message: ReplicaSet "nginx-deployment1-76b7bc4447" is progressing.
reason: ReplicaSetUpdated
status: "True"
type: Progressing
observedGeneration: 6
readyReplicas: 3
replicas: 4
unavailableReplicas: 1
updatedReplicas: 1
最终,一旦 Deployment 的 .spec.progressDeadlinSeconds
超时,Kubernetes 将更新 Deployment 的 Processing condition
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing False ProgressDeadlineExceeded
ReplicaFailure True FailedCreate
那么我们如何解决资源不足呢?我们可以尝试如下措施:
- scale down 您的 Deployment
- scale down 其他的 Deployment
- 向集群中添加计算节点
如果资源足够,并且 Deployment 完成了其滚动更新,您将看到 Deployment 中出现一个成功的 condition(status=True 且 Reason=NewReplicaSetAvailable)。
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
-
Type=Available
及Status=True
代表您的 Deployment 具备最小可用的 Pod 数(minimum availability)。Minimum availability 由 Deployment 中的 strategy 参数决定。 -
Type=Progressing
及Status=True
代表您的 Deployment 要么处于滚动更新的过程中,要么已经成功完成更新并且 Pod 数达到了最小可用的数量。
命令 kubectl rollout status
可用于检查 Deployment 是否失败,如果该命令的退出码不是 0,则该 Deployment 已经超出了 .spec.progressDeadlinSeconds
指定的等候时长。
kubectl rollout status deployment.v1.apps/nginx-deployment
输出结果:文章来源:https://www.toymoban.com/news/detail-622295.html
Waiting for rollout to finish: 2 out of 3 new replicas have been updated...
error: deployment "nginx" exceeded its progress deadline
$ echo $?
1
操作处于 Failed 状态的 Deployment
可以针对 Failed 状态下的 Deployment 执行任何适用于 Deployment 的指令。文章来源地址https://www.toymoban.com/news/detail-622295.html
- scale up / scale down
- 回滚到前一个版本
- 暂停(pause)Deployment,以对 Deployment 的 Pod template 执行多处更新
到了这里,关于k8s控制器之Deployment第七弹之查看Deployment的状态的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!