Kubernetes生产实战:微服务部署与弹性伸缩完全指南
大家好,我是迪哥。之前和大家聊了不少架构设计的话题,今天来点硬核的——聊聊如何在生产环境用 Kubernetes 部署微服务,以及如何实现真正的弹性伸缩。这是我所在团队踩过无数坑后总结出的实战经验。
为什么选择Kubernetes?
在云原生时代,Kubernetes 已经成为容器编排的事实标准。但很多团队上 K8s 后反而发现运维复杂度大幅上升,其实是因为没有想清楚几个核心问题:
- 服务如何优雅上下线:Pod 启动时流量不要打过来,终止时要等请求处理完再退出
- 资源配置如何设定:设小了 OOM,设大了浪费钱
- 弹性伸缩如何真正生效:HPA 不只是写个配置就完事了
接下来我会逐一讲解这些问题的解决方案。
核心配置:让服务优雅地运行
1. 优雅上下线
很多团队遇到服务重启时出现短暂的 503 错误,这就是没有配置优雅上下线。
apiVersion: apps/v1 kind: Deployment metadata: name: order-service namespace: production spec: replicas: 3 selector: matchLabels: app: order-service template: metadata: labels: app: order-service spec: terminationGracePeriodSeconds: 60 # 等待60秒让请求处理完 containers: - name: order-service image: registry.example.com/order-service:v1.2.3 ports: - containerPort: 8080 readinessProbe: httpGet: path: /actuator/health/readiness port: 8080 initialDelaySeconds: 10 periodSeconds: 5 failureThreshold: 3 livenessProbe: httpGet: path: /actuator/health/liveness port: 8080 initialDelaySeconds: 30 periodSeconds: 10 failureThreshold: 3 lifecycle: preStop: exec: command: ["/bin/sh", "-c", "sleep 10"] # 等10秒让kube-proxy更新endpoint2. 资源配额与 LimitRange
# 命名空间级别的资源限制 apiVersion: v1 kind: LimitRange metadata: name: production-limits namespace: production spec: limits: - type: Container default: cpu: 500m memory: 512Mi defaultRequest: cpu: 200m memory: 256Mi max: cpu: 4 memory: 4Gi min: cpu: 100m memory: 128Mi3. 反亲和性高可用部署
apiVersion: apps/v1 kind: Deployment metadata: name: order-service spec: replicas: 3 strategy: type: RollingUpdate rollingUpdate: maxSurge: 1 maxUnavailable: 0 template: spec: affinity: podAntiAffinity: preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchLabels: app: order-service topologyKey: kubernetes.io/hostname弹性伸缩:HPA + VPA + CPA
HorizontalPodAutoscaler(HPA)
apiVersion: autoscaling/v2 kind: HorizontalPodAutoscaler metadata: name: order-service-hpa namespace: production spec: scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: order-service minReplicas: 3 maxReplicas: 20 metrics: - type: Resource resource: name: cpu target: type: Utilization averageUtilization: 70 - type: Resource resource: name: memory target: type: Utilization averageUtilization: 80 - type: Pods pods: metric: name: http_requests_per_second target: type: AverageValue averageValue: "1000" behavior: scaleDown: stabilizationWindowSeconds: 300 # 冷却5分钟防止抖动 policies: - type: Percent value: 10 periodSeconds: 60 scaleUp: stabilizationWindowSeconds: 0 policies: - type: Percent value: 100 periodSeconds: 15 - type: Pods value: 4 periodSeconds: 15 selectPolicy: MaxVerticalPodAutoscaler(VPA)
HPA 管数量,VPA 管规格。对于一些有状态服务或者难以水平扩展的服务,VPA 可以自动调整 CPU 和内存:
apiVersion: autoscaling.k8s.io/v1 kind: VerticalPodAutoscaler metadata: name: order-service-vpa namespace: production spec: targetRef: apiVersion: "apps/v1" kind: Deployment name: order-service updatePolicy: updateMode: "Auto" resourcePolicy: containerPolicies: - containerName: order-service minAllowed: cpu: 100m memory: 128Mi maxAllowed: cpu: 2 memory: 2GiCronHPA:定时弹性伸缩
大促前提前扩容,活动结束后自动缩容:
apiVersion: v1 kind: ConfigMap metadata: name: cronhpa-config namespace: production data: config.yaml: | - name: order-service crons: - schedule: "0 2 * * 6" # 每周六凌晨2点扩容 minReplicas: 10 maxReplicas: 30 - schedule: "0 22 * * 7" # 周日凌晨10点缩容 minReplicas: 3 maxReplicas: 20灰度发布:如何让新版本平滑上线
基于权重的金丝雀发布
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: order-service spec: hosts: - order-service http: - route: - destination: host: order-service-v1 subset: v1 weight: 90 - destination: host: order-service-v2 subset: v2 weight: 10 --- apiVersion: networking.istio.io/v1alpha3 kind: DestinationRule metadata: name: order-service spec: host: order-service subsets: - name: v1 labels: version: v1 - name: v2 labels: version: v2基于请求内容的灰度
apiVersion: networking.istio.io/v1alpha3 kind: VirtualService metadata: name: order-service spec: hosts: - order-service http: - match: - headers: x-user-id: regex: ".*test.*" route: - destination: host: order-service-v2 subset: v2 - route: - destination: host: order-service-v1 subset: v1 weight: 100监控告警:让问题扼杀在摇篮里
Prometheus + Grafana 核心指标
# 关键监控指标 # 1. Pod 重启次数 - alert: PodRestartingTooMuch expr: rate(kube_pod_container_status_restarts_total[5m]) > 0.1 for: 5m labels: severity: warning annotations: summary: "Pod 重启过于频繁" # 2. CPU/内存使用率 - alert: HighResourceUsage expr: (kube_pod_container_resource_usage > 0.8) * on(namespace, pod) group_left(app) kube_deployment_labels for: 10m labels: severity: warning annotations: summary: "资源使用率超过 80%" # 3. HPA 无法扩容 - alert: HPACannotScale expr: kube_hpa_status_condition{condition="AbleToScale"} == 0 for: 5m labels: severity: critical annotations: summary: "HPA 无法进行扩缩容操作"经验总结
- 优雅上下线是基础:terminationGracePeriodSeconds + preStop + readinessProbe 三件套必须配齐
- 资源限制要合理:先用 VPA 观察实际用量,再设定合理阈值
- HPA 不要裸用:一定要设置 behavior 限制缩放速度,防止雪崩
- 灰度发布保平安:新版本先放 5% 流量,观察没问题再逐步放大
- 监控要覆盖全链路:从 K8s 层到应用层再到业务层,缺一不可
我家 Docker 最近总想往外跑,看来它也需要弹性伸缩——家里的"容器"不够用了 😂
我是迪哥,我们下期再见!
往期推荐:
- 《从单体到微服务架构拆分实战》
- 《Redis高可用架构实战》