从零开始的 Kubernetes 学习笔记(五)
后知后觉 暂无评论

使用 Kubernetes 对容器进行编排成为微服务时代的技术风向标。

更新记录

2022-08

2020-11


部署核心组件

metric-server

metric-server 是 Kubernetes 集群的必备组件,有它才能获取容器的资源占用等情况,也正因如此才能实现诸如自动扩容等功能的实现。

安装仅需一条命令

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml

使用此方式会导致一个问题,可以看到 metric-server 镜像已经成功拉取,但是服务无法展开,状态里显示 (0/1),可以通过命令来查看容器状态

kubectl describe pods -n kube-system metrics-server-xxxxxxxxx-xxxxx 

在输出信息的最后部分有事件 Events

Events:
  Type     Reason     Age               From               Message
  ----     ------     ----              ----               -------
  Normal   Scheduled  61s               default-scheduler  Successfully assigned kube-system/metrics-server-8ff8f88c6-mxxm5 to k8s2
  Normal   Pulling    60s               kubelet            Pulling image "k8s.gcr.io/metrics-server/metrics-server:v0.6.2"
  Normal   Pulled     50s               kubelet            Successfully pulled image "k8s.gcr.io/metrics-server/metrics-server:v0.6.2" in 10.756653001s (10.756655835s including waiting)
  Normal   Created    49s               kubelet            Created container metrics-server
  Normal   Started    49s               kubelet            Started container metrics-server
  Warning  Unhealthy  0s (x3 over 20s)  kubelet            Readiness probe failed: HTTP probe failed with statuscode: 500

可以看到报错为 HTTP 状态返回 500,并且通过命令查询集群日志,存在大量类似 x509 报错

I0308 13:32:32.182930       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
I0308 13:32:42.182434       1 server.go:187] "Failed probe" probe="metric-storage-ready" err="no metrics to serve"
E0308 13:32:43.469954       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.16.16.5:10250/metrics/resource\": x509: cannot validate certificate for 172.16.16.5 because it doesn't contain any IP SANs" node="k8s2"
E0308 13:32:43.476014       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.16.16.6:10250/metrics/resource\": x509: cannot validate certificate for 172.16.16.6 because it doesn't contain any IP SANs" node="k8s1"
E0308 13:32:43.476154       1 scraper.go:140] "Failed to scrape node" err="Get \"https://172.16.16.7:10250/metrics/resource\": x509: cannot validate certificate for 172.16.16.7 because it doesn't contain any IP SANs" node="k8s3"

这是因为集群证书问题导致的,按以下步骤进行解决:

  1. 获取部署配置

    wget https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml -O metric-server.yaml
  2. 修改部署配置,找到节点 spec.template.spec.containers.args

    spec:
     selector:
       matchLabels:
       k8s-app: metrics-server
     strategy:
     rollingUpdate:
       maxUnavailable: 0
     template:
     metadata:
       labels:
         k8s-app: metrics-server
     spec:
       containers:
       - args:                                                                                                                                                                             
         - --cert-dir=/tmp
         - --secure-port=4443
         - --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
         - --kubelet-use-node-status-port
         - --metric-resolution=15s
         - --kubelet-insecure-tls      ## 加入这行,忽略证书
  3. 应用修改后的配置

    kubectl apply -f metric-server.yaml
  4. 稍等片刻后检查服务状态

    kubectl get pods -n kube-system | grep metrics-server

ingress-nginx

ingress-nginx 是 Kubernetes 集群的必备组件,有它才能为服务配置负载均衡等代理,为服务的外部访问和动态负载提供支持。项目地址:ingress-nginx

kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/main/deploy/static/provider/baremetal/deploy.yaml

稍等片刻检查状态

kubectl get pods -n ingress-nginx

MetalLB

MetalLB 是一个纯软件实现的负载均衡器,使用标准路由协议,可以为 Kubernetes 集群提供负载均衡服务。

如果使用的是 IPVS 模式,并且 Kubernetes 版本大于 v1.14.2,那么必须启用之前提到过的 strict ARP 模式,如果当时在集群初始化配置中忘了修改也没事,可以手动修改:

kubectl edit configmap -n kube-system kube-proxy

找到节点 ipvs.strictARP 改为 true

ipvs:
  excludeCIDRs: null
  minSyncPeriod: 0s
  scheduler: ""
  strictARP: true
  syncPeriod: 0s
  tcpFinTimeout: 0s
  tcpTimeout: 0s
  udpTimeout: 0s
kind: KubeProxyConfiguration

然后部署 MetalLB,部署的时候根据需求选择一种方式:

然后检查服务状态,会发现服务一直处于挂起状态

$ kubectl get service -n ingress-nginx
ingress-nginx      ingress-nginx-controller             LoadBalancer   10.111.31.111    <pending>     80:32133/TCP,443:30989/TCP   2d16h
ingress-nginx      ingress-nginx-controller-admission   ClusterIP      10.102.52.164    <none>        443/TCP                      2d16h

这是因为还没有为服务配置可用的 IP 地址池导致的,制作配置文件:

tee metallb-config.yaml <<EOF
apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: first-pool              ## 配置名称,根据需要取名
  namespace: metallb-system
spec:
  addresses:
  - 192.168.10.0/24             ## 地址池,根据实际情况分配
  - 192.168.9.1-192.168.9.5     ## 支持子网掩码格式/x或区段A-B格式,同时支持 IPv6
  - fc00:f853:0ccd:e799::/124
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: example
  namespace: metallb-system
EOF

然后应用配置

kubectl apply -f metallb-config.yaml

稍等片刻即可看到 ingress 服务已经正常分配地址,服务状态也不再为 pending。


附录

相关链接

参考链接

本文撰写于一年前,如出现图片失效或有任何问题,请在下方留言。博主看到后将及时修正,谢谢!
禁用 / 当前已拒绝评论,仅可查看「历史评论」。