简介
StatefulSet是一种给Pod提供唯一标志的控制器,它可以保证部署和扩展的顺序。
Pod一致性:包含次序(启动、停止次序)、网络一致性。此一致性与Pod相关,与被调度到哪个node节点无关。
稳定的次序:对于N个副本的StatefulSet,每个Pod都在[0,N)的范围内分配一个数字序号,且是唯一的。
稳定的网络:Pod的hostname模式为$(statefulset名称)-$(序号)。
稳定的存储:通过VolumeClaimTemplate为每个Pod创建一个PV。删除、减少副本,不会删除相关的卷。
部署Statefulset服务
volumeClaimTemplates:表示一类PVC的模板,系统会根据Statefulset配置的replicas数量,创建相应数量的PVC。这些PVC除了名字不一样之外其他配置都是一样的。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
| apiVersion: v1 kind: Service metadata: name: nginx labels: app: nginx spec: ports: - port: 80 name: web clusterIP: None selector: app: nginx --- apiVersion: apps/v1beta2 kind: StatefulSet metadata: name: web spec: selector: matchLabels: app: nginx serviceName: "nginx" replicas: 2 template: metadata: labels: app: nginx spec: containers: - name: nginx image: nginx ports: - containerPort: 80 name: web volumeMounts: - name: disk-data mountPath: /data volumeClaimTemplates: - metadata: name: disk-data spec: accessModes: [ "ReadWriteOnce" ] storageClassName: "data-db" resources: requests: storage: 20Gi
|
验证服务伸缩性
创建Statefulset服务:
1 2 3 4 5 6 7 8 9 10 11
| $ kubectl create -f statefulset.yaml
$ kubectl get pod NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 21m web-1 1/1 Running 0 20m
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 21m disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 21m
|
扩容服务到3个Pod,显示会创建新的云盘卷:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| $ kubectl scale sts web --replicas=3 statefulset.apps "web" scaled
$ kubectl get pod NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 24m web-1 1/1 Running 0 23m web-2 1/1 Running 0 2m
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 24m disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 24m disk-ssd-web-2 Bound d-2zea5iul9f4vgt82hxjj 20Gi RWO alicloud-disk-ssd 2m
|
缩容服务到2个Pod,显示pvc/pv并不会一同删除:
1 2 3 4 5 6 7 8 9 10 11 12 13
| $ kubectl scale sts web --replicas=2 statefulset.apps "web" scaled
$ kubectl get pod NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 25m web-1 1/1 Running 0 25m
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 25m disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 25m disk-ssd-web-2 Bound d-2zea5iul9f4vgt82hxjj 20Gi RWO alicloud-disk-ssd 3m
|
再次扩容到3个Pod,新的pod会复用原来的PVC/PV:
1 2 3 4 5 6 7 8 9 10 11 12 13 14
| $ kubectl scale sts web --replicas=3 statefulset.apps "web" scaled
$ kubectl get pod NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 27m web-1 1/1 Running 0 27m web-2 1/1 Running 0 2m
$ kubectl get pvc NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 27m disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 27m disk-ssd-web-2 Bound d-2zea5iul9f4vgt82hxjj 20Gi RWO alicloud-disk-ssd 5m
|
删除StatefulSet服务,PVC、PV并不会随着删除;
验证服务稳定性
删除一个Pod前,Pod引用PVC:disk-ssd-web-1
1 2 3 4 5
| $ kubectl describe pod web-1 | grep ClaimName ClaimName: disk-ssd-web-1
$ kubectl delete pod web-1 pod "web-1" deleted
|
删除Pod后,重新创建的Pod名字与删除的一致,且使用同一个PVC:
1 2 3 4 5 6 7 8
| $ kubectl get pod NAME READY STATUS RESTARTS AGE web-0 1/1 Running 0 29m web-1 1/1 Running 0 6s web-2 1/1 Running 0 4m
$ kubectl describe pod web-1 | grep ClaimName ClaimName: disk-ssd-web-1
|
验证服务高可用性
云盘中创建临时文件:
1 2 3 4 5 6
| $ kubectl exec web-1 ls /data lost+found
$ kubectl exec web-1 touch /data/statefulset $ kubectl exec web-1 ls /data statefulset
|
删除Pod,验证数据持久性:
1 2 3 4 5
| $ kubectl delete pod web-1 pod "web-1" deleted
$ kubectl exec web-1 ls /data statefulset
|
常见问题
- 服务都可以启动,但是运行一段时间后kafka的pod全部CrashLoopBackOff
解决: 查看kafka日志连接zookeeper超时。查看zookeeper日志
1
| [2020-05-18 02:39:07,095]WARN Error accepting new connection: Too many connections from /172.18.80.0 - max is 2
|
问题找到了,修改zookeeper配置 maxClientCnxns=200