千寻

道路很长, 开始了就别停下!

0%

K8S有状态服务-StatefulSet使用

简介

StatefulSet是一种给Pod提供唯一标志的控制器,它可以保证部署和扩展的顺序。

  • Pod一致性:包含次序(启动、停止次序)、网络一致性。此一致性与Pod相关,与被调度到哪个node节点无关。

  • 稳定的次序:对于N个副本的StatefulSet,每个Pod都在[0,N)的范围内分配一个数字序号,且是唯一的。

  • 稳定的网络:Pod的hostname模式为$(statefulset名称)-$(序号)。

  • 稳定的存储:通过VolumeClaimTemplate为每个Pod创建一个PV。删除、减少副本,不会删除相关的卷。

部署Statefulset服务

volumeClaimTemplates:表示一类PVC的模板,系统会根据Statefulset配置的replicas数量,创建相应数量的PVC。这些PVC除了名字不一样之外其他配置都是一样的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1beta2
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx
serviceName: "nginx"
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx
ports:
- containerPort: 80
name: web
volumeMounts:
- name: disk-data
mountPath: /data
volumeClaimTemplates:
- metadata:
name: disk-data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "data-db"
resources:
requests:
storage: 20Gi

验证服务伸缩性

创建Statefulset服务:

1
2
3
4
5
6
7
8
9
10
11
$ kubectl create -f statefulset.yaml

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 21m
web-1 1/1 Running 0 20m

$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 21m
disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 21m

扩容服务到3个Pod,显示会创建新的云盘卷:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubectl scale sts web --replicas=3
statefulset.apps "web" scaled

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 24m
web-1 1/1 Running 0 23m
web-2 1/1 Running 0 2m

$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 24m
disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 24m
disk-ssd-web-2 Bound d-2zea5iul9f4vgt82hxjj 20Gi RWO alicloud-disk-ssd 2m

缩容服务到2个Pod,显示pvc/pv并不会一同删除:

1
2
3
4
5
6
7
8
9
10
11
12
13
$ kubectl scale sts web --replicas=2
statefulset.apps "web" scaled

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 25m
web-1 1/1 Running 0 25m

$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 25m
disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 25m
disk-ssd-web-2 Bound d-2zea5iul9f4vgt82hxjj 20Gi RWO alicloud-disk-ssd 3m

再次扩容到3个Pod,新的pod会复用原来的PVC/PV:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ kubectl scale sts web --replicas=3
statefulset.apps "web" scaled

$ kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 27m
web-1 1/1 Running 0 27m
web-2 1/1 Running 0 2m

$ kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
disk-ssd-web-0 Bound d-2ze9k2rrtcy92e97d3ie 20Gi RWO alicloud-disk-ssd 27m
disk-ssd-web-1 Bound d-2ze5dwq6gyjnvdcrmtwg 20Gi RWO alicloud-disk-ssd 27m
disk-ssd-web-2 Bound d-2zea5iul9f4vgt82hxjj 20Gi RWO alicloud-disk-ssd 5m

删除StatefulSet服务,PVC、PV并不会随着删除;

验证服务稳定性

删除一个Pod前,Pod引用PVC:disk-ssd-web-1

1
2
3
4
5
$ kubectl describe pod web-1 | grep ClaimName
ClaimName: disk-ssd-web-1

$ kubectl delete pod web-1
pod "web-1" deleted

删除Pod后,重新创建的Pod名字与删除的一致,且使用同一个PVC:

1
2
3
4
5
6
7
8
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
web-0 1/1 Running 0 29m
web-1 1/1 Running 0 6s
web-2 1/1 Running 0 4m

$ kubectl describe pod web-1 | grep ClaimName
ClaimName: disk-ssd-web-1

验证服务高可用性

云盘中创建临时文件:

1
2
3
4
5
6
$ kubectl exec web-1 ls /data
lost+found

$ kubectl exec web-1 touch /data/statefulset
$ kubectl exec web-1 ls /data
statefulset

删除Pod,验证数据持久性:

1
2
3
4
5
$ kubectl delete pod web-1
pod "web-1" deleted

$ kubectl exec web-1 ls /data
statefulset

常见问题

  1. 服务都可以启动,但是运行一段时间后kafka的pod全部CrashLoopBackOff

解决: 查看kafka日志连接zookeeper超时。查看zookeeper日志

1
[2020-05-18 02:39:07,095]WARN Error accepting new connection: Too many connections from /172.18.80.0 - max is 2

问题找到了,修改zookeeper配置 maxClientCnxns=200