我在AWS中使用K3 S v1.21.5+ k3 s1设置了一个双节点群集。我刚刚进行了默认安装,所以我将Traefik作为我的IngressController,它应该绑定到每个节点上的端口80和443。但是,检查netstat
显示没有任何东西在侦听任何节点的这些端口,并且任何从浏览器连接到正在运行的Pod的尝试都失败了。
有趣的是,我尝试重新启动节点,看看是否可以解决问题(没有),我看到除了正常的svclb-traefik
Daemonset之外,还创建了第二个Daemonset,名为svclb-traefik-######
:
$ kubectl -n kube-system get ds
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
svclb-traefik 2 2 2 2 2 <none> 21d
svclb-traefik-23fcfc42 2 2 0 2 0 <none> 8m45s
检查第二个Daemonset,我发现它试图启动的pod失败了,因为分配给它们的节点没有空闲端口。这是意料之中的,因为“真正的”svclb-traefik似乎已经绑定了它们(至少就kubernetes所知)。删除第二个svclb-traefik Daemonset并不能解决问题,Daemonset会在下次重新引导时返回。
我已经检查了traefik daemonset、traefik LoadBalancer Service和所有traefik pod,以了解可能出现的问题(kubectl describe
的输出如下),但一切看起来都很正常。
恶魔:
$ kubectl -n kube-system describe ds svclb-traefik
Name: svclb-traefik
Selector: app=svclb-traefik
Node-Selector: <none>
Labels: objectset.rio.cattle.io/hash=f31475152fbf70655d3c016d368e90118938f6ea
svccontroller.k3s.cattle.io/nodeselector=false
Annotations: deprecated.daemonset.template.generation: 1
objectset.rio.cattle.io/applied:
H4sIAAAAAAAA/8xUTW/jNhD9K8WcKUWy7EQW0MMiySFoNzFsby+BEVDkKGZNkQI5UmMY+u8FZWftbL6KbQ89ejjz/Oa90dvBRhkJBVxxrK1ZIAED3qg/0HllDRTAm8afdSkwqJG45M...
objectset.rio.cattle.io/id: svccontroller
objectset.rio.cattle.io/owner-gvk: /v1, Kind=Service
objectset.rio.cattle.io/owner-name: traefik
objectset.rio.cattle.io/owner-namespace: kube-system
Desired Number of Nodes Scheduled: 2
Current Number of Nodes Scheduled: 2
Number of Nodes Scheduled with Up-to-date Pods: 2
Number of Nodes Scheduled with Available Pods: 2
Number of Nodes Misscheduled: 0
Pods Status: 2 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app=svclb-traefik
svccontroller.k3s.cattle.io/svcname=traefik
Containers:
lb-port-80:
Image: rancher/klipper-lb:v0.2.0
Port: 80/TCP
Host Port: 80/TCP
Environment:
SRC_PORT: 80
DEST_PROTO: TCP
DEST_PORT: 80
DEST_IP: 10.43.82.221
Mounts: <none>
lb-port-443:
Image: rancher/klipper-lb:v0.2.0
Port: 443/TCP
Host Port: 443/TCP
Environment:
SRC_PORT: 443
DEST_PROTO: TCP
DEST_PORT: 443
DEST_IP: 10.43.82.221
Mounts: <none>
Volumes: <none>
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 36m daemonset-controller Created pod: svclb-traefik-pjffb
该处:
$ kubectl -n kube-system describe svc traefik
Name: traefik
Namespace: kube-system
Labels: app.kubernetes.io/instance=traefik
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=traefik
helm.sh/chart=traefik-9.18.2
Annotations: meta.helm.sh/release-name: traefik
meta.helm.sh/release-namespace: kube-system
Selector: app.kubernetes.io/instance=traefik,app.kubernetes.io/name=traefik
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 10.43.82.221
IPs: 10.43.82.221
LoadBalancer Ingress: <node 1 IP>, <node 2 IP>
Port: web 80/TCP
TargetPort: web/TCP
NodePort: web 30234/TCP
Endpoints: 10.42.1.116:8000
Port: websecure 443/TCP
TargetPort: websecure/TCP
NodePort: websecure 32623/TCP
Endpoints: 10.42.1.116:8443
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal EnsuringLoadBalancer 16m service-controller Ensuring load balancer
Normal AppliedDaemonSet 16m service-controller Applied LoadBalancer DaemonSet kube-system/svclb-traefik-23fcfc42
其中一个“好”的pod(来自svclb-traefik):
Name: svclb-traefik-pjffb
Namespace: kube-system
Priority: 0
Node: <node 1>
Start Time: Fri, 03 Feb 2023 08:25:09 -0500
Labels: app=svclb-traefik
controller-revision-hash=56b6bf6489
pod-template-generation=1
svccontroller.k3s.cattle.io/svcname=traefik
Annotations: <none>
Status: Running
IP: 10.42.1.96
IPs:
IP: 10.42.1.96
Controlled By: DaemonSet/svclb-traefik
Containers:
lb-port-80:
Container ID: containerd://6ae25fd4dea39238f3d222dce1a25e3b01a7fb159cecd3e2684257e91dbfd4d7
Image: rancher/klipper-lb:v0.2.0
Image ID: docker.io/rancher/klipper-lb@sha256:5ea5f7904c404085ff24541a0e7a2267637af4bcf30fae9b747d871bfcd8a6f7
Port: 80/TCP
Host Port: 80/TCP
State: Running
Started: Fri, 03 Feb 2023 08:46:33 -0500
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 03 Feb 2023 08:25:10 -0500
Finished: Fri, 03 Feb 2023 08:46:05 -0500
Ready: True
Restart Count: 1
Environment:
SRC_PORT: 80
DEST_PROTO: TCP
DEST_PORT: 80
DEST_IP: 10.43.82.221
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-72smb (ro)
lb-port-443:
Container ID: containerd://b9ba3ec21cbd249f1e03d0f3230af9774ff7208ca56a5623a2b45b595a76889e
Image: rancher/klipper-lb:v0.2.0
Image ID: docker.io/rancher/klipper-lb@sha256:5ea5f7904c404085ff24541a0e7a2267637af4bcf30fae9b747d871bfcd8a6f7
Port: 443/TCP
Host Port: 443/TCP
State: Running
Started: Fri, 03 Feb 2023 08:46:33 -0500
Last State: Terminated
Reason: Unknown
Exit Code: 255
Started: Fri, 03 Feb 2023 08:25:10 -0500
Finished: Fri, 03 Feb 2023 08:46:06 -0500
Ready: True
Restart Count: 1
Environment:
SRC_PORT: 443
DEST_PROTO: TCP
DEST_PORT: 443
DEST_IP: 10.43.82.221
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-72smb (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
kube-api-access-72smb:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Created 17m kubelet Created container lb-port-80
Normal Started 17m kubelet Started container lb-port-80
Normal Created 17m kubelet Created container lb-port-443
Normal Started 17m kubelet Started container lb-port-443
其中一个“坏”pod(来自svclb-traefik-XXXXX Daemonset):
Name: svclb-traefik-23fcfc42-t6jx7
Namespace: kube-system
Priority: 0
Node: <none>
Labels: app=svclb-traefik-23fcfc42
controller-revision-hash=74f5f855c9
pod-template-generation=1
svccontroller.k3s.cattle.io/svcname=traefik
svccontroller.k3s.cattle.io/svcnamespace=kube-system
Annotations: <none>
Status: Pending
IP:
IPs: <none>
Controlled By: DaemonSet/svclb-traefik-23fcfc42
Containers:
lb-tcp-80:
Image: rancher/klipper-lb:v0.4.0
Port: 80/TCP
Host Port: 80/TCP
Environment:
SRC_PORT: 80
SRC_RANGES: 0.0.0.0/0
DEST_PROTO: TCP
DEST_PORT: 80
DEST_IPS: 10.43.82.221
Mounts: <none>
lb-tcp-443:
Image: rancher/klipper-lb:v0.4.0
Port: 443/TCP
Host Port: 443/TCP
Environment:
SRC_PORT: 443
SRC_RANGES: 0.0.0.0/0
DEST_PROTO: TCP
DEST_PORT: 443
DEST_IPS: 10.43.82.221
Mounts: <none>
Conditions:
Type Status
PodScheduled False
Volumes: <none>
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: CriticalAddonsOnly op=Exists
node-role.kubernetes.io/control-plane:NoSchedule op=Exists
node-role.kubernetes.io/master:NoSchedule op=Exists
node.kubernetes.io/disk-pressure:NoSchedule op=Exists
node.kubernetes.io/memory-pressure:NoSchedule op=Exists
node.kubernetes.io/not-ready:NoExecute op=Exists
node.kubernetes.io/pid-pressure:NoSchedule op=Exists
node.kubernetes.io/unreachable:NoExecute op=Exists
node.kubernetes.io/unschedulable:NoSchedule op=Exists
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 19m default-scheduler 0/2 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.
Warning FailedScheduling 19m default-scheduler 0/2 nodes are available: 1 node(s) didn't have free ports for the requested pod ports. preemption: 0/2 nodes are available: 2 No preemption victims found for incoming pod.
1条答案
按热度按时间8aqjt8rx1#
这个答案非常不令人满意,但我最终删除了“真实的的”
svclb-traefik
和svclb-traefik-XXXXX
守护进程并重新启动节点。在启动时,两个守护进程都被重新创建(-XXXXX
守护进程仍然无法启动它的pod),但我至少可以从浏览器连接到我的pod。如果问题再次出现,我会更新这个答案。