kubernetes K8S命名空间容忍白名单冲突

gr8qqesn  于 2023-05-28  发布在  Kubernetes
关注(0)|答案(1)|浏览(235)

1.我一直在尝试在Azure Kubernetes Service(AKS)版本1.19.11上使用Azure Spot示例,并在这些节点上启用Pod调度,我正在尝试使用PodTolerationRestriction准入控制器。
1.我可以确认PodTolerationRestriction控制器已启用,因为我在将复制集部署到默认名称空间时没有遇到任何问题。这是另一个名称空间,但我们在创建它时没有专门添加任何容差。
1.我从其他地方收集到,沿着针对特定污点(在我的案例中是spot)的白名单之外,还需要将某些默认容忍列入白名单。因此,我在名称空间中添加了某些注解。
1.我没有为这个状态集合预先定义任何额外的容差。

  • 节点有污点-前两个污点通过helm chart值处理
  • RabbitMQ=true:NoSchedule
  • Allow=true:NoExecute
  • kubernetes.azure.com/scalesetpriority=spot:NoSchedule

我想知道哪些额外的公差需要列入白名单。
我添加的注解-

scheduler.alpha.kubernetes.io/defaultTolerations: '[{"operator": "Equal", "value": "spot", "key": "kubernetes.azure.com/scalesetpriority"}]'
scheduler.alpha.kubernetes.io/tolerationsWhitelist: '[{"operator": "Equal", "value": "spot", "key": "kubernetes.azure.com/scalesetpriority"}, {"operator": "Exists", "effect": "NoSchedule", "key": "node.kubernetes.io/memory-pressure"}, {"operator": "Exists", "tolerationSeconds": 300, "effect": "NoExecute", "key": "node.kubernetes.io/unreachable"}, {"operator": "Exists", "tolerationSeconds": 300, "effect": "NoExecute", "key": "node.kubernetes.io/not-ready"}]'

Statefulset描述-

Name:               <release name>
Namespace:          <namespace>
CreationTimestamp:  Tue, 18 Jan 2022 19:37:38 +0530
Selector:           app.kubernetes.io/instance=<name>,app.kubernetes.io/name=rabbitmq
Labels:             app.kubernetes.io/instance=rabbit
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=rabbitmq
                    helm.sh/chart=rabbitmq-8.6.1
Annotations:        meta.helm.sh/release-name: <release name>
                    meta.helm.sh/release-namespace: <namespace>
Replicas:           3 desired | 0 total
Update Strategy:    RollingUpdate
Pods Status:        0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
  Labels:           app.kubernetes.io/instance=rabbit
                    app.kubernetes.io/managed-by=Helm
                    app.kubernetes.io/name=rabbitmq
                    helm.sh/chart=rabbitmq-8.6.1
  Annotations:      checksum/config: 1a138ded5a3ade049cbee9f4f8e2d0fd7253c126d49b790495a492601fd9f280
                    checksum/secret: 05af38634eb4b46c2f8db5770013e1368e78b0d5af057aed5fa4fe7eec4c92de
                    prometheus.io/port: 9419
                    prometheus.io/scrape: true
  Service Account:  sa-rabbitmq
  Containers:
   rabbitmq:
    Image:       docker.io/bitnami/rabbitmq:3.8.9-debian-10-r64
    Ports:       5672/TCP, 25672/TCP, 15672/TCP, 4369/TCP, 9419/TCP
    Host Ports:  0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
    Liveness:    exec [/bin/bash -ec rabbitmq-diagnostics -q ping] delay=120s timeout=200s period=30s #success=1 #failure=6
    Readiness:   exec [/bin/bash -ec rabbitmq-diagnostics -q check_running && rabbitmq-diagnostics -q check_local_alarms] delay=10s timeout=200s period=30s #success=1 #failure=3
    Environment:
      <multiple environment variables>
    Mounts:
      /bitnami/rabbitmq/conf from configuration (rw)
      /bitnami/rabbitmq/mnesia from data (rw)
  Volumes:
   configuration:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      rabbit-rabbitmq-config
    Optional:  false
   data:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
Volume Claims:  <none>
Events:
  Type     Reason        Age                 From                    Message
  ----     ------        ----                ----                    -------
  Warning  FailedCreate  31s (x14 over 72s)  statefulset-controller  create Pod <pod-name> in StatefulSet <release name> failed error: pod tolerations (possibly merged with namespace default tolerations) conflict with its namespace whitelist
x759pob2

x759pob21#

我也有同样的问题。修复了它:
1.从ns中删除了白名单注解。
1.部署了逃生舱

  1. kubectl get pod <pod name> -o yaml
    在我的情况下,我有一对夫妇额外的宽容注射没有我意识到。
  2. revert 1(ns上的注解)。

相关问题