kubernetes 使用具有不同ServiceCIDR的不同API服务器时出现不一致行为,

umuewwlo  于 6个月前  发布在  Kubernetes
关注(0)|答案(4)|浏览(135)

Kubernetes apiservers负责处理多个方面的服务:

  • apiservers运行一个协调循环,以维护特殊的kubernetes.default服务,该服务使用配置的主要服务CIDR的第一个地址
  • apiservers通过在etcd中快照的位图为Services分配ClusterIP
  • apiservers运行一个修复循环,以保持Services ClusterIPs和位图同步

服务CIDR由以下标志--service-cluster-ip-range string配置,可能多个apiservers使用不同的配置。如果发生这种情况,集群将处于完全不一致的状态,因为每个apiserver中的每个循环都将尝试协调其自己的参数

kube_apiserver_test.go:684: svcs from client 2: svc-apiserver2-9 192.168.0.22
    kube_apiserver_test.go:687: ------------
    kube_apiserver_test.go:658: Unexpected error: Internal error occurred: failed to allocate a serviceIP: the provided range does not match the current range
I1229 20:22:50.079653   56647 alloc.go:327] "allocated clusterIPs" service="default/svc-apiserver2-17" clusterIPs=map[IPv4:192.168.0.78]
    kube_apiserver_test.go:673: ------------
    kube_apiserver_test.go:676: svcs from client 1: kubernetes 10.0.0.1
    kube_apiserver_test.go:676: svcs from client 1: svc-apiserver2-0 192.168.0.114
    kube_apiserver_test.go:676: svcs from client 1: svc-apiserver2-1 192.168.0.248

可以通过以下集成测试轻松重现

diff --git a/test/integration/controlplane/kube_apiserver_test.go b/test/integration/controlplane/kube_apiserver_test.go
index 0658516aa70..f7b302636aa 100644
--- a/test/integration/controlplane/kube_apiserver_test.go
+++ b/test/integration/controlplane/kube_apiserver_test.go
@@ -42,6 +42,7 @@ import (
        "k8s.io/client-go/kubernetes"
        "k8s.io/kube-aggregator/pkg/apis/apiregistration"
        "k8s.io/kube-openapi/pkg/validation/spec"
+       "k8s.io/kubernetes/cmd/kube-apiserver/app/options"
        kubeapiservertesting "k8s.io/kubernetes/cmd/kube-apiserver/app/testing"
        "k8s.io/kubernetes/test/integration/etcd"
        "k8s.io/kubernetes/test/integration/framework"
@@ -606,3 +607,84 @@ func TestMultiAPIServerNodePortAllocation(t *testing.T) {
        }

 }
+
+func TestMultiAPIServerDifferentServiceCIDRs(t *testing.T) {
+       serviceObject := &corev1.Service{
+               ObjectMeta: metav1.ObjectMeta{
+                       Labels: map[string]string{"foo": "bar"},
+                       Name:   "test-svc",
+               },
+               Spec: corev1.ServiceSpec{
+                       Ports: []corev1.ServicePort{
+                               {
+                                       Name:       "test",
+                                       Port:       443,
+                                       TargetPort: intstr.IntOrString{IntVal: 443},
+                                       Protocol:   "TCP",
+                               },
+                       },
+                       Type:     corev1.ServiceTypeNodePort,
+                       Selector: map[string]string{"foo": "bar"},
+               },
+       }
+       etcd := framework.SharedEtcd()
+
+       serviceCIDR1 := "10.0.0.0/16"
+
+       client1, _, tearDownFn1 := framework.StartTestServer(t, framework.TestServerSetup{
+               ModifyServerRunOptions: func(opts *options.ServerRunOptions) {
+                       opts.ServiceClusterIPRanges = serviceCIDR1
+                       opts.Etcd.StorageConfig = *etcd
+               },
+       })
+       defer tearDownFn1()
+
+       serviceCIDR2 := "192.168.0.0/24"
+
+       client2, _, tearDownFn2 := framework.StartTestServer(t, framework.TestServerSetup{
+               ModifyServerRunOptions: func(opts *options.ServerRunOptions) {
+                       opts.ServiceClusterIPRanges = serviceCIDR2
+                       opts.Etcd.StorageConfig = *etcd
+               },
+       })
+       defer tearDownFn2()
+
+       for i := 0; i < 2500; i++ {
+               // create a Service with first API server
+               svc1 := serviceObject.DeepCopy()
+               svc1.Name = fmt.Sprintf("svc-apiserver1-%d", i)
+               _, err := client1.CoreV1().Services(metav1.NamespaceDefault).Create(context.Background(), svc1, metav1.CreateOptions{})
+               if err != nil {
+                       t.Logf("Unexpected error: %v", err)
+               }
+
+               // create a Service with second API server
+               svc2 := serviceObject.DeepCopy()
+               svc2.Name = fmt.Sprintf("svc-apiserver2-%d", i)
+               _, err = client2.CoreV1().Services(metav1.NamespaceDefault).Create(context.Background(), svc2, metav1.CreateOptions{})
+               if err != nil {
+                       t.Logf("Unexpected error: %v", err)
+               }
+
+               svcs, err := client1.CoreV1().Services(metav1.NamespaceAll).List(context.TODO(), metav1.ListOptions{})
+               if err != nil {
+                       t.Logf("Unexpected error: %v", err)
+               }
+               t.Logf("------------")
+
+               for _, s := range svcs.Items {
+                       t.Logf("svcs from client 1: %s %s\n", s.Name, s.Spec.ClusterIP)
+               }
+
+               svcs, err = client1.CoreV1().Services(metav1.NamespaceAll).List(context.TODO(), metav1.ListOptions{})
+               if err != nil {
+                       t.Logf("Unexpected error: %v", err)
+               }
+               for _, s := range svcs.Items {
+                       t.Logf("svcs from client 2: %s %s\n", s.Name, s.Spec.ClusterIP)
+               }
+               time.Sleep(2 * time.Second)
+               t.Logf("------------")
+       }
+
+}

重要的是要提到,自1.17版本以来,Service定义不会被协调。这意味着创建kubernetes.default服务的首个apiserver获胜。仅用于记录,最初由https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/1880-multiple-service-cidrs描述并修复
/sig api-machinery
/sig network
/assign

disho6za

disho6za3#

这个问题已经超过一年没有更新了,应该重新进行优先级评估。
你可以:

  • 确认这个问题仍然与 /triage accepted (仅组织成员)相关
  • /close 关闭这个问题

有关优先级评估过程的更多详细信息,请参见 https://www.kubernetes.dev/docs/guide/issue-triage/
已接受移除优先级评估

相关问题