kubernetes 使用具有不同ServiceCIDR的不同API服务器时出现不一致行为,

umuewwlo  于 9个月前  发布在  Kubernetes
关注(0)|答案(4)|浏览(232)

Kubernetes apiservers负责处理多个方面的服务:

  • apiservers运行一个协调循环,以维护特殊的kubernetes.default服务,该服务使用配置的主要服务CIDR的第一个地址
  • apiservers通过在etcd中快照的位图为Services分配ClusterIP
  • apiservers运行一个修复循环,以保持Services ClusterIPs和位图同步

服务CIDR由以下标志--service-cluster-ip-range string配置,可能多个apiservers使用不同的配置。如果发生这种情况,集群将处于完全不一致的状态,因为每个apiserver中的每个循环都将尝试协调其自己的参数

  1. kube_apiserver_test.go:684: svcs from client 2: svc-apiserver2-9 192.168.0.22
  2. kube_apiserver_test.go:687: ------------
  3. kube_apiserver_test.go:658: Unexpected error: Internal error occurred: failed to allocate a serviceIP: the provided range does not match the current range
  4. I1229 20:22:50.079653 56647 alloc.go:327] "allocated clusterIPs" service="default/svc-apiserver2-17" clusterIPs=map[IPv4:192.168.0.78]
  5. kube_apiserver_test.go:673: ------------
  6. kube_apiserver_test.go:676: svcs from client 1: kubernetes 10.0.0.1
  7. kube_apiserver_test.go:676: svcs from client 1: svc-apiserver2-0 192.168.0.114
  8. kube_apiserver_test.go:676: svcs from client 1: svc-apiserver2-1 192.168.0.248

可以通过以下集成测试轻松重现

  1. diff --git a/test/integration/controlplane/kube_apiserver_test.go b/test/integration/controlplane/kube_apiserver_test.go
  2. index 0658516aa70..f7b302636aa 100644
  3. --- a/test/integration/controlplane/kube_apiserver_test.go
  4. +++ b/test/integration/controlplane/kube_apiserver_test.go
  5. @@ -42,6 +42,7 @@ import (
  6. "k8s.io/client-go/kubernetes"
  7. "k8s.io/kube-aggregator/pkg/apis/apiregistration"
  8. "k8s.io/kube-openapi/pkg/validation/spec"
  9. + "k8s.io/kubernetes/cmd/kube-apiserver/app/options"
  10. kubeapiservertesting "k8s.io/kubernetes/cmd/kube-apiserver/app/testing"
  11. "k8s.io/kubernetes/test/integration/etcd"
  12. "k8s.io/kubernetes/test/integration/framework"
  13. @@ -606,3 +607,84 @@ func TestMultiAPIServerNodePortAllocation(t *testing.T) {
  14. }
  15. }
  16. +
  17. +func TestMultiAPIServerDifferentServiceCIDRs(t *testing.T) {
  18. + serviceObject := &corev1.Service{
  19. + ObjectMeta: metav1.ObjectMeta{
  20. + Labels: map[string]string{"foo": "bar"},
  21. + Name: "test-svc",
  22. + },
  23. + Spec: corev1.ServiceSpec{
  24. + Ports: []corev1.ServicePort{
  25. + {
  26. + Name: "test",
  27. + Port: 443,
  28. + TargetPort: intstr.IntOrString{IntVal: 443},
  29. + Protocol: "TCP",
  30. + },
  31. + },
  32. + Type: corev1.ServiceTypeNodePort,
  33. + Selector: map[string]string{"foo": "bar"},
  34. + },
  35. + }
  36. + etcd := framework.SharedEtcd()
  37. +
  38. + serviceCIDR1 := "10.0.0.0/16"
  39. +
  40. + client1, _, tearDownFn1 := framework.StartTestServer(t, framework.TestServerSetup{
  41. + ModifyServerRunOptions: func(opts *options.ServerRunOptions) {
  42. + opts.ServiceClusterIPRanges = serviceCIDR1
  43. + opts.Etcd.StorageConfig = *etcd
  44. + },
  45. + })
  46. + defer tearDownFn1()
  47. +
  48. + serviceCIDR2 := "192.168.0.0/24"
  49. +
  50. + client2, _, tearDownFn2 := framework.StartTestServer(t, framework.TestServerSetup{
  51. + ModifyServerRunOptions: func(opts *options.ServerRunOptions) {
  52. + opts.ServiceClusterIPRanges = serviceCIDR2
  53. + opts.Etcd.StorageConfig = *etcd
  54. + },
  55. + })
  56. + defer tearDownFn2()
  57. +
  58. + for i := 0; i < 2500; i++ {
  59. + // create a Service with first API server
  60. + svc1 := serviceObject.DeepCopy()
  61. + svc1.Name = fmt.Sprintf("svc-apiserver1-%d", i)
  62. + _, err := client1.CoreV1().Services(metav1.NamespaceDefault).Create(context.Background(), svc1, metav1.CreateOptions{})
  63. + if err != nil {
  64. + t.Logf("Unexpected error: %v", err)
  65. + }
  66. +
  67. + // create a Service with second API server
  68. + svc2 := serviceObject.DeepCopy()
  69. + svc2.Name = fmt.Sprintf("svc-apiserver2-%d", i)
  70. + _, err = client2.CoreV1().Services(metav1.NamespaceDefault).Create(context.Background(), svc2, metav1.CreateOptions{})
  71. + if err != nil {
  72. + t.Logf("Unexpected error: %v", err)
  73. + }
  74. +
  75. + svcs, err := client1.CoreV1().Services(metav1.NamespaceAll).List(context.TODO(), metav1.ListOptions{})
  76. + if err != nil {
  77. + t.Logf("Unexpected error: %v", err)
  78. + }
  79. + t.Logf("------------")
  80. +
  81. + for _, s := range svcs.Items {
  82. + t.Logf("svcs from client 1: %s %s\n", s.Name, s.Spec.ClusterIP)
  83. + }
  84. +
  85. + svcs, err = client1.CoreV1().Services(metav1.NamespaceAll).List(context.TODO(), metav1.ListOptions{})
  86. + if err != nil {
  87. + t.Logf("Unexpected error: %v", err)
  88. + }
  89. + for _, s := range svcs.Items {
  90. + t.Logf("svcs from client 2: %s %s\n", s.Name, s.Spec.ClusterIP)
  91. + }
  92. + time.Sleep(2 * time.Second)
  93. + t.Logf("------------")
  94. + }
  95. +
  96. +}

重要的是要提到,自1.17版本以来,Service定义不会被协调。这意味着创建kubernetes.default服务的首个apiserver获胜。仅用于记录,最初由https://github.com/kubernetes/enhancements/tree/master/keps/sig-network/1880-multiple-service-cidrs描述并修复
/sig api-machinery
/sig network
/assign

csga3l58

csga3l581#

/kind bug

rqcrx0a6

rqcrx0a62#

/triage accepted

disho6za

disho6za3#

这个问题已经超过一年没有更新了,应该重新进行优先级评估。
你可以:

  • 确认这个问题仍然与 /triage accepted (仅组织成员)相关
  • /close 关闭这个问题

有关优先级评估过程的更多详细信息,请参见 https://www.kubernetes.dev/docs/guide/issue-triage/
已接受移除优先级评估

atmip9wb

atmip9wb4#

/triage accepted

相关问题