使用telegraf作为守护程序发送kubernetes吊舱/容器的度量

ifmq2ha2  于 2021-06-06  发布在  Kafka
关注(0)|答案(1)|浏览(716)

首先,我想清楚地了解一些事情,如果我在kubernetes集群中运行telegraf守护程序,它将收集pods的度量?或者它将收集物理节点的度量?
我在我的测试kubernetes集群中创建了一个telegraf守护程序,它在我的笔记本电脑hyperv下运行,基于这个kubernetes集群安装:
我想收集豆荚的指标,但它没有到达Kafka机器。我在日志中看到这个错误:

  1. 2019-05-08T02:36:35Z I! Starting Telegraf 1.9.2
  2. 2019-05-08T02:36:35Z I! Using config file: /etc/telegraf/telegraf.conf
  3. 2019-05-08T02:46:36Z E! [agent] Failed to connect to output kafka, retrying in 15s, error was 'kafka: client has run out of available brokers to talk to (Is your cluster reachable?)'

这是守护程序集定义文件:

  1. apiVersion: v1
  2. kind: ConfigMap
  3. metadata:
  4. name: telegraf
  5. namespace: monitoring
  6. labels:
  7. k8s-app: telegraf
  8. data:
  9. telegraf.conf: |+
  10. [global_tags]
  11. env = "$ENV"
  12. [agent]
  13. hostname = "$HOSTNAME"
  14. interval = "60s"
  15. round_interval = true
  16. metric_batch_size = 1000
  17. metric_buffer_limit = 10000
  18. collection_jitter = "0s"
  19. flush_interval = "10s"
  20. flush_jitter = "2s"
  21. precision = ""
  22. debug = false
  23. quiet = true
  24. logfile = ""
  25. [[outputs.kafka]]
  26. brokers = ["10.121.63.5:9092", "10.121.63.18:9092", "10.121.62.64:9092", "10.121.62.80:9092", "10.121.63.22:9092"]
  27. topic = "telegraf-measurements-json"
  28. client_id = "golangsarama__1.18.0__serverinfra__telegraf"
  29. routing_tag = "host"
  30. version = "0.11.0.2"
  31. compression_codec = 2
  32. required_acks = 1
  33. data_format = "json"
  34. [[inputs.cpu]]
  35. percpu = true
  36. totalcpu = true
  37. collect_cpu_time = false
  38. report_active = false
  39. [[inputs.disk]]
  40. ignore_fs = ["tmpfs", "devtmpfs", "devfs"]
  41. [[inputs.diskio]]
  42. [[inputs.kernel]]
  43. [[inputs.mem]]
  44. [[inputs.processes]]
  45. [[inputs.swap]]
  46. [[inputs.system]]
  47. [[inputs.docker]]
  48. endpoint = "unix:///var/run/docker.sock"
  49. [[inputs.kubernetes]]
  50. url = "https://192.168.213.18:6443"
  51. insecure_skip_verify = true
  52. ---
  53. # Section: Daemonset
  54. apiVersion: apps/v1
  55. kind: DaemonSet
  56. metadata:
  57. name: telegraf
  58. namespace: monitoring
  59. labels:
  60. k8s-app: telegraf
  61. spec:
  62. selector:
  63. matchLabels:
  64. name: telegraf
  65. template:
  66. metadata:
  67. labels:
  68. name: telegraf
  69. spec:
  70. containers:
  71. - name: telegraf
  72. image: docker.io/telegraf:1.9.2
  73. resources:
  74. limits:
  75. memory: 500Mi
  76. requests:
  77. cpu: 500m
  78. memory: 500Mi
  79. env:
  80. - name: HOSTNAME
  81. valueFrom:
  82. fieldRef:
  83. fieldPath: spec.nodeName
  84. - name: "HOST_PROC"
  85. value: "/rootfs/proc"
  86. - name: "HOST_SYS"
  87. value: "/rootfs/sys"
  88. - name: ENV
  89. valueFrom:
  90. secretKeyRef:
  91. name: telegraf
  92. key: env
  93. volumeMounts:
  94. - name: sys
  95. mountPath: /rootfs/sys
  96. readOnly: true
  97. - name: proc
  98. mountPath: /rootfs/proc
  99. readOnly: true
  100. - name: docker-socket
  101. mountPath: /var/run/docker.sock
  102. - name: utmp
  103. mountPath: /var/run/utmp
  104. readOnly: true
  105. - name: config
  106. mountPath: /etc/telegraf
  107. terminationGracePeriodSeconds: 30
  108. volumes:
  109. - name: sys
  110. hostPath:
  111. path: /sys
  112. - name: docker-socket
  113. hostPath:
  114. path: /var/run/docker.sock
  115. - name: proc
  116. hostPath:
  117. path: /proc
  118. - name: utmp
  119. hostPath:
  120. path: /var/run/utmp
  121. - name: config
  122. configMap:
  123. name: telegraf

这是我创建守护程序的文章。
这是豆荚:

  1. NAMESPACE NAME READY STATUS RESTARTS AGE
  2. default nginx-65f88748fd-jztrz 1/1 Running 0 7d18h
  3. kube-system coredns-fb8b8dccf-rl48l 1/1 Running 0 7d18h
  4. kube-system coredns-fb8b8dccf-x8fvx 1/1 Running 0 7d18h
  5. kube-system etcd-k8s-master 1/1 Running 2 7d18h
  6. kube-system kube-apiserver-k8s-master 1/1 Running 2 7d18h
  7. kube-system kube-controller-manager-k8s-master 1/1 Running 0 7d18h
  8. kube-system kube-flannel-ds-amd64-96tsl 1/1 Running 0 7d18h
  9. kube-system kube-flannel-ds-amd64-b884r 1/1 Running 0 7d18h
  10. kube-system kube-flannel-ds-amd64-pdqmq 1/1 Running 0 7d18h
  11. kube-system kube-proxy-42k2g 1/1 Running 0 7d18h
  12. kube-system kube-proxy-77pw9 1/1 Running 0 7d18h
  13. kube-system kube-proxy-n5mbs 1/1 Running 0 7d18h
  14. kube-system kube-scheduler-k8s-master 1/1 Running 2 7d18h
  15. monitoring telegraf-dvtcl 1/1 Running 5 117m
  16. monitoring telegraf-n2mqz 1/1 Running 5 117m

tcpdump显示从守护程序发送的内容:

  1. 09:52:59.002901 IP 192.168.1.10.45546 > sdsfdsf.XmlIpcRegSvc: Flags [S], seq 3040818525, win 28200, options [mss 1410,sackOK,TS val 158999344 ecr 0,nop,wscale 7], length 0
  2. E..<2.@.@......
  3. y?...#..?5]......n(._.........
  4. z#0........................
  5. 09:52:59.002901 IP 192.168.1.10.45546 > sdsfdsf.XmlIpcRegSvc: Flags [S], seq 3040818525, win 28200, options [mss 1410,sackOK,TS val 158999344 ecr 0,nop,wscale 7], length 0
  6. E..<2.@.@......
  7. y?...#..?5]......n(._.........

但我在我们的grafana Jmeter 盘上什么也看不到。如果我在节点上安装一个独立的基于rpm的telegraf,它就会发出,我可以看到度量。但我对pod指标很好奇。

8yparm6h

8yparm6h1#

来自telegraf的这个错误仅仅意味着没有连接到配置中的代理数组中的10类ip代理范围。取决于你如何设置网络和路由,你可能只是有一个简单的路由问题,以那些私人IP拥有你的Kafka集群。

相关问题