ubuntu Telecommunication正在向influxdb2发送不正确的数据?

34gzjxbg  于 2023-11-17  发布在  InfluxDB
关注(0)|答案(1)|浏览(241)

我是新来的telegraf /influxdb 2:)
我做了一个快速的docker设置来监控一个Ubuntu云虚拟机。为此,我用telegraf、influxdb 2和grafana创建了一个“默认的”docker-compose文件。它在我的Ubuntu笔记本电脑上运行得很好。当我在Ubuntu云虚拟机上运行它时,我的指标有非常奇怪的数字。
例如

  • load 15:运行时的最高值(或正常运行时间)load 15为0.07,但在influxdb 2中为21.39
  • 正常运行时间:真实的的正常运行时间就像现在的2个小时,而influxdb中的值存储就像27177833(44周...)
  • 使用的内存:free -h告诉590 Mi,influx有95025573888(87 Gb,我的VM有3 Gb..)
  • ....

对我来说非常奇怪的是, Docker 的输入工作正常:/
这简直让我抓狂:)有人已经遇到过同样的事情了吗?再说一次,在我的笔记本电脑上运行得很好:)
docker-compose.yml

  1. version: '3.1'
  2. services:
  3. grafana:
  4. image: grafana/grafana
  5. container_name: grafana
  6. restart: unless-stopped
  7. depends_on:
  8. - telegraf
  9. volumes:
  10. - ./grafana/provisioning/:/etc/grafana/provisioning/
  11. - ./grafana/dashboards/:/var/lib/grafana/dashboards/
  12. - ./grafana/grafana.ini:/etc/grafana/grafana.ini
  13. ports:
  14. - 3000:3000
  15. influxdb:
  16. image: influxdb:2.5.1
  17. container_name: influxdb
  18. restart: unless-stopped
  19. ports:
  20. - 8086:8086
  21. environment:
  22. - DOCKER_INFLUXDB_INIT_USERNAME=xxxx
  23. - DOCKER_INFLUXDB_INIT_PASSWORD=yyyyy
  24. - DOCKER_INFLUXDB_INIT_ORG=myorg
  25. - DOCKER_INFLUXDB_INIT_BUCKET=mybucket
  26. - DOCKER_INFLUXDB_INIT_RETENTION=3w
  27. - DOCKER_INFLUXDB_INIT_MODE=setup
  28. - DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=my-token
  29. volumes:
  30. - data-influx:/var/lib/influxdb2
  31. telegraf:
  32. image: telegraf:1.24.3-alpine
  33. container_name: telegraf
  34. restart: unless-stopped
  35. depends_on:
  36. - influxdb
  37. volumes:
  38. - ./telegraf/etc/telegraf.conf:/etc/telegraf/telegraf.conf:ro
  39. - /var/run/docker.sock:/var/run/docker.sock
  40. - /sys:/rootfs/sys:ro
  41. - /proc:/rootfs/proc:ro
  42. - /etc:/rootfs/etc:ro
  43. user: telegraf:999
  44. volumes:
  45. data-influx:

字符串
telegraf.conf

  1. [global_tags]
  2. [agent]
  3. interval = "10s"
  4. round_interval = true
  5. metric_batch_size = 1000
  6. metric_buffer_limit = 10000
  7. flush_buffer_when_full = true
  8. collection_jitter = "0s"
  9. flush_interval = "10s"
  10. flush_jitter = "0s"
  11. debug = false
  12. quiet = false
  13. hostname = "LoubVM"
  14. [[outputs.influxdb_v2]]
  15. urls = ["http://influxdb:8086"]
  16. token = "my-token"
  17. organization = "myorg"
  18. bucket = "mybucket"
  19. [[inputs.statsd]]
  20. protocol = "udp"
  21. max_tcp_connections = 250
  22. tcp_keep_alive = false
  23. service_address = ":8125"
  24. delete_gauges = true
  25. delete_counters = true
  26. delete_sets = true
  27. delete_timings = true
  28. percentiles = [90]
  29. metric_separator = "_"
  30. parse_data_dog_tags = false
  31. allowed_pending_messages = 10000
  32. percentile_limit = 1000
  33. [[inputs.cpu]]
  34. percpu = true
  35. totalcpu = true
  36. [[inputs.disk]]
  37. mount_points = ["/"]
  38. [[inputs.diskio]]
  39. [[inputs.kernel]]
  40. [[inputs.mem]]
  41. [[inputs.processes]]
  42. [[inputs.swap]]
  43. [[inputs.system]]
  44. [[inputs.net]]
  45. [[inputs.netstat]]
  46. [[inputs.interrupts]]
  47. [[inputs.linux_sysctl_fs]]
  48. [[inputs.docker]]
  49. endpoint = "unix:///var/run/docker.sock"
  50. gather_services = false
  51. source_tag = false
  52. container_name_include = []
  53. container_name_exclude = []
  54. timeout = "5s"
  55. total = false
  56. docker_label_include = []
  57. docker_label_exclude = []

ux6nzvsh

ux6nzvsh1#

根据我的云提供商,这是因为我的ubuntu虚拟机是一个VPS(虚拟专用服务器)。Telecom捕获一些hypervisor的度量,在influx/grafana中以错误的数据结束。
我的解决方法是使用自定义脚本创建一些新的指标,由cron调度,将指标发送到statsd(telecommunication的组件,您可以使用它将数据发送到telecommunication)。

  1. function SendToStatsd {
  2. measurement=$1
  3. value=$2
  4. echo "${measurement}:${value}|c" | nc -w10 -u 127.0.0.1 8125
  5. }
  6. ## Uptime # Threshold in grafana 2419200 (4 weeks)
  7. myUptime=$(awk '{print $1}' /proc/uptime)
  8. SendToStatsd myUptime $myUptime
  9. ## Swap
  10. mySwap=$(free |grep Swap)
  11. mySwapTotal=$(echo $mySwap| awk '{print $2}')
  12. SendToStatsd mySwapTotal $mySwapTotal
  13. # etc etc ...

字符串

展开查看全部

相关问题