ubuntu Telecommunication正在向influxdb2发送不正确的数据?

34gzjxbg  于 2023-11-17  发布在  InfluxDB
关注(0)|答案(1)|浏览(222)

我是新来的telegraf /influxdb 2:)
我做了一个快速的docker设置来监控一个Ubuntu云虚拟机。为此,我用telegraf、influxdb 2和grafana创建了一个“默认的”docker-compose文件。它在我的Ubuntu笔记本电脑上运行得很好。当我在Ubuntu云虚拟机上运行它时,我的指标有非常奇怪的数字。
例如

  • load 15:运行时的最高值(或正常运行时间)load 15为0.07,但在influxdb 2中为21.39
  • 正常运行时间:真实的的正常运行时间就像现在的2个小时,而influxdb中的值存储就像27177833(44周...)
  • 使用的内存:free -h告诉590 Mi,influx有95025573888(87 Gb,我的VM有3 Gb..)
  • ....

对我来说非常奇怪的是, Docker 的输入工作正常:/
这简直让我抓狂:)有人已经遇到过同样的事情了吗?再说一次,在我的笔记本电脑上运行得很好:)
docker-compose.yml

version: '3.1'
services:
  grafana:
    image: grafana/grafana
    container_name: grafana
    restart: unless-stopped
    depends_on:
      - telegraf
    volumes:
      - ./grafana/provisioning/:/etc/grafana/provisioning/
      - ./grafana/dashboards/:/var/lib/grafana/dashboards/
      - ./grafana/grafana.ini:/etc/grafana/grafana.ini
    ports:
      - 3000:3000
  influxdb:
    image: influxdb:2.5.1
    container_name: influxdb
    restart: unless-stopped
    ports:
      - 8086:8086
    environment:
      - DOCKER_INFLUXDB_INIT_USERNAME=xxxx
      - DOCKER_INFLUXDB_INIT_PASSWORD=yyyyy
      - DOCKER_INFLUXDB_INIT_ORG=myorg
      - DOCKER_INFLUXDB_INIT_BUCKET=mybucket
      - DOCKER_INFLUXDB_INIT_RETENTION=3w
      - DOCKER_INFLUXDB_INIT_MODE=setup
      - DOCKER_INFLUXDB_INIT_ADMIN_TOKEN=my-token
    volumes:
      - data-influx:/var/lib/influxdb2 
  telegraf:
    image: telegraf:1.24.3-alpine
    container_name: telegraf
    restart: unless-stopped
    depends_on:
      - influxdb
    volumes:
      - ./telegraf/etc/telegraf.conf:/etc/telegraf/telegraf.conf:ro
      - /var/run/docker.sock:/var/run/docker.sock
      - /sys:/rootfs/sys:ro
      - /proc:/rootfs/proc:ro
      - /etc:/rootfs/etc:ro
    user: telegraf:999
      
volumes:
  data-influx:

字符串
telegraf.conf

[global_tags]

[agent]
  interval = "10s"
  round_interval = true
  metric_batch_size = 1000
  metric_buffer_limit = 10000
  flush_buffer_when_full = true
  collection_jitter = "0s"
  flush_interval = "10s"
  flush_jitter = "0s"
  debug = false
  quiet = false
  hostname = "LoubVM"

[[outputs.influxdb_v2]]
  urls = ["http://influxdb:8086"]
  token = "my-token"
  organization = "myorg"
  bucket = "mybucket"
[[inputs.statsd]]
  protocol = "udp"
  max_tcp_connections = 250
  tcp_keep_alive = false
  service_address = ":8125"
  delete_gauges = true
  delete_counters = true
  delete_sets = true
  delete_timings = true
  percentiles = [90]
  metric_separator = "_"
  parse_data_dog_tags = false
  allowed_pending_messages = 10000
  percentile_limit = 1000

[[inputs.cpu]]
  percpu = true
  totalcpu = true

[[inputs.disk]]
  mount_points = ["/"]

[[inputs.diskio]]

[[inputs.kernel]]

[[inputs.mem]]

[[inputs.processes]]

[[inputs.swap]]

[[inputs.system]]

[[inputs.net]]

[[inputs.netstat]]

[[inputs.interrupts]]

[[inputs.linux_sysctl_fs]]

[[inputs.docker]]
  endpoint = "unix:///var/run/docker.sock"
  gather_services = false
  source_tag = false
  container_name_include = []
  container_name_exclude = []
  timeout = "5s"
  total = false
  docker_label_include = []
  docker_label_exclude = []

ux6nzvsh

ux6nzvsh1#

根据我的云提供商,这是因为我的ubuntu虚拟机是一个VPS(虚拟专用服务器)。Telecom捕获一些hypervisor的度量,在influx/grafana中以错误的数据结束。
我的解决方法是使用自定义脚本创建一些新的指标,由cron调度,将指标发送到statsd(telecommunication的组件,您可以使用它将数据发送到telecommunication)。

function SendToStatsd {
measurement=$1
value=$2
echo "${measurement}:${value}|c" | nc -w10 -u 127.0.0.1 8125
}

## Uptime # Threshold in grafana 2419200 (4 weeks)
myUptime=$(awk '{print $1}' /proc/uptime)
SendToStatsd myUptime $myUptime

## Swap
mySwap=$(free |grep Swap)
mySwapTotal=$(echo $mySwap| awk '{print $2}')
SendToStatsd mySwapTotal $mySwapTotal

# etc etc ...

字符串

相关问题