如何输出Flink任务或背压相关指标普罗米修斯?

y53ybaqx  于 2021-06-21  发布在  Flink
关注(0)|答案(1)|浏览(779)

我按照reporter的说明将flink度量导出到prometheus,但是默认情况下,它似乎只将与job manager相关的度量导出到prometheus,请参见以下内容:

打开http://localhost:9249/,我只得到以下信息,没有找到与任务或任务管理器相关的度量。


# HELP flink_jobmanager_Status_JVM_Memory_Mapped_MemoryUsed MemoryUsed (scope: jobmanager_Status_JVM_Memory_Mapped)

# TYPE flink_jobmanager_Status_JVM_Memory_Mapped_MemoryUsed gauge

flink_jobmanager_Status_JVM_Memory_Mapped_MemoryUsed{host="localhost",} 0.0

# HELP flink_jobmanager_Status_JVM_ClassLoader_ClassesUnloaded ClassesUnloaded (scope: jobmanager_Status_JVM_ClassLoader)

# TYPE flink_jobmanager_Status_JVM_ClassLoader_ClassesUnloaded gauge

flink_jobmanager_Status_JVM_ClassLoader_ClassesUnloaded{host="localhost",} 0.0

# HELP flink_jobmanager_Status_JVM_GarbageCollector_PS_Scavenge_Time Time (scope: jobmanager_Status_JVM_GarbageCollector_PS_Scavenge)

# TYPE flink_jobmanager_Status_JVM_GarbageCollector_PS_Scavenge_Time gauge

flink_jobmanager_Status_JVM_GarbageCollector_PS_Scavenge_Time{host="localhost",} 273.0

# HELP flink_jobmanager_job_lastCheckpointRestoreTimestamp lastCheckpointRestoreTimestamp (scope: jobmanager_job)

# TYPE flink_jobmanager_job_lastCheckpointRestoreTimestamp gauge

flink_jobmanager_job_lastCheckpointRestoreTimestamp{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} -1.0

# HELP flink_jobmanager_job_lastCheckpointAlignmentBuffered lastCheckpointAlignmentBuffered (scope: jobmanager_job)

# TYPE flink_jobmanager_job_lastCheckpointAlignmentBuffered gauge

flink_jobmanager_job_lastCheckpointAlignmentBuffered{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_job_lastCheckpointExternalPath lastCheckpointExternalPath (scope: jobmanager_job)

# TYPE flink_jobmanager_job_lastCheckpointExternalPath gauge

flink_jobmanager_job_lastCheckpointExternalPath{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_Status_JVM_Memory_Direct_TotalCapacity TotalCapacity (scope: jobmanager_Status_JVM_Memory_Direct)

# TYPE flink_jobmanager_Status_JVM_Memory_Direct_TotalCapacity gauge

flink_jobmanager_Status_JVM_Memory_Direct_TotalCapacity{host="localhost",} 2374599.0

# HELP flink_jobmanager_Status_JVM_Threads_Count Count (scope: jobmanager_Status_JVM_Threads)

# TYPE flink_jobmanager_Status_JVM_Threads_Count gauge

flink_jobmanager_Status_JVM_Threads_Count{host="localhost",} 47.0

# HELP flink_jobmanager_Status_JVM_Memory_Heap_Committed Committed (scope: jobmanager_Status_JVM_Memory_Heap)

# TYPE flink_jobmanager_Status_JVM_Memory_Heap_Committed gauge

flink_jobmanager_Status_JVM_Memory_Heap_Committed{host="localhost",} 1.058013184E9

# HELP flink_jobmanager_Status_JVM_Memory_NonHeap_Used Used (scope: jobmanager_Status_JVM_Memory_NonHeap)

# TYPE flink_jobmanager_Status_JVM_Memory_NonHeap_Used gauge

flink_jobmanager_Status_JVM_Memory_NonHeap_Used{host="localhost",} 7.02964E7

# HELP flink_jobmanager_job_restartingTime restartingTime (scope: jobmanager_job)

# TYPE flink_jobmanager_job_restartingTime gauge

flink_jobmanager_job_restartingTime{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_Status_JVM_GarbageCollector_PS_Scavenge_Count Count (scope: jobmanager_Status_JVM_GarbageCollector_PS_Scavenge)

# TYPE flink_jobmanager_Status_JVM_GarbageCollector_PS_Scavenge_Count gauge

flink_jobmanager_Status_JVM_GarbageCollector_PS_Scavenge_Count{host="localhost",} 24.0

# HELP flink_jobmanager_Status_JVM_Memory_NonHeap_Committed Committed (scope: jobmanager_Status_JVM_Memory_NonHeap)

# TYPE flink_jobmanager_Status_JVM_Memory_NonHeap_Committed gauge

flink_jobmanager_Status_JVM_Memory_NonHeap_Committed{host="localhost",} 7.2876032E7

# HELP flink_jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep_Count Count (scope: jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep)

# TYPE flink_jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep_Count gauge

flink_jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep_Count{host="localhost",} 2.0

# HELP flink_jobmanager_job_downtime downtime (scope: jobmanager_job)

# TYPE flink_jobmanager_job_downtime gauge

flink_jobmanager_job_downtime{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_job_numberOfFailedCheckpoints numberOfFailedCheckpoints (scope: jobmanager_job)

# TYPE flink_jobmanager_job_numberOfFailedCheckpoints gauge

flink_jobmanager_job_numberOfFailedCheckpoints{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_job_numberOfInProgressCheckpoints numberOfInProgressCheckpoints (scope: jobmanager_job)

# TYPE flink_jobmanager_job_numberOfInProgressCheckpoints gauge

flink_jobmanager_job_numberOfInProgressCheckpoints{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_job_numberOfCompletedCheckpoints numberOfCompletedCheckpoints (scope: jobmanager_job)

# TYPE flink_jobmanager_job_numberOfCompletedCheckpoints gauge

flink_jobmanager_job_numberOfCompletedCheckpoints{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 961.0

# HELP flink_jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep_Time Time (scope: jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep)

# TYPE flink_jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep_Time gauge

flink_jobmanager_Status_JVM_GarbageCollector_PS_MarkSweep_Time{host="localhost",} 110.0

# HELP flink_jobmanager_Status_JVM_Memory_Mapped_Count Count (scope: jobmanager_Status_JVM_Memory_Mapped)

# TYPE flink_jobmanager_Status_JVM_Memory_Mapped_Count gauge

flink_jobmanager_Status_JVM_Memory_Mapped_Count{host="localhost",} 0.0

# HELP flink_jobmanager_Status_JVM_CPU_Load Load (scope: jobmanager_Status_JVM_CPU)

# TYPE flink_jobmanager_Status_JVM_CPU_Load gauge

flink_jobmanager_Status_JVM_CPU_Load{host="localhost",} 0.0025814303680169446

# HELP flink_jobmanager_Status_JVM_ClassLoader_ClassesLoaded ClassesLoaded (scope: jobmanager_Status_JVM_ClassLoader)

# TYPE flink_jobmanager_Status_JVM_ClassLoader_ClassesLoaded gauge

flink_jobmanager_Status_JVM_ClassLoader_ClassesLoaded{host="localhost",} 7693.0

# HELP flink_jobmanager_Status_JVM_Memory_Heap_Max Max (scope: jobmanager_Status_JVM_Memory_Heap)

# TYPE flink_jobmanager_Status_JVM_Memory_Heap_Max gauge

flink_jobmanager_Status_JVM_Memory_Heap_Max{host="localhost",} 1.058013184E9

# HELP flink_jobmanager_job_uptime uptime (scope: jobmanager_job)

# TYPE flink_jobmanager_job_uptime gauge

flink_jobmanager_job_uptime{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 4811388.0

# HELP flink_jobmanager_Status_JVM_CPU_Time Time (scope: jobmanager_Status_JVM_CPU)

# TYPE flink_jobmanager_Status_JVM_CPU_Time gauge

flink_jobmanager_Status_JVM_CPU_Time{host="localhost",} 1.044894698E11

# HELP flink_jobmanager_Status_JVM_Memory_Direct_Count Count (scope: jobmanager_Status_JVM_Memory_Direct)

# TYPE flink_jobmanager_Status_JVM_Memory_Direct_Count gauge

flink_jobmanager_Status_JVM_Memory_Direct_Count{host="localhost",} 60.0

# HELP flink_jobmanager_Status_JVM_Memory_Heap_Used Used (scope: jobmanager_Status_JVM_Memory_Heap)

# TYPE flink_jobmanager_Status_JVM_Memory_Heap_Used gauge

flink_jobmanager_Status_JVM_Memory_Heap_Used{host="localhost",} 2.15962464E8

# HELP flink_jobmanager_job_lastCheckpointDuration lastCheckpointDuration (scope: jobmanager_job)

# TYPE flink_jobmanager_job_lastCheckpointDuration gauge

flink_jobmanager_job_lastCheckpointDuration{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 16.0

# HELP flink_jobmanager_Status_JVM_Memory_NonHeap_Max Max (scope: jobmanager_Status_JVM_Memory_NonHeap)

# TYPE flink_jobmanager_Status_JVM_Memory_NonHeap_Max gauge

flink_jobmanager_Status_JVM_Memory_NonHeap_Max{host="localhost",} -1.0

# HELP flink_jobmanager_job_lastCheckpointSize lastCheckpointSize (scope: jobmanager_job)

# TYPE flink_jobmanager_job_lastCheckpointSize gauge

flink_jobmanager_job_lastCheckpointSize{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 271280.0

# HELP flink_jobmanager_job_fullRestarts fullRestarts (scope: jobmanager_job)

# TYPE flink_jobmanager_job_fullRestarts gauge

flink_jobmanager_job_fullRestarts{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 0.0

# HELP flink_jobmanager_Status_JVM_Memory_Direct_MemoryUsed MemoryUsed (scope: jobmanager_Status_JVM_Memory_Direct)

# TYPE flink_jobmanager_Status_JVM_Memory_Direct_MemoryUsed gauge

flink_jobmanager_Status_JVM_Memory_Direct_MemoryUsed{host="localhost",} 2374600.0

# HELP flink_jobmanager_job_totalNumberOfCheckpoints totalNumberOfCheckpoints (scope: jobmanager_job)

# TYPE flink_jobmanager_job_totalNumberOfCheckpoints gauge

flink_jobmanager_job_totalNumberOfCheckpoints{job_id="dfac65e575f318970e0225eab9688a2e",host="localhost",job_name="Popular_Places_to_Elasticsearch",} 961.0

# HELP flink_jobmanager_Status_JVM_Memory_Mapped_TotalCapacity TotalCapacity (scope: jobmanager_Status_JVM_Memory_Mapped)

# TYPE flink_jobmanager_Status_JVM_Memory_Mapped_TotalCapacity gauge

flink_jobmanager_Status_JVM_Memory_Mapped_TotalCapacity{host="localhost",} 0.0

我的问题是如何导出任务或背压相关的度量,如numrecordsin、numrecordsinpersecond、numrecordsout到prometheus?我还需要配置什么?
顺便说一句,我的测试环境是flink1.5.2,作业管理器和任务管理器位于同一台windows机器上,而且我发现flink1.6.0也有同样的问题。

n53p2ov0

n53p2ov01#

当在同一台主机上运行jobmanager和taskmanager时,它们都需要自己的端口。在flink-conf.yaml中,您可以像这样配置一系列端口,例如:

metrics.reporters: prom
metrics.reporter.prom.class: org.apache.flink.metrics.prometheus.PrometheusReporter
metrics.reporter.prom.port: 9250-9260

如果您这样做,那么您将在http://localhost:9250和taskmanager的度量http://localhost:9251.
您还需要调整prometheus.yml以匹配:

scrape_configs:
  - job_name: 'jobmanager'
    static_configs:
    - targets: ['localhost:9250']
  - job_name: 'taskmanager'
    static_configs:
    - targets: ['localhost:9251']

相关问题