我在kubernetes集群中部署了一个efk堆栈。
我对它进行了配置,fluentd将获取nginx日志和php日志(都是json格式的,每行都有一个json日志)。
这是我的配置:
fluent.conf: |-
@include custom.conf
@include conf.d/*.conf
<match**>
@type elasticsearch
@id out_es
@log_level info
include_tag_key true
host "#{ENV['FLUENT_ELASTICSEARCH_HOST']}"
port "#{ENV['FLUENT_ELASTICSEARCH_PORT']}"
path "#{ENV['FLUENT_ELASTICSEARCH_PATH']}"
scheme "#{ENV['FLUENT_ELASTICSEARCH_SCHEME'] || 'http'}"
ssl_verify "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERIFY'] || 'true'}"
ssl_version "#{ENV['FLUENT_ELASTICSEARCH_SSL_VERSION'] || 'TLSv1'}"
reload_connections "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_CONNECTIONS'] || 'false'}"
reconnect_on_error "#{ENV['FLUENT_ELASTICSEARCH_RECONNECT_ON_ERROR'] || 'true'}"
reload_on_failure "#{ENV['FLUENT_ELASTICSEARCH_RELOAD_ON_FAILURE'] || 'true'}"
log_es_400_reason "#{ENV['FLUENT_ELASTICSEARCH_LOG_ES_400_REASON'] || 'false'}"
logstash_prefix "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_PREFIX'] || 'logstash'}"
logstash_format "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_FORMAT'] || 'true'}"
index_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_INDEX_NAME'] || 'logstash'}"
type_name "#{ENV['FLUENT_ELASTICSEARCH_LOGSTASH_TYPE_NAME'] || 'fluentd'}"
<buffer>
flush_thread_count "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_THREAD_COUNT'] || '8'}"
flush_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_FLUSH_INTERVAL'] || '5s'}"
chunk_limit_size "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_CHUNK_LIMIT_SIZE'] || '2M'}"
queue_limit_length "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_QUEUE_LIMIT_LENGTH'] || '32'}"
retry_max_interval "#{ENV['FLUENT_ELASTICSEARCH_BUFFER_RETRY_MAX_INTERVAL'] || '30'}"
retry_forever true
</buffer>
</match>
custom.conf: |
<match fluent.**>
@type null
</match>
<source>
@type tail
read_from_head true
tag kubernetes.*
path /var/log/k8s/*/*/*.log
pos_file /var/log/k8s/customcontainerlogs.log.pos
format json
<parse>
@type json
json_parser oj
time_type string
time_format %d/%b/%Y:%H:%M:%S %z
</parse>
</source>
基本上,我正在尝试获取所有日志并流式处理它们。
使用上述配置,出于某种原因,我只能获取nginx日志,每个日志看起来像:
{
"_index": "logstash-2021.04.24",
"_type": "_doc",
"_id": "kJMpBXkBcnb7LiWny-tT",
"_version": 1,
"_score": null,
"_source": {
"request": "GET / HTTP/1.1",
"http_referer": "",
"http_user_agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36",
"time_iso8601": "2021-04-24T18:34:50+00:00",
"status": "404",
"request_time": "0.004",
"header_access_id": "",
"ip": "10.200.11.106",
"@timestamp": "2021-04-24T18:34:50.708281784+00:00",
"tag": "kubernetes.var.log.k8s.api.nginx.api-access.log"
},
"fields": {
"time_iso8601": [
"2021-04-24T18:34:50.000Z"
],
"@timestamp": [
"2021-04-24T18:34:50.708Z"
]
},
"sort": [
1619289290708
]
}
如果我加上 @type none
在parse部分:
<parse>
@type json
@type none
time_type string
time_format %d/%b/%Y:%H:%M:%S %z
</parse>
我可以看到2个日志每次我点击刷新(404从nginx和php错误日志没有找到路由),但他们没有格式化。。。所有这些都以字符串的形式出现在消息中:
php日志示例:
{
"_index": "logstash-2021.04.24",
"_type": "_doc",
"_id": "13QsBXkBuBa2uOG0PQSr",
"_version": 1,
"_score": null,
"_source": {
"message": "{\"email\":\"\",\"channel\":\"api_error_channel\",\"level\":\"WARNING\",\"message\":\"No route found\",\"backtrace\":[\"[Library\\\\HttpKernel\\\\Exception\\\\RouterListenerException] \\/app\\/Api\\/Data\\/bootstrap.php.cache:21384\",\"Library\\\\HttpKernel\\\\EventListener\\\\RouterListener->onKernelRoute\",\"->call_user_func\",\"Library\\\\EventDispatcher\\\\EventDispatcher->doDispatch\",\"Library\\\\EventDispatcher\\\\EventDispatcher->dispatch\",\"Library\\\\HttpKernel\\\\HttpKernel->handleRaw\",\"Library\\\\HttpKernel\\\\HttpKernel->handle\"],\"request\":{\"id\":\"210850ea-a52c-11eb-820b-f66ccf92cc93\",\"date\":\"2021-04-24 18:37:31\",\"path\":\"\\/\"},\"response\":{\"status_code\":404,\"body\":{\"Error\":{\"Code\":100004,\"Message\":\"Invalid route\"}}}}",
"@timestamp": "2021-04-24T18:37:31.688259336+00:00",
"tag": "kubernetes.var.log.k8s.api.php.api_error.log"
},
"fields": {
"@timestamp": [
"2021-04-24T18:37:31.688Z"
]
},
"sort": [
1619289451688
]
}
2021-04-26 14:30:36 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/fluent.conf"
2021-04-26 14:30:36 +0000 [info]: using configuration file: <ROOT>
<match fluent.**>
@type null
</match>
<source>
@type tail
read_from_head true
tag "kubernetes.*"
path "/var/log/k8s/*/*/*.log"
pos_file "/var/log/k8s/innercontainerlogs.log.pos"
<parse>
@type "json"
</parse>
</source>
<match**>
@type elasticsearch
@id out_es
@log_level "debug"
include_tag_key true
host "elasticsearch.monitoring.svc.cluster.local"
port 9200
path ""
scheme http
ssl_verify true
ssl_version TLSv1
reload_connections false
reconnect_on_error true
reload_on_failure true
log_es_400_reason false
logstash_prefix "logstash"
logstash_format true
index_name "logstash"
type_name "fluentd"
<buffer>
flush_thread_count 8
flush_interval 5s
chunk_limit_size 2M
queue_limit_length 32
retry_max_interval 30
retry_forever true
</buffer>
</match>
</ROOT>
2021-04-26 14:30:36 +0000 [info]: starting fluentd-1.4.2 pid=7 ruby="2.6.3"
2021-04-26 14:30:36 +0000 [info]: spawn command to main: cmdline=["/usr/local/bin/ruby", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/2.6.0/bin/fluentd", "-c", "/fluentd/etc/fluent.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "--under-supervisor"]
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-concat' version '2.3.0'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-elasticsearch' version '3.4.3'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-grok-parser' version '2.5.1'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-json-in-json-2' version '1.0.2'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '2.1.6'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-multi-format-parser' version '1.0.0'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.3.0'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.1.1'
2021-04-26 14:30:38 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.2'
2021-04-26 14:30:38 +0000 [info]: gem 'fluentd' version '1.4.2'
2021-04-26 14:30:38 +0000 [info]: adding match pattern="fluent.**" type="null"
2021-04-26 14:30:38 +0000 [info]: adding match pattern="**" type="elasticsearch"
2021-04-26 14:30:42 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. Connection refused - connect(2) for 10.109.189.187:9200 (Errno::ECONNREFUSED)
2021-04-26 14:30:46 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. Connection refused - connect(2) for 10.109.189.187:9200 (Errno::ECONNREFUSED)
2021-04-26 14:30:54 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. Connection refused - connect(2) for 10.109.189.187:9200 (Errno::ECONNREFUSED)
2021-04-26 14:31:10 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. Connection refused - connect(2) for 10.109.189.187:9200 (Errno::ECONNREFUSED)
2021-04-26 14:31:42 +0000 [warn]: #0 [out_es] Could not communicate to Elasticsearch, resetting connection and trying again. Connection refused - connect(2) for 10.109.189.187:9200 (Errno::ECONNREFUSED)
2021-04-26 14:31:42 +0000 [warn]: #0 [out_es] Detected ES 7.x or above: `_doc` will be used as the document `_type`.
2021-04-26 14:31:42 +0000 [info]: adding source type="tail"
2021-04-26 14:31:42 +0000 [info]: #0 starting fluentd worker pid=10 ppid=7 worker=0
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] buffer started instance=70277432939320 stage_size=0 queue_size=0
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] flush_thread actually running
2021-04-26 14:31:42 +0000 [debug]: #0 [out_es] enqueue_thread actually running
2021-04-26 14:31:42 +0000 [info]: #0 following tail of /var/log/k8s/api/nginx/api-access.log
2021-04-26 14:31:42 +0000 [info]: #0 following tail of /var/log/k8s/api/nginx/api-error.log
2021-04-26 14:31:42 +0000 [info]: #0 following tail of /var/log/k8s/api/nginx/error.log
2021-04-26 14:31:42 +0000 [info]: #0 following tail of /var/log/k8s/api/nginx/access.log
2021-04-26 14:31:42 +0000 [info]: #0 following tail of /var/log/k8s/api/php/api_error.log
2021-04-26 14:31:42 +0000 [info]: #0 fluentd worker is now running worker=0
我能做些什么来解决这个问题?
编辑#1:
我检查了错误日志,得到了以下信息:
2021-04-25 16:44:35 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not match with data 'No route found'" location=nil tag="kubernetes.var.log.k8s.api.php.api_error.log" time=2021-04-25 16:44:35.639766416 +0000 record={"email"=>"", "channel"=>"api_error_channel", "level"=>"WARNING", "message"=>"No route found", "backtrace"=>["[Library\\HttpKernel\\Exception\\RouterListenerException] /app/Api/Data/bootstrap.php.cache:21384", "Library\\HttpKernel\\EventListener\\RouterListener->onKernelRoute", "->call_user_func", "Library\\EventDispatcher\\EventDispatcher->doDispatch", "Library\\EventDispatcher\\EventDispatcher->dispatch", "Library\\HttpKernel\\HttpKernel->handleRaw", "Library\\HttpKernel\\HttpKernel->handle"], "request"=>{"id"=>"84a434e8-a5e5-11eb-b41b-c629ad91f8c7", "date"=>"2021-04-25 16:44:35", "path"=>"/"}, "response"=>{"status_code"=>404, "body"=>{"Error"=>{"Code"=>100004, "Message"=>"Invalid route"}}}}
编辑#2:
我通过验证器运行了原始json日志,它是一个有效的json。
编辑#3:
添加了启动日志
编辑#4:这是调试标准输出日志的示例:
2021-04-27 09:22:42.429274874 +0000 kubernetes.var.log.k8s.api.php.api_error.log: {"email":"","channel":"api_error_channel","level":"WARNING","message":"No route found","backtrace":["[Library\\HttpKernel\\Exception\\RouterListenerException] /app/Api/Data/bootstrap.php.cache:21384","Library\\HttpKernel\\EventListener\\RouterListener->onKernelRoute","->call_user_func","Library\\EventDispatcher\\EventDispatcher->doDispatch","Library\\EventDispatcher\\EventDispatcher->dispatch","Library\\HttpKernel\\HttpKernel->handleRaw","Library\\HttpKernel\\HttpKernel->handle"],"request":{"id":"1e5ce058-a73a-11eb-9e48-1e539c74b43b","date":"2021-04-27 09:22:42","path":"/"},"response":{"status_code":404,"body":{"Error":{"Code":100004,"Message":"Invalid route"}}},"tag":"kubernetes.var.log.k8s.api.php.api_error.log"}
2021-04-27 09:22:37.854071485 +0000 kubernetes.var.log.k8s.api.nginx.api-access.log: {"request":"GET / HTTP/1.1","http_referer":"","http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/90.0.4430.85 Safari/537.36","time_iso8601":"2021-04-27T09:22:37+00:00","status":"404","request_time":"0.004","header_access_id":"","ip":"10.200.11.106","tag":"kubernetes.var.log.k8s.api.nginx.api-access.log"}
暂无答案!
目前还没有任何答案,快来回答吧!