fluentd在写入elasticsearch之前没有按预期进行过滤

qco9c6ql  于 2021-06-10  发布在  ElasticSearch
关注(0)|答案(1)|浏览(551)

使用:
elasticsearch 7.5.1。
fluentd 1.11.2版
fluent插件elasticsearch 4.1.3
Spring Boot2.3.3
我有一个springboot工件,它的logback配置了一个appender,除了app stdout之外,它还向fluentd发送日志:

<appender name="FLUENT_TEXT"
          class="ch.qos.logback.more.appenders.DataFluentAppender">
    <filter class="ch.qos.logback.classic.filter.ThresholdFilter">
        <level>INFO</level>
    </filter>

    <tag>myapp</tag>
    <label>myservicename</label>
    <remoteHost>fluentdservicename</remoteHost>
    <port>24224</port>
    <useEventTime>false</useEventTime>
</appender>

fluentd配置文件如下所示:

<ROOT>
  <source>
    @type forward
    port 24224
    bind "0.0.0.0"
  </source>

  <filter myapp.**>
    @type parser
    key_name "message"
    reserve_data true
    remove_key_name_field false
    <parse>
      @type "json"
    </parse>
  </filter>

  <match myapp.**>
    @type copy
    <store>
      @type "elasticsearch"
      host "elasticdb"
      port 9200
      logstash_format true
      logstash_prefix "applogs"
      logstash_dateformat "%Y%m%d"
      include_tag_key true
      type_name "app_log"
      tag_key "@log_name"
      flush_interval 1s
      user "elastic"
      password xxxxxx
      <buffer>
        flush_interval 1s
      </buffer>
    </store>
    <store>
      @type "stdout"
    </store>
  </match>
</ROOT>

所以它只是添加了一个过滤器,以结构化的方式解析信息(一个json字符串),然后将其写入elasticsearch(以及fluentd的stdout)。检查如何添加myapp.**regexp以使其在筛选器和匹配块中匹配。
在openshift中,一切都正常运行。springboot将日志正确地发送到fluentd,fluentd在elasticsearch中写入日志。
但问题是,应用程序生成的每个日志也会被写入。这意味着每个信息日志,例如,初始spring配置或应用程序通过logback发送到的任何其他信息,都会被写入。
“通缉”日志示例:

2020-11-04 06:33:42.312840352 +0000 myapp.myservice: {"traceId":"bf8195d9-16dd-4e58-a0aa-413d89a1eca9","spanId":"f597f7ffbe722fa7","spanExportable":"false","X-Span-Export":"false","level":"INFO","X-B3-SpanId":"f597f7ffbe722fa7","idOrq":"bf8195d9-16dd-4e58-a0aa-413d89a1eca9","logger":"es.organization.project.myapp.commons.services.impl.LoggerServiceImpl","X-B3-TraceId":"f597f7ffbe722fa7","thread":"http-nio-8085-exec-1","message":"{\"traceId\":\"bf8195d9-16dd-4e58-a0aa-413d89a1eca9\",\"inout\":\"IN\",\"startTime\":1604471622281,\"finishTime\":null,\"executionTime\":null,\"entrySize\":5494.0,\"exitSize\":null,\"differenceSize\":null,\"user\":\"pmmartin\",\"methodPath\":\"Method Path\",\"errorMessage\":null,\"className\":\"CamelOrchestrator\",\"methodName\":\"preauthorization_validate\"}","idOp":"","inout":"IN","startTime":1604471622281,"finishTime":null,"executionTime":null,"entrySize":5494.0,"exitSize":null,"differenceSize":null,"user":"pmmartin","methodPath":"Method Path","errorMessage":null,"className":"CamelOrchestrator","methodName":"preauthorization_validate"}

“不需要的”日志示例(检查每个意外日志消息如何有fluentd警告):

2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.InternalRouteStartupManager","thread":"restartedMain","message":"Route: route6 started and consuming from: servlet:/preAuth"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Total 20 routes, of which 20 are started'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"org.apache.camel.impl.engine.AbstractCamelContext", "thread"=>"restartedMain", "message"=>"Total 20 routes, of which 20 are started"}
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.AbstractCamelContext","thread":"restartedMain","message":"Total 20 routes, of which 20 are started"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"org.apache.camel.impl.engine.AbstractCamelContext", "thread"=>"restartedMain", "message"=>"Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds"}
2020-11-04 06:55:09.000000000 +0000 myapp.myservice: {"level":"INFO","logger":"org.apache.camel.impl.engine.AbstractCamelContext","thread":"restartedMain","message":"Apache Camel 3.5.0 (MyService DEMO Mode) started in 0.036 seconds"}
2020-11-04 06:55:09 +0000 [warn]: #0 dump an error event: error_class=Fluent::Plugin::Parser::ParserError error="pattern not matched with data 'Started MyServiceApplication in 15.446 seconds (JVM running for 346.061)'" location=nil tag="myapp.myservice" time=1604472909 record={"level"=>"INFO", "logger"=>"es.organization.project.myapp.MyService", "thread"=>"restartedMain", "message"=>"Started MyService in 15.446 seconds (JVM running for 346.061)"}

问题是:我应该告诉fluentd什么,怎样才能真正过滤到的信息,这样不需要的信息就会被丢弃?

4xy9mtcn

4xy9mtcn1#

感谢@azeem,根据grep和regexp特性文档,我得到了:)。
我刚刚把这个添加到我的fluentd配置文件中:

<filter onpay.**>
  @type grep
  <regexp>
    key message
    pattern /^.*inout.*$/
  </regexp>
</filter>

任何不包含单词“inout”的行现在都被排除在外。

相关问题