I have a log file which comes from spring log file. The log file has three formats. Each of the first two formats is a single line, between them if there is keyword app-info, it is the message printed by own developer. If no, it is printed by spring framework. We may treat developers message different from spring framework ones. The third format is a multiline stack trace.
We have an example for our own format, for example
2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO - app-info - injectip ip 192.168.16.89
The above line has app-info
key works, so it is our own developers'.
2018-04-27 10:42:23 [RMI TCP Connection(10)-127.0.0.1] - INFO - org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring FrameworkServlet 'dispatcherServlet'
The above line has not app-info
keyword, so it is printed by spring framework.
In my Grok filter, The first pattern is for messages printed from spring framework, the second is for developers' message, the third format is for multiline stacktrace. I want to first regex clearly mention that spring framework pattern does not have key word app-info so that it could get paserexception and follow the second pattern which is developers own format. So I have following formats in regex tool , but I got compile error. My regex is as follows:
(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s+(?<systemmsg>[^((?app-info).)*\s\.\w\-\'\:\d\[\]\/]+)
since in Grok filter, I use instruction from this link
filter {
grok {
match => [ "message", "PATTERN1", "PATTERN2" , "PATTERN3" ]
}
}
My current configure in logstash is as follows which does not mention app-info clearly in the pattern:
filter {
grok {
match => [
"message",
'(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s+(?<systemmsg>[\s\.\w\-\'\:\d\[\]\/^[app-info]]+)',
'(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s(?<appinfo>app-info)\s-\s(?<systemmsg>[\w\d\:\{\}\,\-\(\)\s\"]+)',
'(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\w\-\d]+)\]\s-\s(?<loglevel>[\w]+)\s\-\s(?<appinfo>app-info)\s-\s(?<params>params):(?<jsonstr>[\"\w\d\,\:\.\{\}]+)\s(?<exceptionname>[\w\d\.]+Exception):\s(?<exceptiondetail>[\w\d\.]+)\n\t(?<extralines>at[\s\w\.\d\~\?\n\t\(\)\_\[\]\/\:\-]+)\n\d'
]
}
}
With the format in above logstash configuration, when handling with
2018-04-27 10:42:49 [http-nio-8088-exec-1] - INFO - app-info - injectip ip 192.168.16.89
The first pattern(spring framework pattern) already works, so it does not fall into second pattern which is our own developers format. The parser has parsered successfully as follows:
{
"timestamp": [
[
"2018-04-27 10:42:49"
]
],
"threadname": [
[
"http-nio-8088-exec-1"
]
],
"loglevel": [
[
"INFO"
]
],
"systemmsg": [
[
"app-info - injectip ip 192.168.16.89\n\n"
]
]
}
Any hints I could let first pattern clearly mention that systemmsg shall not contain key word "app-info"?
EDIT:
My goal is that if there is no key word app-info, I let pattern 1 to handle the log. If there is key word app-info, I let pattern 2 to handle the log.
With following log which does not contains key word app-info (pattern 1 shall works),
2018-04-27 10:42:23 [RMI TCP Connection(10)-127.0.0.1] - INFO - org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/] - Initializing Spring FrameworkServlet 'dispatcherServlet'
I got following result no match with first pattern modified following your suggestion, which is not my goal.
(?<timestamp>[\d\-\s\:]+)\s\[(?<threadname>[\d\.\w\s\(\)\-]+)\]\s-\s(?<loglevel>[\w]+)\s+-\s+(?<systemmsg>[^(?:(?!app\-info).)*\s\.\w\-\'\:\d\[\]\/]+)
see demo . My goal is to extract timestamp, thread name, log level and system msg. But first pattern does not give me the expected result. The tool say there is no match.
if I remove ^(?:(?!app-info).)*, then above log(without key word app-info) parser works. See demo But now, It also works for log which contains key word app-info which is not expected, since now I want to extract timestamp, threadname, loglevel,app-info(exist or not)(the field shall be extracted or grouped), then systemmsg. The expectation is that the first parser returns error, let second parser to handle the log. demo could see the parser also works for log with key word app-info. Systemmsg put field app-info into its value which is not expected.
So I want pattern 1, handles log without keyword app-info, pattern 2 handles log with keyword app-info. So I clearly let pattern 1 throw parse error or exception when it contains key word app-info.
2条答案
按热度按时间nbysray51#
我的目标是让模式1处理没有关键字app-info的日志。如果有app-info,第一个模式将抛出解析错误,以便第二个解析器可以处理日志。
可以使用以下模式作为第一个模式,
如果日志中的任何位置存在
app-info
,它将忽略该日志,并移动到2nd PATTERN
。示例
日志中没有
app-info
、您可以根据自己的要求进行过滤。
现在使用
app-info
进行日志记录,请
如果让
PATTERN1
等于(?<data>^(?!.*app-info).*)
你会得到,
然后可以为
data
字段添加第二个grok筛选器,3pvhb19x2#
我使用了GREEDYDATA,假设您有以下日志行
重定向控制器:点击数据重定向成功:{a:123,b:345}
并且您希望捕获到"data",则按如下所示使用GREEDYDATA
% {GREEDYDATA}数据:% {SPACE} % {模式的其余部分}