用于合并重复行上的Apache日志的正则表达式

xu3bshqb  于 2022-12-23  发布在  Apache
关注(0)|答案(1)|浏览(93)

我正在手动分析我的Apache日志,忽略为什么,这并不重要;)
无论如何,我对流媒体视频有多少条目感到恼火。下面是一个例子。我希望有一个匹配重复行的正则表达式,忽略时间戳和传输字节中的微小变化,但要注意移动IP地址,不要删除比恒定重复行更多的内容。
启动示例:

172.59.152.20 - - [19/Dec/2022:04:52:54 +0000] "GET /video.mp4 HTTP/1.1" 206 504267 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:52:55 +0000] "GET /video.mp4 HTTP/1.1" 206 180747 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:52:56 +0000] "GET /video.mp4 HTTP/1.1" 206 40261 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:52:56 +0000] "GET /video.mp4 HTTP/1.1" 206 1427820 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:52:57 +0000] "GET /video.mp4 HTTP/1.1" 206 47938302 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:10 +0000] "GET /video.mp4 HTTP/1.1" 206 9304011 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:17 +0000] "GET /video.mp4 HTTP/1.1" 206 11115723 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:23 +0000] "GET /video.mp4 HTTP/1.1" 206 10468683 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:29 +0000] "GET /video.mp4 HTTP/1.1" 206 4386507 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:36 +0000] "GET /video.mp4 HTTP/1.1" 206 5292363 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:42 +0000] "GET /video.mp4 HTTP/1.1" 206 6780555 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:49 +0000] "GET /video2.mp4 HTTP/1.1" 206 3739467 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:51 +0000] "GET /video2.mp4 HTTP/1.1" 206 202874 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:52 +0000] "GET /video2.mp4 HTTP/1.1" 206 9592368 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:02 +0000] "GET /video2.mp4 HTTP/1.1" 206 7233483 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:08 +0000] "GET /video2.mp4 HTTP/1.1" 206 7427595 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:15 +0000] "GET /video.mp4 HTTP/1.1" 206 10867691 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:21 +0000] "GET /video.mp4 HTTP/1.1" 206 6845259 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:28 +0000] "GET /video.mp4 HTTP/1.1" 206 11568651 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:34 +0000] "GET /video.mp4 HTTP/1.1" 206 10856907 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:41 +0000] "GET /video.mp4 HTTP/1.1" 206 8139339 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:49 +0000] "GET /video.mp4 HTTP/1.1" 206 10792203 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:56 +0000] "GET /video.mp4 HTTP/1.1" 206 10220651 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:55:02 +0000] "GET /video.mp4 HTTP/1.1" 206 10468683 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:55:09 +0000] "GET /video.mp4 HTTP/1.1" 206 9109899 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"

预期输出:

172.59.152.20 - - [19/Dec/2022:04:53:42 +0000] "GET /video.mp4 HTTP/1.1" 206 6780555 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:53:49 +0000] "GET /video2.mp4 HTTP/1.1" 206 3739467 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"
172.59.152.20 - - [19/Dec/2022:04:54:15 +0000] "GET /video.mp4 HTTP/1.1" 206 10867691 "Mozilla/5.0 (iPhone; CPU iPhone OS 15_6_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/15.6.1 Mobile/15E148 Safari/604.1"

实际上,它是保留匹配行的第一行还是最后一行并不重要。
我确实研究过uniq,但它似乎不能跟踪ip,跳过时间戳,匹配内容,然后忽略行的其余部分。
例如,下面是每行的正则表达式:

Group 1, IP, 2 - Timestamp, 3 - Content 4- HTTP Response Code, 5- Bytes, 6 - Browser / OS
(\d+.\d+.\d+.\d+) - - (.*) \+0000(.*)HTTP\/1.1\" (\d{3}) (\d{1,8})(.*)

如果我想忽略第2组和第5组,但保留其余的,我会怎么做?

rggaifut

rggaifut1#

这是反向引用的一个很好的应用。捕获第1组的整个第一行。然后捕获第2组的IP地址和第3组的文件名。使用这些捕获跳过包含该IP和文件名的每一个下一行。下面是将保留每个组的第一行的regex_replace;

^(((?:\d{1,3}\.){3}\d{1,3})[^"]+("[^"]*?").*?(?:\n|$))(\2[^"]+\3.*?(\n|$))*

您需要Assertg(全局替换)和m(多行处理)标志。下面是您在Regex 101中替换模式下的示例;

相关问题