go 正则表达式:支持大小写不敏感的前缀字符串

vhipe2zx  于 5个月前  发布在  Go
关注(0)|答案(2)|浏览(42)

对于类似 foo.*bar 这样的模式,正则表达式编译器会提取字符串 "foo" 并使用 strings.Index 进行搜索,然后再尝试其他操作。
我建议将此扩展到不区分大小写的版本 (?i)foo.*bar
我进行了一个试验性的实现,只是反复调用 strings.EqualFold ,速度要快得多:

name                old time/op    new time/op     delta
Match/Easy0i/16-8     2.99ns ± 1%     2.91ns ± 1%    -2.74%  (p=0.008 n=5+5)
Match/Easy0i/32-8      589ns ± 1%       56ns ± 0%   -90.51%  (p=0.008 n=5+5)
Match/Easy0i/1K-8     17.3µs ± 2%      7.5µs ± 5%   -56.52%  (p=0.008 n=5+5)
Match/Easy0i/32K-8     693µs ± 0%      259µs ± 0%   -62.61%  (p=0.008 n=5+5)
Match/Easy0i/1M-8     22.8ms ± 6%      8.4ms ± 1%   -63.28%  (p=0.008 n=5+5)
Match/Easy0i/32M-8     714ms ± 1%      269ms ± 0%   -62.32%  (p=0.008 n=5+5)

name                old speed      new speed       delta
Match/Easy0i/16-8   5.35GB/s ± 1%   5.50GB/s ± 1%    +2.82%  (p=0.008 n=5+5)
Match/Easy0i/32-8   54.3MB/s ± 1%  572.6MB/s ± 0%  +954.04%  (p=0.008 n=5+5)
Match/Easy0i/1K-8   59.1MB/s ± 2%  135.9MB/s ± 5%  +130.11%  (p=0.008 n=5+5)
Match/Easy0i/32K-8  47.3MB/s ± 0%  126.5MB/s ± 0%  +167.44%  (p=0.008 n=5+5)
Match/Easy0i/1M-8   46.0MB/s ± 6%  125.0MB/s ± 1%  +171.98%  (p=0.008 n=5+5)
Match/Easy0i/32M-8  47.0MB/s ± 1%  124.8MB/s ± 0%  +165.39%  (p=0.008 n=5+5)
cvxl0en2

cvxl0en21#

请随意发送带有基准测试的CL/PR。

gzszwxb4

gzszwxb42#

https://golang.org/cl/358756提到了这个问题:regexp: handle prefix string with fold-case

相关问题