cmd/go: 测试报告为已缓存,但实际上并未缓存,

yquaqz18  于 6个月前  发布在  Go
关注(0)|答案(5)|浏览(58)

你正在使用哪个版本的Go(go version)?

$ go version
go version go1.15.2 linux/amd64

这个问题在最新版本中是否会重现?

是的

你正在使用什么操作系统和处理器架构(go env)?

go env 输出

$ go env
GO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home/user/.cache/go-build"
GOENV="/home/user/.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home/user/.go_workspace/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home/user/.go_workspace:/usr/local/go_workspace"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/local/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/local/go/pkg/tool/linux_amd64"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/home/user/project/go.mod"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build013192427=/tmp/go-build -gno-record-gcc-switches"

你做了什么?

我们从CI构建中建立了一个较大的缓存(大约2GB的大小)。我们看到测试速度明显减慢,而且似乎根本原因是(可能是部分)缓存未命中,导致至少某些测试进程每次都被运行(特别是我看到链接器被调用)。

$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: input list not found: cache entry not found: open /home/user/.cache/go-build/32/32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1-a: no such file or directory
testcache: github.com/foo/bar/service: test ID 484f364f314d6d6c647371774138555430696467 => ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18
testcache: github.com/foo/bar/service: test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
ok      github.com/foo/bar/service  (cached)

real    0m2.185s
user    0m2.687s
sys     0m0.387s

$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: input list not found: cache entry not found: open /home/user/.cache/go-build/32/32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1-a: no such file or directory
testcache: github.com/foo/bar/service: test ID 484f364f314d6d6c647371774138555430696467 => ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18
testcache: github.com/foo/bar/service: test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
ok      github.com/foo/bar/service  (cached)

real    0m2.207s
user    0m2.786s
sys     0m0.397s

我不熟悉它正在缓存什么,或者缓存是如何工作的,但似乎删除正在被命中的缓存条目可以解决这个问题:

$ rm ~/.cache/go-build/74/74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae-a

$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: input list not found: cache entry not found: open /home/user/.cache/go-build/32/32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1-a: no such file or directory
testcache: github.com/foo/bar/service: test ID 484f364f314d6d6c647371774138555430696467 => ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18
testcache: github.com/foo/bar/service: test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
testcache: github.com/foo/bar/service: test output not found: cache entry not found: open /home/user/.cache/go-build/74/74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae-a: no such file or directory
testcache: github.com/foo/bar/service: save test ID 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => c5d26a24287e143e5891845de941e3d6e14ccca91d2e881d31430334a09001a8
testcache: github.com/foo/bar/service: save test ID ce7e1c46d67997e3d0b493d985ea92e199bad7b2fc4a97e7b405c87a92ac5d18 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => 74e40de3953e4347471748ef1e1ff4aacfa7380968c99f0baa14b0b7b1d24dae
ok      github.com/foo/bar/service  0.052s

real    0m2.236s
user    0m2.863s
sys     0m0.370s

$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: test ID 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => c5d26a24287e143e5891845de941e3d6e14ccca91d2e881d31430334a09001a8
ok      github.com/foo/bar/service  (cached)

real    0m0.728s
user    0m0.974s
sys     0m0.276s

$ time GODEBUG=gocachetest=1 go test -race ./service/
testcache: github.com/foo/bar/service: test ID 5277766d4236333530556841686b797248445250 => 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1
testcache: github.com/foo/bar/service: test ID 32e67a3c459a396178c3acafc87928bc9fc94b28d20fa9763dfc1fac9925e9a1 => input ID de8952dc6e704af61a533169fd1be1877a6c2778382c4f9a0828a177db9401ac => c5d26a24287e143e5891845de941e3d6e14ccca91d2e881d31430334a09001a8
ok      github.com/foo/bar/service  (cached)

real    0m0.725s
user    0m1.034s
sys     0m0.234s

请注意 save test ID 行和加速。
当然,我们可以清除缓存来帮助解决这个问题,但感觉这里可能存在问题:在这些情况下仍然发生缓存未命中,但没有创建新的缓存条目,至少在没有干预的情况下不会。我愿意接受我们的缓存可能陷入了糟糕的状态的可能性,尽管我会很难理解为什么会这样。

wwtsj6pe

wwtsj6pe1#

在第一个跟踪中,ok github.com/foo/bar/service (cached) 输出似乎表明测试结果应该从缓存中加载,但时间信息与该输出不一致。
对我来说,这表明当我们的缓存进入不一致状态时,我们可能在某些地方遗漏了一些错误处理。

5uzkadbs

5uzkadbs2#

为了补充,这个问题似乎在我们整个测试套件中相当普遍(我只粘贴了一个孤立的例子在上面)。

$ # run 1
$ time go test -race ./...
<snip>

real    5m57.490s
user    10m44.469s
sys     0m56.386s
$ # run 2
$ time go test -race ./...
<snip>

real    5m56.693s
user    10m43.209s
sys     0m56.260s
$ # run 3 -- deleting the entire cache
$ rm -r ~/.cache/go-build/
$ time go test -race ./...
<snip>

real    33m14.534s
user    30m23.976s
sys     3m12.240s
$ # run 4
$ time go test -race ./...
<snip>

real    1m12.405s
user    1m39.922s
sys     0m24.057s
$ # run 5
$ time go test -race ./...
<snip>

real    1m8.562s
user    1m41.005s
sys     0m23.210s

我没有包括完整的输出,因为它相当大,但是运行1、2、4和5次的输出包含带有 no test filescached 的行。所以清除缓存后,所有测试的时间明显减少了,从大约6分钟降到了大约1分钟。

d4so4syb

d4so4syb3#

在查看了这个问题之后,我想知道我们是否遇到了这个条件:
go/src/cmd/go/internal/test/test.go
第1143行到第1154行
| ifc.buf==nil { |
| // 我们没有使用链接步骤操作ID找到缓存结果,所以我们运行了链接步骤。现在尝试再次使用链接输出内容ID。使用操作ID的尝试确保如果链接输入没有改变,我们甚至不需要重新运行链接器就可以重用缓存的测试结果。使用链接输出(测试二进制)内容ID的尝试确保即使我们有不同的链接输入但最终的二进制相同,我们仍然可以重用缓存的测试结果。c.saveOutput将在这两个ID下存储结果。 |
| c.tryCacheWithID(b, a, a.Deps[0].BuildContentID()) |
| } |

  • 我可以看到在查看进程列表时调用了链接器
  • 我粘贴的日志表明 tryCache 被调用了两次,第一次尝试得到了缓存未命中

请注意,注解表明在这种情况下 saveOutput 将被调用,但是从我看到的来看,只有在这也导致缓存未命中的情况下才会被调用。这是可以接受的,但是这意味着每次都会运行相对昂贵的链接步骤?
编辑:我注意到在第二个缓存查找成功的情况下调用 saveOutput 不是解决这个问题的方法,因为 saveOutput 依赖于实际运行的测试(读取 testlog.txt )。

w8f9ii69

w8f9ii694#

With the previous comment in mind, steps to force this undesirable behaviour are:

  1. Create main_test.go
package main

import (
	"fmt"
	"testing"
)

func TestFoo(t *testing.T) {
	fmt.Println("test")
}
  1. Run tests as normal
$ go test ./...                                                                                          
ok      test    1.408s
  1. Confirm cached
$ go test ./...
ok      test    (cached)
  1. Add a linker flag that has no impact on the compiled test binary, but results in cache misses for the first action ID
$ time go test -ldflags "-X main.foo=1" ./...            
ok      test    (cached)
go test -ldflags "-X main.foo=1" ./...  0.38s user 0.32s system 214% cpu 0.330 total
$ time go test -ldflags "-X main.foo=1" ./...
ok      test    (cached)
go test -ldflags "-X main.foo=1" ./...  0.38s user 0.33s system 217% cpu 0.327 total
$ time go test -ldflags "-X main.foo=1" ./...
ok      test    (cached)
go test -ldflags "-X main.foo=1" ./...  0.38s user 0.32s system 215% cpu 0.327 total
  1. Compare timings when flag isn't used
$ time go test ./...
ok      test    (cached)
go test ./...  0.23s user 0.29s system 280% cpu 0.187 total
$ time go test ./...
ok      test    (cached)
go test ./...  0.23s user 0.29s system 265% cpu 0.195 total
$ time go test ./...
ok      test    (cached)
go test ./...  0.22s user 0.28s system 263% cpu 0.192 total

We can confirm that using the linker flag is causing the first cache fetch to miss:

$ time GODEBUG=gocachetest=1 go test -ldflags "-X main.foo=1" ./...
testcache: test: test ID 34667a79325f4b5a5031745352646a364f334248 => 3ef2a119bdc3644ae990a022c528aaf151cfee0cdf4c9c709d47fb60f6c9c457
testcache: test: input list not found: cache entry not found: open /Users/***/Library/Caches/go-build/3e/3ef2a119bdc3644ae990a022c528aaf151cfee0cdf4c9c709d47fb60f6c9c457-a: no such file or directory
testcache: test: test ID 655748382d6e5536494c337a442d2d5735766c33 => 0d932d0481612b8bb78edaa50a1934585417fed642903bf1189315ddf344e4d7
testcache: test: test ID 0d932d0481612b8bb78edaa50a1934585417fed642903bf1189315ddf344e4d7 => input ID c5fc1f3633fe1da223946e105e6b9d870031ba00feb596125d273dfd099ec02a => 8071c7e4622894029c7df1341a0defb22002e4268d725b9d2ef3979d86c830aa
ok      test    (cached)
GODEBUG=gocachetest=1 go test -ldflags "-X main.foo=1" ./...  0.40s user 0.32s system 211% cpu 0.343 tota

The time loss seems minor, but with a large test suite with complex and large scale dependencies this very quickly adds up (as demonstrated in #41593 (comment)). Note that we aren't using any bogus -X flags AFAIK, I'm wondering if the cache misses start to occur after upgrading go versions.

bksxznpy

bksxznpy5#

这可能是#49267的另一个症状。

相关问题