go x/build/cmd/coordinator: 计划9-arm的测试分区错误

xpcnnkqh  于 5个月前  发布在  Go
关注(0)|答案(4)|浏览(64)

目前,许多短测试正在单独地在go tool dist test的各个调用中执行,而不是分组在一起。这对于整体运行时间很重要,因为在树莓派构建器上,每个go tool dist test调用都会增加30秒以上的开销,即使对于只占用一小部分秒的测试也是如此。
例如,这是来自plan9-arm执行日志的一段摘录:

2021-10-25T17:16:23Z still_waiting_on_test go_test:internal/abi
  2021-10-25T17:16:53Z still_waiting_on_test go_test:internal/abi
  2021-10-25T17:17:18Z finish_run_tests_multi after 1m24.6s; pi4g: [go_test:internal/abi go_test:internal/buildcfg go_test:internal/cpu]
  2021-10-25T17:17:18Z run_test:go_test:internal/execabs pi4g
  2021-10-25T17:17:48Z still_waiting_on_test go_test:internal/execabs
  2021-10-25T17:17:53Z finish_run_test:go_test:internal/execabs after 35.3s; pi4g
  2021-10-25T17:17:53Z run_test:go_test:internal/fmtsort pi4g
  2021-10-25T17:18:23Z still_waiting_on_test go_test:internal/fmtsort
  2021-10-25T17:18:37Z finish_run_test:go_test:internal/fmtsort after 43.5s; pi4g
  2021-10-25T17:18:37Z run_test:go_test:internal/fuzz pi4g
  2021-10-25T17:19:07Z still_waiting_on_test go_test:internal/fuzz
  2021-10-25T17:19:13Z finish_run_test:go_test:internal/fuzz after 36.3s; pi4g
  2021-10-25T17:19:13Z run_test:go_test:internal/itoa pi4g
  2021-10-25T17:19:43Z still_waiting_on_test go_test:internal/itoa
  2021-10-25T17:19:48Z finish_run_test:go_test:internal/itoa after 34.6s; pi4g
  2021-10-25T17:19:48Z run_test:go_test:internal/poll pi4g
  2021-10-25T17:20:18Z still_waiting_on_test go_test:internal/poll
  2021-10-25T17:20:35Z finish_run_test:go_test:internal/poll after 47.1s; pi4g
  2021-10-25T17:20:35Z run_test:go_test:internal/profile pi4g
  2021-10-25T17:21:05Z still_waiting_on_test go_test:internal/profile
  2021-10-25T17:21:11Z finish_run_test:go_test:internal/profile after 36.1s; pi4g
  2021-10-25T17:21:11Z run_test:go_test:internal/reflectlite pi4g
  2021-10-25T17:21:41Z still_waiting_on_test go_test:internal/reflectlite
  2021-10-25T17:22:04Z finish_run_test:go_test:internal/reflectlite after 53s; pi4g
  2021-10-25T17:22:04Z run_test:go_test:internal/singleflight pi4g
  2021-10-25T17:22:34Z still_waiting_on_test go_test:internal/singleflight
  2021-10-25T17:22:38Z finish_run_test:go_test:internal/singleflight after 34s; pi4g
  2021-10-25T17:22:38Z run_test:go_test:internal/trace pi4g
  2021-10-25T17:23:08Z still_waiting_on_test go_test:internal/trace
  2021-10-25T17:23:14Z finish_run_test:go_test:internal/trace after 35.5s; pi4g
  2021-10-25T17:23:14Z run_test:go_test:internal/unsafeheader pi4g
  2021-10-25T17:23:44Z still_waiting_on_test go_test:internal/unsafeheader
  2021-10-25T17:23:48Z finish_run_test:go_test:internal/unsafeheader after 34.8s; pi4g
  2021-10-25T17:23:48Z run_test:go_test:internal/xcoff pi4g
  2021-10-25T17:24:18Z still_waiting_on_test go_test:internal/xcoff
  2021-10-25T17:24:24Z finish_run_test:go_test:internal/xcoff after 35.4s; pi4g
  2021-10-25T17:24:24Z run_test:go_test:io pi4g
  2021-10-25T17:24:54Z still_waiting_on_test go_test:io
  2021-10-25T17:25:11Z finish_run_test:go_test:io after 47.4s; pi4g
  2021-10-25T17:25:11Z run_test:go_test:io/fs pi4g
  2021-10-25T17:25:41Z still_waiting_on_test go_test:io/fs
  2021-10-25T17:25:46Z finish_run_test:go_test:io/fs after 34.9s; pi4g
  2021-10-25T17:25:46Z run_test:go_test:io/ioutil pi4g
  2021-10-25T17:26:16Z still_waiting_on_test go_test:io/ioutil
  2021-10-25T17:26:39Z finish_run_test:go_test:io/ioutil after 52.4s; pi4g
  2021-10-25T17:26:39Z run_test:go_test:log pi4g
  2021-10-25T17:27:09Z still_waiting_on_test go_test:log
  2021-10-25T17:27:15Z finish_run_test:go_test:log after 35.9s; pi4g
  2021-10-25T17:27:15Z run_test:go_test:math pi4g
  2021-10-25T17:27:45Z still_waiting_on_test go_test:math
  2021-10-25T17:28:04Z finish_run_test:go_test:math after 49.6s; pi4g
  2021-10-25T17:28:04Z run_test:go_test:math/big pi4g
  2021-10-25T17:28:34Z still_waiting_on_test go_test:math/big
  2021-10-25T17:28:51Z finish_run_test:go_test:math/big after 46.5s; pi4g
  2021-10-25T17:28:51Z run_test:go_test:math/bits pi4g
  2021-10-25T17:29:21Z still_waiting_on_test go_test:math/bits
  2021-10-25T17:29:42Z finish_run_test:go_test:math/bits after 50.9s; pi4g
  2021-10-25T17:29:42Z run_test:go_test:math/cmplx pi4g
  2021-10-25T17:30:12Z still_waiting_on_test go_test:math/cmplx
  2021-10-25T17:30:19Z finish_run_test:go_test:math/cmplx after 36.7s; pi4g
  2021-10-25T17:30:19Z run_test:go_test:math/rand pi4g
  2021-10-25T17:30:49Z still_waiting_on_test go_test:math/rand
  2021-10-25T17:30:58Z finish_run_test:go_test:math/rand after 39.4s; pi4g
  2021-10-25T17:30:58Z run_test:go_test:mime pi4g
  2021-10-25T17:31:28Z still_waiting_on_test go_test:mime
  2021-10-25T17:31:35Z finish_run_test:go_test:mime after 36.9s; pi4g
  2021-10-25T17:31:35Z run_test:go_test:mime/multipart pi4g
  2021-10-25T17:32:05Z still_waiting_on_test go_test:mime/multipart
  2021-10-25T17:32:13Z finish_run_test:go_test:mime/multipart after 37.8s; pi4g
  2021-10-25T17:32:13Z run_test:go_test:mime/quotedprintable pi4g
  2021-10-25T17:32:43Z still_waiting_on_test go_test:mime/quotedprintable
  2021-10-25T17:32:49Z finish_run_test:go_test:mime/quotedprintable after 36.1s; pi4g
  2021-10-25T17:32:49Z run_test:go_test:net pi4g
  2021-10-25T17:33:19Z still_waiting_on_test go_test:net
  2021-10-25T17:33:49Z still_waiting_on_test go_test:net
  2021-10-25T17:34:12Z finish_run_test:go_test:net after 1m23.2s; pi4g
  2021-10-25T17:34:12Z run_test:go_test:net/http pi4g
  2021-10-25T17:34:42Z still_waiting_on_test go_test:net/http
  2021-10-25T17:35:12Z still_waiting_on_test go_test:net/http
  2021-10-25T17:35:42Z still_waiting_on_test go_test:net/http
  2021-10-25T17:36:12Z still_waiting_on_test go_test:net/http
  2021-10-25T17:36:20Z finish_run_test:go_test:net/http after 2m8.1s; pi4g
  2021-10-25T17:36:20Z run_tests_multi pi4g: [go_test:net/http/cgi go_test:net/http/cookiejar go_test:net/http/fcgi]
  2021-10-25T17:36:50Z still_waiting_on_test go_test:net/http/cgi
  2021-10-25T17:37:02Z finish_run_tests_multi after 41s; pi4g: [go_test:net/http/cgi go_test:net/http/cookiejar go_test:net/http/fcgi]

与在另一个慢平台上运行的相同子序列的测试相比,freebsd-arm64:

2021-10-25T19:14:46Z still_waiting_on_test go_test:internal/abi
  2021-10-25T19:14:51Z finish_run_tests_multi after 34.4s; aarch64.localdomain: [go_test:internal/abi go_test:internal/buildcfg go_test:internal/cpu]
  2021-10-25T19:14:51Z run_tests_multi aarch64.localdomain: [go_test:internal/execabs go_test:internal/fmtsort go_test:internal/fuzz]
  2021-10-25T19:14:59Z finish_run_tests_multi after 8.31s; aarch64.localdomain: [go_test:internal/execabs go_test:internal/fmtsort go_test:internal/fuzz]
  2021-10-25T19:14:59Z run_tests_multi aarch64.localdomain: [go_test:internal/itoa go_test:internal/poll go_test:internal/profile]
  2021-10-25T19:15:11Z finish_run_tests_multi after 12.3s; aarch64.localdomain: [go_test:internal/itoa go_test:internal/poll go_test:internal/profile]
  2021-10-25T19:15:11Z run_tests_multi aarch64.localdomain: [go_test:internal/reflectlite go_test:internal/singleflight go_test:internal/trace]
  2021-10-25T19:15:27Z finish_run_tests_multi after 15.4s; aarch64.localdomain: [go_test:internal/reflectlite go_test:internal/singleflight go_test:internal/trace]
  2021-10-25T19:15:27Z run_tests_multi aarch64.localdomain: [go_test:internal/unsafeheader go_test:internal/xcoff go_test:io]
  2021-10-25T19:15:38Z finish_run_tests_multi after 11.7s; aarch64.localdomain: [go_test:internal/unsafeheader go_test:internal/xcoff go_test:io]
  2021-10-25T19:15:38Z run_tests_multi aarch64.localdomain: [go_test:io/fs go_test:io/ioutil go_test:log]
  2021-10-25T19:15:42Z finish_run_tests_multi after 3.83s; aarch64.localdomain: [go_test:io/fs go_test:io/ioutil go_test:log]
  2021-10-25T19:15:42Z run_tests_multi aarch64.localdomain: [go_test:log/syslog go_test:math go_test:math/big]
  2021-10-25T19:15:59Z finish_run_tests_multi after 16.6s; aarch64.localdomain: [go_test:log/syslog go_test:math go_test:math/big]
  2021-10-25T19:15:59Z run_tests_multi aarch64.localdomain: [go_test:math/bits go_test:math/cmplx go_test:math/rand]
  2021-10-25T19:16:12Z finish_run_tests_multi after 13.3s; aarch64.localdomain: [go_test:math/bits go_test:math/cmplx go_test:math/rand]
  2021-10-25T19:16:12Z run_tests_multi aarch64.localdomain: [go_test:mime go_test:mime/multipart go_test:mime/quotedprintable]
  2021-10-25T19:16:17Z finish_run_tests_multi after 4.66s; aarch64.localdomain: [go_test:mime go_test:mime/multipart go_test:mime/quotedprintable]
  2021-10-25T19:16:17Z run_tests_multi aarch64.localdomain: [go_test:net go_test:net/http go_test:net/http/cgi]
  2021-10-25T19:16:47Z still_waiting_on_test go_test:net
  2021-10-25T19:16:56Z finish_run_tests_multi after 39.2s; aarch64.localdomain: [go_test:net go_test:net/http go_test:net/http/cgi]

一个简单的问题是:在没有助手的情况下,为什么测试会被分区?在一个go tool dist test的单个调用下运行整个测试集可以大大减少开销。

mkh04yzy

mkh04yzy1#

划分测试的逻辑在这里:https://cs.opensource.google/go/x/build/+/master:cmd/coordinator/coordinator.go;l=2655-2692;drc=db77efa219bbf2dff0b44c84d5e7ea957f55a5db
它似乎使用了硬编码的10秒截止时间,我怀疑它没有考虑到每个测试的设置开销——因此freebsd-arm64构建器可能因为没有这个开销而获得更好的聚类效果。(这里有点“进退两难”的情况。😅)

myzjeezk

myzjeezk2#

关于为什么测试被分区——我不知道?我猜想协调器是按照每个测试的开销最小来编写的,如果我们有分片构建器,那么使用相同的分片逻辑对所有构建器进行协调实现肯定会更简单。

d5vmydt9

d5vmydt93#

The logic that partitions the tests is here: ...
Yes, I did look at that code before posting the issue. But that doesn't show where getTestStats is getting its historical durations from. I suspect they are not very accurate in the case of plan9-arm. Can I look at them somehow?
My feeling is that the bad partitioning is a recent problem, but I don't have an archive of test log output so I can't check. How often does the database of durations change?
I would guess that the coordinator is written assuming that per-test overhead is minimal
Clearly not the case for plan 9 since go1.10, as I reported in #24300 . The redundant (imho) staleness tests at the start of dist test repeatedly thrash the entire source tree with lookup + stat calls, not minimal on operating systems which don't have linux's aggressive directory and inode cacheing, even worse if the builder uses a remote file server.

k5ifujac

k5ifujac4#

我能否以某种方式查看它们?
我不确定!(这似乎是一个@golang/release的问题...)

相关问题