问题类型
Bug
你是否在TF nightly版本中复现了这个bug?
是的
问题来源
source
Tensorflow版本
master
自定义代码
是的
OS平台和发行版
- 无响应*
移动设备
- 无响应*
Python版本
- 无响应*
Bazel版本
- 无响应*
GCC/编译器版本
- 无响应*
CUDA/cuDNN版本
- 无响应*
GPU型号和内存
- 无响应*
当前行为?
A bug happened!
gpu xla testcase:
XLA_TEST_F(Resnet50FusionTest, module_0615_entry_bitcast) {
const char* testcase = R"(
¦ HloModule m, is_scheduled=true
¦ ENTRY out_of_fusion {
¦ ¦ p0 = f32[1]{0} parameter(0)
¦ ¦ ROOT res = f32[] bitcast(f32[1]{0} p0)
¦ }
)";
auto module = ParseAndReturnVerifiedModule(testcase).value();
EXPECT_TRUE(RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5)));
}
will cause following bug:
2023-04-07 07:42:59.490547: I tensorflow/compiler/xla/service/platform_util.cc:72] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
2023-04-07 07:42:59.659576: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:1162] failed to enqueue async memcpy from device to host: CUDA_ERROR_INVALID_VALUE: invalid argument; host dst: 0x7fff996f4360; GPU src: 0x7f4e8da00000; size: 4=0x4
2023-04-07 07:42:59.659624: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659679: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659694: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659703: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659711: W tensorflow/compiler/xla/stream_executor/stream.cc:277] Error blocking host until done in stream destructor: INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659745: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:755] failed to free device memory at 0x7f4e8da00000; result: CUDA_ERROR_INVALID_VALUE: invalid argument
tensorflow/compiler/xla/service/gpu/tests/gpu_kernel_tiling_test.cc:58: Failure
Value of: RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5))
Actual: false (INTERNAL: stream did not block host until done; was already in an error state)
Expected: true
重现问题的独立代码
see current behaviour
相关日志输出
2023-04-07 07:42:59.490547: I tensorflow/compiler/xla/service/platform_util.cc:72] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
2023-04-07 07:42:59.659576: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:1162] failed to enqueue async memcpy from device to host: CUDA_ERROR_INVALID_VALUE: invalid argument; host dst: 0x7fff996f4360; GPU src: 0x7f4e8da00000; size: 4=0x4
2023-04-07 07:42:59.659624: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659679: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659694: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659703: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659711: W tensorflow/compiler/xla/stream_executor/stream.cc:277] Error blocking host until done in stream destructor: INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659745: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:755] failed to free device memory at 0x7f4e8da00000; result: CUDA_ERROR_INVALID_VALUE: invalid argument
tensorflow/compiler/xla/service/gpu/tests/gpu_kernel_tiling_test.cc:58: Failure
Value of: RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5))
Actual: false (INTERNAL: stream did not block host until done; was already in an error state)
Expected: true
3条答案
按热度按时间mwngjboj1#
测试代码如下:
tvmytwxo2#
由于输入参数memory不是输出参数的别名,所以在将其数据拷贝到主机之前释放了内存。
zlhcx6iw3#
在优化之前的HLO中是否可以复现问题?请注意,
bitcast
不在输入的HLO中,它只出现在优化之后。