tensorflow bitcast操作测试用例bug

zour9fqk  于 6个月前  发布在  其他
关注(0)|答案(3)|浏览(49)

问题类型

Bug

你是否在TF nightly版本中复现了这个bug?

是的

问题来源

source

Tensorflow版本

master

自定义代码

是的

OS平台和发行版

  • 无响应*

移动设备

  • 无响应*

Python版本

  • 无响应*

Bazel版本

  • 无响应*

GCC/编译器版本

  • 无响应*

CUDA/cuDNN版本

  • 无响应*

GPU型号和内存

  • 无响应*

当前行为?

A bug happened!
gpu xla testcase:
 XLA_TEST_F(Resnet50FusionTest, module_0615_entry_bitcast) {                         
   const char* testcase = R"(                                                        
¦ HloModule m, is_scheduled=true                                                  
¦ ENTRY out_of_fusion {                                                           
¦ ¦ p0 = f32[1]{0} parameter(0)                                                   
¦ ¦ ROOT res = f32[] bitcast(f32[1]{0} p0)                                        
¦ }                                                                               
)";                                                                               
   auto module = ParseAndReturnVerifiedModule(testcase).value();                     
   EXPECT_TRUE(RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5)));    
 }     
will cause following bug:
2023-04-07 07:42:59.490547: I tensorflow/compiler/xla/service/platform_util.cc:72] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
2023-04-07 07:42:59.659576: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:1162] failed to enqueue async memcpy from device to host: CUDA_ERROR_INVALID_VALUE: invalid argument; host dst: 0x7fff996f4360; GPU src: 0x7f4e8da00000; size: 4=0x4
2023-04-07 07:42:59.659624: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659679: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659694: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659703: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659711: W tensorflow/compiler/xla/stream_executor/stream.cc:277] Error blocking host until done in stream destructor: INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659745: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:755] failed to free device memory at 0x7f4e8da00000; result: CUDA_ERROR_INVALID_VALUE: invalid argument
tensorflow/compiler/xla/service/gpu/tests/gpu_kernel_tiling_test.cc:58: Failure
Value of: RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5))
  Actual: false (INTERNAL: stream did not block host until done; was already in an error state)
Expected: true

重现问题的独立代码

see current behaviour

相关日志输出

2023-04-07 07:42:59.490547: I tensorflow/compiler/xla/service/platform_util.cc:72] platform Host present but no XLA compiler available: could not find registered compiler for platform Host -- was support for that platform linked in?
2023-04-07 07:42:59.659576: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:1162] failed to enqueue async memcpy from device to host: CUDA_ERROR_INVALID_VALUE: invalid argument; host dst: 0x7fff996f4360; GPU src: 0x7f4e8da00000; size: 4=0x4
2023-04-07 07:42:59.659624: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659679: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659694: I tensorflow/compiler/xla/stream_executor/stream.cc:2535] INTERNAL: Unknown error
2023-04-07 07:42:59.659703: I tensorflow/compiler/xla/stream_executor/stream.cc:2539] [stream=0x5562acc10570,impl=0x5562acbb8160] INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659711: W tensorflow/compiler/xla/stream_executor/stream.cc:277] Error blocking host until done in stream destructor: INTERNAL: stream did not block host until done; was already in an error state
2023-04-07 07:42:59.659745: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:755] failed to free device memory at 0x7f4e8da00000; result: CUDA_ERROR_INVALID_VALUE: invalid argument
tensorflow/compiler/xla/service/gpu/tests/gpu_kernel_tiling_test.cc:58: Failure
Value of: RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5))
  Actual: false (INTERNAL: stream did not block host until done; was already in an error state)
Expected: true
mwngjboj

mwngjboj1#

测试代码如下:

TEST_F(GpuKernelTilingTest, module_0615_entry_bitcast_2) {
    const char* testcase = R"(
         HloModule m, is_scheduled=true
         ENTRY out_of_fusion {
             p0 = f32[1]{0} parameter(0)
             ROOT res = f32[] bitcast(f32[1]{0} p0)
         }
    )";
    auto module = ParseAndReturnVerifiedModule(testcase).value();
    EXPECT_TRUE(RunAndCompareNoHloPasses(std::move(module), ErrorSpec(1e-5)));
}
tvmytwxo

tvmytwxo2#

由于输入参数memory不是输出参数的别名,所以在将其数据拷贝到主机之前释放了内存。

zlhcx6iw

zlhcx6iw3#

在优化之前的HLO中是否可以复现问题?请注意,bitcast 不在输入的HLO中,它只出现在优化之后。

相关问题