导出Tensorflow模型-Assert错误:未指定检查点(保存_path=无);未还原任何内容

kiz8lqtg  于 2023-01-21  发布在  其他
关注(0)|答案(4)|浏览(201)

我在Ubuntu机器上使用google colab和tensorflow 2.3.0,并从这里开始执行示例:Tensorlow2训练自定义模型
这是我的代码:

!python exporter_main_v2.py --input_type image_tensor --pipeline_config_path=models/my_ssd_resnet50_v1_fpn/pipeline.config --trained_checkpoint_dir=/models/my_ssd_resnet50_v1_fpn --output_directory=exported-models/my_model/

我收到以下错误:

2020-09-06 08:03:23.830447: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-06 08:03:25.844063: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcuda.so.1
2020-09-06 08:03:25.879149: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:25.879813: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2020-09-06 08:03:25.879853: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-06 08:03:25.881273: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-06 08:03:25.882999: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-06 08:03:25.883384: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-06 08:03:25.885102: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-06 08:03:25.886330: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-06 08:03:25.889988: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-06 08:03:25.890105: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:25.891047: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:25.891854: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-06 08:03:25.901457: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2200000000 Hz
2020-09-06 08:03:25.901653: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2cdd480 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-09-06 08:03:25.901678: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2020-09-06 08:03:26.012959: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:26.013665: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x2cdd640 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2020-09-06 08:03:26.013697: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Tesla T4, Compute Capability 7.5
2020-09-06 08:03:26.013935: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:26.014510: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1716] Found device 0 with properties: 
pciBusID: 0000:00:04.0 name: Tesla T4 computeCapability: 7.5
coreClock: 1.59GHz coreCount: 40 deviceMemorySize: 14.73GiB deviceMemoryBandwidth: 298.08GiB/s
2020-09-06 08:03:26.014556: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-06 08:03:26.014600: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcublas.so.10
2020-09-06 08:03:26.014625: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcufft.so.10
2020-09-06 08:03:26.014647: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcurand.so.10
2020-09-06 08:03:26.014667: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusolver.so.10
2020-09-06 08:03:26.014689: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcusparse.so.10
2020-09-06 08:03:26.014712: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudnn.so.7
2020-09-06 08:03:26.014784: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:26.015364: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:26.015875: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1858] Adding visible gpu devices: 0
2020-09-06 08:03:26.015919: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.1
2020-09-06 08:03:26.651590: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1257] Device interconnect StreamExecutor with strength 1 edge matrix:
2020-09-06 08:03:26.651650: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1263]      0 
2020-09-06 08:03:26.651663: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1276] 0:   N 
2020-09-06 08:03:26.651874: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:26.652564: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:982] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2020-09-06 08:03:26.653153: W tensorflow/core/common_runtime/gpu/gpu_bfc_allocator.cc:39] Overriding allow_growth setting because the TF_FORCE_GPU_ALLOW_GROWTH environment variable is set. Original config value was 0.
2020-09-06 08:03:26.653195: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1402] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 13962 MB memory) -> physical GPU (device: 0, name: Tesla T4, pci bus id: 0000:00:04.0, compute capability: 7.5)
Traceback (most recent call last):
  File "exporter_main_v2.py", line 159, in <module>
    app.run(main)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 299, in run
    _run_main(main, args)
  File "/usr/local/lib/python3.6/dist-packages/absl/app.py", line 250, in _run_main
    sys.exit(main(argv))
  File "exporter_main_v2.py", line 155, in main
    FLAGS.side_input_types, FLAGS.side_input_names)
  File "/usr/local/lib/python3.6/dist-packages/object_detection-0.1-py3.6.egg/object_detection/exporter_lib_v2.py", line 260, in export_inference_graph
    status.assert_existing_objects_matched()
  File "/usr/local/lib/python3.6/dist-packages/tensorflow/python/training/tracking/util.py", line 885, in assert_existing_objects_matched
    "No checkpoint specified (save_path=None); nothing is being restored.")
AssertionError: No checkpoint specified (save_path=None); nothing is being restored.

我已经使用Tensorflow1完成了一个不同的示例,遇到了相同的问题(我认为),并在此处寻求帮助:Stackovefflow question
在指定的目录中有多个检查点文件。训练看起来像它应该运行的那样。我真的被难住了。请问有人能帮忙吗?
已按请求添加快照:

w8ntj3qf

w8ntj3qf1#

trained_checkpoint_dir中应该只有一个检查点。请删除不必要的检查点。

t98cgbkg

t98cgbkg2#

删除第一个/之后的--trained_checkpoint_dir

#!python exporter_main_v2.py --input_type image_tensor --pipeline_config_path=models/my_ssd_resnet50_v1_fpn/pipeline.config --trained_checkpoint_dir=models/my_ssd_resnet50_v1_fpn --output_directory=exported-models/my_model/

应该能解决你的问题

tjvv9vkg

tjvv9vkg3#

检查您是否正在运行fine_tune_checkpoint_type: "detection"

o3imoua4

o3imoua44#

就按这些步骤做:
1.打开导出程序库v2.py
models/research/object_detection/exporter_lib_v2.py
1.注解这些行(您将在第265行和第279行之间找到它们)
# status.assert_existing_objects_matched()
# concrete_function = detection_module.__call__.get_concrete_function()
然后将此行中的concrete_function替换为None
tf.saved_model.save(detection_module,output_saved_model_directory, signatures=None)
1.重新设置对象检测API
x1米4英寸x1米5英寸

相关问题