tensorflow [XLA]是否支持DT_STRING的输入数据类型?

li9yvcax  于 2022-10-29  发布在  其他
关注(0)|答案(1)|浏览(175)

系统信息

*我是否编写了自定义代码(而不是使用TensorFlow中提供的常用示例脚本):是的
*操作系统平台和分发(例如Linux Ubuntu 16.04):操作系统中心7.4.1708
*TensorFlow安装自(源代码或二进制):来源
*TensorFlow版本(使用下面的命令):1.7.0
*Python版本:2.7
*Bazel版本(如果从源代码编译):0.11.1
*GCC/编译器版本(如果从源代码编译):4.8.5
*CUDA/cuDNN版本:cuda 9和cuddn 6
*GPU型号和内存
*要重现的确切命令

描述问题

这里我想做的是使用XLA来加速我的模型推理性能。所以我有一个用Estimator训练的模型(认为它只是一个宽而深的模型),用feature_column导出为savedModel格式,然后转换为冻结图。然后我按照AOT tutorial

  • 准备我的冻结图形.pb
  • 编写图形配置文件pbtxt
  • 编辑BUILD文件以添加我自己的cc_library
  • 建筑物

同时,为了使它工作,我不得不在BUILD文件(tensorflow/compiler/aot/BUILD)的tf_compile库部分再添加3个依赖项。如下所示:

"//tensorflow/core/kernels:example_parsing_ops",
        "//tensorflow/core/kernels:lookup_table_op",
        "//tensorflow/core/kernels:logging_ops",

然后,在解决所有相关性错误后。显示以下错误消息:

INVALID ARGUMENTS: Unsupported type in DataTypeToPrimitiveType string

所以我检查了这里的代码tensorflow/compiler/tf2xla/type_util.cc和这里的代码tensorflow/compiler/xla/xla_data.proto;我发现XLA现在实际上不支持DT_STRING,所以我想知道是否可以支持字符串?为什么?

源代码/日志

我图形配置文件如下所示:

feed {
  id { node_name: "input_example_tensor" }
  shape {
    dim { size: 1 }
  }
}

fetch {
  id { node_name: "head/predictions/probabilities" }
}

bazel构建错误消息:

ERROR: /data/tf/tensorflow-1.7.0/tensorflow/compiler/aot/tests/BUILD:155:1: Executing genrule //tensorflow/compiler/aot/tests:gen_feed_graph failed (Exit 1)
2018-05-08 17:49:19.657209: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA
INVALID ARGUMENTS: Unsupported type in DataTypeToPrimitiveType string

tfcompile performs ahead-of-time compilation of a TensorFlow graph,
resulting in an object file compiled for your target architecture, and a
header file that gives access to the functionality in the object file.
A typical invocation looks like this:

   $ tfcompile --graph=mygraph.pb --config=myfile.pbtxt --cpp_class="mynamespace::MyComputation"

usage: bazel-out/host/bin/tensorflow/compiler/aot/tfcompile
Flags:
	--graph=""                       	string	Input GraphDef file.  If the file ends in '.pbtxt' it is expected to be in the human-readable proto text format, otherwise it is expected to be in the proto binary format.
	--config=""                      	string	Input file containing Config proto.  If the file ends in '.pbtxt' it is expected to be in the human-readable proto text format, otherwise it is expected to be in the proto binary format.
	--dump_fetch_nodes=false         	bool	If set, only flags related to fetches are processed, and the resulting fetch nodes will be dumped to stdout in a comma-separated list.  Typically used to format arguments for other tools, e.g. freeze_graph.
	--target_triple="x86_64-pc-linux"	string	Target platform, similar to the clang -target flag.  The general format is <arch><sub>-<vendor>-<sys>-<abi>.  http://clang.llvm.org/docs/CrossCompilation.html#target-triple.
	--target_cpu=""                  	string	Target cpu, similar to the clang -mcpu flag.  http://clang.llvm.org/docs/CrossCompilation.html#cpu-fpu-abi
	--target_features=""             	string	Target features, e.g. +avx2, +neon, etc.
	--entry_point="entry"            	string	Name of the generated function.  If multiple generated object files will be linked into the same binary, each will need a unique entry point.
	--cpp_class=""                   	string	Name of the generated C++ class, wrapping the generated function.  The syntax of this flag is [[<optional_namespace>::],...]<class_name>.  This mirrors the C++ syntax for referring to a class, where multiple namespaces may precede the class name, separated by double-colons.  The class will be generated in the given namespace(s), or if no namespaces are given, within the global namespace.
	--out_function_object="out_model.o"	string	Output object file containing the generated function for the TensorFlow model.
	--out_header="out.h"             	string	Output header file name.
	--out_metadata_object="out_helper.o"	string	Output object file name containing optional metadata for the generated function.
	--out_session_module=""          	string	Output session module proto.
	--gen_name_to_index=false        	bool	Generate name-to-index data for Lookup{Arg,Result}Index methods.
	--gen_program_shape=false        	bool	Generate program shape data for the ProgramShape method.
	--xla_generate_hlo_graph=""      	string	HLO modules matching this regex will be dumped to a .dot file throughout various stages in compilation.
	--xla_hlo_graph_addresses=false  	bool	With xla_generate_hlo_graph, show addresses of HLO ops in graph dump.
	--xla_hlo_graph_path=""          	string	With xla_generate_hlo_graph, dump the graphs into this path.
	--xla_hlo_dump_as_graphdef=false 	bool	Dump HLO graphs as TensorFlow GraphDefs.
	--xla_hlo_graph_sharding_color=false	bool	Assign colors based on sharding assignments when generating the HLO graphs.
	--xla_hlo_tfgraph_device_scopes=false	bool	When generating TensorFlow HLO graphs, if the HLO instructions are assigned to a specific device, prefix the name scope with "devX" with X being the device ordinal.
	--xla_log_hlo_text=""            	string	HLO modules matching this regex will be dumped to LOG(INFO).
	--xla_generate_hlo_text_to=""    	string	Dump all HLO modules as text into the provided directory path.
	--xla_enable_fast_math=true      	bool	Enable unsafe fast-math optimizations in the compiler; this may produce faster code at the expense of some accuracy.
	--xla_llvm_enable_alias_scope_metadata=true	bool	In LLVM-based backends, enable the emission of !alias.scope metadata in the generated IR.
	--xla_llvm_enable_noalias_metadata=true	bool	In LLVM-based backends, enable the emission of !noalias metadata in the generated IR.
	--xla_llvm_enable_invariant_load_metadata=true	bool	In LLVM-based backends, enable the emission of !invariant.load metadata in the generated IR.
	--xla_llvm_disable_expensive_passes=false	bool	In LLVM-based backends, disable a custom set of expensive optimization passes.
	--xla_backend_optimization_level=3	int32	Numerical optimization level for the XLA compiler backend.
	--xla_disable_hlo_passes=""      	string	Comma-separated list of hlo passes to be disabled. These names must exactly match the passes' names; no whitespace around commas.
	--xla_embed_ir_in_executable=false	bool	Embed the compiler IR as a string in the executable.
	--xla_dump_ir_to=""              	string	Dump the compiler IR into this directory as individual files.
	--xla_eliminate_hlo_implicit_broadcast=true	bool	Eliminate implicit broadcasts when lowering user computations to HLO instructions; use explicit broadcast instead.
	--xla_cpu_multi_thread_eigen=true	bool	When generating calls to Eigen in the CPU backend, use multi-threaded Eigen mode.
	--xla_gpu_cuda_data_dir="./cuda_sdk_lib"	string	If non-empty, speficies a local directory containing ptxas and nvvm libdevice files; otherwise we use those from runfile directories.
	--xla_gpu_ftz=false              	bool	If true, flush-to-zero semantics are enabled in the code generated for GPUs.
	--xla_gpu_disable_multi_streaming=false	bool	If true, multi-streaming in the GPU backend is disabled.
	--xla_dump_optimized_hlo_proto_to=""	string	Dump Hlo after all hlo passes are executed as proto binary into this directory.
	--xla_dump_unoptimized_hlo_proto_to=""	string	Dump HLO before any hlo passes are executed as proto binary into this directory.
	--xla_dump_per_pass_hlo_proto_to=""	string	Dump HLO after each pass as an HloProto in binary file format into this directory.
	--xla_test_all_output_layouts=false	bool	Let ClientLibraryTestBase::ComputeAndCompare* test all permutations of output layouts. For example, with a 3D shape, all permutations of the set {0, 1, 2} are tried.
	--xla_test_all_input_layouts=false	bool	Let ClientLibraryTestBase::ComputeAndCompare* test all permutations of *input* layouts. For example, for 2 input arguments with 2D shape and 4D shape, the computation will run 2! * 4! times for every possible layouts
	--xla_hlo_profile=false          	bool	Instrument the computation to collect per-HLO cycle counts
	--xla_dump_computations_to=""    	string	Dump computations that XLA executes into the provided directory path
	--xla_dump_executions_to=""      	string	Dump parameters and results of computations that XLA executes into the provided directory path
	--xla_backend_extra_options=""   	string	Extra options to pass to a backend; comma-separated list of 'key=val' strings (=val may be omitted); no whitespace around commas.
	--xla_reduce_precision=""        	string	Directions for adding reduce-precision operations. Format is 'LOCATION=E,M:OPS;NAMES' where LOCATION is the class of locations in which to insert the operations (e.g., 'OP_OUTPUTS'), E and M are the exponent and matissa bit counts respectively, and OPS and NAMES are comma-separated (no spaces) lists of the operation types and names to which to attach the reduce-precision operations.  The NAMES string and its preceding ';' may be omitted.  This option may be repeated to define multiple sets of added reduce-precision operations.
	--xla_gpu_use_cudnn_batchnorm=false	bool	Allows the GPU backend to implement batchnorm HLOs using cudnn, rather than expanding them to a soup of HLOs.
Target //tensorflow/compiler/aot/tests:feed_binary failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 12.475s, Critical Path: 3.04s
FAILED: Build did NOT complete successfully

相关问题