用spoop导入Parquet格式的clob数据

1mrurvl1  于 2021-06-03  发布在  Sqoop
关注(0)|答案(1)|浏览(472)

我正在尝试以Parquet格式导入clob数据,下面是我的命令行:

sshpass -p ${MDP_MAPR} ssh -n ${USR_MAPR}@${CNX_MAPR} sqoop import -Dmapred.job.queue.name=root.leasing.dev --connect ${CNX_DB} --username ${USR_DB} --password ${MDP_DB} --query "${query}" --delete-target-dir --target-dir ${DST_HDFS}/${SOURCE}_${table} --hive-overwrite --hive-import --hive-table ${SOURCE}_${table} --hive-database ${DST_HIVE} --hive-drop-import-delims  -m 1 ${DRIVER_DB} --as-parquetfile >>${ficTrace} 2>&1

但它不起作用,我也找不出原因,这是我从它的执行过程中得到的日志:

Warning: /opt/mapr/sqoop/sqoop-1.4.6/bin/../../accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/07/09 14:44:42 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6-mapr-1703
18/07/09 14:44:42 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/07/09 14:44:42 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/07/09 14:44:42 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/mapr/lib/slf4j-log4j12-1.7.12.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/mapr/hive/hive-2.1/lib/log4j-slf4j-impl-2.4.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
18/07/09 14:44:43 INFO oracle.OraOopManagerFactory: Data Connector for Oracle and Hadoop is disabled.
18/07/09 14:44:43 INFO manager.SqlManager: Using default fetchSize of 1000
18/07/09 14:44:43 INFO tool.CodeGenTool: Beginning code generation
18/07/09 14:44:44 INFO manager.OracleManager: Time zone has been set to GMT
18/07/09 14:44:44 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:44 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:44 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /opt/mapr/hadoop/hadoop-2.7.0
Note: /tmp/sqoop-mapr/compile/2b49a98afbeb2ac1135adc84c66cf092/QueryResult.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/07/09 14:44:48 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-mapr/compile/2b49a98afbeb2ac1135adc84c66cf092/QueryResult.jar
18/07/09 14:44:53 INFO tool.ImportTool: Destination directory /app/list/datum/data/calf_hors_prod-cluster/datum/dev/leasing/tmp_sqoop/DE_DECISIONS is not present, hence not deleting.
18/07/09 14:44:53 INFO mapreduce.ImportJobBase: Beginning query import.
18/07/09 14:44:53 INFO Configuration.deprecation: mapred.jar is deprecated. Instead, use mapreduce.job.jar
18/07/09 14:44:53 INFO mapreduce.JobBase: Setting default value for hadoop.job.history.user.location=none
18/07/09 14:44:53 INFO manager.OracleManager: Time zone has been set to GMT
18/07/09 14:44:53 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:53 INFO manager.SqlManager: Executing SQL statement: select * from doe.DE_DECISIONS where  (1 = 0)
18/07/09 14:44:54 ERROR tool.ImportTool: Imported Failed: Cannot convert SQL type 2005

谢谢你的帮助。

mbzjlibv

mbzjlibv1#

您可以尝试在sqoop命令末尾添加以下内容:

--map-column-java <ORACLE_CLOB_COLUMN_NAME>=String

例如,如果oracle表有一个名为body的clob类型的列,请在末尾添加以下内容:

--map-column-java BODY=String

这将为sqoop提供关于oracle clob类型到java类型Map的指导。
如果有多列,可以使用以下语法模式:

--map-column-java <ORACLE_CLOB_COLUMN_NAME_1>=String,<ORACLE_CLOB_COLUMN_NAME_2>=String

相关问题