sqoop导入错误:用户null不属于org.apache.hadoop.hdfs.server.namenode.fsdirattrop.setowner(fsdirattrop)上的hadoopjava:89)

dpiehjr4  于 2021-06-03  发布在  Sqoop
关注(0)|答案(4)|浏览(352)

我使用的是hdp2.6沙盒。我在hdfs组下创建了一个用户根为user的用户空间,执行以下sqoop配置单元导入时遇到以下2个错误:
失败,出现异常org.apache.hadoop.security.accesscontrolexception:位于org.apache.hadoop.hdfs.server.namenode.fsdirattrop.setowner(fsdirattrop)的用户null不属于hadoop。java:89)
失败:执行错误,从org.apache.hadoop.hive.ql.exec.movetask返回代码1,但数据已正确导入到配置单元表中。
请帮助我理解这个错误的意义,以及如何克服这个错误。

[root@sandbox-hdp ~]# sqoop import \
> --connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db \
> --username retail_dba \
> --password hadoop \
> --table departments \
> --hive-home /apps/hive/warehouse \
> --hive-import \
> --create-hive-table \
> --hive-table retail_db.departments \
> --target-dir /user/root/hive_import \
> --outdir java_files
Warning: /usr/hdp/2.6.3.0-235/accumulo does not exist! Accumulo imports will fail.
Please set $ACCUMULO_HOME to the root of your Accumulo installation.
18/01/14 09:42:38 INFO sqoop.Sqoop: Running Sqoop version: 1.4.6.2.6.3.0-235
18/01/14 09:42:38 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
18/01/14 09:42:38 INFO tool.BaseSqoopTool: Using Hive-specific delimiters for output. You can override
18/01/14 09:42:38 INFO tool.BaseSqoopTool: delimiters with --fields-terminated-by, etc.
18/01/14 09:42:38 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
18/01/14 09:42:38 INFO tool.CodeGenTool: Beginning code generation
18/01/14 09:42:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
18/01/14 09:42:38 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
18/01/14 09:42:39 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/hdp/2.6.3.0-235/hadoop-mapreduce
Note: /tmp/sqoop-root/compile/e1ec5b443f92219f1f061ad4b64cc824/departments.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
18/01/14 09:42:40 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-root/compile/e1ec5b443f92219f1f061ad4b64cc824/departments.jar
18/01/14 09:42:40 WARN manager.MySQLManager: It looks like you are importing from mysql.
18/01/14 09:42:40 WARN manager.MySQLManager: This transfer can be faster! Use the --direct
18/01/14 09:42:40 WARN manager.MySQLManager: option to exercise a MySQL-specific fast path.
18/01/14 09:42:40 INFO manager.MySQLManager: Setting zero DATETIME behavior to convertToNull (mysql)
18/01/14 09:42:40 INFO mapreduce.ImportJobBase: Beginning import of departments
18/01/14 09:42:41 INFO client.RMProxy: Connecting to ResourceManager at sandbox-hdp.hortonworks.com/172.17.0.2:8032
18/01/14 09:42:42 INFO client.AHSProxy: Connecting to Application History server at sandbox-hdp.hortonworks.com/172.17.0.2:10200
18/01/14 09:42:46 INFO db.DBInputFormat: Using read commited transaction isolation
18/01/14 09:42:46 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(`department_id`), MAX(`department_id`) FROM `departments`
18/01/14 09:42:46 INFO db.IntegerSplitter: Split size: 1; Num splits: 4 from: 2 to: 7
18/01/14 09:42:46 INFO mapreduce.JobSubmitter: number of splits:4
18/01/14 09:42:47 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1515818851132_0050
18/01/14 09:42:47 INFO impl.YarnClientImpl: Submitted application application_1515818851132_0050
18/01/14 09:42:47 INFO mapreduce.Job: The url to track the job: http://sandbox-hdp.hortonworks.com:8088/proxy/application_1515818851132_0050/
18/01/14 09:42:47 INFO mapreduce.Job: Running job: job_1515818851132_0050
18/01/14 09:42:55 INFO mapreduce.Job: Job job_1515818851132_0050 running in uber mode : false
18/01/14 09:42:55 INFO mapreduce.Job:  map 0% reduce 0%
18/01/14 09:43:05 INFO mapreduce.Job:  map 25% reduce 0%
18/01/14 09:43:09 INFO mapreduce.Job:  map 50% reduce 0%
18/01/14 09:43:12 INFO mapreduce.Job:  map 75% reduce 0%
18/01/14 09:43:14 INFO mapreduce.Job:  map 100% reduce 0%
18/01/14 09:43:14 INFO mapreduce.Job: Job job_1515818851132_0050 completed successfully
18/01/14 09:43:16 INFO mapreduce.Job: Counters: 30
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=682132
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=481
                HDFS: Number of bytes written=60
                HDFS: Number of read operations=16
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=8
        Job Counters
                Launched map tasks=4
                Other local map tasks=4
                Total time spent by all maps in occupied slots (ms)=44760
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=44760
                Total vcore-milliseconds taken by all map tasks=44760
                Total megabyte-milliseconds taken by all map tasks=11190000
        Map-Reduce Framework
                Map input records=6
                Map output records=6
                Input split bytes=481
                Spilled Records=0
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=1284
                CPU time spent (ms)=5360
                Physical memory (bytes) snapshot=561950720
                Virtual memory (bytes) snapshot=8531210240
                Total committed heap usage (bytes)=176685056
        File Input Format Counters
                Bytes Read=0
        File Output Format Counters
                Bytes Written=60
18/01/14 09:43:16 INFO mapreduce.ImportJobBase: Transferred 60 bytes in 34.7351 seconds (1.7274 bytes/sec)
18/01/14 09:43:16 INFO mapreduce.ImportJobBase: Retrieved 6 records.
18/01/14 09:43:16 INFO mapreduce.ImportJobBase: Publishing Hive/Hcat import job data to Listeners
18/01/14 09:43:16 WARN mapreduce.PublishJobData: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook
java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:335)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:264)
        at org.apache.sqoop.mapreduce.PublishJobData.publishJobData(PublishJobData.java:46)
        at org.apache.sqoop.mapreduce.ImportJobBase.runImport(ImportJobBase.java:284)
        at org.apache.sqoop.manager.SqlManager.importTable(SqlManager.java:692)
        at org.apache.sqoop.manager.MySQLManager.importTable(MySQLManager.java:127)
        at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:507)
        at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:615)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
18/01/14 09:43:16 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `departments` AS t LIMIT 1
18/01/14 09:43:16 INFO hive.HiveImport: Loading uploaded data into Hive

Logging initialized using configuration in jar:file:/usr/hdp/2.6.3.0-235/hive/lib/hive-common-1.2.1000.2.6.3.0-235.jar!/hive-log4j.properties
OK
Time taken: 10.427 seconds
Loading data to table retail_db.departments
Failed with exception org.apache.hadoop.security.AccessControlException: User null does not belong to Hadoop at org.apache.hadoop.hdfs.server.namenode.FSDirAttrOp.setOwner(FSDirAttrOp.java:89)         at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setOwner(FSNamesystem.java:1873)         at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setOwner(NameNodeRpcServer.java:828)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setOwner(ClientNamenodeProtocolServerSideTranslatorPB.java:476)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1866)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2347)

FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
vsdwdz23

vsdwdz231#

两个想法

ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook

某个地方少了一节课。
我看到您正在尝试使用linux下的根帐户运行sqoop命令。确保根目录属于hdfs组。我不确定默认情况下是否包含root。

f0brbegy

f0brbegy2#

第一个错误

WARN mapreduce.PublishJobData: Unable to publish import data to publisher org.apache.atlas.sqoop.hook.SqoopHook java.lang.ClassNotFoundException: org.apache.atlas.sqoop.hook.SqoopHook

您需要检查sqoop二进制文件是否正常。最好再复制一遍,这样你就不需要一个文件一个文件地检查了。
第二个错误 Failed with exception org.apache.hadoop.security.AccessControlException: User null does not belong to Hadoop 是因为您正在使用“root”用户执行sqoop。将其更改为hadoop集群中存在的用户。

mzsu5hc0

mzsu5hc03#

有时 null 从rdbms将数据导入配置单元时,sqoop不会处理这些值,因此应使用以下键显式处理它们:

--null-string and --null-non-string

完整命令为

sqoop import --connect jdbc:mysql://sandbox.hortonworks.com:3306/retail_db --username retail_dba --password hadoop --table departments --hive-home /apps/hive/warehouse --null-string 'na' --null-non-string 'na' --hive-import --create-hive-table --hive-table retail_db.departments --target-dir /user/root/hive_import
hc8w905p

hc8w905p4#

它的发生是由于 /etc/hive/conf/hive-site.xml :

<name>hive.warehouse.subdir.inherit.perms</name>
<value>true</value>

将值设置为false并尝试运行相同的查询,
或者将--target dir/user/root/hive\u导入读/写访问目录或将其删除,它将获取hive主目录

相关问题