spark dataframe insertinto配置单元表失败,因为有些暂存部件文件是用用户名mapr创建的

mf98qq94  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(470)

我正在使用sparkDataframe插入到配置单元表中。即使应用程序是使用用户名“myuser”提交的,一些hive暂存部件文件也是使用用户名“mapr”创建的。因此,在重命名暂存文件时,对配置单元表的最终写入失败,并表示访问被拒绝。命令:
resultdf.write.mode(“append”).insertinto(不稳定)
错误:
线程“main”org.apache.hadoop.security.accesscontrolexception中出现异常:拒绝用户myuser(用户id 2547)访问rename/ded/data/db/da\u mydb.db/managed/da\u primary/.hive-staging\u hive\u 2017-12-27\u 13-25-22\u 586\u 3120774356819313410-1/-ext-10000/\u temporary/0/task\u 201712271525\u 0080\u m\u000000/part-00000 to/ded/data/db/da\u mydb.db/managed/da\u primary/.hive-staging\u hive\u 2017-12-27\u 13-25-22\u 586\u 31207774356819313410-1/-ext-10000/part-00000,网址:com.mapr.fs.maprfilesystem.rename(maprfilesystem)。java:1112)位于org.apache.hadoop.mapreduce.lib.output.fileoutputcommitter.mergepaths(fileoutputcommitter)。java:461)在org.apache.hadoop.mapreduce.lib.output.fileoutputcommitter.mergepaths(fileoutputcommitter)。java:475)位于org.apache.hadoop.mapreduce.lib.output.fileoutputcommitter.commitjobinternal(fileoutputcommitter。java:392)在org.apache.hadoop.mapreduce.lib.output.fileoutputcommitter.commitjob(fileoutputcommitter。java:364)在org.apache.hadoop.mapred.fileoutputcommitter.commitjob(fileoutputcommitter。java:136)位于org.apache.spark.sql.hive.sparkhivewritercontainer.commitjob(hivewritercontainers)。scala:108)位于org.apache.spark.sql.hive.execution.insertintohivetable.saveashivefile(insertintohivetable。scala:85)在org.apache.spark.sql.hive.execution.insertintohivetable.sideeffectresult$lzycompute(insertintohivetable。scala:201)位于org.apache.spark.sql.hive.execution.insertintohivetable.sideeffectresult(insertintohivetable。scala:127)在org.apache.spark.sql.hive.execution.insertintohivetable.doexecute(insertintohivetable。scala:276)在org.apache.spark.sql.execution.sparkplan$$anonfun$执行$5.apply(sparkplan。scala:132)在org.apache.spark.sql.execution.sparkplan$$anonfun$execute$5.apply(sparkplan。scala:130)在org.apache.spark.rdd.rddoperationscope$.withscope(rddoperationscope。scala:150)在org.apache.spark.sql.execution.sparkplan.execute(sparkplan。scala:130)在org.apache.spark.sql.execution.queryexecution.tordd$lzycompute(查询执行)。scala:55)在org.apache.spark.sql.execution.queryexecution.tordd(queryexecution。scala:55)位于org.apache.spark.sql.dataframewriter.insertinto(dataframewriter。scala:189)位于org.apache.spark.sql.dataframewriter.insertinto(dataframewriter。scala:166)在com.iri.suppchain.runkeying$.execxform(runkeying。scala:74)在com.iri.suppchain.runkeying$$anonfun$1.apply(runkeying。scala:36)在com.iri.suppchain.runkeying$$anonfun$1.apply(runkeying。scala:36)在scala.collection.immutable.list.foreach(list。scala:318)在com.iri.suppchain.runkeying$delayedinit$body.apply(runkeying。scala:36)在scala.function0$class.apply$mcv$sp(function0。scala:40)在scala.runtime.abstractfunction0.apply$mcv$sp(abstractfunction0。scala:12)
以下是环境细节:
Spark1.6.1
分布Map器

1mrurvl1

1mrurvl11#

尝试下面的方法并给出反馈

resultDF.registerTempTable("results_tbl")
sqlContext.sql("INSERT INTO TABLE insTable SELECT * FROM results_tbl")

相关问题