我有一个代码,它将schemardd(person)保存到一个存储为parquet(person\u parquet)的配置单元表中
hiveContext.sql("insert overwrite table person_parquet select * from person")
但它抛出了一个错误:
java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
at org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:399)
at org.apache.hadoop.hive.ql.session.SessionState.getAuthenticator(SessionState.java:867)
at org.apache.hadoop.hive.ql.session.SessionState.getUserFromAuthenticator(SessionState.java:589)
at org.apache.hadoop.hive.ql.metadata.Table.getEmptyTable(Table.java:174)
at org.apache.hadoop.hive.ql.metadata.Table.<init>(Table.java:116)
at org.apache.hadoop.hive.ql.metadata.Hive.newTable(Hive.java:2566)
at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:917)
at org.apache.hadoop.hive.ql.metadata.Hive.loadTable(Hive.java:1464)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult$lzycompute(InsertIntoHiveTable.scala:243)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.sideEffectResult(InsertIntoHiveTable.scala:137)
at org.apache.spark.sql.execution.Command$class.execute(commands.scala:46)
at org.apache.spark.sql.hive.execution.InsertIntoHiveTable.execute(InsertIntoHiveTable.scala:51)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)
at org.apache.spark.sql.SQLContext$QueryExecution.toRdd(SQLContext.scala:425)
at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58)
at org.apache.spark.sql.SchemaRDD.<init>(SchemaRDD.scala:108)
at org.apache.spark.sql.hive.HiveContext.sql(HiveContext.scala:94)
at com.example.KafkaConsumer$$anonfun$main$2.apply(KafkaConsumer.scala:114)
at com.example.KafkaConsumer$$anonfun$main$2.apply(KafkaConsumer.scala:83)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:529)
at org.apache.spark.streaming.dstream.DStream$$anonfun$foreachRDD$1.apply(DStream.scala:529)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply$mcV$sp(ForEachDStream.scala:42)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
at org.apache.spark.streaming.dstream.ForEachDStream$$anonfun$1.apply(ForEachDStream.scala:40)
at scala.util.Try$.apply(Try.scala:161)
at org.apache.spark.streaming.scheduler.Job.run(Job.scala:32)
at org.apache.spark.streaming.scheduler.JobScheduler$JobHandler.run(JobScheduler.scala:171)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:376)
at org.apache.hadoop.hive.ql.session.SessionState.setupAuth(SessionState.java:381)
... 29 more
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.SQLStdConfOnlyAuthorizerFactory
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:274)
at org.apache.hadoop.hive.ql.metadata.HiveUtils.getAuthorizeProviderManager(HiveUtils.java:366)
... 30 more
我将hive-site.xml更改为此,但仍引发相同的异常
<property>hive.security.authenticator.manager</property>
<value>org.apache.hadoop.hive.ql.security.HadoopDefaultAuthenticator</value>
<property>hive.security.authorization.enabled</property>
<value>false</value>
<property>hive.security.authorization.manager</property
<value>org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider</value>
(与#1相同的hive-site.xml)当我在依赖项中添加hive exec 1.0时,它抛出了一个不同的异常(abstractmethoderror)
(与#1相同的hive-site.xml)我尝试将hive exec 0.13添加到依赖项中。在第一次运行(insert)期间,它仍然抛出一个错误,但在第二次和随后的insert中,它是成功的。
我使用的是sandboxhdp2.2(hive0.14.0.2.2.0.0-2041)和spark1.2.0。
依赖项:
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-exec</artifactId>
<version>0.13.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-streaming-kafka_2.10</artifactId>
<version>1.2.0</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>1.2.0</version>
</dependency>
3条答案
按热度按时间fgw7neuy1#
在配置单元0.14.0版本(hive-8045)中添加了“sqlstdcononlyauthorizerfactory”类,但spark 1.2依赖于配置单元0.13。您的hive-site.xml必须将“hive.security.authorization.manager”设置为“org.apache.hadoop.hive.ql.security.authorization.plugin.sqlstd.sqlstdcononlyauthorizerfactory”,并且您的类路径不能有hive exec 0.14 jar,这就是它引发classnotfoundexception的原因。因此,要么在classpath中包含您的hive exec 0.14.0 jar(在spark自己的hive jar之前),要么将您在hive-site.xml中的条目更改为如下内容:-
不建议使用前者,因为由于配置单元版本不匹配,可能会进一步出现类似问题
g0czyy6m2#
更改的值
hive.security.authorization.manager = org.apache.hadoop.hive.ql.security.authorization.DefaultHiveAuthorizationProvider
工作。更改了hive-site.xml
oyxsuwqo3#
我认为这是因为类路径上有重复的jar。