hadoop软件version:hadoop-2.10.1
配置单元版本:hive-2.3.7
spark版本:spark-2.4.7
tez引擎:tez-0.9.2
我一直在用spark和tez执行引擎开发hive。我已经从dat文件创建了文本格式的表,然后用parquet从它们创建了新表
create table if not exists inventory(.......)
ROW FORMAT DELIMITED FIELDS TERMINATED BY '|' STORED AS TEXTFILE;
LOAD DATA LOCAL INPATH '/home/hadoop/tpcds-kit/dats/inventory.dat' INTO TABLE inventory;
create table if not exists inventory_p(.....)
STORED AS PARQUET;
insert overwrite table inventory_p select * from inventory;
我能够使用mr和tez引擎在这些表上运行查询。但无法使用spark引擎运行查询。下面是错误。此错误仅发生在Parquet地板表中,而不会发生在使用文本文件创建的表中。
Query ID = hadoop_20201231231436_849a5c3e-6987-4fb5-be62-2a74258d77d4
Total jobs = 2
Launching Job 1 out of 2
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapreduce.job.reduces=<number>
Starting Spark Job = 7ce7367f-80a0-4bd3-8949-2acc1f7d6f96
Running with YARN Application = application_1609424642389_0009
Kill Command = /home/hadoop/hadoop/bin/yarn application -kill application_1609424642389_0009
Query Hive on Spark job[0] stages: [0]
Status: Running (Hive on Spark job[0])
--------------------------------------------------------------------------------------
STAGES ATTEMPT STATUS TOTAL COMPLETED RUNNING PENDING FAILED
--------------------------------------------------------------------------------------
Stage-0 0 PENDING 4 0 0 4 0
--------------------------------------------------------------------------------------
STAGES: 00/01 [>>--------------------------] 0% ELAPSED TIME: 2.04 s
--------------------------------------------------------------------------------------
Job failed with java.lang.NoSuchMethodError: org.apache.parquet.column.values.ValuesReader.initFromPage(I[BI)V
20/12/31 23:14:52 [HiveServer2-Background-Pool: Thread-41]: ERROR SessionState: Job failed with java.lang.NoSuchMethodError: org.apache.parquet.column.values.ValuesReader.initFromPage(I[BI)V
java.util.concurrent.ExecutionException: Exception thrown by job
at org.apache.spark.JavaFutureActionWrapper.getImpl(FutureAction.scala:337)
at org.apache.spark.JavaFutureActionWrapper.get(FutureAction.scala:342)
at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:362)
at org.apache.hive.spark.client.RemoteDriver$JobWrapper.call(RemoteDriver.java:323)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
暂无答案!
目前还没有任何答案,快来回答吧!