如何使用spark执行引擎运行配置单元(apache配置单元版本2.1.1和apache spark版本2.2.0)

gcuhipw9 于 2021-06-26 发布在 Hive

关注(0)|答案(1)|浏览(704)

我们已经将配置单元执行引擎从mapreduce切换到spark，并尝试使用 beeline 以及 jdbc .
我们能够运行简单的查询（例如： select * from table )因为它不需要处理数据，但是当我们尝试运行包含聚合函数的查询时（例如： select count(*) from table )我们面临以下错误：

Query ID = hadoop_20180105123047_5bcd0d7a-78bd-4b66-b5fb-fc430726c2a9
Total jobs = 1
Launching Job 1 out of 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Failed to execute spark task, with exception 'org.apache.hadoop.hive.ql.metadata.HiveException(Failed to create spark client.)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

有什么问题吗？

Hive apache-spark-sql

来源：https://stackoverflow.com/questions/48108920/how-to-run-hive-with-spark-execution-engine-apache-hive-version-2-1-1-and-apach