将sparkDataframe插入配置单元管理的acid表不工作,HDP3.0

kmbjn2e3  于 2021-06-25  发布在  Hive
关注(0)|答案(3)|浏览(623)

我在将sparkDataframe插入配置单元表时遇到问题。谁能帮帮我吗。hdp版本3.1,spark版本2.3提前感谢。
//原始代码部分

import org.apache.spark.SparkContext;
import com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl;
import org.apache.spark.sql.DataFrame
import com.hortonworks.hwc.HiveWarehouseSession;
import org.apache.spark.sql.SparkSession$;

val spark = SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")

**val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()**

/*
Some Transformation operations happend and the output of transformation is stored in VAL RESULT
/*
val result = {
  num_records
  .union(df.transform(profile(heatmap_cols2type)))
}

result.createOrReplaceTempView("out_temp"); //Create tempview

scala> result.show()
+-----+--------------------+-----------+------------------+------------+-------------------+
| type|              column|      field|             value|       order|               date|
+-----+--------------------+-----------+------------------+------------+-------------------+
|TOTAL|                 all|num_records|               737|           0|2019-12-05 18:10:12|
|  NUM|available_points_...|    present|               737|           0|2019-12-05 18:10:12|

hive.setDatabase("EXAMPLE_DB")
hive.createTable("EXAMPLE_TABLE").ifNotExists().column("`type`", "String").column("`column`", "String").column("`field`", "String").column("`value`","String").column("`order`", "bigint").column("`date`", "TIMESTAMP").create()

hive.executeUpdate("INSERT INTO TABLE EXAMPLE_DB.EXAMPLE_TABLE SELECT * FROM out_temp");

-----ERROR of Orginal code----------------
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:86 Table not found 'out_temp'**strong text**

我尝试的替代方法是:(因为hive和spark使用独立的目录,通过检查hwc write操作的文档)
spark.sql(“select type,column,field,value,order,date from out\u temp”).write.format(“hivewarehousesession.hive\u warehouse\u connector”).option(“table”,“wellington\u profile”).save()
-------可选步骤错误

xmjla07d

xmjla07d1#

----java.lang.classnotfoundexception:找不到数据源:hivewarehousesession.hive\u warehouse\u连接器。请在以下地址查找包裹http://spark.apache.org/third-party-projects.html 在org.apache.spark.sql.execution.datasources.datasource$.lookupdateasource(datasource。scala:639)位于org.apache.spark.sql.dataframewriter.save(dataframewriter。scala:241) ... 58省略原因:java.lang.classnotfoundexception:hivewarehousesession.hive\u warehouse\u connector.defaultsource
我的问题是:
有没有办法直接在配置单元中创建表,而不是将out\u temp保存为spark中的tempview?有没有办法从sparkDataframe插入配置单元表?
谢谢大家抽出时间!

sz81bmfz

sz81bmfz2#

result.write.mode(SaveMode.Overwrite).saveAsTable("EXAMPLE_TABLE")

你可以从这里详细阅读

rur96b6h

rur96b6h3#

result.write.save(“example\u table.parquet”)

相关问题