我在将sparkDataframe插入配置单元表时遇到问题。谁能帮帮我吗。hdp版本3.1,spark版本2.3提前感谢。
//原始代码部分
import org.apache.spark.SparkContext;
import com.hortonworks.spark.sql.hive.llap.HiveWarehouseSessionImpl;
import org.apache.spark.sql.DataFrame
import com.hortonworks.hwc.HiveWarehouseSession;
import org.apache.spark.sql.SparkSession$;
val spark = SparkSession.builder.getOrCreate()
spark.sparkContext.setLogLevel("ERROR")
**val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()**
/*
Some Transformation operations happend and the output of transformation is stored in VAL RESULT
/*
val result = {
num_records
.union(df.transform(profile(heatmap_cols2type)))
}
result.createOrReplaceTempView("out_temp"); //Create tempview
scala> result.show()
+-----+--------------------+-----------+------------------+------------+-------------------+
| type| column| field| value| order| date|
+-----+--------------------+-----------+------------------+------------+-------------------+
|TOTAL| all|num_records| 737| 0|2019-12-05 18:10:12|
| NUM|available_points_...| present| 737| 0|2019-12-05 18:10:12|
hive.setDatabase("EXAMPLE_DB")
hive.createTable("EXAMPLE_TABLE").ifNotExists().column("`type`", "String").column("`column`", "String").column("`field`", "String").column("`value`","String").column("`order`", "bigint").column("`date`", "TIMESTAMP").create()
hive.executeUpdate("INSERT INTO TABLE EXAMPLE_DB.EXAMPLE_TABLE SELECT * FROM out_temp");
-----ERROR of Orginal code----------------
Caused by: org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10001]: Line 1:86 Table not found 'out_temp'**strong text**
我尝试的替代方法是:(因为hive和spark使用独立的目录,通过检查hwc write操作的文档)
spark.sql(“select type,column,field,value,order,date from out\u temp”).write.format(“hivewarehousesession.hive\u warehouse\u connector”).option(“table”,“wellington\u profile”).save()
-------可选步骤错误
3条答案
按热度按时间xmjla07d1#
----java.lang.classnotfoundexception:找不到数据源:hivewarehousesession.hive\u warehouse\u连接器。请在以下地址查找包裹http://spark.apache.org/third-party-projects.html 在org.apache.spark.sql.execution.datasources.datasource$.lookupdateasource(datasource。scala:639)位于org.apache.spark.sql.dataframewriter.save(dataframewriter。scala:241) ... 58省略原因:java.lang.classnotfoundexception:hivewarehousesession.hive\u warehouse\u connector.defaultsource
我的问题是:
有没有办法直接在配置单元中创建表,而不是将out\u temp保存为spark中的tempview?有没有办法从sparkDataframe插入配置单元表?
谢谢大家抽出时间!
sz81bmfz2#
你可以从这里详细阅读
rur96b6h3#
result.write.save(“example\u table.parquet”)