derby元存储目录是在spark工作区中创建的

mrwjdhj3  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(421)

我已经安装了spark2.1.0并与eclipse和hive2集成,mysql中配置的metastore也将hive-site.xml文件放在spark>>conf文件夹中。我正在尝试从eclipse访问配置单元中已经存在的表。当我执行程序时,metastore文件夹和derby.log文件已在spark workspace中创建,eclipse控制台显示以下信息:

Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
17/06/13 18:26:43 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/06/13 18:26:43 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/06/13 18:26:43 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
17/06/13 18:26:43 INFO Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
17/06/13 18:26:43 INFO Query: Reading in results for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection used is closing
17/06/13 18:26:43 INFO MetaStoreDirectSql: Using direct SQL, underlying DB is MYSQL

spark无法定位已配置的mysql元存储数据库
同时抛出错误

Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':

代码:

import org.apache.spark.SparkContext, org.apache.spark.SparkConf
import com.typesafe.config._
import org.apache.spark.sql.Row
import org.apache.spark.sql.SparkSession

object hivecore {

  def main(args: Array[String]) {

val warehouseLocation = "hdfs://HADOOPMASTER:54310/user/hive/warehouse"

val spark = SparkSession
  .builder().master("local[*]")
  .appName("hivecore")
  .config("spark.sql.warehouse.dir", warehouseLocation)
  .enableHiveSupport()
  .getOrCreate()

import spark.implicits._
import spark.sql

sql("SELECT * FROM sample.source").show()

}
}

构建.sbt

libraryDependencies += "org.apache.spark" % "spark-core_2.11" % "2.1.0"
libraryDependencies += "com.typesafe" % "config" % "1.3.0" 
libraryDependencies += "org.apache.spark" % "spark-sql_2.11" % "2.1.0"
libraryDependencies += "org.apache.spark" % "spark-hive_2.11" % "2.1.0"
libraryDependencies += "mysql" % "mysql-connector-java" % "5.1.42"

注意:我可以从spark shell访问配置单元表
谢谢

7vux5j2d

7vux5j2d1#

当你把 context.setMaster(local) ,它可能不会查找您在集群中设置的spark配置;特别是当你从Eclipse中触发它的时候。
用它做一个jar;从cmd作为 spark-submit --class <main class package> --master spark://207.184.161.138:7077 --deploy-mode client 主ip: spark://207.184.161.138:7077 应替换为群集的ip和spark端口。
并且,记住初始化hivecontext以触发对基础配置单元的查询。

val hc = new HiveContext(sc)
hc.sql("SELECT * FROM ...")

相关问题