apache spark未连接到配置单元元存储(未找到数据库)

6ss1mwsb  于 2021-05-27  发布在  Hadoop
关注(0)|答案(0)|浏览(222)

我有一个java spark代码,我正在尝试连接hive数据库。但是它只有默认的数据库,并且没有给我任何这样的数据库异常。我尝试了以下方法来设置配置单元元存储。
在带有配置单元元存储uri的代码中添加spark conf
在spark submit中添加spark conf
在resources文件夹中添加hive-site.xml
复制spark conf(/etc/spark2/conf/hive site.xml)中的hive-site.xml
另外,在运行时加载的配置单元配置文件与(/etc/hive/conf/hive site.xml)相同

SparkConf sparkConf = new SparkConf();
sparkConf.setAppName("example");
JavaSparkContext sc = new JavaSparkContext(sparkConf);
final SparkSession spark = SparkSession
                .builder()
                .appName("Java Spark Hive Example")
                .config("hive.metastore.uris", "thrift://***:1234")
                .config("spark.sql.uris", "thrift://***:1234")
                .config("hive.metastore.warehouse.dir", "hdfs://***:1234/user/hive/warehouse/")
                .enableHiveSupport()
                .getOrCreate();
JavaRDD<sampleClass> rdd = sc.parallelize(sample);

Dataset<Row> df2 = spark.createDataFrame(rdd, sampleClass.class);

spark.sql("show databases").show();

spark提交的日志如下。

spark-submit --class sampleClass \
> --master local --deploy-mode client --executor-memory 1g \
> --name sparkTest --conf "spark.app.id=SampleLoad" \
> --files /etc/spark/conf/hive-site.xml load-1.0-SNAPSHOT-all.jar
20/03/16 12:33:19 INFO SparkContext: Running Spark version 2.3.0.2.6.5.0-292
20/03/16 12:33:19 INFO SparkContext: Submitted application: SampleLoad
20/03/16 12:33:19 INFO SecurityManager: Changing view acls to: root,User
20/03/16 12:33:19 INFO SecurityManager: Changing modify acls to: root,User
20/03/16 12:33:19 INFO SecurityManager: Changing view acls groups to:
20/03/16 12:33:19 INFO SecurityManager: Changing modify acls groups to:
20/03/16 12:33:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(root, User); groups with view
permissions: Set(); users  with modify permissions: Set(root, User); groups with modify permissions: Set()
20/03/16 12:33:19 INFO Utils: Successfully started service 'sparkDriver' on port 35746.
20/03/16 12:33:19 INFO SparkEnv: Registering MapOutputTracker
20/03/16 12:33:19 INFO SparkEnv: Registering BlockManagerMaster
20/03/16 12:33:19 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/03/16 12:33:19 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/03/16 12:33:19 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-b946b14f-a52d-4467-8028-503ed7ae93da
20/03/16 12:33:19 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/03/16 12:33:19 INFO SparkEnv: Registering OutputCommitCoordinator
20/03/16 12:33:19 INFO Utils: Successfully started service 'SparkUI' on port 4042.
20/03/16 12:33:19 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://sample:4042
20/03/16 12:33:19 INFO SparkContext: Added JAR file:/abc/xyz/load-1.0-SNAPSHOT-all.jar at spark://sample:35746/jars/load-1.0-SNAPSHOT-all.jar with timestamp 1584347599756
20/03/16 12:33:19 INFO SparkContext: Added file file:///etc/spark/conf/hive-site.xml at file:///etc/spark/conf/hive-site.xml with timestamp 1584347599776
20/03/16 12:33:19 INFO Utils: Copying /etc/spark/conf/hive-site.xml to /tmp/spark-914265c5-6115-4aca-8b85-2cd49a530fae/userFiles-aaca5153-ce38-489a-a020-c2477fddc66e/hi
ve-site.xml
20/03/16 12:33:19 INFO Executor: Starting executor ID driver on host localhost
20/03/16 12:33:19 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 45179.
20/03/16 12:33:19 INFO NettyBlockTransferService: Server created on sample:45179
20/03/16 12:33:19 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
20/03/16 12:33:19 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, sample, 45179, None)
20/03/16 12:33:19 INFO BlockManagerMasterEndpoint: Registering block manager sample:45179 with 366.3 MB RAM, BlockManagerId(driver, lhdpegde2u.enbduat.c
om, 45179, None)
20/03/16 12:33:19 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, sample, 45179, None)
20/03/16 12:33:19 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, sample, 45179, None)
20/03/16 12:33:20 INFO EventLoggingListener: Logging events to hdfs:/spark2-history/local-1584347599812
20/03/16 12:33:20 WARN SparkContext: Using an existing SparkContext; some configuration may not take effect.
20/03/16 12:33:20 INFO SharedState: loading hive config file: file:/etc/spark2/2.6.5.0-292/0/hive-site.xml
20/03/16 12:33:21 INFO SharedState: spark.sql.warehouse.dir is not set, but hive.metastore.warehouse.dir is set. Setting spark.sql.warehouse.dir to the value of hive.metastore.warehouse.dir ('/apps/hive/warehouse').
20/03/16 12:33:21 INFO SharedState: Warehouse path is '/apps/hive/warehouse'.
20/03/16 12:33:21 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
20/03/16 12:33:22 INFO CodeGenerator: Code generated in 184.728545 ms
20/03/16 12:33:23 INFO CodeGenerator: Code generated in 10.538159 ms
20/03/16 12:33:23 INFO CodeGenerator: Code generated in 8.809847 ms
+-------+----------------+--------------------+
|   name|     description|         locationUri|
+-------+----------------+--------------------+
|default|default database|/apps/hive/warehouse|
+-------+----------------+--------------------+

20/03/16 12:33:23 INFO CodeGenerator: Code generated in 7.13541 ms
20/03/16 12:33:23 INFO CodeGenerator: Code generated in 5.771691 ms
+------------+
|databaseName|
+------------+
|     default|
+------------+

Exception in thread "main" org.apache.spark.sql.catalyst.analysis.NoSuchDatabaseException: Database 'sample' not found;
        at org.apache.spark.sql.catalyst.catalog.SessionCatalog.org$apache$spark$sql$catalyst$catalog$SessionCatalog$$requireDbExists(SessionCatalog.scala:177)
        at org.apache.spark.sql.catalyst.catalog.SessionCatalog.setCurrentDatabase(SessionCatalog.scala:259)
        at org.apache.spark.sql.execution.command.SetDatabaseCommand.run(databases.scala:59)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
        at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
        at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
        at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
        at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3253)
        at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
        at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3252)
        at org.apache.spark.sql.Dataset.<init>(Dataset.scala:190)
        at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
        at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:638)
        at ProcessXML.main(ProcessXML.java:95)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
        at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:906)
        at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
        at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
20/03/16 12:33:23 INFO SparkContext: Invoking stop() from shutdown hook
20/03/16 12:33:23 INFO SparkUI: Stopped Spark web UI at http://sample:4042
20/03/16 12:33:24 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/03/16 12:33:24 INFO MemoryStore: MemoryStore cleared
20/03/16 12:33:24 INFO BlockManager: BlockManager stopped
20/03/16 12:33:24 INFO BlockManagerMaster: BlockManagerMaster stopped
20/03/16 12:33:24 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/03/16 12:33:24 INFO SparkContext: Successfully stopped SparkContext
20/03/16 12:33:24 INFO ShutdownHookManager: Shutdown hook called
20/03/16 12:33:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-37386c3b-855a-4e09-a372-e8d12a08eebc
20/03/16 12:33:24 INFO ShutdownHookManager: Deleting directory /tmp/spark-914265c5-6115-4aca-8b85-2cd49a530fae

请告诉我哪里出了问题。
提前谢谢,
戈特姆河

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题