当尝试使用databricks-connect 13.2.0执行本地spark代码时,它不起作用。
我有以下问题:
错误代码:
- 详细信息=
"INVALID_STATE: cluster xxxxx is not Shared or Single User Cluster. (requestId=05bc3105-4828-46d4-a381-7580f3b55416)"
- debug_error_string =
"UNKNOWN:Error received from peer {grpc_message:"INVALID_STATE: cluster 0711-122239-bb999j6u is not Shared or Single User Cluster. (requestId=05bc3105-4828-46d4-a381-7580f3b55416)", grpc_status:9, created_time:"2023-07-11T15:26:08.9729+02:00"}"
集群是共享的,我尝试了几个集群配置,但它不工作!集群运行时版本是13.2。
另外,我用途:
- Python 3.10
- openjdk版本“1.8.0_292”
- Azure Databricks
任何人都有一个类似的问题与新的数据库连接?
多谢帮忙!
我尝试了以下代码:
from databricks.connect import DatabricksSession
from pyspark.sql.types import *
from delta.tables import DeltaTable
from datetime import date
if __name__ == "__main__":
spark = DatabricksSession.builder.getOrCreate()
# Create a Spark DataFrame consisting of high and low temperatures
# by airport code and date.
schema = StructType([
StructField('AirportCode', StringType(), False),
StructField('Date', DateType(), False),
StructField('TempHighF', IntegerType(), False),
StructField('TempLowF', IntegerType(), False)
])
data = [
[ 'BLI', date(2021, 4, 3), 52, 43],
[ 'BLI', date(2021, 4, 2), 50, 38],
[ 'BLI', date(2021, 4, 1), 52, 41],
[ 'PDX', date(2021, 4, 3), 64, 45],
[ 'PDX', date(2021, 4, 2), 61, 41],
[ 'PDX', date(2021, 4, 1), 66, 39],
[ 'SEA', date(2021, 4, 3), 57, 43],
[ 'SEA', date(2021, 4, 2), 54, 39],
[ 'SEA', date(2021, 4, 1), 56, 41]
]
temps = spark.createDataFrame(data, schema)
print(temps)
字符串
我希望在5月的本地终端显示与远程Spark执行的框架
2条答案
按热度按时间bbmckpt71#
Databricks Connect V2要求群集支持Unity Catalog -这在需求中有明确说明。看起来您使用的是“无隔离共享”数据访问模式,或者您根本没有Unity Catalog。如果您有Unity Catalog,请确保您在“访问模式”下选择了
Single User
或Shared
。x1c 0d1x的数据
qij5mzcb2#
确保在摘要中创建群集时,您会看到
Unity Catalog
标记:正如Alex Ott所回答的那样,在下拉访问模式中选择
Single User
或Shared
。