scala 带有空格的列名的Spark反勾号会导致错误

xe55xuns 于 2023-03-23 发布在 Scala

关注(0)|答案(2)|浏览(148)

使用Scala，我使用JDBC驱动程序将一个 Dataframe 读入内存（我遵循了这里的示例：https://techcommunity.microsoft.com/t5/azure-synapse-analytics-blog/query-serverless-sql-pool-from-an-apache-spark-scala-notebook/ba-p/2250968）。数据位于Synapse SQL Serverless Pool中，外部数据位于Lake中。其中一个字段的名称中有一个空格，虽然我可以在SELECT子句中使用反勾符号并成功执行查询，但当我在WHERE子句中这样做时，我会得到一个错误。

%%spark
df1.createOrReplaceTempView('temp1')
df1 =  sqlContext.sql("""select `Customer Id` from temp1 where `Customer Id` = 100 """)

I also tried:
%%sql 
select `Customer Id` from temp1 where `Customer Id` = 100

Error: 
Syntax error at or near 'Id': extra input 'Id'(line 1, pos 6)

scala

来源：https://stackoverflow.com/questions/75780732/spark-backtick-for-column-names-with-a-space-causes-an-error

2条答案

按热度按时间

7cjasjjr1#

在编写PySpark代码时，我们可以使用反引号。但是在编写sql代码时，我们可以使用方括号[<column name>]或双引号"<column name>"

赞(0）回复(0）举报 2023-03-23

yqlxgs2m2#

您可以尝试：

df2 =  sqlContext.sql("""select `Customer Id` from temp1 where "Customer Id" = 100 """)

如果反引号不起作用，则检查双引号。
然而，在不同的SQL符号中，这总是很复杂。所以我建议将Spark Dataframe列重命名为withColumnRenamed(existingName, newNam)。只需使用下划线或驼峰：CustomerId

df2 = df1.withColumnRenamed("Customer Id", "CustomerId")

然后使用df 2进行后续的操作：

df2.createOrReplaceTempView('temp2')
df3 =  sqlContext.sql("""select CustomerId from temp2 where CustomerId = 100 """)

希望能有所帮助！

赞(0）回复(0）举报 2023-03-23

我来回答

scala 带有空格的列名的Spark反勾号会导致错误

2条答案

相关问题

热门标签

最新问答