pyspark 如何在spark数据库中转换 snowflake 函数?

mmvthczy  于 2023-03-01  发布在  Spark
关注(0)|答案(1)|浏览(110)

在snowflake中,我可以很好地运行以下查询:

select business_month 
from table1
where business_month between TO_VARCHAR(ADD_MONTHS(TO_DATE('202308', 'YYYYMM'),6), 'YYYYMM')
                         and TO_VARCHAR(ADD_MONTHS(TO_DATE('202406', 'YYYYMM'),12), 'YYYYMM');

当我在spark中运行它时,我得到这个错误:

pyspark.sql.utils.AnalysisException: u"Undefined function: 'TO_VARCHAR'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 193 pos 23"

下面是我的代码:

DF = spark.sql ("""
select business_month 
from table1
where business_month between TO_VARCHAR(ADD_MONTHS(TO_DATE('202308', 'YYYYMM'),6), 'YYYYMM')
                         and TO_VARCHAR(ADD_MONTHS(TO_DATE('202406', 'YYYYMM'),12), 'YYYYMM'
""");

DF.createOrReplaceTempView('TEST_TABLE')

我假设Spark没有内置这个功能。有没有办法复制这个逻辑,使它与Spark兼容?

rta7y2nd

rta7y2nd1#

您可以使用CAST而不是TO_VARCHAR(我还认为TO_DATE可能需要更改为DATEADD,而且我认为YYYYMM在SQL中不受支持,但YYYYMMDD是受支持的,因此我们必须硬编码日期,我选择了每月的第一天)。

DF = spark.sql ("""
select business_month 
from table1
where business_month between CAST(DATEADD(month, 6, CAST('20230801' AS DATE)) AS VARCHAR)
                         and CAST(DATEADD(month, 12, CAST('20240601' AS DATE)) AS VARCHAR)
""");

相关问题