weekofyear列在sparksql中变为null

h22fl7wq  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(408)

在这里,我正在为spark.sql编写sql语句,但是我无法将weekofyear转换为一年中的week,并且在下面的输出中得到一个null。我已经展示了我使用的表达式
输入数据

InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,01-12-2010 8.26,2.55,17850,United Kingdom
536365,71053,WHITE METAL LANTERN,6,01-12-2010 8.26,3.39,17850,United Kingdom
536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,01-12-2010 8.26,2.75,17850,United Kingdom
536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,01-12-2010 8.26,3.39,17850,United Kingdom

sql代码

val summarySQlTest = spark.sql(
  """
    |select Country,WEEKOFYEAR(InvoiceDate)as WeekNumber,
    |count(distinct(InvoiceNo)) as NumInvoices,
    |sum(Quantity) as TotalQuantity,
    |round(sum(Quantity*UnitPrice),2) as InvoiceValue
    |from sales
    |group by Country,WeekNumber
    |""".stripMargin
).show()

期望输出

+--------------+----------+-----------+-------------+------------+
     |       Country|WeekNumber|NumInvoices|TotalQuantity|InvoiceValue|
     +--------------+----------+-----------+-------------+------------+
     |         Spain|        49|          1|           67|      174.72|
     |       Germany|        48|         11|         1795|     3309.75|

我得到的输出

+--------------+----------+-----------+-------------+------------+
    |       Country|WeekNumber|NumInvoices|TotalQuantity|InvoiceValue|
    +--------------+----------+-----------+-------------+------------+
    |         Spain|      null|          1|           67|      174.72|
    |       Germany|      null|         11|         1795|     3309.75|

对于所需的输出,我使用了这个,但我想在spark.sql中解决相同的问题
如果有人能解释一下这里到底发生了什么(tou date(col(“invoicedate”),“dd-mm-yyyy h.mm”),那也太好了

val knowFunc=  invoicesDF.withColumn("InvoiceDate",to_date(col("InvoiceDate"),"dd-MM-yyyy H.mm"))
    .where("year(InvoiceDate) == 2010")
    .withColumn("WeekNumber",weekofyear(col("InvoiceDate")))
    .groupBy("Country","WeekNumber")
    .agg(sum("Quantity").as("TotalQuantity"),
    round(sum(expr("Quantity*UnitPrice")),2).as("InvoiceValue")).show()
xzv2uavs

xzv2uavs1#

你需要转换 InvoiceDate 列到日期类型第一(使用 to_date ),然后才能打电话 weekofyear . 我想这也回答了你的最后一个问题。

val summarySQlTest = spark.sql(
  """
    |select Country,WEEKOFYEAR(to_date(InvoiceDate,'dd-MM-yyyy H.mm')) as WeekNumber,
    |count(distinct(InvoiceNo)) as NumInvoices,
    |sum(Quantity) as TotalQuantity,
    |round(sum(Quantity*UnitPrice),2) as InvoiceValue
    |from sales
    |group by Country,WeekNumber
    |""".stripMargin
).show()

相关问题