需要有关Spark-SQL数据块的子查询的帮助

huwehgph  于 2022-11-25  发布在  Apache
关注(0)|答案(1)|浏览(152)

我有下面提到的SQL和得到下面提到的数据集作为结果。但我想只显示一个打开状态记录,它有最小日期。

SELECT distinct o.svc_ord_nbr AS SVC_ORD_NBR,
  o.svc_ord_stat_nm AS SVC_ORD_STAT_NM,
  min(t.start_date_est) AS STRT_DT_EST, t.status_text
FROM A o inner join B t on t.ticket=o.notif_nbr
  and o.svc_ord_nbr in ('021519_574819','110714_246149')
Group by o.svc_ord_nbr, o.svc_ord_stat_nm, t.status_text

Result数据集如下所示:

我只想要第一行,这是有最小的STRT_DT_EST。提前感谢...

cunj1qz1

cunj1qz11#

您是否尝试过在此用例中使用窗口函数。

spark.sql(
 “””
 |SELECT a.*,
 |ROW_NUMBER() OVER(PARTITION BY dept ORDER BY salary) as rn,
 |RANK() OVER(PARTITION BY dept ORDER BY salary) as rank,
 |DENSE_RANK() OVER(PARTITION BY dept ORDER BY salary) as dense_rank,
 |PERCENT_RANK() OVER(PARTITION BY dept ORDER BY salary) as percent_rank,
 |NTILE(3) OVER(PARTITION BY dept ORDER BY salary) as ntile
 |FROM employee a
 |”””.stripMargin).show(false)

相关问题