hiveql-sql语法/函数帮助

wtzytmuj  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(242)

我有下面的查询,它在SQLServer中运行良好,但我正尝试在hadoop/hiveql中使用它。
但是,hiveql没有top 1函数或outer apply。有人能提供一个替代方案,但结果相同吗?
下面是数据(表格),下面是我试图实现的结果。
谢谢,丹尼

  1. CREATE TABLE #temp
  2. (
  3. ID varchar(20)NOT NULL
  4. ,CreatedDate DATETIME NOT NULL
  5. ,CompletedDate DATETIME NOT NULL
  6. ,TYPES varchar(20) NOT NULL
  7. ,STATUSS varchar(20) NOT NULL
  8. );
  9. INSERT INTO #temp
  10. VALUES ('61030203647','20160427','20160427','Re-Activattion', 'COMP');
  11. INSERT INTO #temp
  12. VALUES('61030203647','20160425','20160426','Re-Activattion', 'N-CO');
  13. INSERT INTO #temp
  14. VALUES('61030203647','20160422','20160422','Re-Activattion', 'N-CO');
  15. INSERT INTO #temp
  16. VALUES('61030203647','20170311','20170613','Re-Activattion', 'COMP');
  17. INSERT INTO #temp
  18. VALUES('64074558792','20160731','20160805','Re-Activattion','N-CO');
  19. INSERT INTO #temp
  20. VALUES('64074558792','20160801','20160805','Re-Activattion','PARTIALLY');
  21. INSERT INTO #temp
  22. VALUES('64074558792','20160809','20160809','Re-Activattion','PARTIALLY');
  23. INSERT INTO #temp
  24. VALUES('64074558792','20160810','20160810','Re-Activattion','N-CO');
  25. INSERT INTO #temp
  26. VALUES('64074558792','20160810','20160810','Re-Activattion','N-CO');
  27. INSERT INTO #temp
  28. VALUES('64074558792','20160811','20160811','Re-Activattion','COMP');
  29. INSERT INTO #temp
  30. VALUES('64074558792','20160812','20160814','Re-Activattion','N-CO');
  31. ;WITH src AS (
  32. SELECT ID, CreatedDate, CompletedDate, TYPES, STATUSS,
  33. ROW_NUMBER() OVER(PARTITION BY ID ORDER BY CreatedDate , CompletedDate) AS rn
  34. FROM #temp
  35. )
  36. ,grouped as (
  37. Select s.* ,d.rnGrp from src s
  38. outer apply (select top 1 rn rnGrp from src s2
  39. where s.ID=s2.ID and s2.STATUSS='COMP' and s2.rn>=s.rn ) d(rnGrp))
  40. ,grouped1 as (
  41. Select ID, min(CreatedDate) CreatedDate, max(CompletedDate) CompletedDate
  42. ,rnGrp,
  43. Case when SUM(CASE WHEN STATUSS = 'COMP' THEN 1 ELSE 0 END) >0 then
  44. Case when TYPES='De-Activattion' then 'NOT A RE-ACT' else
  45. CAST( DATEDIFF(day,min(CreatedDate) ,max(CompletedDate) ) AS VARCHAR(25))
  46. END
  47. ELSE 'NOT COMPLETED' END AS ACT_COMPLETION_TIME
  48. ,Sum(CASE WHEN STATUSS = 'N-CO' THEN 1 ELSE 0 END) as [ACT NCO #]
  49. From grouped
  50. Group by ID, rnGrp,TYPES
  51. )
  52. ,grouped2 as (
  53. select ID, CreatedDate, CompletedDate, ACT_COMPLETION_TIME, [ACT NCO #]
  54. ,Count(*) Over(Partition by ID) cnt
  55. ,row_number()Over(Partition by ID Order by CreatedDate) rn
  56. from grouped1
  57. )
  58. Select g2.ID,
  59. Stuff(Convert(varchar(11),g2.CreatedDate,100),4,4,'-') as MIN_CREATED_MONTH_YEAR
  60. ,g2.ACT_COMPLETION_TIME, g2.[ACT NCO #]
  61. from grouped2 g2
  62. left join grouped2 g3 on g2.ID=g3.ID and g2.rn=g3.rn+1

希望

  1. CREATE TABLE #temp2
  2. (
  3. ID varchar(20) NOT NULL
  4. ,MIN_CREATED_MONTH_YEAR varchar(20)
  5. ,ACT_COMPLETION_TIME varchar(20)
  6. ,ACT_NCO varchar(20)
  7. );
  8. INSERT INTO #temp2
  9. VALUES ('61030203647','Apr-2016','5','2');
  10. INSERT INTO #temp2
  11. VALUES ('61030203647','Mar-2017','94','0');
  12. INSERT INTO #temp2
  13. VALUES ('64074558792','Jul-2016','11','3');
  14. SELECT *
  15. FROM #temp2

我正在尝试在2周内为分组id添加2周的容差。所附的查询做了它需要做的,我知道如何在最后添加2周的公差(我没有包括在这里,不复杂的事情)
我的解释如下,基本上我真正需要的是用hiveql编写的外部apply和top 1,因为这些函数不存在。
说明:2016年和2017年分别创建了61030203647个重新行动请求(过去2周,因此可以将其视为提出的两个独立问题(应为5天(最小创建日期)到27天(最大完成日期))
然而,64074558792,在11.08完成,在31.07创建,然后在12.08创建了另一个重演。这可能是一个错误,这是在2周内的公差,如果在完成日期的2周内将它们视为同一个问题(与上面的示例不同,该示例过去2周,被分隔为2个请求)

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题