java—配置单元中带条件参数的数组的大小

6pp0gazn 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(378)

我有一个数据集，它有一个带有时间戳数组的列和一个只有一个时间戳的列。我希望得到数组的大小，使用c1时间戳作为变大变小的条件。
表（我的表）：

c1 |           c2           |
----------------------------|
4  | [1,2,3,4,5,6,7,8,9,10] |
1  | [1,2,3,4,5,6,7,8,9,10] |
5  | [1,2,3,4,5,6,7,8,9,10] |
3  | [1,2,3,4,5,6,7,8,9,10] |

查询：

select
c1,
c2,
size(some_udf_split_on_c1(sort_array(<array>), c1)[1]) AS smaller_than_c1
size(some_udf_split_on_c1(sort_array(<array>), c1)[2]) AS larger_than_c1

from my_table

udf是我假设的实现。
输出：

c1 |           c2           | smaller_than_c1 | larger_than_c1
----------------------------|-----------------|---------------
4  | [1,2,3,4,5,6,7,8,9,10] |        3        |      6
1  | [1,2,3,4,5,6,7,8,9,10] |        0        |      9
5  | [1,2,3,4,5,6,7,8,9,10] |        4        |      5
3  | [1,2,3,4,5,6,7,8,9,10] |        1        |      8

Java hadoop Hive udf Arrays

来源：https://stackoverflow.com/questions/31502795/size-of-array-with-conditional-argument-in-hive