按组计算pig查询中的百分比

iszxjhcz  于 2021-06-24  发布在  Pig
关注(0)|答案(1)|浏览(447)

我有一张有三列的表(typeofcrime,district,year)
例子:

typeofcrime district year
HOMICIDE 092 2016
THEFT 053 2017
HOMICIDE 075 2016
ASSAULT 025 2014

我想计算每个地区和年份的凶杀率。
像这样:

DISTRICT YEAR PERCENTAGE OF HOMICDE
075  2016   33%
092  2016    0%
025 2014     2%

我怎么用拉丁语做这个?

35g0bw71

35g0bw711#

你有table吗( A )每一行代表一个犯罪事件,你想计算每个地区和年份的凶杀率占该地区和年份所有犯罪的百分比,对吗?
可以使用嵌套的foreach语句执行此操作:

B = GROUP A BY (district, year);
C = FOREACH B {
    homicides = FILTER A BY typeofcrime == 'HOMICIDE';
    GENERATE 
        FLATTEN(group) AS (district, year),
        (float)COUNT(homicides)/(float)COUNT(A) AS homicidepercent;
};

相关问题