无法在pig中跨扇区创建工人值的聚合(错误1066)

wqlqzqxt  于 2021-06-01  发布在  Hadoop
关注(0)|答案(0)|浏览(202)

全新的Pig,只是试图创造一个由城市和赞助人工业部门的集合,我很难做到这一点。我收到了 ERROR 1066: Unable to open iterator for alias test 我的目标是根据每个城市的工人数量对工业部门进行排名。影响的东西 New York City: Finance 20, Accounting 15, Shoemaking 30 等。我错过了什么或做错了什么?

bus_data = LOAD 'sectorAnalysis.json' USING com.twitter.elephantbird.pig.load.JsonLoader('-nestedLoad') as (row: map[]);

bus_data_rows = FOREACH bus_data GENERATE (chararray) row# 'city' AS city, row# 'state' AS state, row# 'sectors' AS sectors, row# 'workers' AS workers;

flattened_bus = FOREACH bus_data_rows GENERATE city, state, FLATTEN(sectors) as sector, workers;

distinct_flat_bus = DISTINCT flattened_bus;

group_by_sec = GROUP distinct_flat_bus BY (city, sector);

sum_sec = FOREACH group_by_sec GENERATE flatten(group) AS (city, sector), SUM(workers) AS worker_T;

DUMP sum_sec;

数据格式:

(Brookville, NY, (product 1), 12)
(Tempe, AZ, (product 3), 13)
(Brookville, NY, (product 1), 9)
(Miami, FL, (Product 2), 10)
(Brookvile, NY, (product 2), 15)

预期的最终结果如下:

(Brookville, NY, (product 1), 21)
(Brookville, NY, (product 2), 15)
(Tempe, AZ, (product 3), 13)
(Miami, FL, (Product 2), 10)

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题