我有一个非常简单的2列数据,有一个双精度和一个字符:
user1 234.43
user1 432.23
user2 4321.213
etc.
我想按用户分组,然后计算他们双倍的平均值。怎样?我需要一个“团体”吗?我试着去学例2http://wiki.apache.org/pig/pigoverview ,但对我没用。
selfReportsAndDiscrepancies = FOREACH discrepancies1 GENERATE discrepancy,selfReportedText;
perDiscrepancy = GROUP selfReportsAndDiscrepancies BY selfReportedText;
allDiscrep = group perDiscrepancy all;
means = FOREACH allDiscrep GENERATE AVG(perDiscrepancy.discrepancy);
DUMP means;
DESCRIBE means;
给了我:
2013-04-02 17:54:06,611 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1128: Cannot find field discrepancy in group:chararray,selfReportsAndDiscrepancies:bag{:tuple(discrepancy:double,selfReportedText:chararray)}
1条答案
按热度按时间mspsb9vt1#
我希望我对你的理解是正确的,你想要小组平均数的平均值:
结果: