我使用三元运算符来包含 SUM()
有条件地操作。我就是这样做的。
GROUPED = GROUP ALL_MERGED BY (fld1, fld2, fld3);
REPORT_DATA = FOREACH GROUPED
{ GENERATE group,
SUM(GROUPED.fld4 == 'S' ? GROUPED.fld5 : 0) AS sum1,
SUM(GROUPED.fld4 == 'S' ? GROUPED.fld5 : (GROUPED.fld5 * -1)) AS sum2;
}
的架构 ALL_MERGED
是
{ALL_MERGED: {fld1:chararray, fld2:chararray, fld3:chararray, fld4:chararray: fld5:int}}
当我执行此操作时,会出现以下错误:
ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1000: Error during parsing. Invalid alias: SUM in {group: (fld1:chararray, fld2:chararray, fld3:chararray), ALL_MERGED: {fld1:chararray, fld2:chararray, fld3:chararray, fld4:chararray: fld5:int}}
我做错什么了?
1条答案
按热度按时间xdnvmnnf1#
SUM
是一个以包作为输入的自定义项。你所做的有很多问题,我想这会帮助你复习一本关于pig的好参考书。我推荐编程Pig,可免费上网。首先,GROUPED
有两个字段:一个称为group
还有一个叫ALL_MERGED
,这就是错误消息试图告诉您的内容(我之所以说“尝试”,是因为pig的错误消息通常都很神秘。)此外,不能像您希望的那样将表达式传递给UDF。相反,你必须
GENERATE
然后经过这些田地。试试这个: