在hadoop上引用pig中包中的元素

ryoqjall 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(253)

我有一个别名叫student，数据结构是这样的（命令的结果 describe ),

studentIDInt:int,courses:bag{(courseId:int,testID:int,score:int)}

然后我试着用分数过滤学生，但是遇到了这样的Pig解析错误，如果有人有什么好的想法，那就太好了。谢谢。
对错误消息中报告的附加元组感到困惑。

student = filter student by courses.score > 3;

incompatible types in GreaterThan Operator left hand side:bag :tuple(score:int)  right hand score:int

你好，林

hadoop apache-pig

来源：https://stackoverflow.com/questions/36670479/refer-elements-in-bag-in-pig-on-hadoop

1条答案

按热度按时间

11dmarpk1#

你不能直接做。可能的解决办法是先展平，过滤，然后再分组

flat_student = foreach student generate studentIDInt, flatten(courses);
filtered_student = filter flat_student by score > 3;
final_student = group filtered_student by studentIDInt;

另一种方法是编写自定义filterfunc，所以您可以选择什么。

赞(0）回复(0）举报 2021-05-30

我来回答

在hadoop上引用pig中包中的元素

1条答案

相关问题

热门标签

最新问答