我对Pig有意见。我试图通过将一个项目组合在一起并计算数字来计算它在某处出现的次数。然后我点了它们,把数量限制在前十位。当我转储有序集时,它工作正常,但当我尝试转储有限的集时,每次都失败。我四处寻找这个问题,什么也没找到。我能得到一些帮助吗?下面是代码。
lines = LOAD '/share/smallspoilers' USING PigStorage(':') AS (location:chararray,lNum:chararray,item:chararray,iNum:chararray);
newLines = FOREACH lines GENERATE(item),REPLACE(location, '"', '') AS location;
newerLines = FOREACH newLines GENERATE(item),REPLACE(location, ' ', '') AS location;
newestLines = FOREACH newerLines GENERATE(location),REPLACE(item, '"', '') AS item;
finalLines = FOREACH newestLines GENERATE(location),REPLACE(item, ' ', '') AS item;
filteredLines = FILTER finalLines BY (item matches 'Lamp');
grouped = GROUP filteredLines BY location;
counted = FOREACH grouped GENERATE group, COUNT(filteredLines) AS total;
ordered = ORDER counted BY total DESC;
prac = LIMIT ordered 10;
dump prac;
暂无答案!
目前还没有任何答案,快来回答吧!