piglatin limit and flatten产生错误的结果

b5lpy0ml  于 2021-06-24  发布在  Pig
关注(0)|答案(1)|浏览(460)
  1. B = GROUP A BY state;
  2. C = FOREACH B {
  3. DA = ORDER A BY population DESC;
  4. DB = LIMIT DA 5;
  5. GENERATE FLATTEN(group), FLATTEN(DB.name), FLATTEN(DB.population);
  6. }

问题是我得到的城市名称是5次而不是1次。我得到的结果是:

  1. (ALASKA,M,27257)
  2. (ALASKA,M,23696)
  3. (ALASKA,M,19949)
  4. (ALASKA,M,19926)
  5. (ALASKA,M,19833)
  6. (ALASKA,H,27257)
  7. (ALASKA,H,23696)
  8. (ALASKA,H,19949)
  9. (ALASKA,H,19926)
  10. (ALASKA,H,19833)

我需要的结果是:

  1. (ALASKA,M,27257)
  2. (ALASKA,H,23696)
bqf10yzr

bqf10yzr1#

2展平:展平(db.name),展平(db.population);造成卡特里安产品在两袋之间,用一袋替换

  1. B = GROUP A BY state;
  2. C = FOREACH B {
  3. DA = ORDER A BY population DESC;
  4. DB = LIMIT DA 5;
  5. GENERATE FLATTEN(group), FLATTEN(DB.(name, population));
  6. }

或者,由于group by创建的包包含所有原始元组和所有列,因此可以执行以下操作:

  1. B = GROUP A BY state;
  2. C = FOREACH B {
  3. DA = ORDER A BY population DESC;
  4. DB = LIMIT DA 5;
  5. GENERATE FLATTEN(DB);
  6. }

相关问题