pig:透视和求和3关系

n3ipq98p  于 2021-05-29  发布在  Hadoop
关注(0)|答案(2)|浏览(493)

我有3种不同的关系,如下所述&我可以使用udf获得输出,但需要在pig中实现。在论坛上提到了其他东西,但没有得到关于这个问题的具体想法。
过程:

  1. FN1,10
  2. FN2,20
  3. FN3,23
  4. FN4,25
  5. FN5,15
  6. FN7,40
  7. FN10,56

雷杰:

  1. FN1,12
  2. FN2,13
  3. FN3,33
  4. FN6,60
  5. FN8,23
  6. FN9,44
  7. FN10,4

所有fn:

  1. FN1
  2. FN2
  3. FN3
  4. FN4
  5. FN5
  6. FN6
  7. FN7
  8. FN8
  9. FN9
  10. FN10

所需输出为:

  1. FN1,10,12,22
  2. FN2,20,13,33
  3. FN3,23,33,56
  4. FN4,25,0,25
  5. FN5,15,0,15
  6. FN6,0,60,60
  7. FN7,40,0,40
  8. FN8,0,23,23
  9. FN9,0,44,44
  10. FN10,56,4,60
zwghvu4y

zwghvu4y1#

您可以使用cogroup来实现这一点

r9f1avp5

r9f1avp52#

asuming您的关系在test.txt test2.txt和test3.txt中

  1. A = LOAD 'test.txt' using PigStorage(',');
  2. B = LOAD 'test2.txt' using PigStorage(',');
  3. C = LOAD 'test3.txt' using PigStorage(',');
  4. D = COGROUP A by $0, B by $0;
  5. E = COGROUP C by $0, D by $0;
  6. F = FOREACH E generate $0, FLATTEN(D.A), FLATTEN(D.B);
  7. G = FOREACH F generate $0, $1.$1, $2.$1;
  8. H = FOREACH G generate $0, FLATTEN((IsEmpty($1)?null:$1)), FLATTEN((IsEmpty($2)?null:$2));
  9. I = foreach H generate $0, ($1 is null?0:$1),($2 is null?0:$2),($1 is null?0:$1)+($2 is null?$0:$2);
  10. dump I;

输出

  1. (FN1,10,12,22)
  2. (FN2,20,13,33)
  3. (FN3,23,33,56)
  4. (FN4,25,0,)
  5. (FN5,15,0,)
  6. (FN6,0,60,60)
  7. (FN7,40,0,)
  8. (FN8,0,23,23)
  9. (FN9,0,44,44)
  10. (FN10,56,4,60)
展开查看全部

相关问题