pig-使用pig拉丁语实现cogroup

kq4fsx7k  于 2021-06-21  发布在  Pig
关注(0)|答案(2)|浏览(421)

我想知道是否可以用基本的pig拉丁语句来模仿cogroup的结果?
提前谢谢。

6ljaweal

6ljaweal1#

我在想你为什么要这么做?以下代码适用于alias 2

>cat test1
1,aaaa
2,bbbb
3,cccc
4,dddd
5,eeee
6,ffff
7,gggg
8,hhhh
9,iiii

>cat test2
7,ggggggg
8,hhhhhhh
9,iiiiiii
10,jjjjjjj
11,kkkkkkk
7,9999
7,gggg

grunt>test1 = load 'test1' USING PigStorage(',') as (id: int, val: chararray);
grunt>test2 = load 'test2' USING PigStorage(',') as (id: int, val: chararray);
grunt>cgrp = cogroup test1 by id, test2 by id;
grunt>dump cgrp;

我们有

(1,{(1,aaaa)},{})
(2,{(2,bbbb)},{})
(3,{(3,cccc)},{})
(4,{(4,dddd)},{})
(5,{(5,eeee)},{})
(6,{(6,ffff)},{})
(7,{(7,gggg)},{(7,ggggggg),(7,9999),(7,gggg)})
(8,{(8,hhhh)},{(8,hhhhhhh)})
(9,{(9,iiii)},{(9,iiiiiii)})
(10,{},{(10,jjjjjjj)})
(11,{},{(11,kkkkkkk)})

下面的代码可以给出相同的结果

grunt>g1 = group test1 by id;
grunt>g2 = group test2 by id;
grunt>j = join g1 by group FULL, g2 by group;
grunt>j2 = foreach j generate (g1::group is null ? g2::group : g1::group), (test1 is null? (bag{tuple(int, chararray)}){} : test1) as test1, (test2 is null? (bag{tuple(int,chararray)}){} : test2) as test2;
fhity93d

fhity93d2#

你说的“模仿同组的结果”是什么意思?
Pig拉丁语已经有了cogroup的功能。
例子:

COGROUP alias BY (col1, col2)

当涉及多个关系时,通常使用cogroup。

相关问题