不在,匹配在pig中

li9yvcax  于 2021-06-02  发布在  Hadoop
关注(0)|答案(2)|浏览(243)

我和Pig有两种关系:

a、 b类

DUMP A;

桑德普·罗汉·莫汉

DUMP B;

磨憨
我需要得到a-b的输出;你应该给我
sandeep,罗汉
因为它们在b中不存在

u4vypkhs

u4vypkhs1#

试试这个:

A1 = LOAD 'Sandeep Rohan Mohan' USING PigStorage() AS (line:chararray);
B1 = LOAD 'MOHAN' USING PigStorage() AS (line:chararray);

A = FOREACH A1 GENERATE UPPER(line) AS line;
B = FOREACH B1 GENERATE UPPER(line) AS line;

C = COGROUP A BY line, B BY line;

D = FILTER C BY IsEmpty(B);

E = FOREACH D GENERATE group AS name;

DUMP E;

(罗汉)(桑德普)
另请参阅apache pig中的set操作

vsmadaxz

vsmadaxz2#

它是通过左外连接实现的,只考虑那些在$1中有空值的元组

相关问题