pig-如何一步连接和定义模式

olhwl3o2  于 2021-06-03  发布在  Hadoop
关注(0)|答案(1)|浏览(277)

我采取以下措施:

A = LOAD 'a.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,bar:chararray
);
B = LOAD 'b.txt' USING PigStorage('\\u001') AS (
    foo:int
    ,baz:long
);
C = JOIN A BY foo, B BY foo;
D = FOREACH C GENERATE
    A::foo AS foo
    ,A::bar AS bar
    ,B::baz AS baz
;

如何在一个步骤中联接和定义模式?

zpgglvta

zpgglvta1#

根据文档,连接关系时不能定义模式。
注意:从语法上讲,您可以嵌套命令,使您感觉保存了以下步骤:

D = foreach
    (join (LOAD 'a.txt' USING PigStorage('\\u001') AS (foo:int ,bar:chararray)) by foo,
          (LOAD 'b.txt' USING PigStorage('\\u001') AS (foo:int ,baz:long)) by foo
    ) generate $0 as foo, $1 as bar, $3 as baz;

但我会避免这样做。这是混乱的,尽管如此,它产生了相同的解释计划作为原来的一个。

相关问题