如何优化展平操作?

mv1qrgav  于 2021-06-02  发布在  Hadoop
关注(0)|答案(0)|浏览(319)

我有一个pig脚本,它计算17个不同的输出,并在最后合并它们。为了合并数据,我使用了cogroup操作。
因为cogroup输出包含每个输入的连接标识符,所以我必须去掉一些不必要的列。所以,有一个展平操作符。
但是,脚本挂起率为95%,无法完成。当我丢弃最后一个部分(cg\u flat)时,它工作得非常好。
所以我需要优化压扁部分。你知道吗?

REST OF THE SCRIPT
    ...
    ...
    CG = cogroup countSmart by ($0),
            countModem by ($0),
            countTablet by ($0),
            countOther by ($0),
            count2G by ($0),
            count3G by ($0),
            countUMTS900 by ($0),
            countGPRS by ($0),
            countEDGE by ($0),
            countR99 by ($0),
            countHSDPA_432 by ($0),
            countHSDPA_288 by ($0),
            countHSDPA_216 by ($0),
            countHSDPA_144 by ($0),
            countHSDPA_72 by ($0),
            countHSDPA_36 by ($0),
            countHSDPA_Unknown by ($0);

CG_FLAT = foreach CG generate
        flatten($0),
         FLATTEN((IsEmpty($1.$2) ? null :  $1.$2)), FLATTEN((IsEmpty($1.$3) ? null :  $1.$3)),
         FLATTEN((IsEmpty($2.$2) ? null :  $2.$2)), FLATTEN((IsEmpty($2.$3) ? null :  $2.$3)),
         FLATTEN((IsEmpty($3.$2) ? null :  $3.$2)), FLATTEN((IsEmpty($3.$3) ? null :  $3.$3)),
         FLATTEN((IsEmpty($4.$2) ? null :  $4.$2)), FLATTEN((IsEmpty($4.$3) ? null :  $4.$3)),
         FLATTEN((IsEmpty($1.$2) ? null :  $1.$2)), FLATTEN((IsEmpty($1.$3) ? null :  $1.$3)),FLATTEN((IsEmpty($1.$4) ? null : $1.$4)),
         FLATTEN((IsEmpty($2.$2) ? null :  $2.$2)), FLATTEN((IsEmpty($2.$3) ? null :  $2.$3)),FLATTEN((IsEmpty($2.$4) ? null : $2.$4)),
         FLATTEN((IsEmpty($3.$2) ? null :  $3.$2)), FLATTEN((IsEmpty($3.$3) ? null :  $3.$3)),FLATTEN((IsEmpty($3.$4) ? null : $3.$4)),
         FLATTEN((IsEmpty($8.$2) ? null :  $8.$2)), FLATTEN((IsEmpty($8.$3) ? null :  $8.$3)),
         FLATTEN((IsEmpty($9.$2) ? null :  $9.$2)), FLATTEN((IsEmpty($9.$3) ? null :  $9.$3)),
        FLATTEN((IsEmpty($10.$2) ? null : $10.$2)),FLATTEN((IsEmpty($10.$3) ? null : $10.$3)),
        FLATTEN((IsEmpty($11.$2) ? null : $11.$2)),FLATTEN((IsEmpty($11.$3) ? null : $11.$3)),
        FLATTEN((IsEmpty($12.$2) ? null : $12.$2)),FLATTEN((IsEmpty($12.$3) ? null : $12.$3)),
        FLATTEN((IsEmpty($13.$2) ? null : $13.$2)),FLATTEN((IsEmpty($13.$3) ? null : $13.$3)),
        FLATTEN((IsEmpty($14.$2) ? null : $14.$2)),FLATTEN((IsEmpty($14.$3) ? null : $14.$3)),
        FLATTEN((IsEmpty($15.$2) ? null : $15.$2)),FLATTEN((IsEmpty($15.$3) ? null : $15.$3)),
        FLATTEN((IsEmpty($16.$2) ? null : $16.$2)),FLATTEN((IsEmpty($16.$3) ? null : $16.$3)),
        FLATTEN((IsEmpty($17.$2) ? null : $17.$2)),FLATTEN((IsEmpty($17.$3) ? null : $17.$3));

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题