apachepig程序

nqwrtyyt  于 2021-06-02  发布在  Hadoop
关注(0)|答案(1)|浏览(295)

需要帮助写Pig的脚本计数的话:在一个
包含以下文本的文件

What|is|Hadoop
History|of|Hadoop
How|Hadoop|name|was|given
Problems|with|Traditional|Large-Scale|Systems|and|Need|for|Hadoop
Understanding|Hadoop|Architecture
Fundamental|of|HDFS|(Blocks,|Name|Node,|Data|Node,|Secondary|Name|Node)
Rack|Awareness
Read/Write|from|HDFS
HDFS|Federation|and|High|Availability
gdrx4gfi

gdrx4gfi1#

将数据加载到一个字符数组中。将“|”替换为空格,即“.”,并对将给出单词的行进行标记,然后对单词进行分组和计数

A = LOAD '/user/hadoop/data.txt' AS (line:chararray);
B = FOREACH A GENERATE FLATTEN(TOKENIZE(REPLACE(line,'\\|',' ')));
C = GROUP B BY $0;
D = FOREACH C GENERATE group, COUNT(B);
DUMP D;

输出

相关问题