文件中元音的数量

a0x5cqrl  于 2021-06-21  发布在  Pig
关注(0)|答案(2)|浏览(288)

有人能帮我吗?非常感谢。这是我的密码:

G = LOAD 'input.txt' AS (line:chararray);
B = foreach G generate FLATTEN(STRSPLIT(LOWER(line), '(?<=.)(?=.)')) as s:chararray;
C = foreach B generate FLATTEN(TOBAG(*)) as letter;
result = filter C by ( letter == 'a' or  letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u' );
E = GROUP result BY letter;
F = foreach E generate group, COUNT(result) ;
DUMP F;
8yoxcaq7

8yoxcaq71#

请使用以下代码获得结果。 A = LOAD 'input.txt' AS (line:chararray); B = FOREACH A GENERATE FLATTEN(TOKENIZE(REPLACE(LOWER((chararray)$0),'','|'),'|')) as letter:chararray; result = FILTER B by (letter == 'a' or letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u'); E = GROUP result BY letter; F = FOREACH E GENERATE group, COUNT(result); DUMP F;

bxjv4tth

bxjv4tth2#

首先将行标记为单词,然后从单词中获取字符。使用replace将单词中的字符切分。不要使用tobag(*),而是使用tokenize沿替换的分隔符拆分字符。filter aeiou,然后按字符分组并获取计数。
Pig笼草

A = LOAD 'test4.txt' as (line:chararray);
B = FOREACH A GENERATE  FLATTEN(TOKENIZE(line)) as words;
C = FOREACH B GENERATE  FLATTEN(TOKENIZE(REPLACE(LOWER(words),'','|'),'|')) as letter;
D = FILTER C BY (letter == 'a' or  letter == 'e' or letter == 'i' or letter == 'o' or letter == 'u' );
E = group D by letter;
F = FOREACH E GENERATE group as letter,COUNT(D.letter) as total;
DUMP F;

输出

相关问题