grep—尝试在文件中找到所有具有匹配字符串的行，然后将结果保存在hadoop hdfs中一个单独目录中的文件中

gudnpqoy 于 2021-05-27 发布在 Hadoop

关注(0)|答案(2)|浏览(373)

我在hdfs的simpledir目录中有一个simpleinput.txt文件。我想输出这个文件中包含单词“texas”的所有行。之后，我需要将结果保存在simpleoutput目录中，该目录应该位于simpledir中。
我已经在simpledir中创建了simpleoutput目录。
我试过很多命令，比如：

hdfs dfs -cat /SimpleDir/SimpleInput.txt | grep -i "texas"

有了这个，我可以打印所有带有单词“texas”的行，但无法将结果保存在simpleoutput目录中。
其他我也尝试过这个命令：

hdfs dfs -cat /SimpleDir/SimpleInput.txt | grep -i "texas" /SimpleDir/SimpleOutput

它表明：

grep: /SimpleDir/SimpleOutput: No such file or directory
cat: Unable to write to output str

hadoop hdfs grep

来源：https://stackoverflow.com/questions/59851403/trying-to-find-all-the-lines-having-a-matching-string-in-the-file-then-save-the

2条答案

按热度按时间

r1zhe5dt1#

解决这个问题的方法是：

hadoop org.apache.hadoop.examples.Grep /SimpleDir/SimpleInput.txt /SimpleDir/SimpleOutput .*texas*.

赞(0）回复(0）举报 2021-05-27

iklwldmw2#

您需要将grep的输出重定向到一个文件

hdfs dfs -cat /SimpleDir/SimpleInput.txt | grep -i "texas" > /SimpleDir/SimpleOutput

那你必须使用 hdfs put 上载本地文件。
或者，也可以使用过滤器功能在spark中执行相同的操作

赞(0）回复(0）举报 2021-05-27

我来回答

grep—尝试在文件中找到所有具有匹配字符串的行，然后将结果保存在hadoop hdfs中一个单独目录中的文件中

2条答案

相关问题

热门标签

最新问答