在gcp中查找jar文件的路径

a0zr77ik 于 2021-05-27 发布在 Hadoop

关注(0)|答案(1)|浏览(456)

在google文件平台中找到hadoop-streaming-1.2.1.jar文件的路径。
https://github.com/devangpatel01/tf-idf-implementation-using-map-reduce-hadoop-python-
我试图使用hadoop在gcp上运行这个mapreduce，但是我找不到hadoop-streaming-1.2.1.jar的路径。我尝试手动下载jar文件并将其上传到hadoop中，然后运行mapper1.py。但我说这条路错了。上面的程序是在本地机器上运行的。如何编辑命令在gcp上运行它？
hadoop jar/home/kirthyodackal/hadoop-streaming-1.2.1.jar-输入hdfs://cluster-29-m/input_prgs/input_prgs/input1/000000_0 -输出hdfs://cluster-29-m/input_prgs/input_prgs/output1 -制图器hdfs://cluster-29-m/input_prgs/input_prgs/mapper1.py -减速机hdfs://cluster-29-m/input_prgs/input_prgs/reducer1.py

hadoop google-cloud-platform Jar reducers Mapper

来源：https://stackoverflow.com/questions/58683808/find-path-of-jar-file-in-gcp

1条答案

按热度按时间

rsl1atfo1#

我使用了不同的mapper reducer程序，可以运行mapreduce。我使用了https://github.com/satishuc15/tfidf-hadoopmapreduce#tfidf-hadoop并在我的gcp集群上运行以下命令。

> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseOne.py /home/kirthyodackal/ReducerPhaseOne.py -mapper "python MapperPhaseOne.py" -reducer "python ReducerPhaseOne.py" -input hdfs://cluster-3299-m/mapinput/inputfile -output hdfs://cluster-3299-m/mappred1
> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseTwo.py /home/kirthyodackal/ReducerPhaseTwo.py -mapper "python MapperPhaseTwo.py" -reducer "python ReducerPhaseTwo.py" -input hdfs://cluster-3299-m/mappred1/part-00000 hdfs://cluster-3299-m/mappred1/part-00001 hdfs://cluster-3299-m/mappred1/part-00002 hdfs://cluster-3299-m/mappred1/part-00003 hdfs://cluster-3299-m/mappred1/part-00004  -output hdfs://cluster-3299-m/mappred2
> hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -file /home/kirthyodackal/MapperPhaseThree.py /home/kirthyodackal/ReducerPhaseThree.py -mapper "python MapperPhaseThree.py" -reducer "python ReducerPhaseThree.py" -input hdfs://cluster-3299-m/mappred2/part-00000 hdfs://cluster-3299-m/mappred2/part-00001 hdfs://cluster-3299-m/mappred2/part-00002 hdfs://cluster-3299-m/mappred2/part-00003 hdfs://cluster-3299-m/mappred2/part-00004  -output hdfs://cluster-3299-m/mappredf

下面的链接概述了我如何在gcp上使用mapreduce。https://github.com/kirthy21/data-analysis-stack-exchange-hadoop-pig-hive-mapreduce-tfidf

赞(0）回复(0）举报 2021-05-27

我来回答

在gcp中查找jar文件的路径

1条答案

相关问题

热门标签

最新问答