错误：java.lang.runtimeexception:pipemapred.waitoutputthreads()：子进程在aws emr上失败，代码为1，但在本地计算机上工作

g6ll5ycj 于 2021-05-27 发布在 Hadoop

关注(0)|答案(1)|浏览(416)

我正在尝试运行一个简单的mapreduce代码，仅使用mapper.py读取，获取mapper.py的输出并通过reducer.py读取。这个代码在本地计算机上工作，但是当我在aws emr上尝试时，它给出了以下错误-

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1

下面是input.txt、mapper.py和reducer.py

输入.txt

scott,haris
jenifer,smith
ted,brandy
amanda,woods
bob,wilton
damn,halloween

Map器.py


# !/usr/bin/env python

import sys

for line in sys.stdin:
    x = line.strip()
    first,last = x.split(",")
    print '%s\t%s' % (first, last)

异径管.py


# !/usr/bin/env python

import sys

for line in sys.stdin:
    x = line.strip()
    key, value = x.split('\t')
    print '%s\t%s' % (key, value)

我正在使用以下命令：

hadoop jar /usr/lib/hadoop/hadoop-streaming.jar -files s3://test/mapper.py,s3://test/reducer.py -mapper "python mapper.py" -reducer "python reducer.py" -input s3://test/input.txt -output s3://test/output

hadoop mapreduce amazon-emr amazon-web-services

来源：https://stackoverflow.com/questions/63872693/error-java-lang-runtimeexception-pipemapred-waitoutputthreads-subprocess-fa

1条答案

按热度按时间

bihw5rsg1#

似乎你对python有问题 reducer / mapper 脚本，你能检查下面两件事吗
1.你的工作是什么 Mapper 以及 Reducer 脚本可执行文件（确保指向右env，如try） #!/usr/bin/python )有权限吗？
2.您的python程序是正确的，例如，如果服务器运行的是python3，则需要为 print() 或者剧本的其他问题。
试着用bash在emr中正常执行python，看看它是否有效

赞(0）回复(0）举报 2021-05-27

我来回答

错误：java.lang.runtimeexception:pipemapred.waitoutputthreads()：子进程在aws emr上失败，代码为1，但在本地计算机上工作

输入.txt

Map器.py

异径管.py

1条答案

相关问题

热门标签

最新问答