文件未显示正在hdfs中复制

kd3sttzy  于 2021-05-29  发布在  Hadoop
关注(0)|答案(0)|浏览(192)

我试图用python解析一个文件,并将其存储在hdfs中。这段代码在几天前运行得很好,但是从今天早上开始,我在执行代码时没有收到任何错误,但是在hdfs目录中我得到data.copying。它一点也没有从复制中走出来。我做错了什么。
代码:

import subprocess
from subprocess import Popen, PIPE

cat = subprocess.Popen(["hadoop", "fs", "-cat", "/user/cloudera/pin/*"], stdout=subprocess.PIPE)
dumpoff = Popen(["hadoop", "fs", "-put", "-", "/user/cloudera/DATA"],stdin=PIPE)
obrInd = "0"
line1 = ""
for line in cat.stdout:
    code = line.split('|')[1]
    idval = line.split('|')[2]
    if (code == "OB"):
        obrInd = runnno
    line1 =line.strip() + "|"+"OB_"+obrInd  
    dumpoff.stdin.write(line1)
    print(line1)

更新:

Configured Capacity: 424169496576 (395.04 GB)
Present Capacity: 348671389696 (324.73 GB)
DFS Remaining: 153408200704 (142.87 GB)
DFS Used: 195263188992 (181.85 GB)
DFS Used%: 56.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

-------------------------------------------------
report: Access denied for user cloudera. Superuser privilege is required

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题