hadoop—使用python将数据从主机加载到hdfs:没有这样的文件或目录错误

wtlkbnrh  于 2021-06-02  发布在  Hadoop
关注(0)|答案(0)|浏览(255)

我想用python脚本将ad.py文件从主机(ubuntu16.04 lts)加载到hdfs的输入文件夹中。到目前为止我发现的是:

import subprocess

def run_cmd(args_list):
        """
        run linux commands
        """
        # import subprocess
        print('Running system command: {0}'.format(' '.join(args_list)))
        proc = subprocess.Popen(args_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
        s_output, s_err = proc.communicate()
        s_return =  proc.returncode
        return s_return, s_output, s_err 

run_cmd(['hdfs', 'dfs', '-put', '/home/mernst/Desktop/ad.py', 'hdfs://localhost:55760/input'])
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-ls', 'hdfs://localhost:55760/input'])

print(out)

如果我将上述代码保存在一个文件(例如myfile.py)中,并在bash中使用python myfile.py运行它,则ad.py文件不会加载到输入文件夹中,但至少list命令起作用,我可以看到hdfs中存储了哪些文件:

python python-hadoop.py
Running system command: hdfs dfs -put /home/mernst/Desktop/ad.py
/usr/local/bin/hdfs/input
Running system command: hdfs dfs -ls hdfs://localhost:55760/input
Found 4 items
drwxr-xr-x   - hduser supergroup          0 2017-09-21 15:59
hdfs://localhost:55760/input/1st_test
-rw-r--r--   1 hduser supergroup        393 2017-09-20 10:28
hdfs://localhost:55760/input/PySpark.txt
-rw-r--r--   1 hduser supergroup         14 2017-09-19 14:50   
hdfs://localhost:55760/input/file.txt
-rw-r--r--   1 hduser supergroup         46 2017-09-28 09:57 
hdfs://localhost:55760/input/streaming_kmeans_data_test.txt

但是,如果使用sudo myfile.py运行脚本,则会出现以下错误:

Traceback (most recent call last):
File "python-hadoop.py", line 18, in <module>
(ret, out, err)= run_cmd(['hdfs', 'dfs', '-ls', 'hdfs://localhost:55760/input'])
File "python-hadoop.py", line 11, in run_cmd
proc = subprocess.Popen(args_list, stdout=subprocess.PIPE, stderr=subprocess.PIPE)#cwd = "/usr/lib/jvm/java-8-oracle/jre/")
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1343, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory

我在google上搜索了这个错误,当subprocess.propen中的args\u list参数是一个字符串而不是一个列表时,它似乎出现了,但在myfile.py中情况并非如此,所以我真的不知道这里出了什么问题,任何帮助都会很好。
p、 s:我在subprocess.popen中也尝试了shell=true,但没有成功。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题