mrjob virtualenv错误:权限被拒绝

58wvjzkj  于 2021-05-30  发布在  Hadoop
关注(0)|答案(1)|浏览(524)

我在一家大型公司工作,那里有一个hadoop集群。我让管理员安装了 virtualenv 在所有hadoop工作节点上,以便我可以提交 mrjob s与标准 Python 工作节点上可能不存在的依赖项。根据这里的文档,我的 mrjob.conf 文件看起来像:

  1. runners:
  2. hadoop:
  3. setup:
  4. - virtualenv venv
  5. - . venv/bin/activate
  6. - pip install nltk

我有一份简单的工作 nltk 包裹。我可以验证这个安装脚本是否在工作节点上运行(我可以将一些简单的命令,如将数据写入 /tmp 它是有效的)。但是,我得到以下错误:

  1. New python executable in venv/bin/python
  2. Installing setuptools............done.
  3. Installing pip...
  4. Error [Errno 13] Permission denied while executing command /storage5/hadoop/map...env/bin/easy_install /usr/share/python-virtualenv/pip-1.1.tar.gz
  5. ...Installing pip...done.
  6. Traceback (most recent call last):
  7. File "/usr/bin/virtualenv", line 3, in <module>
  8. virtualenv.main()
  9. File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 938, in main
  10. never_download=options.never_download)
  11. File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 1054, in create_environment
  12. install_pip(py_executable, search_dirs=search_dirs, never_download=never_download)
  13. File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 643, in install_pip
  14. filter_stdout=_filter_setup)
  15. File "/usr/lib/python2.7/dist-packages/virtualenv.py", line 976, in call_subprocess
  16. cwd=cwd, env=env)
  17. File "/usr/lib/python2.7/subprocess.py", line 679, in __init__
  18. errread, errwrite)
  19. File "/usr/lib/python2.7/subprocess.py", line 1249, in _execute_child
  20. raise child_exception
  21. OSError: [Errno 13] Permission denied
  22. java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
  23. at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:362)
  24. at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:572)
  25. at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:136)
  26. at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
  27. at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
  28. at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417)
  29. at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332)
  30. at org.apache.hadoop.mapred.Child$4.run(Child.java:268)
  31. at java.security.AccessController.doPrivileged(Native Method)
  32. at javax.security.auth.Subject.doAs(Subject.java:396)
  33. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1438)
  34. at org.apache.hadoop.mapred.Child.main(Child.java:262)

是什么导致了这个错误?

eqfvzcg8

eqfvzcg81#

感谢您提出将包部署到集群的想法。
至于你的问题,我认为它似乎没有权限写入目录。

相关问题