modulenotfounderror:没有名为“py4j”的模块

ldxq2e6h  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(587)

我安装了spark,在将pyspark模块加载到ipython时遇到了问题。我得到以下错误:

ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-2-49d7c4e178f8> in <module>
----> 1 import pyspark

/opt/spark/python/pyspark/__init__.py in <module>
     44 
     45 from pyspark.conf import SparkConf
---> 46 from pyspark.context import SparkContext
     47 from pyspark.rdd import RDD
     48 from pyspark.files import SparkFiles

/opt/spark/python/pyspark/context.py in <module>
     27 from tempfile import NamedTemporaryFile
     28 
---> 29 from py4j.protocol import Py4JError
     30 
     31 from pyspark import accumulators

ModuleNotFoundError: No module named 'py4j'
oewdyzsn

oewdyzsn1#

如果可以直接运行spark,可能需要修复环境变量 PYTHONPATH . 检查目录中的文件名 $SPARK_HOME/python/lib/ . 如果spark版本为2.4.3,则文件为 py4j-0.10.7-src.zip :

export PYTHONPATH=$SPARK_HOME/python:$SPARK_HOME/python/lib/py4j-0.10.7-src.zip:$PYTHONPATH

相关问题