pyspark 为什么python控制台中对SparkSession.builder.getOrCreate()的调用被视为命令行spark-submit?

dxxyhpgq  于 2023-08-02  发布在  Spark
关注(0)|答案(1)|浏览(127)

python console内部,我试图创建一个Spark Session(我没有使用pyspark来隔离依赖项)。为什么会生成spark-submit命令行提示和错误??

NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
Error: Missing application resource.

NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
Usage: spark-submit [options] <app jar | python file | R file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
..

Options:
  --master MASTER_URL         spark://host:port, mesos://host:port, yarn,
                              k8s://https://host:port, or local (Default: local[*]).
  --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally ("client") or
                              on one of the worker machines inside the cluster ("cluster")
                              (Default: client).
  --class CLASS_NAME          Your application's main class (for Java / Scala apps).
  --name NAME                 A name of your application.
  --jars JARS                 Comma-separated list of jars to include on the driver
   ..
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<stdin>", line 7, in getSpark
  File "/shared/spark/python/pyspark/sql/session.py", line 173, in getOrCreate
    sc = SparkContext.getOrCreate(sparkConf)
  File "/shared/spark/python/pyspark/context.py", line 367, in getOrCreate
    SparkContext(conf=conf or SparkConf())
  File "/shared/spark/python/pyspark/context.py", line 133, in __init__
    SparkContext._ensure_initialized(self, gateway=gateway, conf=conf)
  File "/shared/spark/python/pyspark/context.py", line 316, in _ensure_initialized
    SparkContext._gateway = gateway or launch_gateway(conf)
  File "/shared/spark/python/pyspark/java_gateway.py", line 46, in launch_gateway
    return _launch_gateway(conf)
  File "/shared/spark/python/pyspark/java_gateway.py", line 108, in _launch_gateway
    raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number

字符串

w8f9ii69

w8f9ii691#

在尝试了超过15个资源-并细读了大约两倍的资源-唯一有效的是这个先前未被投票的答案https://stackoverflow.com/a/55326797/1056563

export PYSPARK_SUBMIT_ARGS="--master local[2] pyspark-shell"

字符串
是否使用local[2]locallocal[*]并不重要:* 需要的是 * 格式 *,包括关键的 pyspark-shell 部分。
另一种处理此问题的方法(更能抵抗环境变化)是在python代码中使用以下行:

os.environ["PYSPARK_SUBMIT_ARGS"] = "pyspark-shell"

相关问题