如何在群集模式下设置驱动程序python路径(pyspark)

mspsb9vt  于 2022-11-25  发布在  Apache
关注(0)|答案(1)|浏览(133)

我的程序在客户端模式下运行良好,但当我尝试在集群模式下运行时失败,原因是集群节点上的python版本不同
当我的应用程序在集群模式下运行时,我尝试设置python驱动程序路径
下面是我在集群模式下的spark submit命令

spark-submit --master yarn --deploy-mode cluster --num-executors 10 --executor-cores 3 --driver-memory 50G --executor-memory 20G \
--conf spark.dynamicAllocation.enabled=false  \
--conf spark.kryoserializer.buffer.max=1024 --conf spark.yarn.keytab=keytab_path --conf spark.yarn.principal=${10} \
--conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/bin/python3
--jars path_to_jars \
--py-files Pipeline.egg-info,<path>/app.py <application_path>/app.py arguments

下面是错误

22/08/04 06:09:34 INFO yarn.ApplicationMaster: Starting the user application in a separate Thread
22/08/04 06:09:34 INFO yarn.ApplicationMaster: Waiting for spark context initialization...
22/08/04 06:09:34 ERROR yarn.ApplicationMaster: User application exited with status 1
22/08/04 06:09:34 INFO yarn.ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: User application exited with status 1)
22/08/04 06:09:34 ERROR yarn.ApplicationMaster: Uncaught exception:
org.apache.spark.SparkException: Exception thrown in awaitResult:
        at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
        at org.apache.spark.deploy.yarn.ApplicationMaster.runDriver(ApplicationMaster.scala:447)
        at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:275)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:805)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$3.run(ApplicationMaster.scala:804)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
        at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:804)
        at org.apache.spark.deploy.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: org.apache.spark.SparkUserAppException: User application exited with 1
        at org.apache.spark.deploy.PythonRunner$.main(PythonRunner.scala:106)
        at org.apache.spark.deploy.PythonRunner.main(PythonRunner.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$2.run(ApplicationMaster.scala:673)
22/08/04 06:09:34 INFO yarn.ApplicationMaster: Deleting staging directory hdfs://test-scc/user/tst_rdip_cross/.sparkStaging/application_1643123069214_48871
22/08/04 06:09:35 INFO util.ShutdownHookManager: Shutdown hook called

在研究控制台日志和应用程序日志时,我们没有发现错误的原因,然后我们研究了yarn日志,发现python版本在集群节点上不兼容
谁能帮帮我
先谢了

eivgtgni

eivgtgni1#

如果您使用spark version〉= 2.1.0.

,那么当您在--conf中提交spark作业时,可以使用spark.pyspark.pythonspark.pyspark.driver.python来设置python路径

相关问题