pyspark Kubernetes上的Spark:创建sparkContext时,执行程序pod无法启动

noj0wjuj  于 2022-11-01  发布在  Spark
关注(0)|答案(1)|浏览(120)

我尝试在kubernetes上运行Spark,沿着通过Spark shell或jupyter界面运行交互式命令。我已经为驱动程序pod和执行程序pod构建了自定义映像,并使用以下代码来启动Spark上下文

import pyspark
conf = pyspark.SparkConf()
conf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443")
conf.set(
    "spark.kubernetes.container.image", 
    "<Repo>/<IMAGENAME>:latest") 

conf.set("spark.kubernetes.namespace": "default")

# Authentication certificate and token (required to create worker pods):

conf.set(
    "spark.kubernetes.authenticate.caCertFile", 
    "/var/run/secrets/kubernetes.io/serviceaccount/ca.crt")
conf.set(
    "spark.kubernetes.authenticate.oauthTokenFile", 
    "/var/run/secrets/kubernetes.io/serviceaccount/token")

conf.set(
    "spark.kubernetes.authenticate.driver.serviceAccountName", 
    "spark-master") 
conf.set("spark.executor.instances", "2") 
conf.set(
    "spark.driver.host", "spark-test-jupyter") 
conf.set("spark.executor.memory", "1g")
conf.set("spark.executor.cores", "1")
conf.set("spark.driver.blockManager.port", "7777")
conf.set("spark.driver.bindAddress", "0.0.0.0")

conf.set("spark.driver.port", "29416") 

sc = pyspark.SparkContext(conf=conf)

驱动程序尝试运行执行程序pod,但它最终在2个执行程序pod尝试启动,但最终出错,新的一组pod做同样的事情。日志如下:

pyspark-shell-1620894878554-exec-8   0/1     Pending             0          0s
pyspark-shell-1620894878554-exec-8   0/1     ContainerCreating   0          0s
pyspark-shell-1620894878528-exec-7   1/1     Running             0          1s
pyspark-shell-1620894878554-exec-8   1/1     Running             0          2s
pyspark-shell-1620894878528-exec-7   0/1     Error               0          4s
pyspark-shell-1620894878554-exec-8   0/1     Error               0          4s
pyspark-shell-1620894878528-exec-7   0/1     Terminating         0          5s
pyspark-shell-1620894878528-exec-7   0/1     Terminating         0          5s
pyspark-shell-1620894878554-exec-8   0/1     Terminating         0          5s
pyspark-shell-1620894878554-exec-8   0/1     Terminating         0          5s
pyspark-shell-1620894883595-exec-9   0/1     Pending             0          0s
pyspark-shell-1620894883595-exec-9   0/1     Pending             0          0s
pyspark-shell-1620894883595-exec-9   0/1     ContainerCreating   0          0s
pyspark-shell-1620894883623-exec-10   0/1     Pending             0          0s
pyspark-shell-1620894883623-exec-10   0/1     Pending             0          0s
pyspark-shell-1620894883623-exec-10   0/1     ContainerCreating   0          0s
pyspark-shell-1620894883595-exec-9    1/1     Running             0          1s
pyspark-shell-1620894883623-exec-10   1/1     Running             0          3s

这种情况会无休止地持续下去,直到停止。
这里到底出了什么问题?

vnjpjtjt

vnjpjtjt1#

您的spark.driver.host应该是服务的DNS,因此类似于spark-test-jupyter.default.svc.cluster.local

相关问题