我尝试在kubernetes上运行Spark,沿着通过Spark shell或jupyter界面运行交互式命令。我已经为驱动程序pod和执行程序pod构建了自定义映像,并使用以下代码来启动Spark上下文
import pyspark
conf = pyspark.SparkConf()
conf.setMaster("k8s://https://kubernetes.default.svc.cluster.local:443")
conf.set(
"spark.kubernetes.container.image",
"<Repo>/<IMAGENAME>:latest")
conf.set("spark.kubernetes.namespace": "default")
# Authentication certificate and token (required to create worker pods):
conf.set(
"spark.kubernetes.authenticate.caCertFile",
"/var/run/secrets/kubernetes.io/serviceaccount/ca.crt")
conf.set(
"spark.kubernetes.authenticate.oauthTokenFile",
"/var/run/secrets/kubernetes.io/serviceaccount/token")
conf.set(
"spark.kubernetes.authenticate.driver.serviceAccountName",
"spark-master")
conf.set("spark.executor.instances", "2")
conf.set(
"spark.driver.host", "spark-test-jupyter")
conf.set("spark.executor.memory", "1g")
conf.set("spark.executor.cores", "1")
conf.set("spark.driver.blockManager.port", "7777")
conf.set("spark.driver.bindAddress", "0.0.0.0")
conf.set("spark.driver.port", "29416")
sc = pyspark.SparkContext(conf=conf)
驱动程序尝试运行执行程序pod,但它最终在2个执行程序pod尝试启动,但最终出错,新的一组pod做同样的事情。日志如下:
pyspark-shell-1620894878554-exec-8 0/1 Pending 0 0s
pyspark-shell-1620894878554-exec-8 0/1 ContainerCreating 0 0s
pyspark-shell-1620894878528-exec-7 1/1 Running 0 1s
pyspark-shell-1620894878554-exec-8 1/1 Running 0 2s
pyspark-shell-1620894878528-exec-7 0/1 Error 0 4s
pyspark-shell-1620894878554-exec-8 0/1 Error 0 4s
pyspark-shell-1620894878528-exec-7 0/1 Terminating 0 5s
pyspark-shell-1620894878528-exec-7 0/1 Terminating 0 5s
pyspark-shell-1620894878554-exec-8 0/1 Terminating 0 5s
pyspark-shell-1620894878554-exec-8 0/1 Terminating 0 5s
pyspark-shell-1620894883595-exec-9 0/1 Pending 0 0s
pyspark-shell-1620894883595-exec-9 0/1 Pending 0 0s
pyspark-shell-1620894883595-exec-9 0/1 ContainerCreating 0 0s
pyspark-shell-1620894883623-exec-10 0/1 Pending 0 0s
pyspark-shell-1620894883623-exec-10 0/1 Pending 0 0s
pyspark-shell-1620894883623-exec-10 0/1 ContainerCreating 0 0s
pyspark-shell-1620894883595-exec-9 1/1 Running 0 1s
pyspark-shell-1620894883623-exec-10 1/1 Running 0 3s
这种情况会无休止地持续下去,直到停止。
这里到底出了什么问题?
1条答案
按热度按时间vnjpjtjt1#
您的
spark.driver.host
应该是服务的DNS,因此类似于spark-test-jupyter.default.svc.cluster.local