我想通过本地机器上的python连接到docker上运行的远程spark master:
spark = SparkSession \
.builder \.
.master('spark://spark-spark-master:7077') \
.appName('spark-yarn') \
.getOrCreate()
我明白了 Connection Refused
运行代码时出错。
跑步 telnet ip 7077
在我的终端中给出错误:
telnet:无法连接到远程主机:连接被拒绝
这很混乱,因为服务器本身的端口是打开的,并且服务器正在接受来自端口7077的连接。
跑步 docker container ls
在服务器上显示:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
9086cf2f26dc bde2020/spark-master:3.0.1-hadoop3.2 "/bin/bash /master.sh" 2 weeks ago Up 2 weeks 6066/tcp, 8080/tcp, 0.0.0.0:7077->7077/tcp spark_spark-master.1.qyie2bq52hbrfg2ttioz6ljwq
5133adc223ef bde2020/spark-worker:3.0.1-hadoop3.2 "/bin/bash /worker.sh" 2 weeks ago Up 2 weeks 8081/tcp spark_spark-worker.ylnj52bj78as9hxr6zdo1lgo3.kwzys14lm3uid0qclyv0jn95o
da2841b1d757 bde2020/hadoop-nodemanager:2.0.0-hadoop3.2.1-java8 "/entrypoint.sh /run…" 2 months ago Up 2 months (healthy) 8042/tcp hadoop_nodemanager.ylnj52bj78as9hxr6zdo1lgo3.o9gznaa9u57wuyf21fl9ya4hi
49a3cbb8073a bde2020/hadoop-resourcemanager:2.0.0-hadoop3.2.1-java8 "/entrypoint.sh /run…" 2 months ago Up 2 months 8088/tcp hadoop_resourcemanager.1.7kwgmhxz74brj6xs218k81ptk
10b22205a879 bde2020/hadoop-historyserver:2.0.0-hadoop3.2.1-java8 "/entrypoint.sh /run…" 2 months ago Up 2 months (healthy) 8188/tcp hadoop_historyserver.1.p3c3ouxmayxt4rvhrjlq7ti4t
775209433ea8 bde2020/hadoop-namenode:2.0.0-hadoop3.2.1-java8 "/entrypoint.sh /run…" 2 months ago Up 2 months (healthy) 0.0.0.0:9000->9000/tcp, 0.0.0.0:9870->9870/tcp hadoop_namenode.1.bbt0n4ne76ddwqtmejlsf590m
5d14d16020e5 bde2020/hadoop-datanode:2.0.0-hadoop3.2.1-java8 "/entrypoint.sh /run…" 2 months ago Up 2 months (healthy) 9864/tcp hadoop_datanode.ylnj52bj78as9hxr6zdo1lgo3.e9drmfbdqicux6ltkk9gv2uh5
83f7b3290995 traefik:v2.2 "/entrypoint.sh --ap…" 2 months ago Up 2 months 80/tcp traefik_traefik.1.ha8o6dc3ewtmppkn4pauugkj
spark的docker-compose.yml文件是:
version: '3.6'
services:
spark-master:
image: bde2020/spark-master:3.0.1-hadoop3.2
networks:
- workbench
ports:
- target: 7077
published: 7077
mode: host
deploy:
restart_policy:
condition: on-failure
placement:
constraints:
- node.hostname == johnsnow
labels:
- "traefik.enable=true"
- "traefik.docker.network=workbench"
- "traefik.http.services.spark-master.loadbalancer.server.port=8080"
env_file:
- ./hadoop.env
environment:
- INIT_DAEMON_STEP=setup_spark
- "constraint:node==spark-master"
spark-worker:
image: bde2020/spark-worker:3.0.1-hadoop3.2
networks:
- workbench
environment:
- SPARK_MASTER_URL=spark://spark_spark-master:7077
deploy:
mode: global
restart_policy:
condition: on-failure
labels:
- "traefik.enable=true"
- "traefik.docker.network=workbench"
- "traefik.http.services.spark-worker.loadbalancer.server.port=8081"
env_file:
- ./hadoop.env
environment:
- INIT_DAEMON_STEP=setup_spark
- "constraint:node==spark-worker"
networks:
workbench:
external: true
为什么会发生这种错误?
编辑:运行后spark master的内容 docker network -v workbench
```
},
"0bce286b736b1368738aa7504e1219dd12d855f7d79fc8f17d6a04b98ebe0ec1": {
"Name": "spark_spark-master.1.2tdsfzhbyl1wr8omjh518lhwg",
"EndpointID": "ae7e503e911039c3f02469a968536035b88c0213bcf09a1def49eaf7853b9085",
"MacAddress": "02:42:0a:00:01:07",
"IPv4Address": "10.0.1.7/24",
"IPv6Address": ""
},
...
"spark_spark-master": {
"VIP": "10.0.1.70",
"Ports": [],
"LocalLBIndex": 257,
"Tasks": [
{
"Name": "spark_spark-master.1.2tdsfzhbyl1wr8omjh518lhwg",
"EndpointID": "ae7e503e911039c3f02469a968536035b88c0213bcf09a1def49eaf7853b9085",
"EndpointIP": "10.0.1.7",
"Info": {
"Host IP": "ip"
}
}
]
},
暂无答案!
目前还没有任何答案,快来回答吧!