错误尝试使用端口54321作为流代理,但端口已被占用

w51jfk4q  于 2021-05-26  发布在  Spark
关注(0)|答案(0)|浏览(268)

我有一个spark应用程序,依赖于h2o起泡水罐

  1. // H2O Sparkling Water version (spark 2.4)
  2. val h2oVersion = "3.32.0.1-2-2.4"
  3. lazy val h2oLibs = Seq(
  4. "org.apache.spark" %% "spark-repl" % sparkVersion,
  5. "ai.h2o" % "sparkling-water-package_2.11" % h2oVersion //exclude("ai.h2o", "sparkling-water-api-generation")
  6. )

当我尝试在k8s上启动它时,spark executor会正确启动(请参阅下面executor容器中的日志),我可以打开到它的端口转发连接并访问flow ui。

  1. 12-03 17:04:55.129 10.1.0.22:54321 33 ent-loop-0 INFO water.default: ----- H2O started -----
  2. 12-03 17:04:55.129 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Build git branch: rel-zermelo
  3. 12-03 17:04:55.129 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Build git hash: 8a289cc0c15718842afc54dca7c8ed104aec9bd4
  4. 12-03 17:04:55.129 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Build git describe: jenkins-master-5215-21-g8a289cc
  5. 12-03 17:04:55.130 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Build project version: 3.32.0.1
  6. 12-03 17:04:55.130 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Build age: 1 month and 24 days
  7. 12-03 17:04:55.130 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Built by: 'jenkins'
  8. 12-03 17:04:55.130 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Built on: '2020-10-08 18:16:09'
  9. 12-03 17:04:55.130 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Found H2O Core extensions: [HiveTableImporter, StackTraceCollector, HiveFrameSaver, XGBoost]
  10. 12-03 17:04:55.131 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Processed H2O arguments: [-internal_security_conf_rel_paths, -name, sparkling-water-4444_spark-application-1607015075714, -port_offset, 1, -nthreads, 1, -hdfs_config, /opt/c3/./hdfs_conf2654621877693077305.xml, -log_level, INFO, -embedded, -baseport, 54321, -log_dir, h2ologs, -ip, 10.1.0.22, -flatfile, /var/data/spark-40ac5b73-d4ec-40ff-9785-1827934ec0ee/spark-b85b822f-e7b3-4ba4-aeb8-b7ddfaa99589/sparkling-water-019f1019-ebca-4ba0-bcf2-92078b2dcbec/flatfile.txt]
  11. 12-03 17:04:55.131 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Java availableProcessors: 1
  12. 12-03 17:04:55.131 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Java heap totalMemory: 2.00 GB
  13. 12-03 17:04:55.131 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Java heap maxMemory: 2.00 GB
  14. 12-03 17:04:55.131 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Java version: Java 1.8.0_265 (from Oracle Corporation)
  15. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: JVM launch parameters: [-XX:+UseG1GC, -XX:+PrintFlagsFinal, -XX:+UseContainerSupport, -XX:+UnlockExperimentalVMOptions, -XX:+UseCGroupMemoryLimitForHeap, -Xms2G, -Xmx2G]
  16. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: JVM process id: 33@vad-1607015075951-exec-1
  17. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: OS version: Linux 4.19.76-linuxkit (amd64)
  18. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Machine physical memory: 7.78 GB
  19. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Machine locale: en
  20. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: X-h2o-cluster-id: 1607015093048
  21. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: User name: '4444'
  22. 12-03 17:04:55.132 10.1.0.22:54321 33 ent-loop-0 INFO water.default: IPv6 stack selected: false
  23. 12-03 17:04:55.133 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Possible IP Address: eth0 (eth0), 10.1.0.22
  24. 12-03 17:04:55.133 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Possible IP Address: lo (lo), 127.0.0.1
  25. 12-03 17:04:55.133 10.1.0.22:54321 33 ent-loop-0 INFO water.default: H2O node running in unencrypted mode.
  26. 12-03 17:04:55.137 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Internal communication uses port: 54322
  27. 12-03 17:04:55.138 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Listening for HTTP and REST traffic on http://10.1.0.22:54321/
  28. 12-03 17:04:55.146 10.1.0.22:54321 33 ent-loop-0 WARN water.default: Flatfile configuration does not include self: /10.1.0.22:54321, but contains []
  29. 12-03 17:04:55.147 10.1.0.22:54321 33 ent-loop-0 INFO water.default: H2O cloud name: 'sparkling-water-4444_spark-application-1607015075714' on /10.1.0.22:54321, discovery address /235.92.149.120:60252
  30. 12-03 17:04:55.147 10.1.0.22:54321 33 ent-loop-0 INFO water.default: If you have trouble connecting, try SSH tunneling from your local machine (e.g., via port 55555):
  31. 12-03 17:04:55.147 10.1.0.22:54321 33 ent-loop-0 INFO water.default: 1. Open a terminal and run 'ssh -L 55555:localhost:54321 4444@10.1.0.22'
  32. 12-03 17:04:55.147 10.1.0.22:54321 33 ent-loop-0 INFO water.default: 2. Point your browser to http://localhost:55555
  33. 12-03 17:04:57.038 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Log dir: 'h2ologs'
  34. 12-03 17:04:57.039 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Cur dir: '/opt/c3'
  35. 12-03 17:04:57.051 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Distributed HTTP import not available (import from HTTP/HTTPS will be eager)
  36. 12-03 17:04:57.055 10.1.0.22:54321 33 ent-loop-0 INFO water.default: HDFS subsystem successfully initialized
  37. 12-03 17:04:57.061 10.1.0.22:54321 33 ent-loop-0 INFO water.default: S3 subsystem successfully initialized
  38. 12-03 17:04:57.115 10.1.0.22:54321 33 ent-loop-0 INFO water.default: GCS subsystem successfully initialized
  39. 12-03 17:04:57.115 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Flow dir: '/opt/spark/h2oflows'
  40. 12-03 17:04:57.131 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Cloud of size 1 formed [vad-1607015075951-exec-1/10.1.0.22:54321]
  41. 12-03 17:04:57.214 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Registered parsers: [GUESS, ARFF, XLS, SVMLight, AVRO, PARQUET, ORC, CSV]
  42. 12-03 17:04:57.215 10.1.0.22:54321 33 ent-loop-0 INFO water.default: HiveTableImporter extension initialized
  43. 12-03 17:04:57.215 10.1.0.22:54321 33 ent-loop-0 INFO water.default: StackTraceCollector extension initialized
  44. 12-03 17:04:57.216 10.1.0.22:54321 33 ent-loop-0 INFO water.default: HiveFrameSaver extension initialized
  45. 12-03 17:04:57.216 10.1.0.22:54321 33 ent-loop-0 INFO water.default: XGBoost extension initialized
  46. 12-03 17:04:57.216 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Registered 4 core extensions in: 1447ms
  47. 12-03 17:04:57.217 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Registered H2O core extensions: [HiveTableImporter, StackTraceCollector, HiveFrameSaver, XGBoost]
  48. 12-03 17:04:58.139 10.1.0.22:54321 33 ent-loop-0 INFO hex.tree.xgboost.XGBoostExtension: Found XGBoost backend with library: xgboost4j_minimal
  49. 12-03 17:04:58.140 10.1.0.22:54321 33 ent-loop-0 WARN hex.tree.xgboost.XGBoostExtension: Your system supports only minimal version of XGBoost (no GPUs, no multithreading)!
  50. 12-03 17:04:58.438 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Registered: 225 REST APIs in: 1221ms
  51. 12-03 17:04:58.439 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Registered REST API extensions: [XGBoost, Algos, Amazon S3, Sparkling Water REST API Extensions, AutoML, Core V3, TargetEncoder, Core V4]
  52. 12-03 17:04:58.751 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Registered: 301 schemas in 311ms
  53. 12-03 17:04:58.751 10.1.0.22:54321 33 ent-loop-0 INFO water.default: H2O started in 5736ms
  54. 12-03 17:04:58.751 10.1.0.22:54321 33 ent-loop-0 INFO water.default:
  55. 12-03 17:04:58.813 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Open H2O Flow in your web browser: http://10.1.0.22:54321
  56. 12-03 17:04:58.814 10.1.0.22:54321 33 ent-loop-0 INFO water.default:
  57. 12-03 17:04:58.834 10.1.0.22:54321 33 ent-loop-1 INFO water.default: Adding vad-1607015075951-exec-1/10.1.0.22:54321 to vad-1607015075951-exec-1/10.1.0.22:54321's flatfile
  58. 12-03 17:04:58.843 10.1.0.22:54321 33 ent-loop-1 INFO water.default: Full flatfile: vad-1607015075951-exec-1/10.1.0.22:54321
  59. 12-03 17:04:58.849 10.1.0.22:54321 33 ent-loop-0 INFO water.default: Locking cloud to new members, because Locking the cloud from Sparkling Water as we have reached the expected cluster size.
  60. 12-03 17:04:59.114 10.1.0.22:54321 33 8925448-40 INFO water.default: POST /3/CloudLock, parms: {reason=Locked from Sparkling Water.}
  61. 12-03 17:04:59.434 10.1.0.22:54321 33 8925448-36 INFO water.default: GET /3/verifyWebOpen, parms: {}
  62. 12-03 17:04:59.535 10.1.0.22:54321 33 8925448-39 INFO water.default: GET /3/verifyVersion, parms: {referenced_version=3.32.0.1}
  63. 12-03 17:04:59.645 10.1.0.22:54321 33 8925448-42 INFO water.default: GET /3/LogLevel, parms: {}
  64. 12-03 17:04:59.719 10.1.0.22:54321 33 8925448-38 INFO water.default: POST /99/Rapids, parms: {ast=(setTimeZone "UTC")}

但是驱动程序节点被卡在无限循环中,我在下面的日志中看到

  1. 2020-12-03T17:04:59.532+0000 level=INFO thread=main logger=ai.h2o.sparkling.backend.utils.RestApiUtils
  2. H2O node http://10.1.0.22:54321/3/verifyWebOpen successfully responded for the GET.
  3. 2020-12-03T17:04:59.553+0000 level=INFO thread=main logger=ai.h2o.sparkling.backend.utils.RestApiUtils
  4. H2O node http://10.1.0.22:54321/3/verifyVersion?referenced_version=3.32.0.1 successfully responded for the GET.
  5. 2020-12-03T17:04:59.636+0000 level=INFO thread=main logger=ai.h2o.sparkling.backend.utils.RestApiUtils
  6. H2O node http://10.1.0.22:54321/3/Cloud successfully responded for the GET.
  7. 2020-12-03T17:04:59.713+0000 level=INFO thread=main logger=ai.h2o.sparkling.backend.utils.RestApiUtils
  8. H2O node http://10.1.0.22:54321/3/LogLevel successfully responded for the GET.
  9. 2020-12-03T17:04:59.951+0000 level=INFO thread=main logger=ai.h2o.sparkling.backend.utils.RestApiUtils
  10. H2O node http://10.1.0.22:54321/99/Rapids successfully responded for the POST.
  11. 2020-12-03T17:05:00.011+0000 level=WARN thread=main logger=ai.h2o.sparkling.backend.utils.ProxyStarter
  12. Tried using port 54321 for Flow proxy, but port was already occupied!
  13. 2020-12-03T17:05:00.011+0000 level=WARN thread=main logger=ai.h2o.sparkling.backend.utils.ProxyStarter
  14. Tried using port 54322 for Flow proxy, but port was already occupied!
  15. 2020-12-03T17:05:00.011+0000 level=WARN thread=main logger=ai.h2o.sparkling.backend.utils.ProxyStarter
  16. Tried using port 54323 for Flow proxy, but port was already occupied!
  17. 2020-12-03T17:05:00.012+0000 level=WARN thread=main logger=ai.h2o.sparkling.backend.utils.ProxyStarter
  18. Tried using port 54324 for Flow proxy, but port was already occupied!
  19. ...

它一直在尝试!这些是驱动程序容器的开放端口(54321、54322)

  1. Containers:
  2. exmac-container:
  3. Container ID: docker://5889a4bba9576e4c4b6a10d4f5e570bf15ed9afdc7b4202c8423a1243d09177b
  4. Image: myimage
  5. Image ID: docker-pullable://amyimage@sha256:xyz
  6. Ports: 10011/TCP, 10013/TCP, 9091/TCP, 54321/TCP, 54322/TCP
  7. Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP, 0/TCP
  8. Args:

在这种情况下我该怎么办?

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题