java.io.eofexception在集群上使用spark submit with yarn作为主节点时

0kjbasz6  于 2021-06-01  发布在  Hadoop
关注(0)|答案(0)|浏览(586)

我正在尝试使用以下spark submit命令运行jar文件:

  1. spark-submit --master yarn --deploy-mode cluster --executor-memory 3g --class my.package.Main my-jar-file.jar

main类是jar的主类,下面是内容(全部在scala中):

  1. object Main{
  2. def main(args: Array[String]){
  3. val server = HttpServer.create(new InetSocketAddress("master", 8000), 0)
  4. val backend = new MainProcess()
  5. val handlerRoot = new RootHandler()
  6. handlerRoot.initProcess(backend)
  7. server.createContext("/", handlerRoot)
  8. server.setExecutor(null)
  9. server.start()
  10. println("Server is started at " + server.getAddress().getHostString() + ":" + server.getAddress().getPort())
  11. }
  12. }

类mainprocess是我使用从hdfs获得的文件来处理spark和spark graphx库的类。以下是我在mainprocess类中配置sparkcontext的方法:

  1. class MainProcess{
  2. val config = new SparkConf()
  3. config.setAppName("Final GraphX App - Main")
  4. val sc = new SparkContext(config)
  5. ...
  6. }

应用程序似乎运行正常,最终状态返回成功,但应用程序只是关闭,而不是连续运行,因为它应该是一个正在运行的服务器。我只能打开链接master:8000 once 当我尝试刷新页面时,它又回到无法连接的状态。以下是运行应用程序的日志:

  1. 18/04/06 15:45:59 ERROR yarn.YarnAllocator: Failed to launch executor 2 on container container_1522920902032_0027_01_000003
  2. org.apache.spark.SparkException: Exception while starting container container_1522920902032_0027_01_000003 on host slave2
  3. at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:125)
  4. at org.apache.spark.deploy.yarn.ExecutorRunnable.run(ExecutorRunnable.scala:65)
  5. at org.apache.spark.deploy.yarn.YarnAllocator$$anonfun$runAllocatedContainers$1$$anon$1.run(YarnAllocator.scala:523)
  6. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
  7. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
  8. at java.lang.Thread.run(Thread.java:748)
  9. Caused by: java.io.IOException: Failed on local exception: java.io.IOException: java.io.EOFException; Host Details : local host is: "master/10.100.69.207"; destination host is: "slave2":57914;
  10. at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:776)
  11. at org.apache.hadoop.ipc.Client.call(Client.java:1479)
  12. at org.apache.hadoop.ipc.Client.call(Client.java:1412)
  13. at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229)
  14. at com.sun.proxy.$Proxy19.startContainers(Unknown Source)
  15. at org.apache.hadoop.yarn.api.impl.pb.client.ContainerManagementProtocolPBClientImpl.startContainers(ContainerManagementProtocolPBClientImpl.java:96)
  16. at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  17. at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
  18. at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
  19. at java.lang.reflect.Method.invoke(Method.java:498)
  20. at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:191)
  21. at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
  22. at com.sun.proxy.$Proxy20.startContainers(Unknown Source)
  23. at org.apache.hadoop.yarn.client.api.impl.NMClientImpl.startContainer(NMClientImpl.java:201)
  24. at org.apache.spark.deploy.yarn.ExecutorRunnable.startContainer(ExecutorRunnable.scala:122)
  25. ... 5 more
  26. Caused by: java.io.IOException: java.io.EOFException
  27. at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:687)
  28. at java.security.AccessController.doPrivileged(Native Method)
  29. at javax.security.auth.Subject.doAs(Subject.java:422)
  30. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
  31. at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
  32. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:737)
  33. at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
  34. at org.apache.hadoop.ipc.Client.getConnection(Client.java:1528)
  35. at org.apache.hadoop.ipc.Client.call(Client.java:1451)
  36. ... 18 more
  37. Caused by: java.io.EOFException
  38. at java.io.DataInputStream.readInt(DataInputStream.java:392)
  39. at org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:367)
  40. at org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:560)
  41. at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:375)
  42. at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:729)
  43. at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:725)
  44. at java.security.AccessController.doPrivileged(Native Method)
  45. at javax.security.auth.Subject.doAs(Subject.java:422)
  46. at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1698)
  47. at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:725)
  48. ... 21 more

这个应用程序基本上是一个使用JavaHTTPServer(com.sun.net.httpserver.httpserver)制作的web应用程序,它使用spark来处理大数据。处理程序类接受发送的请求,并创建一个新线程在后台运行spark作业。用户可以发送另一个请求来检查spark作业是否完成,这样就可以将完成的结果显示在web页面上。问题是,每当spark声称完成了一个作业(但在本例中,失败了一个作业)时,服务器都会被“杀死”。我使用的是为hadoop2.7和hadoop2.7.1构建的spark2.2.0。所有数据文件都在hdfs中。

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题