socket.timeout:使用pyspark执行python时超时

7cjasjjr  于 2021-05-17  发布在  Spark
关注(0)|答案(0)|浏览(703)

在执行时得到下面的异常,这里我们进行一些计算并将Dataframe写入parquet文件。在保存时,我遇到了套接字超时问题,并且在执行时尝试使用heartbeatinterval,但仍然没有得到解决
2017年11月20日14:50:28错误实用程序:中止任务org.apache.spark.api.python异常:回溯(最近一次调用):文件“c:\nmr\u pyspark\archimedes\pyspark.package\spark-3.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\worker.py”,第585行,在主文件“c:\nmr\u pyspark\archimedes\pyspark.package\spark-3.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\serializers.py”的第593行,在readinto socket.timeout中的read\int length=stream.read(4)文件“d:\obj\windows release\37amd64\u release\msi\u python\zip\amd64\socket.py”的第589行中,readinto socket.timeout:超时

at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:503)
    at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:99)
    at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:49)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:456)
    at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:272)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:281)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$15(FileFormatWriter.scala:205)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:127)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:444)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1377)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:447)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

2017年11月20日14:50:28错误fileformatwriter:作业\u 2020117145009 \u 0212已中止。2017年11月20日14:50:28错误执行器:阶段212.0(tid 452)中任务0.0出现异常org.apache.spark.sparkexception:任务在写入行时失败。位于org.apache.spark.sql.execution.datasources.fileformatwriter$.executetask(fileformatwriter。scala:291)位于org.apache.spark.sql.execution.datasources.fileformatwriter$。$anonfun$write$15(fileformatwriter。scala:205)在org.apache.spark.scheduler.resulttask.runtask(resulttask。scala:90)在org.apache.spark.scheduler.task.run(task。scala:127)在org.apache.spark.executor.executor$taskrunner.$anonfun$运行$3(executor。scala:444)在org.apache.spark.util.utils$.trywithsafefinally(utils。scala:1377)在org.apache.spark.executor.executor$taskrunner.run(executor。scala:447)位于java.util.concurrent.threadpoolexecutor.runworker(threadpoolexecutor。java:1149)在java.util.concurrent.threadpoolexecutor$worker.run(threadpoolexecutor。java:624)在java.lang.thread.run(线程。java:748)原因:org.apache.spark.api.python异常:traceback(最近一次调用):文件“c:\nmr\u pyspark\archimedes\pyspark.package\spark-3.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\worker.py”,第585行,在主文件“c:\nmr\u pyspark\archimedes\pyspark.package\spark-3.0.0-bin-hadoop2.7\python\lib\pyspark.zip\pyspark\serializers.py”的第593行,在readinto socket.timeout中的read\int length=stream.read(4)文件“d:\obj\windows release\37amd64\u release\msi\u python\zip\amd64\socket.py”的第589行中,readinto socket.timeout:超时

at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.handlePythonException(PythonRunner.scala:503)
    at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:99)
    at org.apache.spark.sql.execution.python.PythonArrowOutput$$anon$1.read(PythonArrowOutput.scala:49)
    at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:456)
    at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:489)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$executeTask$1(FileFormatWriter.scala:272)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1411)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.executeTask(FileFormatWriter.scala:281)
    ... 9 more

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题