如何克服Spark中磁盘空间不足的错误?

k5ifujac  于 2022-12-13  发布在  Apache
关注(0)|答案(1)|浏览(378)

当我使用Apache Spark在数据块上运行笔记本时,出现错误-Caused by: Job aborted due to stage failure: No space left on device

2-5 Workers
16-40 GB Memory
4-10 Cores
1 Driver
8 GB Memory, 2 Cores

118 GB Storage on each node

我正在执行一个巨大的连接,后面是mapreduceByKey

calculateDataFrame(sampleDataA).rdd // Finishes in 11 mins
  .join(calculateDataFrame(sampleDataB).rdd) // Finishes in 12 mins
  .map(_._2 -> 1) //Out of Disk space error after 5 hrs
  .reduceByKey(_ + _)

数据集是这样的,sampleDataAsampleDataB的联接应该产生非常少的记录(公共键非常非常少)。是什么导致了磁盘空间不足错误?增加每个节点的磁盘空间是唯一的解决方案吗?
附加完整的堆栈跟踪

at org.apache.spark.sql.errors.QueryExecutionErrors$.jobAbortedError(QueryExecutionErrors.scala:882)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$1(FileFormatWriter.scala:334)
    at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:154)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:207)
    at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:126)
    at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:124)
    at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:138)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:241)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:243)
    at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:392)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:188)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985)
    at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:142)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:342)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:241)
    at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:226)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:239)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:232)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:99)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:268)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:264)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:232)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)
    at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:232)
    at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:186)
    at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:177)
    at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:268)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:965)
    at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:430)
    at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:397)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:251)
    at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:956)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:13)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:95)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:97)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:99)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:101)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:103)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:105)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw.<init>(command-2618157713453039:107)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw.<init>(command-2618157713453039:109)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw.<init>(command-2618157713453039:111)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read.<init>(command-2618157713453039:113)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$.<init>(command-2618157713453039:117)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$.<clinit>(command-2618157713453039)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$eval$.$print$lzycompute(<notebook>:7)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$eval$.$print(<notebook>:6)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
    at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
    at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
    at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
    at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:223)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:225)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:1113)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:1066)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:225)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$23(DriverLocal.scala:725)
    at com.databricks.unity.EmptyHandle$.runWith(UCSHandle.scala:103)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$20(DriverLocal.scala:708)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:398)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:147)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:396)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:393)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:441)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:426)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:62)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:685)
    at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:622)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:614)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:533)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:568)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:438)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:381)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:232)
    at java.lang.Thread.run(Thread.java:750)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 10 in stage 4.0 failed 4 times, most recent failure: Lost task 10.3 in stage 4.0 (TID 428) (10.234.241.133 executor 1): java.io.IOException: No space left on device
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:326)
    at org.apache.spark.storage.TimeTrackingOutputStream.write(TimeTrackingOutputStream.java:59)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
    at net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:225)
    at net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:178)
    at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1877)
    at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1786)
    at java.io.ObjectOutputStream.writeNonProxyDesc(ObjectOutputStream.java:1286)
    at java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1231)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1427)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1577)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:351)
    at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
    at org.apache.spark.serializer.SerializationStream.writeKey(Serializer.scala:160)
    at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:309)
    at org.apache.spark.util.collection.WritablePartitionedIterator.writeNext(WritablePartitionedPairCollection.scala:83)
    at org.apache.spark.util.collection.ExternalSorter.org$apache$spark$util$collection$ExternalSorter$$spillMemoryIteratorToDisk(ExternalSorter.scala:320)
    at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:261)
    at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:94)
    at org.apache.spark.util.collection.Spillable.maybeSpill(Spillable.scala:98)
    at org.apache.spark.util.collection.ExternalSorter.maybeSpillCollection(ExternalSorter.scala:238)
    at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:209)
    at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:74)
    at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
    at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:81)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.doRunTask(Task.scala:169)
    at org.apache.spark.scheduler.Task.$anonfun$run$4(Task.scala:137)
    at com.databricks.unity.EmptyHandle$.runWithAndClose(UCSHandle.scala:104)
    at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:137)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.scheduler.Task.run(Task.scala:96)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:902)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1696)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:905)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:760)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:3312)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:3244)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:3235)
    at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
    at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:3235)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1(DAGScheduler.scala:1424)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$handleTaskSetFailed$1$adapted(DAGScheduler.scala:1424)
    at scala.Option.foreach(Option.scala:407)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:1424)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:3524)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3462)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:3450)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:51)
    at org.apache.spark.scheduler.DAGScheduler.$anonfun$runJob$1(DAGScheduler.scala:1169)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:1157)
    at org.apache.spark.SparkContext.runJobInternal(SparkContext.scala:2713)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2696)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.$anonfun$write$1(FileFormatWriter.scala:299)
    at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:80)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:154)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:207)
    at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:126)
    at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:124)
    at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:138)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:241)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$8(SQLExecution.scala:243)
    at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:392)
    at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:188)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:985)
    at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:142)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:342)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:241)
    at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:226)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:239)
    at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:232)
    at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:512)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:99)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:512)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:268)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:264)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:488)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:232)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:324)
    at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:232)
    at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:186)
    at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:177)
    at org.apache.spark.sql.execution.QueryExecution.assertCommandExecuted(QueryExecution.scala:268)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:965)
    at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:430)
    at org.apache.spark.sql.DataFrameWriter.saveInternal(DataFrameWriter.scala:397)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:251)
    at org.apache.spark.sql.DataFrameWriter.csv(DataFrameWriter.scala:956)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:13)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:95)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:97)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:99)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:101)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:103)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw$$iw.<init>(command-2618157713453039:105)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw$$iw.<init>(command-2618157713453039:107)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw$$iw.<init>(command-2618157713453039:109)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$$iw.<init>(command-2618157713453039:111)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read.<init>(command-2618157713453039:113)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$.<init>(command-2618157713453039:117)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$read$.<clinit>(command-2618157713453039)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$eval$.$print$lzycompute(<notebook>:7)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$eval$.$print(<notebook>:6)
    at $linedd029d55a4834e3bbdf8cda5a124ff2553.$eval.$print(<notebook>)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:747)
    at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1020)
    at scala.tools.nsc.interpreter.IMain.$anonfun$interpret$1(IMain.scala:568)
    at scala.reflect.internal.util.ScalaClassLoader.asContext(ScalaClassLoader.scala:36)
    at scala.reflect.internal.util.ScalaClassLoader.asContext$(ScalaClassLoader.scala:116)
    at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:41)
    at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:567)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:594)
    at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:564)
    at com.databricks.backend.daemon.driver.DriverILoop.execute(DriverILoop.scala:223)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.$anonfun$repl$1(ScalaDriverLocal.scala:225)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExitInternal$.trapExit(DriverLocal.scala:1113)
    at com.databricks.backend.daemon.driver.DriverLocal$TrapExit$.apply(DriverLocal.scala:1066)
    at com.databricks.backend.daemon.driver.ScalaDriverLocal.repl(ScalaDriverLocal.scala:225)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$23(DriverLocal.scala:725)
    at com.databricks.unity.EmptyHandle$.runWith(UCSHandle.scala:103)
    at com.databricks.backend.daemon.driver.DriverLocal.$anonfun$execute$20(DriverLocal.scala:708)
    at com.databricks.logging.UsageLogging.$anonfun$withAttributionContext$1(UsageLogging.scala:398)
    at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
    at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:147)
    at com.databricks.logging.UsageLogging.withAttributionContext(UsageLogging.scala:396)
    at com.databricks.logging.UsageLogging.withAttributionContext$(UsageLogging.scala:393)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionContext(DriverLocal.scala:62)
    at com.databricks.logging.UsageLogging.withAttributionTags(UsageLogging.scala:441)
    at com.databricks.logging.UsageLogging.withAttributionTags$(UsageLogging.scala:426)
    at com.databricks.backend.daemon.driver.DriverLocal.withAttributionTags(DriverLocal.scala:62)
    at com.databricks.backend.daemon.driver.DriverLocal.execute(DriverLocal.scala:685)
    at com.databricks.backend.daemon.driver.DriverWrapper.$anonfun$tryExecutingCommand$1(DriverWrapper.scala:622)
    at scala.util.Try$.apply(Try.scala:213)
    at com.databricks.backend.daemon.driver.DriverWrapper.tryExecutingCommand(DriverWrapper.scala:614)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommandAndGetError(DriverWrapper.scala:533)
    at com.databricks.backend.daemon.driver.DriverWrapper.executeCommand(DriverWrapper.scala:568)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInnerLoop(DriverWrapper.scala:438)
    at com.databricks.backend.daemon.driver.DriverWrapper.runInner(DriverWrapper.scala:381)
    at com.databricks.backend.daemon.driver.DriverWrapper.run(DriverWrapper.scala:232)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.IOException: No space left on device
    at java.io.FileOutputStream.writeBytes(Native Method)
    at java.io.FileOutputStream.write(FileOutputStream.java:326)
    at org.apache.spark.storage.TimeTrackingOutputStream.write(TimeTrackingOutputStream.java:59)
    at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
    at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
    at net.jpountz.lz4.LZ4BlockOutputStream.flushBufferedData(LZ4BlockOutputStream.java:225)
    at net.jpountz.lz4.LZ4BlockOutputStream.write(LZ4BlockOutputStream.java:178)
    at java.io.ObjectOutputStream$BlockDataOutputStream.drain(ObjectOutputStream.java:1877)
    at java.io.ObjectOutputStream$BlockDataOutputStream.setBlockDataMode(ObjectOutputStream.java:1786)
    at java.io.ObjectOutputStream.writeNonProxyDesc(ObjectOutputStream.java:1286)
    at java.io.ObjectOutputStream.writeClassDesc(ObjectOutputStream.java:1231)
    at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1427)
    at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1178)
    at java.io.ObjectOutputStream.writeFatalException(ObjectOutputStream.java:1577)
    at java.io.ObjectOutputStream.writeObject(ObjectOutputStream.java:351)
    at org.apache.spark.serializer.JavaSerializationStream.writeObject(JavaSerializer.scala:46)
    at org.apache.spark.serializer.SerializationStream.writeKey(Serializer.scala:160)
    at org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:309)
    at org.apache.spark.util.collection.WritablePartitionedIterator.writeNext(WritablePartitionedPairCollection.scala:83)
    at org.apache.spark.util.collection.ExternalSorter.org$apache$spark$util$collection$ExternalSorter$$spillMemoryIteratorToDisk(ExternalSorter.scala:320)
    at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:261)
    at org.apache.spark.util.collection.ExternalSorter.spill(ExternalSorter.scala:94)
    at org.apache.spark.util.collection.Spillable.maybeSpill(Spillable.scala:98)
    at org.apache.spark.util.collection.ExternalSorter.maybeSpillCollection(ExternalSorter.scala:238)
    at org.apache.spark.util.collection.ExternalSorter.insertAll(ExternalSorter.scala:209)
    at org.apache.spark.shuffle.sort.SortShuffleWriter.write(SortShuffleWriter.scala:74)
    at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
    at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:81)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
    at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
    at org.apache.spark.scheduler.Task.doRunTask(Task.scala:169)
    at org.apache.spark.scheduler.Task.$anonfun$run$4(Task.scala:137)
    at com.databricks.unity.EmptyHandle$.runWithAndClose(UCSHandle.scala:104)
    at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:137)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.scheduler.Task.run(Task.scala:96)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:902)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1696)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:905)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:760)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:750)```
f5emj3cl

f5emj3cl1#

这可能是由磁盘溢出引起的,具体取决于联接操作的大小。添加更多代码+完整的错误堆栈跟踪将有助于确定哪里出了问题。
但是,如果您的连接非常大,以至于所有的shuffle分区最终完全填满了您的磁盘空间,那么最终可能会出现这些错误。
尝试监控:

  • 您的Spark-UI(通常在驱动程序节点的端口4040上运行)并检查那里的磁盘溢出量,您可以在阶段详细信息页上找到此信息。
  • 执行此操作时所有执行程序的$SPARK_LOCAL_DIR目录。看到数据大小不断增长,直到失败。

但更重要的是,我看到你在使用RDD进行计算。根据你的Spark版本,你应该尝试使用Dataframes/Datasets。它们使用Catalyst optimizer,这对运行时代码的效率有很大的影响。
希望这对你有帮助!:)
EDIT:获取您添加的额外信息(错误堆栈跟踪)。
在堆栈跟踪中,您可以看到org.apache.spark.util.collection.ExternalSorter.spill方法被触发,因此您肯定会溢出到磁盘,这会导致任务失败。
减少磁盘溢出是一个典型的问题,有很多文档。我建议你在谷歌上搜索一下,调整到你的具体问题,但here是一个Stackoverflow帖子,讨论了这个例子。
我的第一个建议是使用Dataframes/Datasets来优化Catalyst Optimizer的使用。完成后,请尝试解决磁盘溢出问题的典型方法。

相关问题