spark作业由于java.io.notserializableexception失败：org.apache.spark.sparkcontext

j2cgzkjk 于 2021-06-02 发布在 Hadoop

关注(0)|答案(1)|浏览(450)

当我试图在上应用方法（computedwt）时，我遇到了上述异常 RDD[(Int,ArrayBuffer[(Int,Double)])] 输入。我甚至在用 extends Serialization 选项序列化spark中的对象。下面是代码段。

input:series:RDD[(Int,ArrayBuffer[(Int,Double)])] 
DWTsample extends Serialization is a class having computeDwt function.
sc: sparkContext

val  kk:RDD[(Int,List[Double])]=series.map(t=>(t._1,new DWTsample().computeDwt(sc,t._2)))

Error:
org.apache.spark.SparkException: Job failed: java.io.NotSerializableException: org.apache.spark.SparkContext
org.apache.spark.SparkException: Job failed: java.io.NotSerializableException: org.apache.spark.SparkContext
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:760)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:758)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:60)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:758)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitMissingTasks(DAGScheduler.scala:556)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:503)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:361)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$run(DAGScheduler.scala:441)
at org.apache.spark.scheduler.DAGScheduler$$anon$1.run(DAGScheduler.scala:149)

有没有人能告诉我可能存在什么问题，以及应该做些什么来克服这个问题？

Java hadoop scala apache-spark

来源：https://stackoverflow.com/questions/23605542/spark-job-is-failed-due-to-java-io-notserializableexception-org-apache-spark-sp

1条答案

按热度按时间

eh57zj3b1#

线路

series.map(t=>(t._1,new DWTsample().computeDwt(sc,t._2)))

引用sparkcontext( sc )但是sparkcontext是不可序列化的。sparkcontext设计用于公开在驱动程序上运行的操作；在worker上运行的代码不能引用/使用它。
你必须重新构造你的代码 sc 在Map函数闭包中未引用。

赞(0）回复(0）举报 2021-06-03

我来回答

spark作业由于java.io.notserializableexception失败：org.apache.spark.sparkcontext

1条答案

相关问题

热门标签

最新问答