azure blob存储的spark流式处理检查点问题:taskcompletionlistener中的错误为null

pokxtpni  于 2021-07-13  发布在  Spark
关注(0)|答案(0)|浏览(404)

我正在使用spark结构化流的检查点功能,将chekpoint元数据存储为azure blob。
但我得到下面的错误,从日志似乎是删除临时文件,并试图再次访问它。
下面是详细的日志和任务计划
日志

  1. 2021-02-10 20:28:55.633 DEBUG --- [Executor task launch worker for task 15] org.apache.hadoop.fs.azure.NativeAzureFileSystem : Delete Successful for : wasb://mosaic-k8s-hdi-container@mosaick8shdistorage.blob.core.windows.net/k8s_performance/streamCheckpoint/c0c92f4f-2708-412c-82c1-053b32ba63c9pot5/MYSQL1/state/0/2/.1.delta.3554a10d-9d96-4ddb-84f1-a49334757dc9.TID15.tmp
  2. 2021-02-10 20:28:55.634 ERROR --- [Executor task launch worker for task 15] org.apache.spark.internal.Logging$class : Error in TaskCompletionListener
  3. java.lang.NullPointerException
  4. at org.apache.hadoop.fs.azure.NativeAzureFileSystem$NativeAzureFsOutputStream.write(NativeAzureFileSystem.java:1140)
  5. at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
  6. at java.io.DataOutputStream.write(DataOutputStream.java:107)
  7. at org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:58)
  8. at java.io.DataOutputStream.write(DataOutputStream.java:107)
  9. at net.jpountz.lz4.LZ4BlockOutputStream.finish(LZ4BlockOutputStream.java:258)
  10. at net.jpountz.lz4.LZ4BlockOutputStream.close(LZ4BlockOutputStream.java:190)
  11. at java.io.FilterOutputStream.close(FilterOutputStream.java:159)
  12. at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:363)
  13. at org.apache.commons.io.IOUtils.closeQuietly(IOUtils.java:317)
  14. at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider.org$apache$spark$sql$execution$streaming$state$HDFSBackedStateStoreProvider$$cancelDeltaFile(HDFSBackedStateStoreProvider.scala:508)
  15. at org.apache.spark.sql.execution.streaming.state.HDFSBackedStateStoreProvider$HDFSBackedStateStore.abort(HDFSBackedStateStoreProvider.scala:150)
  16. at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps$$anonfun$1$$anonfun$apply$1.apply(package.scala:65)
  17. at org.apache.spark.sql.execution.streaming.state.package$StateStoreOps$$anonfun$1$$anonfun$apply$1.apply(package.scala:64)
  18. at org.apache.spark.TaskContext$$anon$1.onTaskCompletion(TaskContext.scala:131)
  19. at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
  20. at org.apache.spark.TaskContextImpl$$anonfun$markTaskCompleted$1.apply(TaskContextImpl.scala:117)
  21. at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
  22. at org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
  23. at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  24. at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  25. at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128)
  26. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
  27. at org.apache.spark.scheduler.Task.run(Task.scala:139)
  28. at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  29. at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  30. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  31. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  32. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  33. at java.lang.Thread.run(Thread.java:745)
  34. 2021-02-10 20:28:55.639 ERROR --- [Executor task launch worker for task 15] org.apache.spark.internal.Logging$class : Exception in task 2.0 in stage 3.0 (TID 15)
  35. org.apache.spark.util.TaskCompletionListenerException: null
  36. at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
  37. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
  38. at org.apache.spark.scheduler.Task.run(Task.scala:139)
  39. at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  40. at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  41. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  42. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  43. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  44. at java.lang.Thread.run(Thread.java:745)
  45. 2021-02-10 20:28:55.658 DEBUG --- [dispatcher-event-loop-0] org.apache.spark.internal.Logging$class : parentName: , name: TaskSet_3.0, runningTasks: 3
  46. 2021-02-10 20:28:55.659 DEBUG --- [dispatcher-event-loop-0] org.apache.spark.internal.Logging$class : No tasks for locality level NO_PREF, so moving to locality level ANY
  47. 2021-02-10 20:28:55.661 WARN --- [task-result-getter-1] org.apache.spark.internal.Logging$class : Lost task 2.0 in stage 3.0 (TID 15, localhost, executor driver): org.apache.spark.util.TaskCompletionListenerException: null
  48. at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
  49. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
  50. at org.apache.spark.scheduler.Task.run(Task.scala:139)
  51. at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  52. at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  53. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  54. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  55. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  56. at java.lang.Thread.run(Thread.java:745)
  57. 2021-02-10 20:28:55.663 ERROR --- [task-result-getter-1] org.apache.spark.internal.Logging$class : Task 2 in stage 3.0 failed 1 times; aborting job
  58. 2021-02-10 20:28:55.672 INFO --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : Cancelling stage 3
  59. 2021-02-10 20:28:55.673 INFO --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : Killing all running tasks in stage 3: Stage cancelled
  60. 2021-02-10 20:28:55.678 INFO --- [dispatcher-event-loop-2] org.apache.spark.internal.Logging$class : Executor is trying to kill task 3.0 in stage 3.0 (TID 16), reason: Stage cancelled
  61. 2021-02-10 20:28:55.679 INFO --- [dispatcher-event-loop-2] org.apache.spark.internal.Logging$class : Executor is trying to kill task 0.0 in stage 3.0 (TID 13), reason: Stage cancelled
  62. 2021-02-10 20:28:55.679 INFO --- [dispatcher-event-loop-2] org.apache.spark.internal.Logging$class : Executor is trying to kill task 1.0 in stage 3.0 (TID 14), reason: Stage cancelled
  63. 2021-02-10 20:28:55.680 INFO --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : Stage 3 was cancelled
  64. 2021-02-10 20:28:55.681 INFO --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : ResultStage 3 (start at taskCompletion.java:414) failed in 47.809 s due to Job aborted due to stage failure: Task 2 in stage 3.0 failed 1 times, most recent failure: Lost task 2.0 in stage 3.0 (TID 15, localhost, executor driver): org.apache.spark.util.TaskCompletionListenerException: null
  65. at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
  66. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
  67. at org.apache.spark.scheduler.Task.run(Task.scala:139)
  68. at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  69. at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  70. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  71. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  72. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  73. at java.lang.Thread.run(Thread.java:745)
  74. Driver stacktrace:
  75. 2021-02-10 20:28:55.687 DEBUG --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : After removal of stage 2, remaining stages = 3
  76. 2021-02-10 20:28:55.687 DEBUG --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : After removal of stage 1, remaining stages = 2
  77. 2021-02-10 20:28:55.687 DEBUG --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : After removal of stage 3, remaining stages = 1
  78. 2021-02-10 20:28:55.688 DEBUG --- [dag-scheduler-event-loop] org.apache.spark.internal.Logging$class : After removal of stage 0, remaining stages = 0
  79. 2021-02-10 20:28:55.689 INFO --- [stream execution thread for [id = 3e8b8ac2-f836-44e1-b71f-9e0c030ccc79, runId = 2222948c-53e5-411a-ad22-7083bc10ed5f]] org.apache.spark.internal.Logging$class : Job 0 failed: start at taskCompletion.java:414, took 244.059233 s
  80. 2021-02-10 20:28:55.691 ERROR --- [stream execution thread for [id = 3e8b8ac2-f836-44e1-b71f-9e0c030ccc79, runId = 2222948c-53e5-411a-ad22-7083bc10ed5f]] org.apache.spark.internal.Logging$class : Aborting job 0b63e841-559f-4106-a3f8-f680fc7fdfcc.
  81. org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 3.0 failed 1 times, most recent failure: Lost task 2.0 in stage 3.0 (TID 15, localhost, executor driver): org.apache.spark.util.TaskCompletionListenerException: null
  82. at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
  83. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
  84. at org.apache.spark.scheduler.Task.run(Task.scala:139)
  85. at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  86. at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  87. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  88. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  89. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  90. at java.lang.Thread.run(Thread.java:745)
  91. Driver stacktrace:
  92. at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1889)
  93. at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1877)
  94. at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1876)
  95. at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
  96. at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
  97. at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1876)
  98. at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
  99. at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:926)
  100. at scala.Option.foreach(Option.scala:257)
  101. at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:926)
  102. at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:2110)
  103. at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2059)
  104. at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:2048)
  105. at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:49)
  106. at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:737)
  107. at org.apache.spark.SparkContext.runJob(SparkContext.scala:2061)
  108. at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:167)
  109. at org.apache.spark.sql.execution.streaming.FileStreamSink.addBatch(FileStreamSink.scala:131)
  110. at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$5$$anonfun$apply$17.apply(MicroBatchExecution.scala:537)
  111. at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
  112. at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
  113. at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
  114. at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch$5.apply(MicroBatchExecution.scala:535)
  115. at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
  116. at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
  117. at org.apache.spark.sql.execution.streaming.MicroBatchExecution.org$apache$spark$sql$execution$streaming$MicroBatchExecution$$runBatch(MicroBatchExecution.scala:534)
  118. at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply$mcV$sp(MicroBatchExecution.scala:198)
  119. at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
  120. at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1$$anonfun$apply$mcZ$sp$1.apply(MicroBatchExecution.scala:166)
  121. at org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:351)
  122. at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
  123. at org.apache.spark.sql.execution.streaming.MicroBatchExecution$$anonfun$runActivatedStream$1.apply$mcZ$sp(MicroBatchExecution.scala:166)
  124. at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:56)
  125. at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:160)
  126. at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:281)
  127. at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:193)
  128. Caused by: org.apache.spark.util.TaskCompletionListenerException: null
  129. at org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)
  130. at org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)
  131. at org.apache.spark.scheduler.Task.run(Task.scala:139)
  132. at org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
  133. at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
  134. at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
  135. at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  136. at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  137. at java.lang.Thread.run(Thread.java:745)

逻辑计划

  1. === Streaming Query ===
  2. Identifier: [id = 3e8b8ac2-f836-44e1-b71f-9e0c030ccc79, runId = 2222948c-53e5-411a-ad22-7083bc10ed5f]
  3. Current Committed Offsets: {}
  4. Current Available Offsets: {org.apache.spark.sql.eventhubs.EventHubsSource@5a86ea54: {"test":{"0":49743985}},org.apache.spark.sql.eventhubs.EventHubsSource@108ea531: {"sample":{"2":155436179,"5":155434270,"4":155415112,"7":155434738,"1":155439493,"3":155417225,"6":155429651,"0":155439349}}}
  5. Current State: ACTIVE
  6. Thread State: RUNNABLE
  7. Logical Plan:
  8. Project [GrpCol#1515, COL_1_COUNT#2554L, window#2136-T60000ms.start AS start#2561, window#2136-T60000ms.end AS end#2562]
  9. +- SubqueryAlias `groupby1`
  10. +- Project [GrpCol#1515, window#2136-T60000ms, count(COL_1)#2549L AS COL_1_COUNT#2554L]
  11. +- Aggregate [GrpCol#1515, window#2550-T60000ms], [GrpCol#1515, window#2550-T60000ms AS window#2136-T60000ms, count(COL_1#1310L) AS count(COL_1)#2549L]
  12. +- Filter isnotnull(TIMESTAMP_COL_3#1514-T60000ms)
  13. +- Project [named_struct(start, precisetimestampconversion(((((CASE WHEN (cast(CEIL((cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) as double) = (cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) THEN (CEIL((cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) + cast(1 as bigint)) ELSE CEIL((cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) END + cast(0 as bigint)) - cast(1 as bigint)) * 60000000) + 0), LongType, TimestampType), end, precisetimestampconversion((((((CASE WHEN (cast(CEIL((cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) as double) = (cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) THEN (CEIL((cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) + cast(1 as bigint)) ELSE CEIL((cast((precisetimestampconversion(TIMESTAMP_COL_3#1514-T60000ms, TimestampType, LongType) - 0) as double) / cast(60000000 as double))) END + cast(0 as bigint)) - cast(1 as bigint)) * 60000000) + 0) + 60000000), LongType, TimestampType)) AS window#2550-T60000ms, COL_1#1310L, COL_10#1311, COL_100#1312, COL_101#1313L, COL_102#1314L, COL_103#1315L, COL_104#1316L, COL_105#1317, COL_106#1318L, COL_107#1319L, COL_108#1320L, COL_109#1321, COL_11#1322L, COL_110#1323, COL_111#1324, COL_112#1325, COL_113#1326L, COL_114#1327, COL_115#1328L, COL_116#1329L, COL_117#1330L, COL_118#1331, COL_119#1332, ... 389 more fields]
  14. +- EventTimeWatermark TIMESTAMP_COL_3#1514: timestamp, interval 1 minutes
  15. +- Project [COL_1#48L AS COL_1#1310L, COL_10#49 AS COL_10#1311, COL_100#50 AS COL_100#1312, COL_101#51L AS COL_101#1313L, COL_102#52L AS COL_102#1314L, COL_103#53L AS COL_103#1315L, COL_104#54L AS COL_104#1316L, COL_105#55 AS COL_105#1317, COL_106#56L AS COL_106#1318L, COL_107#57L AS COL_107#1319L, COL_108#58L AS COL_108#1320L, COL_109#59 AS COL_109#1321, COL_11#60L AS COL_11#1322L, COL_110#61 AS COL_110#1323, COL_111#62 AS COL_111#1324, COL_112#63 AS COL_112#1325, COL_113#64L AS COL_113#1326L, COL_114#65 AS COL_114#1327, COL_115#66L AS COL_115#1328L, COL_116#67L AS COL_116#1329L, COL_117#68L AS COL_117#1330L, COL_118#69 AS COL_118#1331, COL_119#70 AS COL_119#1332, COL_12#71L AS COL_12#1333L, ... 388 more fields]
  16. +- Join Inner, (COL_1#48L = COL_1#703L)
  17. :- SubqueryAlias `watermark1`
  18. : +- EventTimeWatermark TIMESTAMP_COL_3#447: timestamp, interval 1 minutes
  19. : +- Project [COL_1#48L, COL_10#49, COL_100#50, COL_101#51L, COL_102#52L, COL_103#53L, COL_104#54L, COL_105#55, COL_106#56L, COL_107#57L, COL_108#58L, COL_109#59, COL_11#60L, COL_110#61, COL_111#62, COL_112#63, COL_113#64L, COL_114#65, COL_115#66L, COL_116#67L, COL_117#68L, COL_118#69, COL_119#70, COL_12#71L, ... 182 more fields]
  20. : +- SubqueryAlias `eventhub1`
  21. : +- Project [LIST#37.COL_1 AS COL_1#48L, LIST#37.COL_10 AS COL_10#49, LIST#37.COL_100 AS COL_100#50, LIST#37.COL_101 AS COL_101#51L, LIST#37.COL_102 AS COL_102#52L, LIST#37.COL_103 AS COL_103#53L, LIST#37.COL_104 AS COL_104#54L, LIST#37.COL_105 AS COL_105#55, LIST#37.COL_106 AS COL_106#56L, LIST#37.COL_107 AS COL_107#57L, LIST#37.COL_108 AS COL_108#58L, LIST#37.COL_109 AS COL_109#59, LIST#37.COL_11 AS COL_11#60L, LIST#37.COL_110 AS COL_110#61, LIST#37.COL_111 AS COL_111#62, LIST#37.COL_112 AS COL_112#63, LIST#37.COL_113 AS COL_113#64L, LIST#37.COL_114 AS COL_114#65, LIST#37.COL_115 AS COL_115#66L, LIST#37.COL_116 AS COL_116#67L, LIST#37.COL_117 AS COL_117#68L, LIST#37.COL_118 AS COL_118#69, LIST#37.COL_119 AS COL_119#70, LIST#37.COL_12 AS COL_12#71L, ... 180 more fields]
  22. : +- Project [jsontostructs(StructField(COL_1,LongType,true), StructField(COL_10,BooleanType,true), StructField(COL_100,BooleanType,true), StructField(COL_101,LongType,true), StructField(COL_102,LongType,true), StructField(COL_103,LongType,true), StructField(COL_104,LongType,true), StructField(COL_105,StringType,true), StructField(COL_106,LongType,true), StructField(COL_107,LongType,true), StructField(COL_108,LongType,true), StructField(COL_109,StringType,true), StructField(COL_11,LongType,true), StructField(COL_110,DoubleType,true), StructField(COL_111,DoubleType,true), StructField(COL_112,StringType,true), StructField(COL_113,LongType,true), StructField(COL_114,StringType,true), StructField(COL_115,LongType,true), StructField(COL_116,LongType,true), StructField(COL_117,LongType,true), StructField(COL_118,StringType,true), StructField(COL_119,DoubleType,true), StructField(COL_12,LongType,true), ... 173 more fields) AS LIST#37, body#27, partition#10, offset#11, sequenceNumber#12L, enqueuedTime#13, publisher#14, partitionKey#15, properties#16, systemProperties#17]
  23. : +- Project [cast(body#9 as string) AS body#27, partition#10, offset#11, sequenceNumber#12L, enqueuedTime#13, publisher#14, partitionKey#15, properties#16, systemProperties#17]
  24. : +- StreamingExecutionRelation org.apache.spark.sql.eventhubs.EventHubsSource@108ea531, [body#9, partition#10, offset#11, sequenceNumber#12L, enqueuedTime#13, publisher#14, partitionKey#15, properties#16, systemProperties#17]
  25. +- SubqueryAlias `watermark2`
  26. +- EventTimeWatermark TIMESTAMP_COL_3#1102: timestamp, interval 1 minutes
  27. +- Project [COL_1#703L, COL_10#704, COL_100#705, COL_101#706L, COL_102#707L, COL_103#708L, COL_104#709L, COL_105#710, COL_106#711L, COL_107#712L, COL_108#713L, COL_109#714, COL_11#715L, COL_110#716, COL_111#717, COL_112#718, COL_113#719L, COL_114#720, COL_115#721L, COL_116#722L, COL_117#723L, COL_118#724, COL_119#725, COL_12#726L, ... 182 more fields]
  28. +- SubqueryAlias `eventhub2`
  29. +- Project [LIST#692.COL_1 AS COL_1#703L, LIST#692.COL_10 AS COL_10#704, LIST#692.COL_100 AS COL_100#705, LIST#692.COL_101 AS COL_101#706L, LIST#692.COL_102 AS COL_102#707L, LIST#692.COL_103 AS COL_103#708L, LIST#692.COL_104 AS COL_104#709L, LIST#692.COL_105 AS COL_105#710, LIST#692.COL_106 AS COL_106#711L, LIST#692.COL_107 AS COL_107#712L, LIST#692.COL_108 AS COL_108#713L, LIST#692.COL_109 AS COL_109#714, LIST#692.COL_11 AS COL_11#715L, LIST#692.COL_110 AS COL_110#716, LIST#692.COL_111 AS COL_111#717, LIST#692.COL_112 AS COL_112#718, LIST#692.COL_113 AS COL_113#719L, LIST#692.COL_114 AS COL_114#720, LIST#692.COL_115 AS COL_115#721L, LIST#692.COL_116 AS COL_116#722L, LIST#692.COL_117 AS COL_117#723L, LIST#692.COL_118 AS COL_118#724, LIST#692.COL_119 AS COL_119#725, LIST#692.COL_12 AS COL_12#726L, ... 180 more fields]
  30. +- Project [jsontostructs(StructField(COL_1,LongType,true), StructField(COL_10,BooleanType,true), StructField(COL_100,BooleanType,true), StructField(COL_101,LongType,true), StructField(COL_102,LongType,true), StructField(COL_103,LongType,true), StructField(COL_104,LongType,true), StructField(COL_105,StringType,true), StructField(COL_106,LongType,true), StructField(COL_107,LongType,true), StructField(COL_108,LongType,true), StructField(COL_109,StringType,true), StructField(COL_11,LongType,true), StructField(COL_110,DoubleType,true), StructField(COL_111,DoubleType,true), StructField(COL_112,StringType,true), StructField(COL_113,LongType,true), StructField(COL_114,StringType,true), StructField(COL_115,LongType,true), StructField(COL_116,LongType,true), StructField(COL_117,LongType,true), StructField(COL_118,StringType,true), StructField(COL_119,DoubleType,true), StructField(COL_12,LongType,true), ... 173 more fields) AS LIST#692, body#682, partition#665, offset#666, sequenceNumber#667L, enqueuedTime#668, publisher#669, partitionKey#670, properties#671, systemProperties#672]
  31. +- Project [cast(body#664 as string) AS body#682, partition#665, offset#666, sequenceNumber#667L, enqueuedTime#668, publisher#669, partitionKey#670, properties#671, systemProperties#672]
  32. +- StreamingExecutionRelation org.apache.spark.sql.eventhubs.EventHubsSource@5a86ea54, [body#664, partition#665, offset#666, sequenceNumber#667L, enqueuedTime#668, publisher#669, partitionKey#670, properties#671, systemProperties#672]
  33. Caused by: org.apache.spark.sql.streaming.StreamingQueryException: Job aborted.

jar版本azure存储-8.4.0,hadoop azure-2.9.2

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题