我试图在intellij中构建一个“jar”文件(spark/scala)项目,并在azure数据工厂中使用“jar”活动执行“jar”。intellij中的项目是一个sbt项目,名为“simpleapp”的主类具有以下代码
import org.apache.spark.sql.SparkSession
object SimpleApp {
def main(args: Array[String]) {
val spark = SparkSession.builder.appName("Simple Application").getOrCreate()
val df = spark
.read
.parquet("/mnt/Raw/ServiceCenter=CSLP/Entity=abc/Year=2019/Month=10/abc-20191031-01.parquet")
df
.write
.format("delta")
.mode("overwrite")
.save("/mnt/Raw/Entity=works")
spark.stop()
}
}
主函数从azuredatalake(gen2)读取数据(采用parquet格式),然后以delta格式将数据写回adls(gen2)。数据以delta格式正确写入。但是活动本身在azure数据工厂中失败。我在azuredatabricks集群的“驱动程序日志”中找不到任何有意义的错误消息。我已经粘贴了一部分标准错误驱动程序日志
org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2136)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=512m; support was removed in 8.0
我遇到了一个障碍,似乎无法理解这里出了什么问题
暂无答案!
目前还没有任何答案,快来回答吧!