我在scala 2.12.3中使用spark 3。我的应用程序有一些依赖项,我想将它们包含在fatjar文件中。我看到了一个使用 sbt-assembly
在这个链接上。为了做到这一点,我必须创建一个 project/assembly.sbt
文件:
addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.5")
我的build.sbt文件有:
name := "explore-spark"
version := "0.2"
scalaVersion := "2.12.3"
val sparkVersion = "3.0.0"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",
"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",
"com.twitter" %% "algebird-core" % "0.13.7",
"joda-time" % "joda-time" % "2.5",
"org.fusesource.mqtt-client" % "mqtt-client" % "1.16"
)
mainClass in(Compile, packageBin) := Some("org.sense.spark.app.App")
mainClass in assembly := Some("org.sense.spark.app.App")
assemblyOption in assembly := (assemblyOption in assembly).value.copy(includeScala = false)
assemblyJarName in assembly := s"${name.value}_${scalaBinaryVersion.value}-fat_${version.value}.jar"
然后执行命令 sbt assembly
在项目的根目录上。我收到警告信息说文件被丢弃了。
[info] Merging files...
[warn] Merging 'META-INF/NOTICE.txt' with strategy 'rename'
[warn] Merging 'META-INF/LICENSE.txt' with strategy 'rename'
[warn] Merging 'META-INF/MANIFEST.MF' with strategy 'discard'
[warn] Merging 'META-INF/maven/com.googlecode.javaewah/JavaEWAH/pom.properties' with strategy 'discard'
[warn] Merging 'META-INF/maven/com.googlecode.javaewah/JavaEWAH/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/joda-time/joda-time/pom.properties' with strategy 'discard'
[warn] Merging 'META-INF/maven/joda-time/joda-time/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtbuf/hawtbuf/pom.properties' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtbuf/hawtbuf/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch-transport/pom.properties' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch-transport/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch/pom.properties' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.hawtdispatch/hawtdispatch/pom.xml' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.mqtt-client/mqtt-client/pom.properties' with strategy 'discard'
[warn] Merging 'META-INF/maven/org.fusesource.mqtt-client/mqtt-client/pom.xml' with strategy 'discard'
[warn] Strategy 'discard' was applied to 13 files
[warn] Strategy 'rename' was applied to 2 files
[info] SHA-1: 2f2a311b8c826caae5f65a3670a71aafa12e2dc7
[info] Packaging /home/felipe/workspace-idea/explore-spark/target/scala-2.12/explore-spark_2.12-fat_0.2.jar ...
[info] Done packaging.
[success] Total time: 13 s, completed Jul 20, 2020 12:44:37 PM
然后,当我试图提交我的Spark申请,我得到的错误 java.lang.NoClassDefFoundError: org/fusesource/hawtbuf/Buffer
. 我创建了fatjar文件,但不知何故它丢弃了我需要的依赖项。这就是我提交应用程序的方式,只是为了确保我使用的是fat jar。 $ ./bin/spark-submit --master spark://127.0.0.1:7077 --deploy-mode cluster --driver-cores 4 --name "App" --conf "spark.driver.extraJavaOptions=-javaagent:/home/flink/spark-3.0.0-bin-hadoop2.7/jars/jmx_prometheus_javaagent-0.13.0.jar=8082:/home/flink/spark-3.0.0-bin-hadoop2.7/conf/spark.yml" /home/felipe/workspace-idea/explore-spark/target/scala-2.12/explore-spark_2.12-fat_0.2.jar -app 2
1条答案
按热度按时间pvabu6sv1#
可以按以下顺序进行调试:
请确保丢失的类包含在您的fat jar中。jar是一个归档文件,您可以通过操作系统上的工具将其可视化。
如果是这样,请检查它是否还未包含在用于运行代码的集群中。如果是,您可以使用着色作为解决方案(我在这里解释了方法:https://www.waitingforcode.com/apache-spark/shading-solution-dependency-hell-spark/read)或者破坏依赖关系,但这有点冒险。
如果不是这样的话,试着明确地包括它-你做了什么,但可能是一个错误的版本?