将spark submit转换为livy rest json协议

ffscu2ro  于 2021-07-12  发布在  Spark
关注(0)|答案(1)|浏览(425)

我正在尝试将spark submit重写到livy rest json协议中,spark submit包含诸如包、存储库、jar、文件、用户定义的参数。请在下面查找更多详细信息。
spark提交命令:

spark-submit \
  --packages com.hortonworks.shc:shc-core:1.1.0.3.1.6.5-3 \
  --repositories http://repo.hortonworks.com/content/groups/public/ \
  --jars /usr/hdp/current/phoenix-client/phoenix-server.jar \
  --files x/y.yml,x/y1.yml $HOME/spark_apps/a/app.py \
  --arg_name value \
  --arg_name2 value

我在利维身上的尝试:

{
    "conf": {"com.hortonworks.shc": "shc-core:1.1.0.3.1.6.5-3"},
    "jars":["wasbs:///phoenix-server.jar"],
    "file": "/home/admin/spark_apps/a/app.py",
    "files": ["/home/admin/x/y.yml,/home/admin/x/y1.yml"], 
    "args": [
         "--arg_name=value", 
         "--arg_name=value"] 

}

错误是:

ls: cannot access '/usr/hdp/current/hadoop/lib': No such file or directory
log4j:ERROR Could not find value for key log4j.appender.tcp
log4j:ERROR Could not instantiate appender named "tcp".
Warning: Ignoring non-spark config property: com.hortonworks.shc=shc-core:1.1.0.3.1.6.5-3
Warning: Master yarn-cluster is deprecated since 2.0. Please use master "yarn" with specified deploy mode instead.
21/03/14 12:05:07 WARN NativeCodeLoader [main]: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/03/14 12:05:08 WARN DependencyUtils [main]: Skip remote jar wasbs:///phoenix-server.jar.
21/03/14 12:05:09 INFO RequestHedgingRMFailoverProxyProvider [main]: Created wrapped proxy for [rm1, rm2]
21/03/14 12:05:09 INFO RequestHedgingRMFailoverProxyProvider [main]: Looking for the active RM in [rm1, rm2]...
21/03/14 12:05:09 INFO RequestHedgingRMFailoverProxyProvider [main]: Found active RM [rm2]
21/03/14 12:05:09 INFO Client [main]: Requesting a new application from cluster with 2 NodeManagers
21/03/14 12:05:09 INFO Configuration [main]: found resource resource-types.xml at file:/etc/hadoop/4.1.2.5/0/resource-types.xml
21/03/14 12:05:09 INFO Client [main]: Verifying our application has not requested more than the maximum memory capability of the cluster (51200 MB per container)
21/03/14 12:05:09 INFO Client [main]: Will allocate AM container, with 1408 MB memory including 384 MB overhead
21/03/14 12:05:09 INFO Client [main]: Setting up container launch context for our AM
21/03/14 12:05:10 INFO Client [main]: Setting up the launch environment for our AM container
21/03/14 12:05:10 INFO Client [main]: Preparing resources for our AM container
21/03/14 12:05:10 INFO Client [main]: Falling back to uploading libraries in this host
21/03/14 12:05:10 INFO Client [main]: Uploading resource file:/tmp/spark-c923cab1-6cf5-4fa8-9db3-73d156052819/__hive_libs__5946923461629036475.zip -> wasbs://container-spark-2021-01-12t10-28-51-042z@container.blob.core.windows.net/user/livy/.sparkStaging/application_1615371594106_0446/__hive_libs__5946923461629036475.zip
21/03/14 12:05:12 INFO Client [main]: Source and destination file systems are the same. Not copying wasbs:/phoenix-server.jar
21/03/14 12:05:12 WARN AzureFileSystemThreadPoolExecutor [main]: Disabling threads for Delete operation as thread count 0 is <= 1
21/03/14 12:05:12 INFO AzureFileSystemThreadPoolExecutor [main]: Time taken for Delete operation is: 11 ms with threads: 0
21/03/14 12:05:12 INFO Client [main]: Deleted staging directory wasbs://container-2021-01-12t10-28-51-042z@container.blob.core.windows.net/user/livy/.sparkStaging/application_1615371594106_0446
Exception in thread "main" java.io.FileNotFoundException: wasbs://container-2021-01-12t10-28-51-042z@container.blob.core.windows.net/phoenix-server.jar: No such file or directory.
    at org.apache.hadoop.fs.azure.NativeAzureFileSystem.getFileStatusInternal(NativeAzureFileSystem.java:2716)
    at org.apache.hadoop.fs.azure.NativeAzureFileSystem.getFileStatus(NativeAzureFileSystem.java:2620)
    at org.apache.spark.deploy.yarn.ClientDistributedCacheManager$$anonfun$1.apply(ClientDistributedCacheManager.scala:71)
    at org.apache.spark.deploy.yarn.ClientDistributedCacheManager$$anonfun$1.apply(ClientDistributedCacheManager.scala:71)
    at scala.collection.MapLike$class.getOrElse(MapLike.scala:128)
    at scala.collection.AbstractMap.getOrElse(Map.scala:59)
    at org.apache.spark.deploy.yarn.ClientDistributedCacheManager.addResource(ClientDistributedCacheManager.scala:71)
    at org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:479)
    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$16$$anonfun$apply$6.apply(Client.scala:651)
    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$16$$anonfun$apply$6.apply(Client.scala:650)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$16.apply(Client.scala:650)
    at org.apache.spark.deploy.yarn.Client$$anonfun$prepareLocalResources$16.apply(Client.scala:649)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:649)
    at org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:917)
    at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:179)
    at org.apache.spark.deploy.yarn.Client.run(Client.scala:1239)
    at org.apache.spark.deploy.yarn.YarnClusterApplication.start(Client.scala:1634)
    at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:858)
    at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
    at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
    at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
    at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:933)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:942)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/03/14 12:05:12 INFO ShutdownHookManager [shutdown-hook-0]: Shutdown hook called
21/03/14 12:05:12 INFO ShutdownHookManager [shutdown-hook-0]: Deleting directory /tmp/spark-c923cab1-6cf5-4fa8-9db3-73d156052819
21/03/14 12:05:12 INFO ShutdownHookManager [shutdown-hook-0]: Deleting directory /tmp/spark-424d5ba4-718c-470a-b110-9020578aef12

你能帮我把spark submit改写成livy rest json吗。。?
先谢谢你。

dsf9zpds

dsf9zpds1#

也许你应该检查一下你的jvm。检查jvm配置。
日志显示 ls: cannot access '/usr/hdp/current/hadoop/lib': No such file or directory . 还要检查路径。
也许可以查看spark用户指南。要使spark运行时jars可以从yarn端访问,可以指定spark.yarn.archive或spark.yarn.jars。有关详细信息,请参阅公园物业。如果既没有指定spark.yarn.archive也没有指定spark.yarn.jars,spark将创建一个zip文件,其中所有jar都在$spark\u home/jars下,并将其上载到分布式缓存。

相关问题