spark提交文件hdfs://file get 在驱动程序上的/tmp中缓存

xam8gpfp 于 2021-06-01 发布在 Hadoop

关注(0)|答案(1)|浏览(297)

我正在运行一个类似这样的spark提交：

spark-submit --deploy-mode client 
             --master yarn 
             --conf spark.files.overwrite=true 
             --conf spark.local.dir='/my/other/tmp/with/more/space' 
             --conf spark.executor.extraJavaOptions='-Djava.io.tmpdir=/my/other/tmp/with/more/space' 
             --conf spark.driver.extraJavaOptions='-Djava.io.tmpdir=/my/other/tmp/with/more/space'
             --files hdfs:///a_big_file.binary,hdfs:///another_big_file.binary 
              ... etc.

我需要以这种方式将这两个二进制文件添加到节点中，因为它们是由外部 *.dll / *.so 在workers中，只需处理本地文件。
正在跑步 yarn=master deploy-mode=client 我的节点获取驱动程序，因此将文件从hdfs拉到 /tmp 目录。由于这些文件相当大，它填补了我的有限 /tmp 目录相当快。
我想知道是否有人能指出改变这条道路的背景 /tmp 至 /my/other/tmp/with/more/space 因为我已经提出了论点 spark.local.dir , spark.executor.extraJavaOptions 以及 spark.driver.extraJavaOptions .
谢谢你，马夫