spark filealreadyexistsexception在hdfs中写入

8tntrjer  于 2021-05-17  发布在  Spark
关注(0)|答案(0)|浏览(772)

在spark2.3的hdfs中编写分区表时遇到了一些问题。我认为问题来自partitionoverwritemode配置。有可能吗?保持覆盖模式如何解决?

spark.conf.set("spark.sql.sources.partitionOverwriteMode", "dynamic")
 df \
            .repartition(3) \
            .write \
            .format('orc') \
            .partitionBy(['data_date_part']) \
            .mode("overwrite") \
            .option("compression",'zlib') \
            .option("path", table_path+ '/' + table_name) \
            .saveAsTable(scheme + '.' + table_name)

错误:

Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 8 in stage 3.0 failed 4 times, most recent failure: Lost task 8.3 in stage 3.0 Task failed while writing rows.
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.fs.FileAlreadyExistsException): file/ for client already exists

暂无答案!

目前还没有任何答案,快来回答吧!

相关问题