from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("example").getOrCreate()
# Assuming you have a DataFrame with your new data
new_data = ...
table_name = "your_table"
table_path = "path_to_your_table"
new_data.write.mode("overwrite").insertInto(table_name)
spark.sql(f"MSCK REPAIR TABLE {table_name}")
1条答案
按热度按时间djmepvbi1#
字符串
insert into
方法和mode("overwrite")
用于用新数据覆盖整个表。要删除新数据中不存在的分区,可以使用MSCK REPAIR TABLE
命令。