spark catalog.clearcache和取消持久化rdd

r1wp621o 于 2021-05-17 发布在 Spark

关注(0)|答案(1)|浏览(710)

我是新的Spark和正在寻找一个选项，以清除缓存完全。我发现我有两个选择

SparkInstance.catalog().clearCache(); // removes all the tables from the in-memory cache

另一方面，我也能做到

persistentRDDs.forEach((k,v) -> {
   v.unpersist(); // will unpersist all the RDD's
});

现在我想找到并理解的是，哪一个是更好的选择，为什么？我相信取消持久化rdd也会从catalog（）包含的内存中删除所有数据。如果我错了，请纠正我

Java apache-spark spring-boot

来源：https://stackoverflow.com/questions/64932787/spark-catalog-clearcache-and-unpersisting-rdds

1条答案

按热度按时间

txu3uszq1#

如果使用createorreplacetempview函数将Dataframe转换为表/视图，则可以使用cache table来缓存该表。代码如下：
df.createorreplacetempview（“test”）spark.catalog.cachetable（“test”）
如果要缓存Dataframe本身，可以使用df.cache（）或df.persist（）。对于persist方法，可以根据要缓存Dataframe的方式传递参数。
如果要取消缓存Dataframe，可以使用df.unpersist（），但是如果要取消缓存创建的临时表，可以使用spark.catalog.uncachetable（“test”），如果要取消缓存已缓存的所有临时表，则可以使用spark.catalog.clearcache（）。
您还可以在这里查看答案，以获得参考：sparksql的临时表缓存

赞(0）回复(0）举报 2021-05-18

我来回答

spark catalog.clearcache和取消持久化rdd

1条答案

相关问题

热门标签

最新问答