Flink作业完成时的保存点

epggiuax 于 2022-12-09 发布在 Apache

关注(0)|答案(1)|浏览(168)

I have a usecase where I need to seed a Flink Application(both RocksDB state and Broadcast State) using Bounded S3 sources and then read other unbounded/bounded S3 sources after the seeding is complete.
I was trying to achieve this in 2 steps:

Seeding: Trigger a Flink job with only the seeding data bounded source and take a savepoint after the job finishes.
Regular Processing: Restore from seeded savepoint on a new Flink graph to process other unbounded/bounded S3 sources.
Questions:
For Step 1: Does Flink support taking savepoints automatically after Job Finishes in Streaming Mode.
If only manual savepoint trigger is supported, what can be used a done signal that all the seeding data is processed completely and all the task are finished processing?
Any other approaches to achieve the seeding usecase is appreciated as well. Note: Approaches where we buffer the regular data until seeding data is processed is not feasible for my usecase
Thanks

apache-flink

来源：https://stackoverflow.com/questions/74408659/savepoint-on-flink-job-finish

1条答案

按热度按时间

2wnc66cl1#

1.使用unbounded source，您可以使用externalized checkpoint，并且您将能够从检查点启动/恢复作业。启用此功能时，必须有一个进程在作业取消时清理检查点，否则Flink不会删除检查点。
1.您可以使用Flink 1.15中提供的新特性（已完成任务的检查点）来完成此操作。

赞(0）回复(0）举报 2022-12-09

我来回答

Flink作业完成时的保存点

1条答案

相关问题

热门标签

最新问答