如何在运行时配置flink作业?

kd3sttzy  于 2021-06-25  发布在  Flink
关注(0)|答案(1)|浏览(430)

是否可以在运行时配置flink应用程序?例如,我有一个流应用程序,它读取输入,进行一些转换,然后过滤掉低于某个阈值的所有元素。但是,我希望这个阈值在运行时是可配置的,这意味着我可以在不必重新启动flink作业的情况下更改它。示例代码:

DataStream<MyModel> myModelDataStream = // get input ...
                // do some stuff ...
                .filter(new RichFilterFunction<MyModel>() {
                    @Override
                    public boolean filter(MyModel value) throws Exception {
                        return value.someValue() > someGlobalState.getThreshold();
                    }
                })
                // write to some sink ...

DataStream<MyConfig> myConfigDataStream = // get input ...
                // ...
                .process(new RichProcessFunction<MyConfig>() {
                      someGlobalState.setThreshold(MyConfig.getThreshold());
                })
                // ...

有没有可能做到这一点?例如,可以通过配置流更改的全局状态。

vuktfyat

vuktfyat1#

是的,你可以用一个 BroadcastProcessFunction . 大致如下:

MapStateDescriptor<Void, Threshold> bcStateDescriptor = new MapStateDescriptor<>(
    "thresholds", Types.VOID, Threshold.class);

    DataStream<MyModel> myModelDataStream = // get input ...
    DataStream<Threshold> thresholds = // get input...
    BroadcastStream<Threshold> controlStream = thresholds.broadcast(bcStateDescriptor);

    DataStream<MyModel> result = myModelDataStream
      .connect(controlStream)
      .process(new MyFunction());

    public class MyFunction extends BroadcastProcessFunction<MyModel, Long, MyModel> {    
        @Override
        public void processBroadcastElement(Threshold newthreshold, Context ctx, Collector<MyModel> out) {
            BroadcastState<Void, Threshold> bcState = ctx.getBroadcastState(new MapStateDescriptor<>("thresholds", Types.VOID, Threshold.class));  
            bcState.put(null, newthreshold);
        }

        @Override
        public void processElement(MyModel model, Collector<MyModel> out) {
            Threshold threshold = ctx.getBroadcastState(new MapStateDescriptor<>("threshold", Types.VOID, Threshold.class)).get(null);
            if (threshold.value() == null || model.getData() > threshold.value()) {
                out.collect(model);
            }
        }
    }

相关问题