使用自定义配置通过Flink流式传输执行环境时未接收到输出

wbgh16ku 于 2022-12-09 发布在 Apache

关注(0)|答案(1)|浏览(141)

我正在运行Apache Flink版本1.12.7，并将流执行环境配置为任务管理器的任务插槽数= 3（只是试验），但无法看到环境读取的文件的输出。相反，如日志中所示，执行图被卡在SCHEDULED状态，无法进入RUNNING状态。
请注意，如果属性文件中未传递任何配置，则一切正常，并且由于执行图切换到RUNNING状态，环境能够读取文件，因此可以看到输出。
代码如下：

ParameterTool parameters = ParameterTool.fromPropertiesFile("src/main/resources/application.properties");
    Configuration config = Configuration.fromMap(parameters.toMap());
    TaskExecutorResourceUtils.adjustForLocalExecution(config);

    StreamExecutionEnvironment env =
            StreamExecutionEnvironment.getExecutionEnvironment(config);

    System.out.println("Config Params : " + config.toMap());

    DataStream<String> inputStream =
            env.readTextFile(FILEPATH);

    DataStream<String> filteredData = inputStream.filter((String value) -> {
        String[] tokens = value.split(",");
        return Double.parseDouble(tokens[3]) >= 75.0;
    });

    filteredData.print(); // no o/p seen if configuration object is set otherwise everything works as expected

    env.execute("Filter Country Details");

需要帮助来理解这种行为，以及为了打印输出沿着进行一些自定义配置，应该做哪些更改。谢谢。

apache-flink

来源：https://stackoverflow.com/questions/74174810/no-output-received-when-flink-streaming-execution-environment-passed-with-custom

1条答案

按热度按时间

jjjwad0x1#

Okay..So found the answer to the above puzzle by referring to some links mentioned below.

Solution : So I set the parallelism (env.setParallelism) in the above code just after configuring the streaming execution environment and the file was read with output generated as expected.

Post that, experimented with a few things :

set parallelism equal to number of task slots = everything worked
set parallelism greater than number of task slots = intermittent results
set parallelism less than number of task slots = intermittent results.

As per this link corresponding to Flink Architecture,
A Flink cluster needs exactly as many task slots as the highest parallelism used in the job
So its best to go with no. of task slots for a task manager equal to the parallelism configured.

赞(0）回复(0）举报 2022-12-09

我来回答

使用自定义配置通过Flink流式传输执行环境时未接收到输出

1条答案

相关问题

热门标签

最新问答