pyflink 1.16.1 -访问安全的Kafka集群的问题

cbjzeqam  于 2023-05-15  发布在  Apache
关注(0)|答案(1)|浏览(233)

我正在尝试使用PyFlink生成安全的Kafka集群。
我尝试使用Flink项目提供的默认JSON生成器示例。
我的配置如下所示:

USERNAME = 'username'
PASSWORD = 'password'

def write_to_kafka(env):
    type_info = Types.ROW([Types.INT(), Types.STRING()])
    ds = env.from_collection(
        [(1, 'hi'), (2, 'hello'), (3, 'hi'), (4, 'hello'), (5, 'hi'), (6, 'hello'), (6, 'hello')],
        type_info=type_info)

    serialization_schema = JsonRowSerializationSchema.Builder() \
        .with_type_info(type_info) \
        .build()
    kafka_producer = FlinkKafkaProducer(
        topic='flink-test',
        serialization_schema=serialization_schema,
        producer_config={'bootstrap.servers': 'bootstrapxxxxxxxxx:443',
                    'properties.security.protocol': 'SASL_SSL',
                    'properties.sasl.mechanism':'PLAIN',
                    'properties.ssl.truststore.location': '/home/xxxx/flink/truststore.jks',
                    'properties.ssl.truststore.password': 'password',
                    'properties.group.id': 'flink-cg',
                    'properties.sasl.jaas.config' : f'org.apache.kafka.common.security.plain.PlainLoginModule required username=\"{USERNAME}\" password=\"{PASSWORD}\";'
                     }
    )

我一直得到以下错误:

at org.apache.flink.runtime.scheduler.SchedulerNG.updateTaskExecutionState(SchedulerNG.java:78)
        at org.apache.flink.runtime.jobmaster.JobMaster.updateTaskExecutionState(JobMaster.java:477)
        at jdk.internal.reflect.GeneratedMethodAccessor18.invoke(Unknown Source)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.lambda$handleRpcInvocation$1(AkkaRpcActor.java:309)
        at org.apache.flink.runtime.concurrent.akka.ClassLoadingUtils.runWithContextClassLoader(ClassLoadingUtils.java:83)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcInvocation(AkkaRpcActor.java:307)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleRpcMessage(AkkaRpcActor.java:222)
        at org.apache.flink.runtime.rpc.akka.FencedAkkaRpcActor.handleRpcMessage(FencedAkkaRpcActor.java:84)
        at org.apache.flink.runtime.rpc.akka.AkkaRpcActor.handleMessage(AkkaRpcActor.java:168)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:24)
        at akka.japi.pf.UnitCaseStatement.apply(CaseStatements.scala:20)
        at scala.PartialFunction.applyOrElse(PartialFunction.scala:123)
        at scala.PartialFunction.applyOrElse$(PartialFunction.scala:122)
        at akka.japi.pf.UnitCaseStatement.applyOrElse(CaseStatements.scala:20)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:171)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at scala.PartialFunction$OrElse.applyOrElse(PartialFunction.scala:172)
        at akka.actor.Actor.aroundReceive(Actor.scala:537)
        at akka.actor.Actor.aroundReceive$(Actor.scala:535)
        at akka.actor.AbstractActor.aroundReceive(AbstractActor.scala:220)
        at akka.actor.ActorCell.receiveMessage(ActorCell.scala:580)
        at akka.actor.ActorCell.invoke(ActorCell.scala:548)
        at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:270)
        at akka.dispatch.Mailbox.run(Mailbox.scala:231)
        at akka.dispatch.Mailbox.exec(Mailbox.scala:243)
        ... 5 more
Caused by: org.apache.flink.kafka.shaded.org.apache.kafka.common.errors.TimeoutException: Topic flink-test not present in metadata after 60000 ms.

我对错误的轻描淡写并不一定意味着这个主题不存在。这可能与访问配置有关。
我尝试了不同的配置,并通过文档,但仍然不能正确地找到正确的配置。
https://nightlies.apache.org/flink/flink-docs-release-1.15/api/python/reference/pyflink.datastream/api/pyflink.datastream.connectors.FlinkKafkaConsumer.html#pyflink.datastream.connectors.FlinkKafkaConsumer

esbemjvw

esbemjvw1#

问题是我使用的是(FlinkKafkaProducer),根据文档,它在Flink 1.15中被弃用和删除。我用的是Flink 1.61.1
"FlinkKafkaProducer is deprecated and will be removed with Flink 1.15, please use KafkaSink instead."链接为here
在切换到新的API from pyflink.datastream.connectors.kafka import KafkaSink之后,我能够按照文档中的代码片段毫无问题地生成Kafka。

相关问题