我在confluent社区平台中使用kafka connect来保持mysql数据库的同步。源和汇是mysql数据库。它不起作用。
我的情况有些问题:
同一台服务器上的其他数据库中有表,我不想将它们读入kafka,但kafka connect source一直在尝试读取其他数据库。
我想用 org.apache.kafka.connect.json.JsonConverter
在源连接器和接收器连接器中,但接收器连接器无法正确插入。
我想同步多个数据库,不同数据库中的表可能具有相同的表名,如何避免表名冲突和sink连接器正确路由kafka主题,将数据插入到正确的数据库中?mysql同步说明
kafka jdbc源连接器配置文件是:
{
"name": "br-auths-3910472223-source",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSourceConnector",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable":"true",
"tasks.max": "1",
"connection.url": "jdbc:mysql://localhost:3306/br_auths?user=root&password=123456",
"database.whitelist":"br_auths",
"table.blacklist": "br_auths.__migrationversions,br_auths.auths_service_apps",
"mode": "timestamp",
"timestamp.column.name": "utime",
"validate.non.null": "false",
"incrementing.column.name": "id",
"topic.prefix": "br_auths__"
}
}
kafka jdbc接收器连接器配置文件是:
{
"name": "br-auths-3910472223-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"key.converter": "org.apache.kafka.connect.json.JsonConverter",
"key.converter.schemas.enable":"true",
"value.converter": "org.apache.kafka.connect.json.JsonConverter",
"value.converter.schemas.enable":"true",
"tasks.max": "1",
"connection.url": "jdbc:mysql://rm-hp303a0n2vr8970.mysql.huhehaote.rds.aliyuncs.com:3306/dev-br-auths-391047222?user=br_auths&password=@123456",
"topics": "br_auths__auths_roles,br_auths__auths_user_logins,br_auths__auths_user_roles,br_auths__auths_users,br_auths__auths_user_claims,br_auths__auths_user_tokens,br_auths__auths_role_claims",
"auto.create": "true",
"insert.mode": "upsert",
"transforms":"dropTopicPrefix",
"transforms.dropTopicPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropTopicPrefix.regex":"br_auths__(.*)",
"transforms.dropTopicPrefix.replacement":"$1"
}
}
我想为不同的数据库创建几对源和汇连接器,mysql server-a中的数据库a中的一些白名单表可以增量地与mysql server-b中的数据库a同步。
更新1:
我改为连接avro分布式、debezium源连接器和jdbc接收器连接器。源连接器是:
{
"name":"br-auths-3910472223-source",
"config":{
"connector.class": "io.debezium.connector.mysql.MySqlConnector",
"tasks.max": "1",
"database.hostname": "localhost",
"database.port": "3306",
"database.user": "root",
"database.password": "br123456",
"database.useLegacyDatetimeCode": "false",
"database.server.id": "184",
"database.server.name": "local3910472223",
"database.whitelist":"br_auths",
"database.history.kafka.bootstrap.servers": "localhost:9092",
"database.history.kafka.topic": "schema-changes.br-auths.local3910472223" ,
"table.blacklist": "br_auths.__migrationversions,br_auths.auths_service_apps",
"include.schema.changes": "true",
"transforms": "route,TimestampConverter",
"transforms.TimestampConverter.type": "org.apache.kafka.connect.transforms.TimestampConverter$Value",
"transforms.TimestampConverter.target.type": "string",
"transforms.TimestampConverter.field": "payload.after.ctime",
"transforms.TimestampConverter.format": "yyyy-MM-dd HH:mm:ss",
"transforms.route.type": "org.apache.kafka.connect.transforms.RegexRouter",
"transforms.route.regex": "([^.]+)\\.([^.]+)\\.([^.]+)",
"transforms.route.replacement": "$2__$3"
}
}
Flume接头为:
{
"name": "br-auths-3910472223-sink",
"config": {
"connector.class": "io.confluent.connect.jdbc.JdbcSinkConnector",
"tasks.max": "1",
"connection.url": "jdbc:mysql://rm-hp303a0n2.mysql.huhehaote.rds.aliyuncs.com:3306/dev-br-auths-391047222?useLegacyDatetimeCode=false&user=br_auths&password=123456",
"dialect.name": "MySqlDatabaseDialect",
"topics.regex": "br_auths__(.*)",
"transforms": "dropTopicPrefix,unwrap",
"transforms.dropTopicPrefix.type":"org.apache.kafka.connect.transforms.RegexRouter",
"transforms.dropTopicPrefix.regex":"br_auths__(.*)",
"transforms.dropTopicPrefix.replacement":"$1",
"transforms.unwrap.type": "io.debezium.transforms.UnwrapFromEnvelope",
"insert.mode": "upsert",
"pk.fields": "Id",
"pk.mode": "record_value"
}
}
avro消息转换为json,如下所示:
{
"schema": {
"type": "struct",
"fields": [
{
"type": "struct",
"fields": [
{
"type": "string",
"optional": false,
"field": "Id"
},
{
"type": "string",
"optional": false,
"field": "UserId"
},
{
"type": "string",
"optional": false,
"field": "RoleId"
},
{
"type": "string",
"optional": true,
"field": "APPID"
},
{
"type": "int32",
"optional": false,
"default": 0,
"field": "IsDeleted"
},
{
"type": "int64",
"optional": false,
"name": "io.debezium.time.Timestamp",
"version": 1,
"default": 0,
"field": "ctime"
},
{
"type": "int64",
"optional": false,
"name": "io.debezium.time.Timestamp",
"version": 1,
"default": 0,
"field": "utime"
}
],
"optional": true,
"name": "local3910472223.br_auths.auths_user_roles.Value",
"field": "before"
},
{
"type": "struct",
"fields": [
{
"type": "string",
"optional": false,
"field": "Id"
},
{
"type": "string",
"optional": false,
"field": "UserId"
},
{
"type": "string",
"optional": false,
"field": "RoleId"
},
{
"type": "string",
"optional": true,
"field": "APPID"
},
{
"type": "int32",
"optional": false,
"default": 0,
"field": "IsDeleted"
},
{
"type": "int64",
"optional": false,
"name": "io.debezium.time.Timestamp",
"version": 1,
"default": 0,
"field": "ctime"
},
{
"type": "int64",
"optional": false,
"name": "io.debezium.time.Timestamp",
"version": 1,
"default": 0,
"field": "utime"
}
],
"optional": true,
"name": "local3910472223.br_auths.auths_user_roles.Value",
"field": "after"
},
{
"type": "struct",
"fields": [
{
"type": "string",
"optional": true,
"field": "version"
},
{
"type": "string",
"optional": false,
"field": "name"
},
{
"type": "int64",
"optional": false,
"field": "server_id"
},
{
"type": "int64",
"optional": false,
"field": "ts_sec"
},
{
"type": "string",
"optional": true,
"field": "gtid"
},
{
"type": "string",
"optional": false,
"field": "file"
},
{
"type": "int64",
"optional": false,
"field": "pos"
},
{
"type": "int32",
"optional": false,
"field": "row"
},
{
"type": "boolean",
"optional": true,
"default": false,
"field": "snapshot"
},
{
"type": "int64",
"optional": true,
"field": "thread"
},
{
"type": "string",
"optional": true,
"field": "db"
},
{
"type": "string",
"optional": true,
"field": "table"
},
{
"type": "string",
"optional": true,
"field": "query"
}
],
"optional": false,
"name": "io.debezium.connector.mysql.Source",
"field": "source"
},
{
"type": "string",
"optional": false,
"field": "op"
},
{
"type": "int64",
"optional": true,
"field": "ts_ms"
}
],
"optional": false,
"name": "local3910472223.br_auths.auths_user_roles.Envelope"
},
"payload": {
"before": null,
"after": {
"Id": "DB4DA841364860D112C3C76BDCB36635",
"UserId": "0000000000",
"RoleId": "5b7e5f9b4bc00d89c4cf96ae",
"APPID": "br.region2",
"IsDeleted": 0,
"ctime": 1550138524000,
"utime": 1550138524000
},
"source": {
"version": "0.8.3.Final",
"name": "local3910472223",
"server_id": 0,
"ts_sec": 0,
"gtid": null,
"file": "mysql-bin.000003",
"pos": 64606,
"row": 0,
"snapshot": true,
"thread": null,
"db": "br_auths",
"table": "auths_user_roles",
"query": null
},
"op": "c",
"ts_ms": 1550568556614
}
}
使用mysql datetime类型的列被序列化为一个大整数,jdbc接收器连接器试图插入mysql datetime列,但失败了。
所以我写了 transforms.TimestampConverter
但ctime、utime列没有更改。怎么了?
1条答案
按热度按时间vs91vp4v1#
如果您想保持数据库的同步,那么jdbc源连接器并不是最好的——您需要使用适当的基于日志的cdc,对于mysql,您可以使用debezium获得它。更多细节在这里。
如果你对这些数据不做任何其他事情,你需要Kafka吗?专用的mysql复制工具更合适吗?
你的具体问题。这篇文章将解决你的许多问题。特别地:
同一台服务器上的其他数据库中有表,我不想将它们读入kafka,但kafka connect source一直在尝试读取其他数据库。
结合使用
table.whitelist
,table.blacklist
,和schema.pattern
按要求。如果不能用一个连接器匹配整个图案,则需要使用多个连接器来实现所需的设置。我想在源连接器和接收器连接器中都使用org.apache.kafka.connect.json.jsonconverter,但接收器连接器无法正确插入。
如果你不解释“不能正确插入”,很难回答这个问题。一般来说,我会使用avro,因为它有更丰富的模式支持和更高效的消息(没有嵌入式模式,模式存储在模式注册表中)。请参阅此处了解更多详细信息。
我想同步多个数据库,不同数据库中的表可能具有相同的表名,如何避免表名冲突和sink连接器正确路由kafka主题,将数据插入到正确的数据库中?
您将需要使用
topic.prefix
在源连接器上标记来自特定源的主题,然后进行单个消息转换RegexRouter
(正如您已经发现的)在源连接器和/或接收器连接器中进一步操作主题名称。您可能需要使用多个接收器连接器topics.regex
选择要路由到特定数据库的特定主题。