我有一个流应用程序,它监听一些数据,然后通过将数据推送到一个新的主题来转换数据。我使用avro模式将数据读写到主题中。问题是当我使用下面的命令使用来自最终目的地的数据时。然而,我的数据有点复杂,其中包含一些数组和json,我怀疑我的avro模式可能不适合我的目的。没有错误或任何东西,我可以看到我的最后一个主题的所有数据,但“宠物”字段由于某种原因被复制,我不明白为什么。事实上,我只向avro模式中的现有数据添加了一个新字段(job\ id),在转换它时不会对其进行大的更改。
./bin/kafka-console-consumer --topic my_topic \
--bootstrap-server localhost:9092 \
这是我的json数据
{
"Person":{
"id":"104440",
"Name":"William",
"LastName":"Dorsey",
"archived":false,
"Timezone":"America/Los_Angeles",
"brandCompanyName":"Twitter",
"brandID":"cf545a7b",
"creatorID":"1234",
"currency":"USD",
"dateCreated":"2020-09-07T02:56:22Z",
"dateModified":"2020-09-07T02:57:24Z",
"disabled":false,
"endDate":"2020-11-29T19:51:00-08:00",
"startDate":"2020-08-31T20:55:00-07:00",
"totalBudget":0
},
"Pets":[
{
"Name":"Pawny",
"Id":"4214",
"budget":"0",
"adoptionDate":"2020-09-07T02:56:22Z",
"year":"2",
"type":"Golden",
"gender":"male"
}
],
"CreationTime":"1604036638"
}
我的avro架构
{
"name": "MyClass",
"type": "record",
"namespace": "com.acme.avro",
"fields": [
{
"name": "Person",
"type": {
"name": "Person",
"type": "record",
"fields": [
{
"name": "id",
"type": "string"
},
{
"name": "Name",
"type": "string"
},
{
"name": "LastName",
"type": "string"
},
{
"name": "archived",
"type": "boolean"
},
{
"name": "Timezone",
"type": "string"
},
{
"name": "brandCompanyName",
"type": "string"
},
{
"name": "brandID",
"type": "string"
},
{
"name": "creatorID",
"type": "string"
},
{
"name": "currency",
"type": "string"
},
{
"name": "dateCreated",
"type": "int",
"logicalType": "date"
},
{
"name": "dateModified",
"type": "int",
"logicalType": "date"
},
{
"name": "disabled",
"type": "boolean"
},
{
"name": "endDate",
"type": "int",
"logicalType": "date"
},
{
"name": "startDate",
"type": "int",
"logicalType": "date"
},
{
"name": "totalBudget",
"type": "int"
}
]
}
},
{
"name": "Pets",
"type": {
"type": "array",
"items": {
"name": "Pets_record",
"type": "record",
"fields": [
{
"name": "Name",
"type": "string"
},
{
"name": "Id",
"type": "string"
},
{
"name": "budget",
"type": "string"
},
{
"name": "adoptionDate",
"type": "int",
"logicalType": "date"
},
{
"name": "year",
"type": "string"
},
{
"name": "type",
"type": "string"
},
{
"name": "gender",
"type": "string"
}
]
}
}
},
{
"name": "CreationTime",
"type": "string"
},
{
"name":"jobID",
"type":"string"
}
]
}
当我使用主题-pets字段时,主题中的输出由于某种原因被复制了?我不明白为什么
{
"id":"104440",
"Name":"William",
"LastName:"Dorsey",
"archived":false,
"Timezone":"America/Los_Angeles",
"brandCompanyName":"Twitter",
"brandID":"cf545a7b",
"creatorID":"1234",
"currency":"USD",
"dateCreated":"2020-09-07T02:56:22Z",
"dateModified":"2020-09-07T02:57:24Z",
"disabled":false,
"endDate":"2020-11-29T19:51:00-08:00",
"startDate":"2020-08-31T20:55:00-07:00",
"totalBudget":0,
"Pets":[
{
"Name":"Pawny",
"Id":"4214",
"budget":"0",
"adoptionDate":2020-09-07T02:56:22Z",
"year":"2",
"type":"Golden",
"gender":"male"
}
],
"CreationTime":1604036638,
"jobID":12512,
"pets":[
{
"Name":"Pawny",
"Id":"4214",
"budget":"0",
"adoptionDate":2020-09-07T02:56:22Z",
"year":"2",
"type":"Golden",
"gender":"male"
}
]
}
1条答案
按热度按时间oxcyiej71#
因为我在字段名中使用了大写字母。。。在无休止的循环中徘徊了24个小时,如果有人遇到同样的问题,我终于能够找到答案。请阅读此处并使用小写名称作为字段名。当我把域名改成“宠物”的时候。复制品不见了