如何处理Hive表中的多行记录

qgelzfjb 于 2022-11-05 发布在 Hive

关注(0)|答案(1)|浏览(162)

Json File :

{ 
"buyer": { 
"legalBusinessName": "test1 Company","organisationIdentifications": [{ "type": "abcd",
"identification": "test.bb@tesr"
}, 
{ 
"type": "TXID","identification": "12345678"
}
]
},
"supplier": {
"legalBusinessName": "test Company",
"organisationIdentifications": [
{
"type":"abcd","identification": "test28@test"
}
]
},
"paymentRecommendationId": "1234-5678-9876-2212-123456",
"excludedRemittanceInformation": [],
"recommendedPaymentInstructions": [{
"executionDate": "2022-06-12",
"paymentMethod": "aaaa",
"remittanceInformation": {
"structured": [{
"referredDocumentInformation": [{
"type": "xxx",
"number": "12341234",
"relatedDate": "2022-06-12",
"paymentDueDate": "2022-06-12",
"referredDocumentAmount": {
"remittedAmount": 2600.5,
"duePayableAmount": 3000
}
}]
}]
}
}]
}

创建表语句：

CREATE EXTERNAL TABLE IF NOT EXISTS `test`.`test_rahul` 
(`buyer` STRUCT< `legalBusinessName`:STRING, `organisationIdentifications`:STRUCT<  `type`:STRING, `identification`:STRING>>, 
`supplier` STRUCT< `legalBusinessName`:STRING, `organisationIdentifications`:STRUCT<    `type`:STRING, `identification`:STRING>>,
`paymentRecommendationId` STRING, `recommendedPaymentInstructions` ARRAY< STRUCT<     `executionDate`:STRING, `paymentMethod`:STRING, 
`remittanceInformation`:STRUCT< `structured`:STRUCT<     `referredDocumentInformation`:STRUCT< `type`:STRING, 
`number`:STRING, `relatedDate`:STRING, `paymentDueDate`:STRING,     `referredDocumentAmount`:STRUCT< `remittedAmount`:DOUBLE, 
`duePayableAmount`:INT>>>>>>) 
ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe'  
WITH SERDEPROPERTIES ( "field.delim"=",","mapping.ts" = "number") 
STORED AS textFILE LOCATION '/user/hdfs/Jsontest/';

如果我在单行中为每个记录写入Jsonfile数据，那么它工作得很好，但是如果它在多行中，那么会出现下面的错误。
错误消息：
错误：java.io。IO异常：org.apache.hadoop.hive.serde2.SerDeException:行不是有效的JSON对象-JSON异常：JSONObject文本在2 [第1行第3个字符]处必须以'}'结尾（状态=，代码=0）
可以有人善意地建议。看起来我需要添加行/字段分隔符，但不能决定我应该添加什么，以便它可以处理多行也一样的Spark。即。oprtion（多行，真）

Hive

来源：https://stackoverflow.com/questions/71706016/how-to-handle-multiline-record-in-hive-table

1条答案

按热度按时间

qltillow1#

看起来Hive中的JSON serde不支持多行。您可能需要将JSON扁平化为单行，如下所示。

{ "buyer": { "a": "1", "b": "2" }, "c": "3" }
{ "buyer": { "a": "1", "b": "2" }, "c": "3" }
{ "buyer": { "a": "1", "b": "2" }, "c": "3" }
...

赞(0）回复(0）举报 2022-11-05

我来回答

如何处理Hive表中的多行记录

1条答案

相关问题

热门标签

最新问答