在unix中使用jq工具将多级json解析为csv

g6baxovj  于 2023-11-20  发布在  Unix
关注(0)|答案(3)|浏览(160)

我有一个多层次的JSON,如下所示:

{
    "id": "id123",
    "details": {
        "prod": "prod123",
        "etype": "type1"
    },
    "accounts": [
        {
            "bankName": "bank123",
            "accountType": "account123",
            "openingBalance": "bal123",
            "fromDate": "2023-01-01",
            "toDate": "2023-01-01",
            "missingMonths": [],
            "transactions": [
                {
                    "dateTime": "2020-12-01",
                    "description": "a very long string",
                    "amount": -599.0,
                    "bal": 8154.83,
                    "type": "Debit"
                },
                {
                    "dateTime": "2020-12-01",
                    "description": "a very long string; a very long string",
                    "amount": -4000.0,
                    "balanceAfterTransaction": 4154.83,
                    "type": "Debit"
                }
            ]
        }
    ],
    "accountid": "sample123"
}

字符串
我想将其解析为以下格式的csv:

id,prod,etype,bankName,accountType,openingBalance,fromDate,toDate,dateTime,description,amount,bal,type,accountid
id123,prod123,type1,bank123,account123,bal123,2023-01-01,2023-01-01,2020-12-01,a very long string,-599.0,8154.83,Debit,sample123
id123,prod123,type1,bank123,account123,bal123,2023-01-01,2023-01-01,2020-12-01,a very long string; a very long string,-4000.0,4157.83,Debit,sample123


将事务解析为适当的csv,并重复每个事务的公共信息。
到目前为止,我只得到了这么多:
第一个月

6ju8rftf

6ju8rftf1#

这将帮助您获得所需的CSV格式:

jq -r '.[] | "\(.id),\(.details.prod),\(.details.etype),\(.accounts[].bankName)"'

字符串
你可以在它的基础上进一步构建以获得所有想要的值。
注意:对于有效的CSV,您可能需要在某些值(如字符串)周围添加双引号
要解析多个JSON条目,它们应该在一个数组中:

[
{
    "id": "id123",
    "details": {
        "prod": "prod123",
        "etype": "type1"
    },
    "accounts": [
        {
.....
        }
     ]
},
{
    "id": "id124",
    "details": {
        "prod": "prod124",
        "etype": "type1"
    },
    "accounts": [
        {
.....
        }
     ]
}
]

sdnqo3pr

sdnqo3pr2#

您可以将外部数据值保存在变量中,并逐步向下到叶节点:

jq -r '"id,prod,etype,bankName,accountType,openingBalance,fromDate,toDate,dateTime,description,amount,bal,type,accountid",
    ( . as {$id,$accountid}
    | .details as {$prod, $etype}
    | .accounts[]
    | . as { $bankName,$accountType,$openingBalance,$fromDate,$toDate }
    | .transactions[]
    | [$id,$prod,$etype,$bankName,$accountType,$openingBalance,$fromDate,$toDate,.dateTime,.description,.amount,.bal? // .balanceAfterTransaction,.type,$accountid]
    | join(","))' input.json

字符串

sdnqo3pr

sdnqo3pr3#

下面的解决方案使用了jq的@csv过滤器来提高健壮性,并通过使用也可以用于生成标题行的键值数组来避免冗余。fiddle with .bal使用了jq的//=赋值操作符,以避免干扰程序的主要功能。

["id","prod","etype"] as $idHdrs
| ["bankName", "accountType", "openingBalance", "fromDate", "toDate"] as $accountHdrs
| ["dateTime", "description", "amount", "bal", "type"] as $txHdrs
| $idHdrs + $accountHdrs + $txHdrs + ["accountid"],   # the header row
  ( . as {$id,$accountid}
    | .details as {$prod, $etype}
    | .accounts[]
    | [.[$accountHdrs[]]] as $accounts
    | .transactions[]
    | .bal //= .balanceAfterTransaction  # fiddle
    | [.[$txHdrs[]]] as $transactions
    | [$id,$prod,$etype] + $accounts + $transactions + [$accountid] )
| @csv

字符串
输出量:

"id","prod","etype","bankName","accountType","openingBalance","fromDate","toDate","dateTime","description","amount","bal","type","accountid"
"id123","prod123","type1","bank123","account123","bal123","2023-01-01","2023-01-01","2020-12-01","a very long string",-599,8154.83,"Debit","sample123"
"id123","prod123","type1","bank123","account123","bal123","2023-01-01","2023-01-01","2020-12-01","a very long string; a very long string",-4000,4154.83,"Debit","sample123"

相关问题