你能帮我,通过python,json_normalize,把2个数组的json打包吗?
下面是代码:
import json
from pandas.io.json import json_normalize
data5 = {
"id": "0001",
"type": "donut",
"name": "Cake",
"ppu": 0.55,
"batters":
{
"batter":
[
{ "id": "1001", "type": "Regular" },
{ "id": "1002", "type": "Chocolate" },
{ "id": "1003", "type": "Blueberry" },
{ "id": "1004", "type": "Devil's Food" }
]
},
"topping":
[
{ "id": "5001", "type": "None" },
{ "id": "5002", "type": "Glazed" },
{ "id": "5005", "type": "Sugar" },
{ "id": "5007", "type": "Powdered Sugar" },
{ "id": "5006", "type": "Chocolate with Sprinkles" },
{ "id": "5003", "type": "Chocolate" },
{ "id": "5004", "type": "Maple" }
]
}
df2 = json_normalize(data5
, record_path = ['topping']
, meta = ['id', 'type', 'name', 'ppu', 'batters']
, record_prefix='_'
, errors='ignore'
)
这个包"topping"对象,但不包"batters"。要包"batters"可以应用代码:
# parce the part of json string into another dataframe
df3 = json_normalize(data5
,record_path = ['batters', 'batter'])
# cross join 2 dataframes
df2['key_'] = 1
df3['key_'] = 1
result = pd.merge(df2, df3, on ='key_').drop("key_", 1)
但这看起来很复杂。是否可以在一个查询中组合上述两个步骤?例如:
df2 = json_normalize(data5
, record_path = ['topping', ['batters', 'batter']]
, meta = ['id', 'type', 'name', 'ppu', ]
, record_prefix='_'
, errors='ignore'
)
谢谢你。
1条答案
按热度按时间2ic8powd1#
我不认为你可以在
json_normalize
中指定它,但是,你可以通过在pd.merge
中指定how="cross"
来避免创建key_
列(也不需要在df2
中保留batters
):