pandas 如何用各种列表结构重塑Python字典

qf9go6mv 于 12个月前发布在 Python

关注(0)|答案(2)|浏览(137)

我有一个来自字典的示例对象，我试图扁平化数据，以便每个键值将成为pandas DataFrame中自己的“列”。正如你所看到的，这需要处理多个场景。我希望其中的每个id键成为它自己在DataFrame中的列，然后将'value'键的相应值分配给该列。
我尝试使用.json_normalize（），它让我接近（如果我排除了中间字段值，我会根据需要构建中间帧），但是我在将“中间字段值”列表中的键值同时转换为它们自己的字段时遇到了问题。或者也可以选择使用原始列表，而不使用中间字段值，然后重新整形中间字段值，然后将其连接回原始列表。

{
        "classification": [
            {
                "classificationId": "OperatingExpense",
                "taxonomyId": "accounting.gp"
            }
        ],
        "supplementalFieldValues": [
            {
                "id": "Account Class",
                "value": "Expense"
            },
            {
                "id": "Account Type",
                "value": "Expense"
            },
            {
                "id": "Account Subtype",
                "value": "PayrollExpenses"
            }
        ],
        "id": "182",
        "name": "Payroll - Admin",
        "userAssignedCode": "60100"
    },

字符串
我想要的输出看起来像这样：

pandas

来源：https://stackoverflow.com/questions/77455017/how-do-i-reshape-python-dictionary-with-various-list-structures

2条答案

按热度按时间

c9x0cxw01#

我认为一个可能的解决方案是遵循以下步骤：
1.展开字典以创建单级字典。
1.从扁平化的字典创建DataFrame。
代码如下：

import pandas as pd

# Your sample data
data = {
    "classification": [
        {
            "classificationId": "OperatingExpense",
            "taxonomyId": "accounting.gp"
        }
    ],
    "supplementalFieldValues": [
        {
            "id": "Account Class",
            "value": "Expense"
        },
        {
            "id": "Account Type",
            "value": "Expense"
        },
        {
            "id": "Account Subtype",
            "value": "PayrollExpenses"
        }
    ],
    "id": "182",
    "name": "Payroll - Admin",
    "userAssignedCode": "60100"
}

# Function to flatten the dictionary
def flatten_dict(d, parent_key='', sep='_'):
    items = {}
    for key, value in d.items():
        new_key = f"{parent_key}{sep}{key}" if parent_key else key
        if isinstance(value, dict):
            items.update(flatten_dict(value, new_key, sep))
        else:
            items[new_key] = value
    return items

flattened_data = flatten_dict(data)

# Create a DataFrame from the flattened dictionary
df = pd.DataFrame([flattened_data])

# Display the DataFrame
print(df)

字符串
无论如何，你应该附上你正在使用的代码，看看什么是不工作的。

赞(0）回复(0）举报 12个月前

amrnrhlw2#

我的解决方案是遵循我的问题中的原始想法。在一个pandas对象中，我从'PandentalFieldValues'中弹出项目并将其存储到自己的列表（supp_list）中。这使得我可以更容易地利用defaultdict包循环遍历这个列表，同时利用原地OR操作符更新值以成为键。

res = defaultdict(dict)
for d in supp_list:
    res[d['i']] |= {'i': d['i'], d['id']: d.get('value')}

supplemental_list = [*res.values()]

字符串
现在我有了新的supplementary_list，我只是将它加入到原始的pandas对象（flat_list）中，从其中弹出了pandemalFieldValues。

flat_data = flat_list.join(pd.DataFrame(supplemental_list).set_index('i')).reset_index(drop=True)

型

赞(0）回复(0）举报 12个月前

我来回答

pandas 如何用各种列表结构重塑Python字典

2条答案

相关问题

热门标签

最新问答