pandas 如何循环pivot表来创建一个字典列表,从每列的每一行获取索引和值

gmxoilav  于 2023-09-29  发布在  其他
关注(0)|答案(2)|浏览(102)

我在这里有这个表,我试图从每一列的每一行中获取值和设备类别,这样我就可以得到如下所示的数据。

series: [{
        name: 'engaged_sessions',
        data: [{
            name: 'Desktop',
            y: 7765,
        }, {
            name: 'Mobile',
            y: 388
        },...
        name: 'event_count',
        data: [{
            name: 'Desktop',
            y: 51325,
        }, {
            name: 'Mobile',
            y: 4349
        },...

基本上遍历每一列,将设备类别和值放入字典列表中这是透视表,

engaged_sessions  event_count  new_users  total_revenue  total_users
device_category                                                                      
Desktop                      7765        51325       6593              9         8021
Mobile                        388         4349        795              0          412
Smart Tv                        2           38          1              250          9
Tablet                         87          111         37              0           97

我尝试过使用for循环,并将每次迭代放入列表中,但这并不完全正确。我得到的最接近的是to_dict()方法,我认为这是迄今为止最好的选择。
这里的问题(Pandas to_dict data structure, using column as dictionary index)非常相似,但我尝试按每列分组,如果我使用groupby(df.cloumns)或groupby(['column'],['column']),它会给我对象,其中有数字,但没有引用它们是什么

c90pui9n

c90pui9n1#

如果df包含问题中的pivoted dataframe,则可以执行以下操作:

out = []
for c in df:
    out.append(
        {"name": c, "data": [{"name": k, "y": v} for k, v in df[c].to_dict().items()]}
    )
print(out)

图纸:

[
    {
        "name": "engaged_sessions",
        "data": [
            {"name": "Desktop", "y": 7765},
            {"name": "Mobile", "y": 388},
            {"name": "Smart Tv", "y": 2},
            {"name": "Tablet", "y": 87},
        ],
    },
    {
        "name": "event_count",
        "data": [
            {"name": "Desktop", "y": 51325},
            {"name": "Mobile", "y": 4349},
            {"name": "Smart Tv", "y": 38},
            {"name": "Tablet", "y": 111},
        ],
    },
    {
        "name": "new_users",
        "data": [
            {"name": "Desktop", "y": 6593},
            {"name": "Mobile", "y": 795},
            {"name": "Smart Tv", "y": 1},
            {"name": "Tablet", "y": 37},
        ],
    },
    {
        "name": "total_revenue",
        "data": [
            {"name": "Desktop", "y": 9},
            {"name": "Mobile", "y": 0},
            {"name": "Smart Tv", "y": 250},
            {"name": "Tablet", "y": 0},
        ],
    },
    {
        "name": "total_users",
        "data": [
            {"name": "Desktop", "y": 8021},
            {"name": "Mobile", "y": 412},
            {"name": "Smart Tv", "y": 9},
            {"name": "Tablet", "y": 97},
        ],
    },
]
ubof19bj

ubof19bj2#

假设device_category是索引,你可以尝试:

out = [
    {"name": c, "data": [
        {"name": i, "y": v} for i, v in zip(df.index, r)]}
    for c, r in zip(df.columns, df.T.to_numpy())
]

输出量:

print(json.dumps(out, indent=4, default=str))

[
    {
        "name": "engaged_sessions",
        "data": [
            {
                "name": "Desktop",
                "y": "7765"
            },
            {
                "name": "Mobile",
                "y": "388"
            },
            {
                "name": "Smart Tv",
                "y": "2"
            },
            {
                "name": "Tablet",
                "y": "87"
            }
        ]
    },
    {
        "name": "event_count",
        "data": [
            {
                "name": "Desktop",
                "y": "51325"
...

相关问题