pandas 获取 Dataframe 中每列的dict值

zu0ti5jz 于 2022-11-20 发布在其他

关注(0)|答案(2)|浏览(259)

输入数据：

data = [
    ['0039384', [{'A': 415}, {'A': 228}, {'B': 360}, {'B': 198}, {'C': 300}, {'C': 165}]],
    ['0035584', [{'A': 345}, {'A': 117}, {'B': 223}, {'B': 554}, {'C': 443}, {'C': 143}]]
]

df = pd.DataFrame(data=data, columns=['id', 'prices'])

我想得到这样的结果：

id  CurrentPrice_A  LastPrice_C CurrentPrice_B  LastPrice_B CurrentPrice_C  LastPrice_C
0039384 415         228         360         198         300         165

我已经尝试过分开dict，然后每列替换和重命名比得到的价格，但它需要大约10行代码。你知道任何短期和快速的方法来做到这一点。

pandas

来源：https://stackoverflow.com/questions/74435173/get-the-value-of-dict-in-every-column-in-dataframe

2条答案

按热度按时间

y1aodyip1#

迭代 Dataframe 的每一行是很方便的，这样你就可以控制算法，两个两个地压缩字典（以便合并当前的和最后的），并动态地分配列名和它们的值。
为了方便起见，您可以使用列表和临时字典，而不是使用pd.concat（）。

import pandas as pd

data = [
    ['0039384', [{'A': 415}, {'A': 228}, {'B': 360}, {'B': 198}, {'C': 300}, {'C': 165}]],
    ['0035584', [{'A': 345}, {'A': 117}, {'B': 223}, {'B': 554}, {'C': 443}, {'C': 143}]]
]

df = pd.DataFrame(data=data, columns=['id', 'prices'])

new_df_rows = []

for index, row in df.iterrows():

    grouped_prices = zip(row.prices[::2], row.prices[1::2])  # create groups two-by-two
    tmp_dict = {'id': row.id}
    for curr_price, last_price in grouped_prices:
        tmp_dict.update({
            'CurrentPrice_' + str(list(curr_price.keys())[0]): int(list(curr_price.values())[0]),
            'LastPrice_' + str(list(last_price.keys())[0]): int(list(last_price.values())[0])
        })
    new_df_rows.append(tmp_dict)

new_df = pd.DataFrame(new_df_rows)
print(new_df)

输出将为：

id  CurrentPrice_A  LastPrice_A  CurrentPrice_B  LastPrice_B  CurrentPrice_C  LastPrice_C
0  0039384             415          228             360          198             300          165
1  0035584             345          117             223          554             443          143

赞(0）回复(0）举报 2022-11-20

knpiaxh12#

首先将列表行转换为新列：

dfx = pd.DataFrame(df['prices'].tolist(),index=df.id)
print(dfx)
'''
                  0           1           2           3           4           5
id                                                                             
0039384  {'A': 415}  {'A': 228}  {'B': 360}  {'B': 198}  {'C': 300}  {'C': 165}
0035584  {'A': 345}  {'A': 117}  {'B': 223}  {'B': 554}  {'C': 443}  {'C': 143}
'''

然后，我们将这些列分成奇数和偶数，奇数代表最后一个价格，偶数代表当前价格：

last=list(filter(lambda x: x % 2, list(dfx.columns))) #[1, 3, 5]
currents=list(sorted(set(dfx.columns) - set(last))) #[0, 2, 4]

现在，重命名列：

for i in currents:
    dfx=dfx.rename(columns={i:'CurrentPrice_{}'.format(list(dfx[i].iloc[0].keys())[0])})

for i in last:
    dfx=dfx.rename(columns={i:'LastPrice_{}'.format(list(dfx[i].iloc[0].keys())[0])})
print(dfx)
'''
id      CurrentPrice_A   LastPrice_A    CurrentPrice_B  LastPrice_B  CurrentPrice_C  LastPrice_C
0039384 {'A': 415}       {'A': 228}     {'B': 360}      {'B': 198}   {'C': 300}  {'C': 165}
0035584 {'A': 345}       {'A': 117}     {'B': 223}      {'B': 554}   {'C': 443}  {'C': 143}

'''

最后，从dict中获取值：

for i in dfx.columns:
    dfx[i]=dfx[i].apply(lambda x: list(x.values())[0])

print(dfx)
'''
id      CurrentPrice_A  LastPrice_A CurrentPrice_B  LastPrice_B CurrentPrice_C  LastPrice_C
0039384 415             228         360             198         300             165
0035584 345             117         223             554         443             143

'''

赞(0）回复(0）举报 2022-11-20

我来回答

pandas 获取 Dataframe 中每列的dict值

2条答案

相关问题

热门标签

最新问答