pandas 在Python中基于列分区和条件填充行

qrjkbowd 于 2023-01-01 发布在 Python

关注(0)|答案(3)|浏览(146)

我的目标是迭代可能的B值列表，这样每个ID（列A）都将有新的行添加C = 0，其中可能的B值以前在DF中不存在。
我有一个 Dataframe ：

A    B   C
0   id1  2   10
1   id1  3   20
2   id2  1   30

可能的B值=[1 2 3]
导致：

A    B   C
0   id1  1   0
1   id1  2   10
2   id1  3   20
3   id2  1   30
4   id2  2   0
5   id2  3   0

先谢了!

pandas

来源：https://stackoverflow.com/questions/74942990/filling-rows-based-on-column-partitions-conditions-in-python

3条答案

按热度按时间

aelbi1ox1#

使用一些索引技巧：

import pandas as pd

df = pd.read_clipboard() # Your df here
possible_B_values = [1, 2, 3]

extrapolate_columns = ["A", "B"]
index = pd.MultiIndex.from_product(
    [df["A"].unique(), possible_B_values],
    names=extrapolate_columns
)

out = df.set_index(extrapolate_columns).reindex(index, fill_value=0).reset_index()

输出：

A  B   C
0  id1  1   0
1  id1  2  10
2  id1  3  20
3  id2  1  30
4  id2  2   0
5  id2  3   0

赞(0）回复(0）举报 2023-01-01

alen0pnh2#

也许你可以创建一个 Dataframe 与列表的元组与可能的B值，并合并它与原来的一个

import pandas as pd

# Create a list of tuples with the possible B values and a C value of 0
possible_b_values = [1, 2, 3]
possible_b_rows = [(id, b, 0) for id in df['A'].unique() for b in possible_b_values]

# Create a new DataFrame from the list of tuples
possible_b_df = pd.DataFrame(possible_b_rows, columns=['A', 'B', 'C'])

# Merge the new DataFrame with the original one, using the 'A' and 'B' columns as the keys
df = df.merge(possible_b_df, on=['A', 'B'], how='outer')

# Fill any null values in the 'C' column with 0
df['C'] = df['C'].fillna(0)

print(df)

赞(0）回复(0）举报 2023-01-01

1hdlvixo3#

这里有一个简单的pandas方法来解决这个问题
1.将索引设置为B（这将有助于以后重新建立索引）
1.对列A执行Gropuby，然后对列C应用以下apply函数以重新索引B

lambda函数x.reindex(range(1,4), fill_value=0)基本上为每个id获取每组 Dataframe x，然后从range(1,4) = 1,2,3重新索引它，并用0填充nan值。
1.最后，使用reset_index将A和B返回到 Dataframe 中。

out = df.set_index('B') \                                       # Set index as B
        .groupby(['A'])['C'] \                                  # Groupby A and use apply on column C
        .apply(lambda x: x.reindex(range(1,4), fill_value=0))\  # Reindex B to range(1,4) for each group and fill 0
        .reset_index()                                          # Reset index

print(out)

A  B   C
0  id1  1   0
1  id1  2  10
2  id1  3  20
3  id2  1  30
4  id2  2   0
5  id2  3   0

赞(0）回复(0）举报 2023-01-01

我来回答

pandas 在Python中基于列分区和条件填充行

3条答案

相关问题

热门标签

最新问答