在Pandas中基于另一个列框架生成列

iyfjxgzm  于 2023-10-14  发布在  其他
关注(0)|答案(1)|浏览(155)
import pandas as pd

# Sample data
data = { 'name': ['ray', 'ray', 'ray', 'ray', 'ray', 'ray'], 
            'code': [1, 0, 1, 1, 0, 1] }

# Create a DataFrame
df = pd.DataFrame(data)

# Initialize the 'Period' column
df['Period'] = 0

# Calculate the 'Period' based on the logic
current_period = 0 
for i in range(len(df)): 
    if df.loc[i, 'code'] == 1: 
        current_period += 1 
    else: 
        current_period = 0 
    df.loc[i, 'Period'] = current_period

# Display the DataFrame with the 'Period' column
print(df)

上面是我的代码,这里我想根据代码列的值创建期间列。期间列的逻辑为;

  • 第一行是code=1所以period是1,
  • 第二行是code=0所以period是0,
  • 然后行3和4是代码=1所以周期是2为两行
  • 第五行是code=0,所以period是0,
  • 那么行6的code =1,所以周期是3,依此类推

基本上,我想将code=1的连续行分组,并为它们赋值。
Check this for required output and my current output from the code above

5lhxktic

5lhxktic1#

我将使用shiftcumsumwhere的布尔运算:

# Identify 1s
m1 = df['code'].eq(1)
# Identify first 1 of each stretch
m2 = m1&~m1.shift(fill_value=False)

# compute the increment, mask non-1s with 0
df['Period'] = m2.cumsum().where(m1, 0)

输出量:

name  code  Period
0  ray     1       1
1  ray     0       0
2  ray     1       2
3  ray     1       2
4  ray     0       0
5  ray     1       3

中间体:

name  code     m1  ~m1.shift     m2  m2.cumsum  Period
0  ray     1   True       True   True          1       1
1  ray     0  False      False  False          1       0
2  ray     1   True       True   True          2       2
3  ray     1   True      False  False          2       2
4  ray     0  False      False  False          2       0
5  ray     1   True       True   True          3       3

修复代码

您需要跟踪最后一个值。

current_period = 0
last_period = 0
for i in range(len(df)): 
    if df.loc[i, 'code'] == 1:
        if last_period == 0:
            current_period += 1
            last_period = 1
        df.loc[i, 'Period'] = current_period
    else: 
        last_period = 0

但是,你不应该对pandas使用循环。

相关问题