pandas 显示索引[1]中第二次出现两个连续零的索引[0]的值

toiithl6 于 12个月前发布在其他

关注(0)|答案(2)|浏览(89)

问题/上下文：我正在尝试编写一段代码，它将读取指定文件夹中的每个CSV文件，在列索引1中查找第二个连续出现的零，并在找到时显示列索引[0]中的相应值。下面是我目前所做的，但它不起作用。它只是说没有带有两个连续零的列。但是，当我打开各个文件时，我可以清楚地看到有两个连续的零的列。我以前问过这个问题，并得到了有效的结果，直到我在自己的 Dataframe 中进行了替换。也许是因为我自己的 Dataframe 既包含数字又包含单词？
数据框架：

我的天|试验编号| 0||未命名：0|文件名|| 0 |0.0|| 1 |1.0|| 2 |0.0|| 3 |1.0|| 4 |1.0|| 5 |0.0|| 6 |1.0|| 7 |1.0|| 8 |1.0|| 9 |1.0|| 10 |0.0|| 11 |0.0|| 12 |1.0|| 13 |0.0|| 14 |1.0|| 15 |1.0|

当前编码：

import os
import pandas as pd

folder_path = "/content/drive/session 1 & 2"

def find_column1_value_for_second_zero(file_path):
    try:
        df = pd.read_csv(file_path)
        consecutive_zeros = 0
        column1_value = None

        for _, row in df.iterrows():
            if row.iloc[1] == 0:
                consecutive_zeros += 1
                if consecutive_zeros == 2:
                    column1_value = row.iloc[0]
                    break
            else:
                consecutive_zeros = 0

        return column1_value
    except Exception as e:
        print(f"Error reading file '{file_path}': {str(e)}")
        return None

for filename in os.listdir(folder_path):
    if filename.endswith(".csv"):  # Assuming your files are CSV format
        file_path = os.path.join(folder_path, filename)
        
        column1_value = find_column1_value_for_second_zero(file_path)
        
        if column1_value is not None:
            print(f"In file '{filename}', the value in column 1 for the second zero in column 2 is: {column1_value}")
        else:
            print(f"In file '{filename}', no second zero in column 2 was found.")

预期效果：获取列[0]中的值，该值与列1中的第二个连续零位于同一行。在上面的dataframe中，它应该返回'11'。
实际结果：每一行返回“在列2中没有找到第二个零”。

pandas

来源：https://stackoverflow.com/questions/77063451/displaying-value-of-index0-for-second-occurrence-of-two-consecutive-zeroes-in

2条答案

按热度按时间

deyfvvtc1#

不要使用复杂的循环，而是使用向量代码。
假设此示例输入：

col1  col2 col3
0      0     0    A
1      1     1    B
2      2     0    C
3      3     1    D
4      4     1    E
5      5     0    F
6      6     0    G  # should be selected
7      7     0    H
8      8     1    I
9      9     0    K
10    10     0    L  # should be selected

使用groupby.cumcount和布尔索引：

N = 2         # nth consecutive 0 to select
col = 'col2'  # target column

grp = df[col].ne(0).cumsum()      # form groups
cnt = df.groupby(grp).cumcount()  # enumerate 0s
out = df[cnt.eq(N)]               # select nth

输出量：

col1  col2 col3
6      6     0    G
10    10     0    L

中间体：

col1  col2 col3  grp  cnt  cnt.eq(N)
0      0     0    A    0    0      False
1      1     1    B    1    0      False
2      2     0    C    1    1      False
3      3     1    D    2    0      False
4      4     1    E    3    0      False
5      5     0    F    3    1      False
6      6     0    G    3    2       True
7      7     0    H    3    3      False
8      8     1    I    4    0      False
9      9     0    K    4    1      False
10    10     0    L    4    2       True

赞(0）回复(0）举报 12个月前

zi8p0yeb2#

另一个基于我认为您的CSV的工作解决方案是：

from pathlib import Path
import csv

folder_path = '.'

def find_column1_value_for_second_zero(file_path):
    with open(file_path, newline='') as f:
        consec = 0
        for row in csv.reader(f):
            try: x, y = row[0], float(row[1])
            except ValueError: continue
            consec = 0 if y != 0 else consec + 1
            if consec == 2: return x

if __name__ == '__main__':
    
    for filename in Path(folder_path).glob('*csv'):
        column1_value = find_column1_value_for_second_zero(filename.resolve())
        if column1_value:
            print(f"In file '{filename}', the value in column 1 for the second zero in column 2 is: {column1_value}")
        else:
            print(f"In file '{filename}', no second zero in column 2 was found.")

赞(0）回复(0）举报 12个月前

我来回答

pandas 显示索引[1]中第二次出现两个连续零的索引[0]的值

2条答案

相关问题

热门标签

最新问答