python 如何基于正则表达式字符串拆分Pandas Dataframe

xxb16uws 于 2023-01-29 发布在 Python

关注(0)|答案(1)|浏览(104)

我有一个CSV格式的问题和结果。建立了一个简单的代码位变成一个 Dataframe 列表进行分析。
但是最后一个拒绝分开，我想是因为简单的开头和结尾无法处理这样一个事实，即每个问题的开头都是以“〈Q”开头的

def start_and_finish_points(df):
    df_indices_start = []
    df_indices_end = []
    rows = df.iloc[:, 0].to_list()
    for i, row in enumerate(rows):
        if str(row).startswith('<Q'):
            df_indices_start.append(i)
        if str(row).endswith('++'):
            df_indices_end.append(i)    
    return df_indices_start, df_indices_end
start, finish = start_and_finish_points(df)

问题之一是代码不能处理“〈Q”

question
    698 <Q8> To what extent are you concerned about of the following.................Climate change
    ... Some data
    700  <Q11e> How often d...

我可以用来泛化startswith来科普字符串开头的空格吗？我确信它是正则表达式，但是我看不到它。

python

来源：https://stackoverflow.com/questions/75271094/how-to-split-a-pandas-dataframe-based-on-regex-string