pandas 添加行和列到panda Dataframe

myzjeezk  于 2023-02-14  发布在  其他
关注(0)|答案(2)|浏览(154)

我有一个panda数据框correct_X_test,其中包含一列review包含评论。我需要添加两个新列,其中包含部分评论如下:
对于一行评论review ='x1 x2 x3 x x x xi x x x xn',我需要库存sub_review_1_i='x1 x2 x3 x x x xi' and sub_review_i_n='xi x x x xn' for i in (1,n)
我使用以下代码提取这两个字符串:

for j in correct_y_test.index:
  input_list=correct_X_test["review"][j].split()
  for i in range(len(input_list)):
    #Construction de la séquence de x1 à xi
    sub_list_1_i=input_list[:i+1]
    sub_str_1_i = ""
    for ele in sub_list_1_i:
      sub_str_1_i += ele + " "
    #Construction de la séquence de xi à xn
    sub_list_i_n=input_list[i:]
    sub_str_i_n = ""
    for ele in sub_list_i_n:
      sub_str_i_n += ele + " "

但是不知道如何在日期框中存储这个,因为对于一个评论,我们将有i行和2列,请你知道吗?

ttvkxqim

ttvkxqim1#

您可以为两个sub_review列创建空列表,然后为每个i值将相应的子查看字符串附加到这些列表。最后,您可以将这两个列表作为新列添加到correct_X_test Dataframe 。请尝试以下代码:

sub_review_1_i = []
sub_review_i_n = []

for j in correct_X_test.index:
    input_list = correct_X_test["review"][j].split()
    
    for i in range(len(input_list)):
        sub_list_1_i = input_list[:i+1]
        sub_str_1_i = " ".join(sub_list_1_i)
        sub_review_1_i.append(sub_str_1_i)
        
        sub_list_i_n = input_list[i:]
        sub_str_i_n = " ".join(sub_list_i_n)
        sub_review_i_n.append(sub_str_i_n)
        
correct_X_test["sub_review_1_i"] = sub_review_1_i
correct_X_test["sub_review_i_n"] = sub_review_i_n

sub_review_1_i和sub_review_i_n列表在循环之前初始化,然后用i的每个值的子审查字符串填充。最后,使用correct_X_test["sub_review_1_i"] = sub_review_1_i and correct_X_test["sub_review_i_n"] = sub_review_i_n.将这两个列表作为新列添加到correct_X_test Dataframe

tp5buhyn

tp5buhyn2#

在我看来你有两个选择

选项1:将子审阅存储为列表

在此选项中,对于每个"review",创建两个列表来存储sub_str_1_i的值,另一个列表用于存储sub_str_i_n的值。然后将这些列表作为新列添加到其各自的行中。以下是一个示例:

import pandas as pd

# == Create some dummy data ====================================================
correct_X_test = pd.DataFrame({"review": ["This is a review",
                                          "This is another review",
                                          "This is a third review"]})

# == Solution 1 ================================================================
correct_X_test['1_i'] = None
correct_X_test['i_n'] = None

for j, row in correct_X_test.iterrows():
    input_list = row["review"].split()
    sub_list_1_i, sub_list_i_n = [], []
    for i in range(len(input_list)):

        # Construction de la séquence de x1 à xi
        sub_str_1_i = " ".join(input_list[:i+1])
        
        # Construction de la séquence de xi à xn
        sub_str_i_n = " ".join(input_list[i:])

        sub_list_1_i.append(sub_str_1_i)
        sub_list_i_n.append(sub_str_i_n)

    correct_X_test.loc[j, '1_i'] = sub_list_1_i
    correct_X_test.loc[j, 'i_n'] = sub_list_i_n

print(correct_X_test)
# Prints:
#
                #    review                                                1_i  \
# 0        This is a review       [This, This is, This is a, This is a review]   
# 1  This is another review  [This, This is, This is another, This is anoth...   
# 2  This is a third review  [This, This is, This is a, This is a third, Th...   

#                                                  i_n  
# 0  [This is a review, is a review, a review, review]  
# 1  [This is another review, is another review, an...  
# 2  [This is a third review, is a third review, a ...

选项2:为sub_str_1_isub_str_i_n的每个组合创建新行

在此选项中,sub_str_1_isub_str_i_n的每个组合都作为新行存储在 Dataframe 中。您可以使用pd.DataFrame.explode方法将选项1的输出转换为新行:

correct_X_test.explode(['i_n', '1_i'])
# Returns:
#
#                    review                     1_i                     i_n
# 0        This is a review                    This        This is a review
# 0        This is a review                 This is             is a review
# 0        This is a review               This is a                a review
# 0        This is a review        This is a review                  review
# 1  This is another review                    This  This is another review
# 1  This is another review                 This is       is another review
# 1  This is another review         This is another          another review
# 1  This is another review  This is another review                  review
# 2  This is a third review                    This  This is a third review
# 2  This is a third review                 This is       is a third review
# 2  This is a third review               This is a          a third review
# 2  This is a third review         This is a third            third review
# 2  This is a third review  This is a third review                  review

相关问题