我有一个看起来像这样的 Dataframe (它包含虚拟数据)-
我想删除每个单元格中出现在“\uuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuu”标识符之后的文本。我编写了如下代码(逻辑:添加一个包含nan的新列,并将编辑后的值保存在该列中)-
import pandas as pd
import numpy as np
df = pd.read_excel(r'Desktop\Trial.xlsx')
NaN = np.nan
df["Body2"] = NaN
substring = "____________"
for index, row in df.iterrows():
if substring in row["Body"]:
split_string = row["Body"].split(substring,1)
row["Body2"] = split_string[0]
print(df)
但是body2列仍然显示nan,而不是编辑的值。
任何帮助都将不胜感激!
2条答案
按热度按时间vs91vp4v1#
使用at修改该值
vs91vp4v2#
不要遍历行,而是一次对所有行执行该操作。您可以使用expand将值拆分为多列,我认为这正是您想要的。