Pandas:创建一个函数来删除链接

ltqd579y  于 2023-01-07  发布在  其他
关注(0)|答案(1)|浏览(148)

我需要一个函数来删除panda DataFrame中oldText列(超过1000行)的链接。我用regex创建了这个函数,但它不起作用。这是我的代码:

def remove_links(text):
    text = re.sub(r'http\S+', '', text) 
    text = text.strip('[link]') 

    return text

df['newText'] = df['oldText'].apply(remove_links)

我没有错误,代码什么也没做

bkhjykvo

bkhjykvo1#

您的代码对我有效:CSV:

oldText
https://abc.xy/oldText asd
https://abc.xy/oldTe asd
https://abc.xy/oldT
https://abc.xy/old
https://abc.xy/ol

代码:

import pandas as pd
import re

def remove_links(text):
    text = re.sub(r'http\S+', '', text) 
    text = text.strip('[link]') 

    return text

df = pd.read_csv('test2.csv')
df['newText'] = df['oldText'].apply(remove_links)
print(df)

结果:

oldText newText
0  https://abc.xy/oldText asd     asd
1    https://abc.xy/oldTe asd     asd
2         https://abc.xy/oldT        
3          https://abc.xy/old        
4           https://abc.xy/ol

相关问题