python 如何从列中删除特定单词？

llycmphe 于 2022-10-30 发布在 Python

关注(0)|答案(2)|浏览(223)

我需要从distance列中删除“km”：

[Distance]
0     114 km
1     114 km
2     9.1 km
3    33.1 km
4     182 km
5    93.2 km
6    40.4 km
7        0.0
8        0.0
9    43.4 km
Name: distance, dtype: object

必须是这样的：

python

来源：https://stackoverflow.com/questions/74248482/how-do-i-remove-specific-words-from-columns

2条答案

按热度按时间

km0tfn4u1#

假设要删除的尾随子字符串始终为km，则可以用途：

df['distance'] = df['distance'].str.replace(r'\s*km$', '', regex=True)

一个更通用的方法是提取数字：

df['distance'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)')

如果您只想要有“km”时的数字：

df['distance'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km')

并转换为数字/NaN：

df['distance'] = pd.to_numeric(df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km', expand=False), errors='coerce')

摘要

df['distance1'] = df['distance'].str.replace(r'\s*km$', '', regex=True)
df['distance2'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)')
df['distance3'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km')
df['distance4'] = pd.to_numeric(df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km', expand=False), errors='coerce')
print(df.dtypes)
print(df)

输出量：

distance      object
distance1     object
distance2     object
distance3     object
distance4    float64
dtype: object
  distance distance1 distance2 distance3  distance4
0   114 km       114       114       114      114.0
1   114 km       114       114       114      114.0
2   9.1 km       9.1       9.1       9.1        9.1
3  33.1 km      33.1      33.1      33.1       33.1
4   182 km       182       182       182      182.0
5  93.2 km      93.2      93.2      93.2       93.2
6  40.4 km      40.4      40.4      40.4       40.4
7      0.0       0.0       0.0       NaN        NaN
8      0.0       0.0       0.0       NaN        NaN
9  43.4 km      43.4      43.4      43.4       43.4

展开查看全部

赞(0）回复(0）举报 2022-10-30

7fyelxc52#

下面是另一种方法，只需删除带有0的观测值并删除“km”：

df['distance'] = df['distance'].str.replace(r'\D+', '').astype('float')
# r'\D+' removes any character that is not a digit
df['distance'] = df['distance'].replace(0, np.nan)
df['distance'].dropna(inplace=True)

赞(0）回复(0）举报 2022-10-30

我来回答

python 如何从列中删除特定单词？

2条答案

摘要

相关问题

热门标签

最新问答