python 如何从列中删除特定单词?

llycmphe  于 2022-10-30  发布在  Python
关注(0)|答案(2)|浏览(222)

我需要从distance列中删除“km”:

  1. [Distance]
  2. 0 114 km
  3. 1 114 km
  4. 2 9.1 km
  5. 3 33.1 km
  6. 4 182 km
  7. 5 93.2 km
  8. 6 40.4 km
  9. 7 0.0
  10. 8 0.0
  11. 9 43.4 km
  12. Name: distance, dtype: object

必须是这样的:

  1. [Distance]
  2. 0 114
  3. 1 114
  4. 2 9.1
  5. 3 33.1
  6. 4 182
  7. 5 93.2
  8. 6 40.4
  9. 7
  10. 8
  11. 9 43.4
km0tfn4u

km0tfn4u1#

假设要删除的尾随子字符串始终为km,则可以用途:

  1. df['distance'] = df['distance'].str.replace(r'\s*km$', '', regex=True)

一个更通用的方法是提取数字:

  1. df['distance'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)')

如果您只想要有“km”时的数字:

  1. df['distance'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km')

并转换为数字/NaN:

  1. df['distance'] = pd.to_numeric(df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km', expand=False), errors='coerce')
摘要
  1. df['distance1'] = df['distance'].str.replace(r'\s*km$', '', regex=True)
  2. df['distance2'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)')
  3. df['distance3'] = df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km')
  4. df['distance4'] = pd.to_numeric(df['distance'].str.extract(r'(\d+(?:\.\d+)?)\s*km', expand=False), errors='coerce')
  5. print(df.dtypes)
  6. print(df)

输出量:

  1. distance object
  2. distance1 object
  3. distance2 object
  4. distance3 object
  5. distance4 float64
  6. dtype: object
  7. distance distance1 distance2 distance3 distance4
  8. 0 114 km 114 114 114 114.0
  9. 1 114 km 114 114 114 114.0
  10. 2 9.1 km 9.1 9.1 9.1 9.1
  11. 3 33.1 km 33.1 33.1 33.1 33.1
  12. 4 182 km 182 182 182 182.0
  13. 5 93.2 km 93.2 93.2 93.2 93.2
  14. 6 40.4 km 40.4 40.4 40.4 40.4
  15. 7 0.0 0.0 0.0 NaN NaN
  16. 8 0.0 0.0 0.0 NaN NaN
  17. 9 43.4 km 43.4 43.4 43.4 43.4
展开查看全部
7fyelxc5

7fyelxc52#

下面是另一种方法,只需删除带有0的观测值并删除“km”:

  1. df['distance'] = df['distance'].str.replace(r'\D+', '').astype('float')
  2. # r'\D+' removes any character that is not a digit
  3. df['distance'] = df['distance'].replace(0, np.nan)
  4. df['distance'].dropna(inplace=True)

相关问题