import pandas as pd
import numpy as np
df = pd.read_csv("dirtydata.csv")
dfn = df.convert_dtypes()
bike_sales_ds = dfn.copy()
# Create new age column with general age range groups
age_conditions = [
(bike_sales_ds['Age'] <= 30),
(bike_sales_ds['Age'] >= 31) & (bike_sales_ds['Age'] <= 40),
(bike_sales_ds['Age'] >= 41) & (bike_sales_ds['Age'] <= 55),
(bike_sales_ds['Age'] >= 56) & (bike_sales_ds['Age'] <= 69),
(bike_sales_ds['Age'] >= 70)
]
age_choices = ['30 or Less', '31 to 40', '41 to 55', '56 to 69', '70 or Older']
bike_sales_ds['Age_Range'] = np.select(age_conditions, age_choices, default='error')
The dataset I'm working from
这个数据集不是我创建的,我是前阵子从youtube视频上得到的,视频不是关于Pandas的。
错误
追溯(最近调用最后调用):文件"C:\用户\dmcfa\PycharmProjects\自行车销售数据清理01\main.py",第43行,bike_sales_ds ['年龄范围']= www.example.com(年龄条件,年龄选择,默认值= 0)^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^np.select应为布尔值ndarray invalid entry 0 in condlist: should be boolean ndarray
这就避免了我的错误:
df.convert_dtypes(convert_integer=False)
但是,首先是什么原因导致了这种情况呢?www.example.com()说,无论我使用df. convert_dtypes(),该列都是Int64。pd.info() says that the column is an Int64 whether I use df.convert_dtypes().
1条答案
按热度按时间9o685dep1#
你的代码在我的输入 Dataframe 中运行良好,但是,你可以使用
pd.cut
来检查问题是否仍然存在:输出:
输入: