pandas 'df.select_dtypes'适用于'float'，但不适用于'int'

tpgth1q7 于 2022-12-21 发布在其他

关注(0)|答案(1)|浏览(155)

我刚刚发现了pd.DataFrame.select_dtypes的奇怪行为。
我的pd.DataFrame是：

df = pd.DataFrame({'a': [1, 2, 3, 4], 'b': ['a', 'b', 'c', 'd'], 'c': [1.2, 3.4, 5.6, 7.8]})

现在，如果我想选择数字列，我将执行以下操作：

df.select_dtypes([int, float])

但输出仅包含float列：

为什么会这样呢？我列出了float和int，为什么没有列出整数列。
以下是dtypes：

>>> df.dtypes
a      int64
b     object
c    float64
dtype: object
>>>

如您所见，它们都以64结尾，但只有float有效。
更多测试：

>>> df.select_dtypes(int)
Empty DataFrame
Columns: []
Index: [0, 1, 2, 3]
>>> df.select_dtypes(float)
     c
0  1.2
1  3.4
2  5.6
3  7.8
>>>

为什么会发生这种情况？
我知道我可以做：

df.select_dtypes(['int64', 'float64'])

但我想知道这种行为的原因。

1条答案

如果需要所有整数和所有浮点列，请检查numpy types：
这意味着int16、int32、int64与integer匹配，浮点数的原理相同：

print (df.select_dtypes(['integer', 'floating']))
   a    c
0  1  1.2
1  2  3.4
2  3  5.6
3  4  7.8

原因：发现numpy types：

在Python 3中，int_类型不是从内置的int继承的，因为int类型不再是固定宽度的整数类型。