excel 为什么最后一列的数据类型是str而不是float？

mmvthczy 于 2023-02-14 发布在其他

关注(0)|答案(2)|浏览(191)

B列至K列的所有数据均为以文本形式存储在excel文件中的数字。

我已经上传了dropbox中的excel文件作为测试样本。
sample data text
下载并保存在/tmp/tsm.xlsx中。
tsm.xlsx for testing
我发现最后一列K的数据类型是string，从B到J的列都是数字类型，我把它阅读一个dataframe：

import pandas as pd
sexcel = '/tmp/tsm.xlsx'
df = pd.read_excel(sexcel,sheet_name='ratios_annual')
row_num = len(df)
for id in range(row_num):
    print('the data type in last column--K is',type(df.iloc[id,-1]))
    print('the data type in  column--J is',type(df.iloc[id,-2]))

the data type in last column--K is <class 'str'>
the data type in  column--J is <class 'numpy.float64'>
the data type in last column--K is <class 'str'>
the data type in  column--J is <class 'numpy.float64'>

明明在excel中打开的时候，B列到K列都是number stored as text，为什么我读入数据框的时候类型不一样呢？
请下载样本数据并进行检查。

excel

来源：https://stackoverflow.com/questions/75425731/why-the-data-type-in-last-columns-is-str-instead-of-float

2条答案

按热度按时间

vfh0ocws1#

您是否尝试过将该列显式转换为浮点型？默认情况下，Pandas会尝试正确推断数据类型，但如果数据不完美（例如，列在某处有字符等），它可能会出错。尝试将该列转换为浮点型，它可能会将您引导到问题所在。

df["2012"] = df["2012"].astype(float)

赞(0）回复(0）举报 2023-02-14

icomxhvb2#

您的列中可能存在一些错误值。若要查找这些错误值，请用途：

out = df.loc[pd.to_numeric(df['2012'], errors='coerce').isna(), '2012']
print(out)

赞(0）回复(0）举报 2023-02-14

我来回答

excel 为什么最后一列的数据类型是str而不是float？

2条答案

相关问题

热门标签

最新问答