在Pandas中将格式从对象转换为浮点时出错

iszxjhcz  于 2023-03-11  发布在  其他
关注(0)|答案(1)|浏览(257)

我试图在Pandas中将一个数据类型对象转换为浮点型,但是我无法修复这个错误。我该如何解决这个问题?
数据集:https://drive.google.com/file/d/1fWUG__B-11mV2td-eoqS7eF1eADnsdRs/view

Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
d:\Overdose AI\part_a.ipynb Cell 4 in 1
----> 1 df["quantity tons"] = df["quantity tons"].astype(float)

File c:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\generic.py:6245, in NDFrame.astype(self, dtype, copy, errors)
   6238     results = [
   6239         self.iloc[:, i].astype(dtype, copy=copy)
   6240         for i in range(len(self.columns))
   6241     ]
   6243 else:
   6244     # else, only a single dtype is given
-> 6245     new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   6246     return self._constructor(new_data).__finalize__(self, method="astype")
   6248 # GH 33113: handle empty frame or series

File c:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\managers.py:446, in BaseBlockManager.astype(self, dtype, copy, errors)
    445 def astype(self: T, dtype, copy: bool = False, errors: str = "raise") -> T:
--> 446     return self.apply("astype", dtype=dtype, copy=copy, errors=errors)

File c:\Users\Admin\AppData\Local\Programs\Python\Python310\lib\site-packages\pandas\core\internals\managers.py:348, in BaseBlockManager.apply(self, f, align_keys, ignore_failures, **kwargs)
    346         applied = b.apply(f, **kwargs)
    347     else:
--> 348         applied = getattr(b, f)(**kwargs)
    349 except (TypeError, NotImplementedError):
...
    169     # Explicit copy, or required since NumPy can't view from / to object.
--> 170     return arr.astype(dtype, copy=True)
    172 return arr.astype(dtype, copy=copy)

ValueError: could not convert string to float: 'e'
wrrgggsh

wrrgggsh1#

问题是你提供的文件中“数量吨”一栏(第173088行)有一个“e”。
为了避免这个问题,我建议在修改dtype之前检查一列是否有字符串。

df[df['quantity tons'].apply(lambda x: isinstance(x, str))]

输出将只显示“数量吨”列包含字符串的行。

相关问题