首页 > 解决方案 > 如何使用 ValueError 将对象类型转换为整数类型:int() 基数为 10 的无效文字:'2,156,624,900'?

问题描述

我正在处理一个关于自杀的数据集,其中一部分包括一gdp_for_year列。但是,该列是object类型的,并且可以理解地需要是int. 这是我收到的错误:

ValueError                                Traceback (most recent call last)
<ipython-input-10-ec740fbd9849> in <module>
      2 suicides.info()
      3 
----> 4 suicides['gdp_for_year'] = suicides['gdp_for_year'].astype('int')

~\anaconda3\lib\site-packages\pandas\core\generic.py in astype(self, dtype, copy, errors)
   5696         else:
   5697             # else, only a single dtype is given
-> 5698             new_data = self._data.astype(dtype=dtype, copy=copy, errors=errors)
   5699             return self._constructor(new_data).__finalize__(self)
   5700 

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in astype(self, dtype, copy, errors)
    580 
    581     def astype(self, dtype, copy: bool = False, errors: str = "raise"):
--> 582         return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
    583 
    584     def convert(self, **kwargs):

~\anaconda3\lib\site-packages\pandas\core\internals\managers.py in apply(self, f, filter, **kwargs)
    440                 applied = b.apply(f, **kwargs)
    441             else:
--> 442                 applied = getattr(b, f)(**kwargs)
    443             result_blocks = _extend_blocks(applied, result_blocks)
    444 

~\anaconda3\lib\site-packages\pandas\core\internals\blocks.py in astype(self, dtype, copy, errors)
    623             vals1d = values.ravel()
    624             try:
--> 625                 values = astype_nansafe(vals1d, dtype, copy=True)
    626             except (ValueError, TypeError):
    627                 # e.g. astype_nansafe can fail on object-dtype of strings

~\anaconda3\lib\site-packages\pandas\core\dtypes\cast.py in astype_nansafe(arr, dtype, copy, skipna)
    872         # work around NumPy brokenness, #1987
    873         if np.issubdtype(dtype.type, np.integer):
--> 874             return lib.astype_intsafe(arr.ravel(), dtype).reshape(arr.shape)
    875 
    876         # if we have a datetime/timedelta array of objects

pandas\_libs\lib.pyx in pandas._libs.lib.astype_intsafe()

**ValueError: invalid literal for int() with base 10: '2,156,624,900'**

数据框 info() 和 head()

有人对我能做什么有建议吗?

标签: pythonpandas

解决方案


你的字符串

'2,156,624,900'

包含逗号。您不能自动将此字符串转换为整数。您首先必须删除逗号。你可以这样做:

int('2,156,624,900'.replace(',', ''))

因此,在您的情况下,您可能希望遵循帖子下方评论中链接的一些更详细的语言环境设置,或者replace先将此功能应用于整个列,然后将其转换为int.


推荐阅读