python-3.x - 如何在熊猫中转换百万美元的列?
问题描述
我有一个名为 collection 的列,如下所示
收藏 : $5,345,677, 46836214, $533,316,061, " ", 29200000
列值有美元和没有美元。此外,它有 NAN。我想换成百万美元
我曾经转换如下但不成功
df['Boxoffice in US$ (mil)'] = (df2['collection'].astype(float)/1000000).round(2).astype(str)
收到此错误:无法将字符串转换为浮点数:'$5,345,677'
请指教
解决方案
# remove the '$' and ',' from the strings so it can be converted to numerics
# -> notice: the series is converted to strings to handle numerics (eg. 29200000)
collection_tmp = df2['collection'].astype(str).str.replace('[$,]', '')
# convert to numerics (floats) and then to millions
# -> errors='coerce' sets NaN for invalid values
millions = pd.to_numeric(collection_tmp, errors='coerce')/1e6
# create 'Boxoffice in US$ (mil)'
df['Boxoffice in US$ (mil)'] = millions.round(2).astype('str')