首页 > 解决方案 > 如何将字符串值插入到 pandas 0.24.2 的浮点列中?

问题描述

我有一列超过一百万个花车。当某个值高于或低于某些阈值时,我需要能够用字符串替换某些值。

import pandas as pd

import numpy as np

df = pd.DataFrame({'foo': np.random.random(10),
                   'bar': np.random.random(10)})

df
Out[115]: 
        foo       bar
0  0.181262  0.890826
1  0.321260  0.053619
2  0.832247  0.044459
3  0.937769  0.855299
4  0.752133  0.008980
5  0.751948  0.680084
6  0.559528  0.785047
7  0.615597  0.265483
8  0.129505  0.509945
9  0.727209  0.786113

df.at[5, 'foo'] = 'somestring'
Traceback (most recent call last):

  File "<ipython-input-116-bf0f6f9e84ac>", line 1, in <module>
    df.at[5, 'foo'] = 'somestring'

  File "/Users/nate/anaconda3/lib/python3.7/site-packages/pandas/core/indexing.py", line 2287, in __setitem__
    self.obj._set_value(*key, takeable=self._takeable)

  File "/Users/nate/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py", line 2815, in _set_value
    engine.set_value(series._values, index, value)

  File "pandas/_libs/index.pyx", line 95, in pandas._libs.index.IndexEngine.set_value

  File "pandas/_libs/index.pyx", line 106, in pandas._libs.index.IndexEngine.set_value

ValueError: could not convert string to float: 'somestring'

我最终需要写一些类似的东西:

for idx, row in df.iterrows()
    if row[0] > some_value:
        df.at[idx, 'foo'] = 'over_some_value'
    else:

我曾尝试使用iloc,但我怀疑它会很慢,我希望能够使用它at来保持我的代码统一。

标签: python-3.xpandas

解决方案


为了将不同type的值分配给columns,您可能需要将其转换为object

并在这里警告,由于转换为object,这是非常危险的

df=df.astype(object)
df.at[5, 'foo'] = 'somestring'
df
          foo        bar
0    0.163246   0.803071
1    0.946447    0.48324
2    0.777733   0.461704
3    0.996791   0.521338
4    0.320627   0.374384
5  somestring   0.987591
6    0.388765   0.726807
7    0.362077    0.76936
8    0.738139  0.0539076
9    0.208691   0.812568

推荐阅读