首页 > 解决方案 > 带有 SettingWithCopyWarning 的熊猫

问题描述

我有一个大约 100 万行的非常大的数据集(测试)。我想从数据集中更新一列(“日期”)。我只想在“日期”列中有 3 个日期:

2014-04-01, 2014-05-01, 2014-06-01

因此,一行中的每个日期和每第三行之后的日期都在重复。

我试过这个:

for i in range(0,len(test),3):

    if(i <= len(test)):

       test['Date'][i] = '2014-04-01'

       test['Date'][i+1] = '2014-05-01'

       test['Date'][i+2] = '2014-06-01'

我收到此警告:

__main__:3: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
__main__:4: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
__main__:5: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

我已经浏览了链接,但无法解决我的问题。我用谷歌搜索了一些解决方案,比如切片前的 copy() 数据集和其他一些解决方案,但没有任何效果。

标签: pythonpandas

解决方案


我相信你想要的是np.tile

from math import ceil

dates = pd.Series(['2014-04-01', '2014-05-01', '2014-06-01'], dtype='datetime64[ns]')

repeated_dates = np.tile(dates, len(df) // 3 + 1)[:len(df)]

df['dates'] = repeated_dates

这将创建一个Series包含重复值并将其分配给数据框的列。


推荐阅读