首页 > 解决方案 > 熊猫数据框什么是处理选择行然后在没有SettingwithCopyWarning的情况下修改列的最佳方法

问题描述

我有一个大的 dataFrame 对象,我想先选择一些行,然后转换时间戳列:

def choose_loc(data, lat, lon, lat_diff, lon_diff):

    data = data.loc[(data.lat > (lat - lat_diff)) & (data.lat < (lat + lat_diff)) & (data.lon > (lon - lon_diff)) & (data.lon < (lon + lon_diff))]

    return data


column_names = np.genfromtxt(header_path, dtype=str, delimiter='\t')
dtypes = {"lat": np.float64, "lon": np.float64, "timeStamp": np.int64}
pos_lat = 0.0
pos_lon = 0.0
size_lat = 0.05
size_lon = 0.05

data = pd.read_csv(filePath, sep='\t', dtype=dtypes, header=None, names=column_names, error_bad_lines=False)

data = choose_loc(data, pos_lat, pos_lon, size_lat / 2, size_lon / 2)

data.loc[:, 'timeStamp'] =  pd.to_datetime(data.loc[:, 'timeStamp'], unit='ms')

当我运行上面的代码时,我在最后一行得到 SettingwithCopyWarning。我真的不明白为什么,因为我使用 .loc 并且不应该复制任何内容。我可以通过运行使其工作data = choose_loc(data, ...).copy(),但文件很大,我想避免复制以节省时间和内存。所以我该怎么做?

标签: pandas

解决方案


尝试这个:

data = data.loc[(data.lat > (lat - lat_diff)) & (data.lat < (lat + lat_diff)) & (data.lon > (lon - lon_diff)) & (data.lon < (lon + lon_diff))].copy()

推荐阅读