python - dataFrame重复提取行

问题描述

下面的代码给出了以下 Jupyter 输出：

日期开高低收盘量

0 1992 年 4 月 29 日 2.21 2.21 1.98 1.99 0

1 1992 年 4 月 29 日 2.21 2.21 1.98 1.98 0

2 1992 年 4 月 30 日 2.02 2.32 1.95 1.98 0

尺寸：6686

没有重复？错误的

日期开高低收盘量

0 1992 年 4 月 29 日 2.21 2.21 1.98 1.99 0

1 1992 年 4 月 29 日 2.21 2.21 1.98 1.98 0

2 1992 年 4 月 30 日 2.02 2.32 1.95 1.98 0

没有重复？错误的

尺寸：6686

我应该在重复提取行中更改什么？

谢谢！弗斯基尔尼克

checking = pd.DataFrame(df)

print(checking.head(3))

size2 = len(checking.index)
print('size:',size2)

print('no duplicates?', checking.date.is_unique)

checking.drop_duplicates(['date'], keep='last')

print(checking.head(3))

print('no duplicates?', checking.date.is_unique)

size2 = len(checking.index)
print('size:',size2)

标签： pythonpandas

您应该添加inplace=True到drop_duplicates方法等reassign：dataframe

checking.drop_duplicates(['date'], keep='last', inplace=True)

或者：

checking = checking.drop_duplicates(['date'], keep='last')

python - dataFrame重复提取行

问题描述

解决方案

推荐阅读