首页 > 解决方案 > 如何获取多索引数据帧第二级的最后 10 行?

问题描述

对于每个 1 级索引,我想只切掉多索引数据帧的第二级的最后 10 行。我已经阅读了 loc、iloc、slice 等,但似乎无法将所有东西放在一起来获得我需要的东西。

这是一些生成我的数据框的代码:

import pandas as pd
df = pd.DataFrame({'Id': {('A', Timestamp('2008-10-31 00:00:00')): 45846, ('A', Timestamp('2009-10-31 00:00:00')): 45846, ('A', Timestamp('2010-10-31 00:00:00')): 45846, ('A', Timestamp('2011-10-31 00:00:00')): 45846, ('A', Timestamp('2012-10-31 00:00:00')): 45846, ('A', Timestamp('2013-10-31 00:00:00')): 45846, ('A', Timestamp('2014-10-31 00:00:00')): 45846, ('A', Timestamp('2015-10-31 00:00:00')): 45846, ('A', Timestamp('2016-10-31 00:00:00')): 45846, ('A', Timestamp('2017-10-31 00:00:00')): 45846, ('A', Timestamp('2018-10-31 00:00:00')): 45846, ('A', Timestamp('2019-10-31 00:00:00')): 45846, ('A', Timestamp('2020-10-31 00:00:00')): 45846, ('AA', Timestamp('2015-12-31 00:00:00')): 367153, ('AA', Timestamp('2016-12-31 00:00:00')): 367153, ('AA', Timestamp('2017-12-31 00:00:00')): 367153, ('AA', Timestamp('2018-12-31 00:00:00')): 367153, ('AA', Timestamp('2019-12-31 00:00:00')): 367153, ('AA', Timestamp('2020-12-31 00:00:00')): 367153}, 'Currency': {('A', Timestamp('2008-10-31 00:00:00')): 'USD', ('A', Timestamp('2009-10-31 00:00:00')): 'USD', ('A', Timestamp('2010-10-31 00:00:00')): 'USD', ('A', Timestamp('2011-10-31 00:00:00')): 'USD', ('A', Timestamp('2012-10-31 00:00:00')): 'USD', ('A', Timestamp('2013-10-31 00:00:00')): 'USD', ('A', Timestamp('2014-10-31 00:00:00')): 'USD', ('A', Timestamp('2015-10-31 00:00:00')): 'USD', ('A', Timestamp('2016-10-31 00:00:00')): 'USD', ('A', Timestamp('2017-10-31 00:00:00')): 'USD', ('A', Timestamp('2018-10-31 00:00:00')): 'USD', ('A', Timestamp('2019-10-31 00:00:00')): 'USD', ('A', Timestamp('2020-10-31 00:00:00')): 'USD', ('AA', Timestamp('2015-12-31 00:00:00')): 'USD', ('AA', Timestamp('2016-12-31 00:00:00')): 'USD', ('AA', Timestamp('2017-12-31 00:00:00')): 'USD', ('AA', Timestamp('2018-12-31 00:00:00')): 'USD', ('AA', Timestamp('2019-12-31 00:00:00')): 'USD', ('AA', Timestamp('2020-12-31 00:00:00')): 'USD'}, 'Fiscal Year': {('A', Timestamp('2008-10-31 00:00:00')): 2008, ('A', Timestamp('2009-10-31 00:00:00')): 2009, ('A', Timestamp('2010-10-31 00:00:00')): 2010, ('A', Timestamp('2011-10-31 00:00:00')): 2011, ('A', Timestamp('2012-10-31 00:00:00')): 2012, ('A', Timestamp('2013-10-31 00:00:00')): 2013, ('A', Timestamp('2014-10-31 00:00:00')): 2014, ('A', Timestamp('2015-10-31 00:00:00')): 2015, ('A', Timestamp('2016-10-31 00:00:00')): 2016, ('A', Timestamp('2017-10-31 00:00:00')): 2017, ('A', Timestamp('2018-10-31 00:00:00')): 2018, ('A', Timestamp('2019-10-31 00:00:00')): 2019, ('A', Timestamp('2020-10-31 00:00:00')): 2020, ('AA', Timestamp('2015-12-31 00:00:00')): 2015, ('AA', Timestamp('2016-12-31 00:00:00')): 2016, ('AA', Timestamp('2017-12-31 00:00:00')): 2017, ('AA', Timestamp('2018-12-31 00:00:00')): 2018, ('AA', Timestamp('2019-12-31 00:00:00')): 2019, ('AA', Timestamp('2020-12-31 00:00:00')): 2020}, 'Fiscal Period': {('A', Timestamp('2008-10-31 00:00:00')): 'FY', ('A', Timestamp('2009-10-31 00:00:00')): 'FY', ('A', Timestamp('2010-10-31 00:00:00')): 'FY', ('A', Timestamp('2011-10-31 00:00:00')): 'FY', ('A', Timestamp('2012-10-31 00:00:00')): 'FY', ('A', Timestamp('2013-10-31 00:00:00')): 'FY', ('A', Timestamp('2014-10-31 00:00:00')): 'FY', ('A', Timestamp('2015-10-31 00:00:00')): 'FY', ('A', Timestamp('2016-10-31 00:00:00')): 'FY', ('A', Timestamp('2017-10-31 00:00:00')): 'FY', ('A', Timestamp('2018-10-31 00:00:00')): 'FY', ('A', Timestamp('2019-10-31 00:00:00')): 'FY', ('A', Timestamp('2020-10-31 00:00:00')): 'FY', ('AA', Timestamp('2015-12-31 00:00:00')): 'FY', ('AA', Timestamp('2016-12-31 00:00:00')): 'FY', ('AA', Timestamp('2017-12-31 00:00:00')): 'FY', ('AA', Timestamp('2018-12-31 00:00:00')): 'FY', ('AA', Timestamp('2019-12-31 00:00:00')): 'FY', ('AA', Timestamp('2020-12-31 00:00:00')): 'FY'}, 'Publish Date': {('A', Timestamp('2008-10-31 00:00:00')): Timestamp('2008-12-19 00:00:00'), ('A', Timestamp('2009-10-31 00:00:00')): Timestamp('2009-12-21 00:00:00'), ('A', Timestamp('2010-10-31 00:00:00')): Timestamp('2010-12-20 00:00:00'), ('A', Timestamp('2011-10-31 00:00:00')): Timestamp('2011-12-16 00:00:00'), ('A', Timestamp('2012-10-31 00:00:00')): Timestamp('2012-12-20 00:00:00'), ('A', Timestamp('2013-10-31 00:00:00')): Timestamp('2013-12-19 00:00:00'), ('A', Timestamp('2014-10-31 00:00:00')): Timestamp('2014-12-22 00:00:00'), ('A', Timestamp('2015-10-31 00:00:00')): Timestamp('2015-12-21 00:00:00'), ('A', Timestamp('2016-10-31 00:00:00')): Timestamp('2016-12-20 00:00:00'), ('A', Timestamp('2017-10-31 00:00:00')): Timestamp('2017-12-21 00:00:00'), ('A', Timestamp('2018-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2019-10-31 00:00:00')): Timestamp('2019-12-19 00:00:00'), ('A', Timestamp('2020-10-31 00:00:00')): Timestamp('2020-12-18 00:00:00'), ('AA', Timestamp('2015-12-31 00:00:00')): Timestamp('2016-03-02 00:00:00'), ('AA', Timestamp('2016-12-31 00:00:00')): Timestamp('2017-02-03 00:00:00'), ('AA', Timestamp('2017-12-31 00:00:00')): Timestamp('2018-02-26 00:00:00'), ('AA', Timestamp('2018-12-31 00:00:00')): Timestamp('2019-02-26 00:00:00'), ('AA', Timestamp('2019-12-31 00:00:00')): Timestamp('2020-02-21 00:00:00'), ('AA', Timestamp('2020-12-31 00:00:00')): Timestamp('2021-02-25 00:00:00')}, 'Restated Date': {('A', Timestamp('2008-10-31 00:00:00')): Timestamp('2010-12-20 00:00:00'), ('A', Timestamp('2009-10-31 00:00:00')): Timestamp('2011-12-16 00:00:00'), ('A', Timestamp('2010-10-31 00:00:00')): Timestamp('2012-12-20 00:00:00'), ('A', Timestamp('2011-10-31 00:00:00')): Timestamp('2013-12-19 00:00:00'), ('A', Timestamp('2012-10-31 00:00:00')): Timestamp('2014-12-22 00:00:00'), ('A', Timestamp('2013-10-31 00:00:00')): Timestamp('2015-12-21 00:00:00'), ('A', Timestamp('2014-10-31 00:00:00')): Timestamp('2016-12-20 00:00:00'), ('A', Timestamp('2015-10-31 00:00:00')): Timestamp('2017-12-21 00:00:00'), ('A', Timestamp('2016-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2017-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2018-10-31 00:00:00')): Timestamp('2018-12-20 00:00:00'), ('A', Timestamp('2019-10-31 00:00:00')): Timestamp('2019-12-19 00:00:00'), ('A', Timestamp('2020-10-31 00:00:00')): Timestamp('2020-12-18 00:00:00'), ('AA', Timestamp('2015-12-31 00:00:00')): Timestamp('2017-03-15 00:00:00'), ('AA', Timestamp('2016-12-31 00:00:00')): Timestamp('2017-03-15 00:00:00'), ('AA', Timestamp('2017-12-31 00:00:00')): Timestamp('2018-02-26 00:00:00'), ('AA', Timestamp('2018-12-31 00:00:00')): Timestamp('2019-02-26 00:00:00'), ('AA', Timestamp('2019-12-31 00:00:00')): Timestamp('2020-02-21 00:00:00'), ('AA', Timestamp('2020-12-31 00:00:00')): Timestamp('2021-02-25 00:00:00')}})

应该产生这个:

                        Id      Currency  Fiscal Year Fiscal Period Publish Date Restated Date
Ticker Report Date                                                                         
A      2008-10-31      45846      USD         2008            FY   2008-12-19    2010-12-20
       2009-10-31      45846      USD         2009            FY   2009-12-21    2011-12-16
       2010-10-31      45846      USD         2010            FY   2010-12-20    2012-12-20
       2011-10-31      45846      USD         2011            FY   2011-12-16    2013-12-19
       2012-10-31      45846      USD         2012            FY   2012-12-20    2014-12-22
       2013-10-31      45846      USD         2013            FY   2013-12-19    2015-12-21
       2014-10-31      45846      USD         2014            FY   2014-12-22    2016-12-20
       2015-10-31      45846      USD         2015            FY   2015-12-21    2017-12-21
       2016-10-31      45846      USD         2016            FY   2016-12-20    2018-12-20
       2017-10-31      45846      USD         2017            FY   2017-12-21    2018-12-20
       2018-10-31      45846      USD         2018            FY   2018-12-20    2018-12-20
       2019-10-31      45846      USD         2019            FY   2019-12-19    2019-12-19
       2020-10-31      45846      USD         2020            FY   2020-12-18    2020-12-18
AA     2015-12-31     367153      USD         2015            FY   2016-03-02    2017-03-15
       2016-12-31     367153      USD         2016            FY   2017-02-03    2017-03-15
       2017-12-31     367153      USD         2017            FY   2018-02-26    2018-02-26
       2018-12-31     367153      USD         2018            FY   2019-02-26    2019-02-26
       2019-12-31     367153      USD         2019            FY   2020-02-21    2020-02-21
       2020-12-31     367153      USD         2020            FY   2021-02-25    2021-02-25

切片后我想要的是:

                        Id      Currency  Fiscal Year Fiscal Period Publish Date Restated Date
Ticker Report Date                                                                         
A      2011-10-31      45846      USD         2011            FY   2011-12-16    2013-12-19
       2012-10-31      45846      USD         2012            FY   2012-12-20    2014-12-22
       2013-10-31      45846      USD         2013            FY   2013-12-19    2015-12-21
       2014-10-31      45846      USD         2014            FY   2014-12-22    2016-12-20
       2015-10-31      45846      USD         2015            FY   2015-12-21    2017-12-21
       2016-10-31      45846      USD         2016            FY   2016-12-20    2018-12-20
       2017-10-31      45846      USD         2017            FY   2017-12-21    2018-12-20
       2018-10-31      45846      USD         2018            FY   2018-12-20    2018-12-20
       2019-10-31      45846      USD         2019            FY   2019-12-19    2019-12-19
       2020-10-31      45846      USD         2020            FY   2020-12-18    2020-12-18
AA     2015-12-31     367153      USD         2015            FY   2016-03-02    2017-03-15
       2016-12-31     367153      USD         2016            FY   2017-02-03    2017-03-15
       2017-12-31     367153      USD         2017            FY   2018-02-26    2018-02-26
       2018-12-31     367153      USD         2018            FY   2019-02-26    2019-02-26
       2019-12-31     367153      USD         2019            FY   2020-02-21    2020-02-21
       2020-12-31     367153      USD         2020            FY   2021-02-25    2021-02-25

标签: pythonpandasmulti-index

解决方案


我们可以groupby tail用来获取n相对于的最后一个元素level=0

filtered_df = df.groupby(level=0).tail(10)

*请注意,如果要分配值以避免在子集框架时SetWithCopyWarning需要:copy

filtered_df = df.groupby(level=0).tail(10).copy()

filtered_df

                   Id Currency  ...  Publish Date Restated Date
A  2011-10-31   45846      USD  ...    2011-12-16    2013-12-19
   2012-10-31   45846      USD  ...    2012-12-20    2014-12-22
   2013-10-31   45846      USD  ...    2013-12-19    2015-12-21
   2014-10-31   45846      USD  ...    2014-12-22    2016-12-20
   2015-10-31   45846      USD  ...    2015-12-21    2017-12-21
   2016-10-31   45846      USD  ...    2016-12-20    2018-12-20
   2017-10-31   45846      USD  ...    2017-12-21    2018-12-20
   2018-10-31   45846      USD  ...    2018-12-20    2018-12-20
   2019-10-31   45846      USD  ...    2019-12-19    2019-12-19
   2020-10-31   45846      USD  ...    2020-12-18    2020-12-18
AA 2015-12-31  367153      USD  ...    2016-03-02    2017-03-15
   2016-12-31  367153      USD  ...    2017-02-03    2017-03-15
   2017-12-31  367153      USD  ...    2018-02-26    2018-02-26
   2018-12-31  367153      USD  ...    2019-02-26    2019-02-26
   2019-12-31  367153      USD  ...    2020-02-21    2020-02-21
   2020-12-31  367153      USD  ...    2021-02-25    2021-02-25

[16 rows x 6 columns]

推荐阅读