首页 > 解决方案 > 如何获得上一年的中位数价格?

问题描述

鉴于以下数据,我如何获得上一年的中位数 squaremeterPrice?

  city_code createdYear squaremeterPrice squaremeterPrice_grouped_city_for_the_current_year
0   26      2014            33273        39632.0
1   26      2014            37500        39632.0
2   26      2014            47428        39632.0
3   26      2014            39554        39632.0
4   26      2014            38893        39632.0
5   26      2013            34231        28841.0
6   26      2014            34344        39632.0
7   26      2014            44574        39632.0
8   26      2014            25202        39632.0
9   26      2014            39632        39632.0
10  26      2014            44504        39632.0
11  26      2013            23451        28841.0
...

为了得到 squaremeterPrice_grouped_city_for_the_current_year 我使用了下面的代码:

# adding the yearly average sqm price
median_squaremeterPrice_per_city = df.groupby(["city_code"])["squaremeterPrice"].median().to_frame("squaremeterPrice_grouped_city_for_the_current_year").reset_index()
df = df.merge(median_squaremeterPrice_per_city, left_on=["city_code"], right_on=["city_code"])
df

我们的预期输出如下:

  city_code createdYear squaremeterPrice squaremeterPrice_grouped_city_for_the_current_year   squaremeterPrice_grouped_city_for_1_year_prior
0   26      2014            33273        39632.0      28841.0
1   26      2014            37500        39632.0      28841.0
2   26      2014            47428        39632.0      28841.0
3   26      2014            39554        39632.0      28841.0
4   26      2014            38893        39632.0      28841.0
5   26      2013            34231        28841.0      whatever was the 2012 price
6   26      2014            34344        39632.0      28841.0
7   26      2014            44574        39632.0      28841.0
8   26      2014            25202        39632.0      28841.0
9   26      2014            39632        39632.0      28841.0
10  26      2014            44504        39632.0      28841.0
11  26      2013            23451        28841.0      whatever was the 2012 price
... 

标签: pythonpandasdategroup-by

解决方案


相反,您的解决方案按两列分组,city_codecreatedYearmedian前一年添加1到年份MultiIndex,最后DataFrame.join用于新列:

median_squaremeterPrice_per_city_and_year = (df.groupby(["city_code", "createdYear"])["squaremeterPrice"]
                                               .median()
                                               .rename('squaremeterPrice_grouped_city_for_the_current_year'))
median_squaremeterPrice_per_city_and__prev_year =( median_squaremeterPrice_per_city_and_year
                                                   .rename(lambda x: x+1, level=1)
                                                   .rename('squaremeterPrice_grouped_city_for_the_prev_year'))

print (median_squaremeterPrice_per_city_and__prev_year)

df1 = (df.join(median_squaremeterPrice_per_city_and_year, on=['city_code','createdYear'])
         .join(median_squaremeterPrice_per_city_and__prev_year, on=['city_code','createdYear']))

print (df1)
    city_code  createdYear  squaremeterPrice  \
0          26         2014             33273   
1          26         2014             37500   
2          26         2014             47428   
3          26         2014             39554   
4          26         2014             38893   
5          26         2013             34231   
6          26         2014             34344   
7          26         2014             44574   
8          26         2014             25202   
9          26         2014             39632   
10         26         2014             44504   
11         26         2013             23451   

    squaremeterPrice_grouped_city_for_the_current_year  \
0                                             39223.5    
1                                             39223.5    
2                                             39223.5    
3                                             39223.5    
4                                             39223.5    
5                                             28841.0    
6                                             39223.5    
7                                             39223.5    
8                                             39223.5    
9                                             39223.5    
10                                            39223.5    
11                                            28841.0    

    squaremeterPrice_grouped_city_for_the_prev_year  
0                                           28841.0  
1                                           28841.0  
2                                           28841.0  
3                                           28841.0  
4                                           28841.0  
5                                               NaN  
6                                           28841.0  
7                                           28841.0  
8                                           28841.0  
9                                           28841.0  
10                                          28841.0  
11                                              NaN  

推荐阅读