首页 > 解决方案 > 重新格式化 Dataframe 列,以便将任何数字月份子字符串替换为月份字符串

问题描述

希望重新格式化字符串列以在 Django 中导致错误。我的df:

import pandas as pd   
data = {'Date_Str'['2018_11','2018_12','2019_01','2019_02','2019_03','2019_04','2019_05','2019_06','2019_07','2019_08','2019_09','2019_10',],}
df = pd.DataFrame(dict(data))
print(df)

   Date_Str
0   2018_11
1   2018_12
2   2019_01
3   2019_02
4   2019_03
5   2019_04
6   2019_05
7   2019_06
8   2019_07
9   2019_08
10  2019_09
11  2019_10

我的解决方案:

df['Date_Month'] = df.Date_Str.str[-2:]
mapper = {'01':'Jan', '02':'Feb', '03':'Mar','04':'Apr','05':'May','06':'Jun','07':'Jul','08':'Aug','09':'Sep','10':'Oct','11':'Nov','12':'Dec'}
df['Date_Month_Str'] = df.Date_Str.str[0:4] + '_' + df.Date_Month.map(mapper)

print(df)

所需的输出是列Date_Month_Str或简单地Date_Str用 yyyy_mmm 更新

   Date_Str Date_Month Date_Month_Str
0   2018_11         11       2018_Nov
1   2018_12         12       2018_Dec
2   2019_01         01       2019_Jan
3   2019_02         02       2019_Feb
4   2019_03         03       2019_Mar
5   2019_04         04       2019_Apr
6   2019_05         05       2019_May
7   2019_06         06       2019_Jun
8   2019_07         07       2019_Jul
9   2019_08         08       2019_Aug
10  2019_09         09       2019_Sep
11  2019_10         10       2019_Oct

三行可以减为一吗?或者只是Date_Str用一个衬垫更新?

标签: pythonpandasdataframe

解决方案


将 column 转换为datetimes 然后使用Series.dt.strftime

df['Date_Month_Str'] = pd.to_datetime(df.Date_Str, format='%Y_%m').dt.strftime('%Y_%b')
print(df)
   Date_Str Date_Month_Str
0   2018_11       2018_Nov
1   2018_12       2018_Dec
2   2019_01       2019_Jan
3   2019_02       2019_Feb
4   2019_03       2019_Mar
5   2019_04       2019_Apr
6   2019_05       2019_May
7   2019_06       2019_Jun
8   2019_07       2019_Jul
9   2019_08       2019_Aug
10  2019_09       2019_Sep
11  2019_10       2019_Oct

推荐阅读