python - 转置具有重复值的 pandas 列
问题描述
我有一个如下所示的数据框
df1 = pd.DataFrame({'Gender':['Male','Male','Male','Male','Female','Female','Female','Female','Male','Male','Male','Male','Female','Female','Female','Female'],
'Year' :[2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009,2008,2008,2009,2009],
'rate':[2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6,2.3,3.2,4.5,6.7,5.6,3.2,3.5,2.6],
'Heading':['TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123','TNMAB123',
'TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456','TNMAB456'],
'target':[31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2,31.2,33.4,33.4,35.2,35.2,36.4,36.4,37.2],
'day_type':['wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend','wk','wkend']})
如您所见,所有列中都有重复的值。
我想转置/旋转它们以获得如下所示的输出。尽管我尝试了以下方法,但它没有用。
df1.pivot(index='Year', columns='Heading', values='rate')
我希望我的输出如下所示,其中每一年都作为一行,而该年份的所有相应条目都作为列。
请注意,我没有填写值,因为表列结构更重要。
你能帮我吗?
解决方案
你可以试试这个。您可以df.unstack()
在此处使用并将多索引转换为使用join
.
df1 = df1.pivot_table(index=['Year','Gender'],columns='Heading',values='rate').unstack()
df1.columns = ['_'.join(i) for i in df1.columns.tolist()]
df1
TDAS3_Female TDAS3_Male TNMAB123_Female TNMAB123_Male TSAD4_Female TSAD4_Male TWQE2_Female TWQE2_Male
Year
2008 NaN NaN 6.3 2.3 NaN NaN NaN NaN
2009 NaN NaN 7.1 3.2 NaN NaN 2.1 4.5
2010 5.3 5.6 NaN NaN NaN NaN 4.2 6.7
2011 3.6 3.2 NaN NaN 2.9 3.5 NaN NaN
2012 NaN NaN NaN NaN 6.2 2.6 NaN NaN
有几种方法可以将多索引转换为单级。使用df.colums
或df.columns.tolist
或pd.MultiIndex.to_flat_index
['_'.join(i) for i in df1.columns.tolist()]
['_'.join(i) for i in df1.columns]
['_'.join(i) for i in df1.columns.to_flat_index()]