首页 > 解决方案 > 使用python的for循环中没有多个名义属性的数据图

问题描述

我目前正在对我想用相对频率可视化的分类属性进行聚类dexplot。这些属性来自 Kaggle 的银行营销数据集,我从中创建了三个集群 (kmodes)。

km_cao = KModes(n_clusters=choosed_clusters, init = "Cao", n_init = 5, verbose = 0)
fitClusters_cao = km_cao.fit_predict(df)
clusterDf = pd.DataFrame(fitClusters_cao)

clusterDf.columns = ['Cluster']
combinedDf = pd.concat([df, clusterDf], axis = 1)    

cluster_0 = combined_df[combined_df['Predicted_Cluster'] == 0]
cluster_1 = combined_df[combined_df['Predicted_Cluster'] == 1]
cluster_2 = combined_df[combined_df['Predicted_Cluster'] == 2]
cluster_0 = df[df['Cluster']==0]
cluster_0.head()

输出:

输出集群 0

我在可视化中遇到了以下问题:我可以dexplot用来显示 rel。频率单独(例如状态),但如果我想输出所有属性它不起作用。我没有收到错误 - 但也没有情节。

import dexplot as dxp 
# plot for one attribute - Ok
dxp.count(val='State', data=cluster_0, normalize=True, orientation='v', title='Relative Frequency by Cluster 0')
# All attributes with no plot
for col in cluster_0:
   dxp.count(val=col, data=cluster_0, normalize=True, orientation='v', title='Relative Frequency by Cluster 0')

一个属性的输出:绘制一个属性

标签: pythoncluster-analysisvisualizationcategorical-data

解决方案


推荐阅读