首页 > 解决方案 > 如何为通过 K-means 获得的集群绘制带有工具提示的 3-D 图形

问题描述

我正在尝试检查我的 3-D 集群中的数据点。为此,我需要将鼠标悬停在工具提示上来绘制图像。我也想拥有不同颜色的集群。

这就是我为获取集群所做的工作:

from sklearn.cluster import KMeans
km = KMeans(n_clusters=3, random_state=5)

x = df[['Event ID', 'Case ID', 'Num_Resource']]
y_clusters = km.fit_predict(X)

我的数据表是这样的:

    Case ID Event ID    dd-MM-yyyy:HH.mm    Activity           Resource   Costs  Num_Resource
 0    1     35654423    30-12-2010:11.02    register request    Pete       50     0
 1    1     35654424    31-12-2010:10.06    examine thoroughly  Sue        400    1
 2    1     35654425    05-01-2011:15.12    check ticket        Mike       100    2
 3    1     35654426    06-01-2011:11.18    decide              Sara       200    3
 4    1     35654427    07-01-2011:14.24    reject request      Pete       200    0
 5    2     35654483    30-12-2010:11.32    register request    Mike       50     2
 6    2     35654485    30-12-2010:12.12    check ticket        Mike       100    2
 7    2     35654487    30-12-2010:14.16    examine casually    Sean       400    4
 8    2     35654488    05-01-2011:11.22    decide              Sara       200    3
 9    2     35654489    08-01-2011:12.05    pay compensation    Ellen      200    5

我希望事件 ID、活动和成本出现在工具提示信息中。此外,在应用 K-Means 时,我将资源列转换为数字,我希望实际的资源列值与我的绘图中的数字相对应。

为了实现它,我尝试使用 SO 中的答案中的代码 -

# 3d scatterplot using matplotlib

fig = plt.figure(figsize = (15,15))
ax = fig.add_subplot(111, projection='3d')
ax.scatter(x[y_clusters == 0,0],x[y_clusters == 0,1],x[y_clusters == 0,2], s = 40 , color = 'blue', label = "cluster 0")
ax.scatter(x[y_clusters == 1,0],x[y_clusters == 1,1],x[y_clusters == 1,2], s = 40 , color = 'orange', label = "cluster 1")
ax.scatter(x[y_clusters == 2,0],x[y_clusters == 2,1],x[y_clusters == 2,2], s = 40 , color = 'green', label = "cluster 2")
ax.scatter(x[y_clusters == 3,0],x[y_clusters == 3,1],x[y_clusters == 3,2], s = 40 , color = '#D12B60', label = "cluster 3")
ax.scatter(x[y_clusters == 4,0],x[y_clusters == 4,1],x[y_clusters == 4,2], s = 40 , color = 'purple', label = "cluster 4")
ax.set_xlabel('Age of a customer-->')
ax.set_ylabel('Anual Income-->')
ax.set_zlabel('Spending Score-->')
ax.legend()
plt.show()

但我收到一个错误:

TypeError                                 Traceback (most recent call last)
<ipython-input-34-d86f593f897b> in <module>()
      3 fig = plt.figure(figsize = (15,15))
      4 ax = fig.add_subplot(111, projection='3d')
----> 5 ax.scatter(X[y_clusters == 0,0],X[y_clusters == 0,1],X[y_clusters == 0,2], s = 40 , color = 'blue', label = "cluster 0")
      6 ax.scatter(X[y_clusters == 1,0],X[y_clusters == 1,1],X[y_clusters == 1,2], s = 40 , color = 'orange', label = "cluster 1")
      7 ax.scatter(X[y_clusters == 2,0],X[y_clusters == 2,1],X[y_clusters == 2,2], s = 40 , color = 'green', label = "cluster 2")

1 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2896             casted_key = self._maybe_cast_indexer(key)
   2897             try:
-> 2898                 return self._engine.get_loc(casted_key)
   2899             except KeyError as err:
   2900                 raise KeyError(key) from err

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(array([False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False, False, False, False,
       False, False, False, False, False, False,  True,  True,  True,
        True,  True,  True,  True,  True,  True,  True,  True,  True,
        True, False, False, False, False, False]), 0)' is an invalid key

如果有人可以帮助我,那就太好了。

提前致谢。

PS 编辑 -:我现在可以绘制这个数字了。代码在这里:

fig = plt.figure(figsize=(15, 15))
ax = fig.add_subplot(111, projection='3d')

scatter = ax.scatter(df['Num_Resource'],df['Case ID'], df['Event ID'],
                     c=y_clusters,s=20, cmap='winter')


ax.set_title('K-Means Clustering')
ax.set_xlabel('Num_Resource')
ax.set_ylabel('Case ID')
ax.set_zlabel('Event_ID')
ax.legend()
plt.show()

但是现在,如果有人可以帮助我使用工具提示,我会很高兴

标签: python-3.xdata-visualizationk-means

解决方案


推荐阅读