首页 > 解决方案 > 如何反向缩放整个数据框?

问题描述

当我分别缩放每一列时,逆转这个过程很容易。例子:

sc_X = StandardScaler()
X_train = sc_X.fit_transform(X_train)
reverse=sc_X.inverse_transform(X_train)

当我必须缩放整个数据框时,很容易在一行代码中通过以下方式对其进行缩放:

dfv = StandardScaler().fit_transform(df)

但是如何扭转这个过程呢?我的意思是如何更改下面的代码,以便图表具有原始比例但正确的集群类别颜色?特别是 dfv 不再是数据框,因为缩放似乎将其更改为数组?

# Model
kmeans = KMeans(n_clusters = 5, init = 'k-means++', random_state = 15)
y_kmeans = kmeans.fit_predict(dfv)
# 2D Visualisation after clustering
X = dfv
df.columns.tolist()
fig = plt.figure(figsize=(15,5))
ax = fig.gca()
ax.grid(which='major', linestyle='-', linewidth='0.5', color='white')
ax.set_facecolor((0.898, 0.898, 0.898))

plt.scatter(X[y_kmeans == 0, 0], X[y_kmeans == 0, 1], s = 20, c = 'red', label = 'Cluster1')
plt.scatter(X[y_kmeans == 1, 0], X[y_kmeans == 1, 1], s = 20, c = 'blue', label = 'Cluster2')
plt.scatter(X[y_kmeans == 2, 0], X[y_kmeans == 2, 1], s = 20, c = 'limegreen', label = 'Cluster3')
plt.scatter(X[y_kmeans == 3, 0], X[y_kmeans == 3, 1], s = 20, c = 'magenta', label = 'Cluster4')
plt.scatter(X[y_kmeans == 4, 0], X[y_kmeans == 4, 1], s = 20, c = 'blueviolet', label = 'Cluster5')

ax.set_title('Features after clustering')
ax.set_xlabel(df.columns[0])
ax.set_ylabel(df.columns[1])
plt.legend()
plt.show()

标签: pythonscikit-learnscaling

解决方案


推荐阅读