首页 > 解决方案 > IndexError:只有整数、切片(`:`)

问题描述

我正在尝试在 Python 中使用 k 表示聚类,但遇到错误。我正在使用的数据集可以在这里找到:

https://www.fueleconomy.gov/feg/ws/index.shtml#vehicle

import pandas as pd
import seaborn as sns
import numpy as np
#from matplotlib import pyplot as plt
import matplotlib.pyplot as plt

from matplotlib import style
%matplotlib inline

from sklearn.preprocessing import scale, StandardScaler
import sklearn
from sklearn.cluster import KMeans

df = pd.read_csv('vehicles.csv')
df2 = df[['comb08','youSaveSpend']].copy()

scaler = StandardScaler()
scaler.fit(df2)
scaled_array = scaler.transform(df2)
average = np.mean(scaled_array[:,0])
std = np.std(scaled_array[:,0])


df2 = scaled_array
max_clusters = 10
noClusters = range(1, max_clusters + 1)
kmeans = [KMeans(n_clusters = i) for i in noClusters] 
score = [kmeans[i].fit(df6).score(df2) for i in range(len(kmeans))]
plt.plot(noClusters, score)
plt.xlabel('Number of Clusters')
plt.ylabel("Score")
plt.title('Elbow Curve')

kmeans = KMeans(n_clusters = 10, random_state = 0)
kmeans = kmeans.fit(scaled_array) 
unscaled = scaler.inverse_transform(kmeans.cluster_centers_)
unscaled


centroids = pd.DataFrame({'centroidx':unscaled[:,0],'centroidy':unscaled[:,1]})

df2['label'] = kmeans.labels_.astype(np.int) 
df2.head() # <======== Error Occurs Here



plt.scatter(df2['comb08'], df2['youSaveSpend'], c=df2.label) # (x,y,color)
plt.scatter(centroids['centroidx'], \
            centroids['centroidy'], c='red') # (x,y,color)

plt.show() <======== Error also Occurs Here

我得到的错误如下:

在此处输入图像描述

当我尝试像这样转换我的浮点数时:

df2 = df.apply(pd.to_numeric)

我收到此错误:

在此处输入图像描述

标签: pythonnumpy

解决方案


推荐阅读