首页 > 解决方案 > Faiss Kmeans图像聚类中的错误

问题描述

我有一组大约 200 张图像,我想将它们聚集成具有相似特征的图像组。我正在使用 Resnet50 从图像中提取特征向量,并在 Faiss Kmeans 的帮助下尝试将它们聚类成组。

我已经为 Faiss KMeans 定义了一个类,如这里的链接所示

class FaissKMeans:
    def __init__(self, n_clusters=8, n_init=10, max_iter=300):
        self.n_clusters = n_clusters
        self.n_init = n_init
        self.max_iter = max_iter
        self.kmeans = None
        self.cluster_centers_ = None
        self.inertia_ = None

    def fit(self, X, y):
        self.kmeans = faiss.Kmeans(d=X.shape[1],
                                   k=self.n_clusters,
                                   niter=self.max_iter,
                                   nredo=self.n_init)
        self.kmeans.train(X.astype(np.float32))
        self.cluster_centers_ = self.kmeans.centroids
        self.inertia_ = self.kmeans.obj[-1]

    def predict(self, X):
        return self.kmeans.index.search(X.astype(np.float32), 1)[1]

我将图像及其向量作为键值对存储在字典中。

#function to extract image vector
def extract_features(file, model):
    img = load_img(file,target_size=(224,224))
    img = np.array(img) 
    reshaped_img = img.reshape(1,224,224,3)
    imgx = preprocess_input(reshaped_img)
    features = model.predict(imgx,use_multiprocessing=True)
    return features

#append the images in a folder to list "products"
products = []
with os.scandir(mypath) as files:
for file in files:
    if file.name.endswith('.jpg'):
        products.append(file.name)

#load ResNet50 model
model = ResNet50()
model = Model(inputs = model.inputs, outputs = model.layers[-2].output)

#save image and image vector to dictionary "feature_dict" as key value pair
feature_dict = {}
p = pkl_path 
    
for product in products:
    try:
        feat = extract_features(product,model)
        feature_dict[product] = feat
    except:
        with open(p,'wb') as file:
            pickle.dump(data,file)

#convert dictionary to a numpy array    
filenames = np.array(list(feature_dict.keys()))
feat = np.array(list(feature_dict.values()))
feat = feat.reshape(-1,2048)

我正在使用包“kneed”来确定集群的数量

#determine the number of clusters
length = len(filenames)
lim = 25
    
sse = []
list_k = list(range(1, lim))
    
for k in list_k:
    km = KMeans(n_clusters=k,random_state=22, n_jobs=-1)
    labels= km.fit_predict(feat)
    sse.append(km.inertia_)

kneedle=KneeLocator(list_k,sse,curve='convex',direction='decreasing')
elbow = kneedle.elbow #number of clusters

现在我正在尝试使用 faiss Kmeans 将图像聚类到不同的组中,但我得到了AttributeError: 'Kmeans' object has no attribute 'fit'on的错误kmeans.fit(feat)

kmeans = faiss.Kmeans(d=feat.shape[0] ,k=elbow, niter=200)
kmeans.fit(feat) 
kmeans.train(feat)

当我尝试使用kmeans.train(feat)链接上找到的内容时,出现错误AssertionError

标签: pythonk-meansfaiss

解决方案


推荐阅读