首页 > 解决方案 > 如何从头开始在 Python 中绘制 KNN 决策边界?

问题描述

我需要在不使用 sklearn 的情况下绘制 KNN 的决策边界。我已经实现了分类器,但我无法绘制决策边界。情节应如 Trevor Hastie & Robert Tibshirani & Jerome Friedman 所著的 ElemStatLearn “统计学习的要素:数据挖掘、推理和预测。第二版”一书中所述。所需的情节如下所示:

KNN k=15 分类器 原始

所以,到目前为止,我只能绘制下面的图像:

KNN k=15 分类器 到目前为止产生的图

我已经计算了网格点和对这些点的预测。如果预测与前一个网格点上的预测不匹配,我还尝试找到边界上的点并对点进行排序。但是当我绘制这些点时,它们看起来不像是需要的。

def get_grid(X):
    # Creating grids for decision surface
    ## Define bounds of the surface
    min1, max1 = X[:, 0].min() - 0.2, X[:, 0].max() + 0.2
    min2, max2 = X[:, 1].min() - 0.2, X[:, 1].max() + 0.2
    ## Define the x and y points
    x1grid = arange(min1, max1, 0.1)
    x2grid = arange(min2, max2, 0.1)
    ## Create all of the lines and rows of the grid
    xx, yy = meshgrid(x1grid, x2grid)
    ## Flatten each grid to a vector
    r1, r2 = xx.flatten(), yy.flatten()
    r1, r2 = r1.reshape((len(r1), 1)), r2.reshape((len(r2), 1))
    ## Horizontally stack vectors to create x1, x2 input for the model
    grid_X = hstack((r1, r2))
    return grid_X

X, y = data[:, :-1], data[:, -1].astype(int)
# Custom class defined
model = KNNClassifier(num_neighbors = 5)
model.fit(X, y)
y_pred = model.predict(X)

grid_X = get_grid(X)
grid_yhat = model.predict(grid_X)

boundary = []
for i in range(1, len(grid_X)):
    if grid_yhat[i] != grid_yhat[i-1]:
        boundary.append((grid_X[i] + grid_X[i-1]) * 0.5)
boundary_x = [b[0] for b in boundary]
boundary_y = [b[1] for b in boundary]
order = np.argsort(boundary_x)
boundary_x = np.array(boundary_x)[order]
boundary_y = np.array(boundary_y)[order]

def plot_decision_surface(X, y, boundary_X, boundary_y, grid_X, grid_yhat):
    
    figure(figsize=(10,10))    
    axis('off')
    # Plot the ground truth data points in the 2D feature space
    X_pos, X_neg = split_X(X, y)
    scatter(X_pos[:, 0], X_pos[:, 1], facecolors='none', edgecolors='orange', marker='o', linewidth=3, s=60)
    scatter(X_neg[:, 0], X_neg[:, 1], facecolors='none', edgecolors='blue', marker='o', linewidth=3, s=60)
    
    
    grid_pos, grid_neg = split_X(grid_X, grid_yhat)
    
    # Plot and color the grid of x, y values with class
    scatter(grid_pos[:, 0], grid_pos[:, 1], color='orange', marker='.', linewidth=0.05)
    scatter(grid_neg[:, 0], grid_neg[:, 1], color='blue', marker='.', linewidth=0.05)
    
    # Plot the decision boundary for the classification

    scatter(boundary_X, boundary_y, color='k')  
    plot(boundary_X, boundary_y, color='k')
    
    # Plot Info
    show()

plot_decision_surface(X, y, boundary_X, boundary_y, grid_X, grid_yhat)

绘制边界失败的尝试如下所示:

尝试绘制边界失败

标签: pythonmatplotlibplotclassificationknn

解决方案


推荐阅读