python-3.x - 计算混淆矩阵的更快方法?
问题描述
我正在计算我的混淆矩阵,如下所示,用于图像语义分割,这是一种非常冗长的方法:
def confusion_matrix(preds, labels, conf_m, sample_size):
preds = normalize(preds,0.9) # returns [0,1] tensor
preds = preds.flatten()
labels = labels.flatten()
for i in range(len(preds)):
if preds[i]==1 and labels[i]==1:
conf_m[0,0] += 1/(len(preds)*sample_size) # TP
elif preds[i]==1 and labels[i]==0:
conf_m[0,1] += 1/(len(preds)*sample_size) # FP
elif preds[i]==0 and labels[i]==0:
conf_m[1,0] += 1/(len(preds)*sample_size) # TN
elif preds[i]==0 and labels[i]==1:
conf_m[1,1] += 1/(len(preds)*sample_size) # FN
return conf_m
在预测循环中:
conf_m = torch.zeros(2,2) # two classes (object or no-object)
for img,label in enumerate(data):
...
out = Net(img)
conf_m = confusion_matrix(out, label, len(data))
...
是否有更快的方法(在 PyTorch 中)来有效地计算图像语义分割输入样本的混淆矩阵?
解决方案
我使用这两个函数来计算混淆矩阵(在sklearn中定义):
# rewrite sklearn method to torch
def confusion_matrix_1(y_true, y_pred):
N = max(max(y_true), max(y_pred)) + 1
y_true = torch.tensor(y_true, dtype=torch.long)
y_pred = torch.tensor(y_pred, dtype=torch.long)
return torch.sparse.LongTensor(
torch.stack([y_true, y_pred]),
torch.ones_like(y_true, dtype=torch.long),
torch.Size([N, N])).to_dense()
# weird trick with bincount
def confusion_matrix_2(y_true, y_pred):
N = max(max(y_true), max(y_pred)) + 1
y_true = torch.tensor(y_true, dtype=torch.long)
y_pred = torch.tensor(y_pred, dtype=torch.long)
y = N * y_true + y_pred
y = torch.bincount(y)
if len(y) < N * N:
y = torch.cat(y, torch.zeros(N * N - len(y), dtype=torch.long))
y = y.reshape(N, N)
return y
y_true = [2, 0, 2, 2, 0, 1]
y_pred = [0, 0, 2, 2, 0, 2]
confusion_matrix_1(y_true, y_pred)
# tensor([[2, 0, 0],
# [0, 0, 1],
# [1, 0, 2]])
在类数量较少的情况下,第二个功能更快。
%%timeit
confusion_matrix_1(y_true, y_pred)
# 102 µs ± 30.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%%timeit
confusion_matrix_2(y_true, y_pred)
# 25 µs ± 149 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
推荐阅读
- batch-file - 如何缩短重复的表达式?
- azure - Azure 部署槽之间存在什么隔离级别?
- python - BeautifulSoup,1 个元素有 2 个相同的链接,如何只打印 1 个?
- php - Wordpress 自定义不起作用 - load-scripts.php 错误
- javascript - 使用 API 下载 Fortify 导出数据
- google-chrome-devtools - 复制为 cURL(cmd) 将无效字符添加到表单数据
- java - 调试模式下的 assertEquals 未定义
- mysql - MySQL 替换 QUOTE
- git - 如何在 Gerrit 中协作处理一个 git commit?
- mysql - Sequelize ORM:如何使用“主机”属性连接到本地主机上的 MYSQL?