r - 有没有更快的方法在 R 中制作这个混淆矩阵表?
问题描述
我正在尝试使用以下数据框在 R 中制作混淆矩阵表:
mydf <- structure(list(pred_class = c("dog", "dog", "fish", "cat", "cat",
"dog", "fish", "cat", "dog", "fish"), true_class = c("cat", "cat",
"dog", "cat", "cat", "dog", "dog", "cat", "dog", "fish")), row.names = c(NA,
10L), class = "data.frame")
pred_class true_class
1 dog cat
2 dog cat
3 fish dog
4 cat cat
5 cat cat
6 dog dog
我已经编写了代码来做我想做的事——对于每一类(狗、猫或鱼),说每一行是真阳性、假阳性、真阴性还是假阴性。
conf_mat <- mydf %>%
mutate(
dog_conf = case_when(
true_class == "dog" & pred_class == "dog" ~ "TP",
true_class == "dog" & pred_class != "dog" ~ "FN",
true_class != "dog" & pred_class == "dog" ~ "FP",
true_class != "dog" & pred_class != "dog" ~ "TN"
),
cat_conf = case_when(
true_class == "cat" & pred_class == "cat" ~ "TP",
true_class == "cat" & pred_class != "cat" ~ "FN",
true_class != "cat" & pred_class == "cat" ~ "FP",
true_class != "cat" & pred_class != "cat" ~ "TN"
),
fish_conf = case_when(
true_class == "fish" & pred_class == "fish" ~ "TP",
true_class == "fish" & pred_class != "fish" ~ "FN",
true_class != "fish" & pred_class == "fish" ~ "FP",
true_class != "fish" & pred_class != "fish" ~ "TN"
)
)
但是,此代码非常重复且庞大。我不确定如何简化这一点。有没有人有什么建议?谢谢你。
解决方案
这是一个选项map
,我们循环遍历数据集的唯一元素,transmute
根据 OP 帖子中指定的条件在循环中创建列,并将这些列与原始数据绑定
library(dplyr)
library(purrr)
library(stringr)
map_dfc(unique(unlist(mydf)), ~
mydf %>%
transmute(!! str_c(.x, '_conf') :=
case_when(true_class == .x & pred_class == .x ~ "TP",
true_class == .x & pred_class != .x ~ "FN",
true_class != .x & pred_class == .x ~ "FP",
true_class != .x & pred_class != .x ~ "TN"
))) %>%
bind_cols(mydf, .)
-输出
# pred_class true_class dog_conf cat_conf fish_conf
#1 dog cat FP FN TN
#2 dog cat FP FN TN
#3 fish dog FN TN FP
#4 cat cat TN TP TN
#5 cat cat TN TP TN
#6 dog dog TP TN TN
#7 fish dog FN TN FP
#8 cat cat TN TP TN
#9 dog dog TP TN TN
#10 fish fish TN TN TP
或者merge
在 key val 数据集上使用
keydat <- data.frame(pred_class = c(TRUE, TRUE, FALSE, FALSE),
true_class = c(TRUE, FALSE, TRUE, FALSE),
conf = c("TP", "FN", "FP", "TN"))
un1 <- unique(unlist(mydf))
mydf[paste0(un1, "_conf")] <- lapply(un1, function(x)
merge(mydf == x, keydat, all.x = TRUE)$conf)
推荐阅读
- security - 我可以检查是否有人下载了我的代码吗?
- reactjs - 找不到模块“reactstrap”的声明文件
- javascript - 如何使用调用 Javascript 的本机函数在 Chrome 中创建一个在 Java 中工作的粘贴按钮?
- python - 401 未授权来自 Flask RESTAPI
- docker - Microsoft Active Directory SSO 到 docker 容器
- math - 我们可以从很少的浮点数中得到多少不同的总和?
- laravel - 有没有一种方法可以创建自定义验证规则,无需在 Laravel 中使用表单请求即可重用 Livewire?
- typescript - 具有属性名和值参数的通用函数
- spring-websocket - 我希望我的应用程序在春季同时使用 https 和 websocket 协议
- java - 排序字母数字值,而不考虑字母