首页 > 解决方案 > 从决策树中提取变量名称

问题描述

所以我用包在 R 中构建了一个决策树,tree并在树上运行 summary() 函数给了我:

Classification tree:
tree(formula = High temperature ~ ., data = summer.train)
Variables actually used in tree construction:
[1] "Humidity"      "Cloudy"   "Airy" "Dry"   
"Windy"
Number of terminal nodes:  12
Residual mean deviance:  0.3874 = 377.7 / 975 
Misclassification error rate: 0.08909 = 89 / 999 

我想根据上面的汇总函数获取树构造使用的变量,“airy”,“dry”等。有什么办法我可以这样做吗?

标签: r

解决方案


所以它链接到:

树中使用的变量

确实,该解决方案对我有用,我使用著名的垃圾邮件数据集对其进行了测试:

library(kernlab)
library(tree)

data(spam)

spam_tree_def <- tree(type~.,data=spam)
summary(spam_tree_def)

总结结果:

Classification tree:
tree(formula = type ~ ., data = spam)
Variables actually used in tree construction:
 [1] "charDollar"      "remove"          "charExclamation" "hp"              "capitalLong"     "our"            
 [7] "capitalAve"      "free"            "george"          "edu"            
Number of terminal nodes:  13 
Residual mean deviance:  0.4879 = 2238 / 4588 
Misclassification error rate: 0.08259 = 380 / 4601 

提取所需内容的方法:

as.character(summary(spam_tree_def)$used)

[1] "charDollar"      "remove"          "charExclamation" "hp"              "capitalLong"     "our"            
 [7] "capitalAve"      "free"            "george"          "edu" 

推荐阅读