r - 根据数值变量的水平编码新因子
问题描述
我正在尝试根据另一列的数值创建一个因子列。这是我的数据的一个子集:
> dput(sample)
structure(list(ID = c(1683L, 1684L, 1684L, 1684L, 1684L, 1685L,
1685L, 1685L, 1685L, 1686L, 1686L, 1686L, 1686L, 30759L, 30759L,
30759L, 30759L, 30760L, 30760L, 30760L, 30760L), Month = structure(c(2L,
2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L, 2L, 3L, 1L, 2L,
2L, 3L, 1L, 2L), .Label = c("Jun", "Jul", "Aug"), class = "factor"),
Year = c(2018, 2017, 2017, 2018, 2018, 2017, 2017, 2018,
2018, 2017, 2017, 2018, 2018, 2017, 2017, 2018, 2018, 2017,
2017, 2018, 2018), Homerange = c(NA, 27.2850594918174, NA,
NA, NA, NA, 30.52684873837, NA, NA, NA, 30.7069481409563,
10.625864752589, 29.2661529202662, 32.3278427642325, NA,
NA, NA, NA, 33.8586876862157, NA, NA)), out.attrs = list(
dim = c(58L, 4L, 2L), dimnames = list(Var1 = c("Var1= 1657",
"Var1= 1658", "Var1= 1659", "Var1= 1660", "Var1= 1661", "Var1= 1662",
"Var1= 1663", "Var1= 1664", "Var1= 1666", "Var1= 1667", "Var1= 1668",
"Var1= 1669", "Var1= 1670", "Var1= 1671", "Var1= 1672", "Var1= 1673",
"Var1= 1674", "Var1= 1675", "Var1= 1676", "Var1= 1678", "Var1= 1679",
"Var1= 1680", "Var1= 1681", "Var1= 1682", "Var1= 1683", "Var1= 1684",
"Var1= 1685", "Var1= 1686", "Var1=30759", "Var1=30760", "Var1=30761",
"Var1=30762", "Var1=30763", "Var1=30764", "Var1=30765", "Var1=30766",
"Var1=30767", "Var1=30768", "Var1=30769", "Var1=30770", "Var1=30771",
"Var1=30772", "Var1=30773", "Var1=30774", "Var1=30775", "Var1=30776",
"Var1=30777", "Var1=30778", "Var1=30779", "Var1=30780", "Var1=30781",
"Var1=30782", "Var1=30783", "Var1=30784", "Var1=30785", "Var1=30786",
"Var1=30787", "Var1=30788"), Var2 = c("Var2=Jun", "Var2=Jul",
"Var2=Aug", "Var2=Sep"), Var3 = c("Var3=2017", "Var3=2018"
))), row.names = c(315L, 84L, 142L, 258L, 316L, 85L, 143L,
259L, 317L, 86L, 144L, 260L, 318L, 87L, 145L, 261L, 319L, 88L,
146L, 262L, 320L), class = "data.frame")
数字列“ID”的值介于 1659-1685 和 30759-30788 之间。我想做的是创建一个因子列“类型”,它有两个级别“V13”,对应于 ID 1659-1685,“V16”对应于 ID 30759-30788。我知道我以前做过,但由于某种原因我不记得怎么做了。谢谢您的帮助!
解决方案
假设在您的范围内没有考虑 ID 1686 是故意的,您可以试试这个:
library(dplyr)
library(forcats)
df %>%
mutate(type = case_when(between(ID, 1659, 1685) ~ "V13",
between(ID, 30759, 30788) ~ "V16")) %>%
mutate(type = as_factor(type))
# A tibble: 21 x 5
ID Month Year Homerange type
<int> <fct> <dbl> <dbl> <fct>
1 1683 Jul 2018 NA V13
2 1684 Jul 2017 27.3 V13
3 1684 Aug 2017 NA V13
4 1684 Jun 2018 NA V13
5 1684 Jul 2018 NA V13
6 1685 Jul 2017 NA V13
7 1685 Aug 2017 30.5 V13
8 1685 Jun 2018 NA V13
9 1685 Jul 2018 NA V13
10 1686 Jul 2017 NA NA
11 1686 Aug 2017 30.7 NA
12 1686 Jun 2018 10.6 NA
13 1686 Jul 2018 29.3 NA
14 30759 Jul 2017 32.3 V16
15 30759 Aug 2017 NA V16
16 30759 Jun 2018 NA V16
17 30759 Jul 2018 NA V16
18 30760 Jul 2017 NA V16
19 30760 Aug 2017 33.9 V16
20 30760 Jun 2018 NA V16
21 30760 Jul 2018 NA V16
推荐阅读
- java - 模拟 Activity 过渡动画
- ionic-framework - 离子输入点击进入文本奇怪的行为
- javascript - 如何将标签更改
为在博主?
- amazon-web-services - 将 Powershell 数组转换为文本,以便将其导出为 CSV 或 HTML
- java - Cucumber java - 将 .txt 文件中的字符串添加到黄瓜步骤定义
- autodesk-forge - 如何从查看器中的选定组件获取 xyz 坐标
- python - 如何将用户输入设置为另一个框架中 Tkinter Entry 小部件的默认文本?
- bash - 从 shell 重定向可执行文件的 stdout/stderr 但不重定向调用时错误
- prometheus - 通过 Rancher 访问 Prometheus 监控值
- python - 美丽的汤断言错误