r - 在 R 中从长格式重塑为宽格式时出错 - 所有数据都是 NA 并且变量名称不正确
问题描述
我有一个长格式的数据框,我想对其进行重组。但是,当我这样做时,出现了严重的问题,我似乎无法找到方法。任何帮助是极大的赞赏!
这是数据(对于 2 个 ID 变量,我还有 300 个)
# A tibble: 86 x 20
# Groups: ID, Day [12]
ID Day Obs Time1 Time1_1 Time_between Time_minutes PA1 NA1 PA2 NA2 PA3
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 1 27154 NA NA NA 4 1 6 1 6
2 1 1 2 30150 27154 2996 49 4 1 5 1 6
3 1 1 3 33266 30150 3116 51 5 1 5 1 6
4 1 1 4 39842 33266 6576 109 5 1 7 1 6
5 1 1 5 46744 39842 6902 115 5 1 6 1 6
6 1 1 6 50643 46744 3899 64 1 1 5 1 4
7 1 1 7 56343 50643 5700 95 3 1 6 1 6
8 1 1 8 61744 56343 5401 90 6 1 6 1 6
9 1 1 9 67205 61744 5461 91 5 1 6 1 6
10 1 1 10 75360 67205 8155 135 4 1 6 1 6
11 1 1 11 78062 75360 2702 45 6 1 6 1 6
12 1 2 12 42844 NA NA NA 1 1 6 1 6
13 1 2 13 47400 42844 4556 75 6 1 6 1 6
14 1 2 14 53522 47400 6122 102 6 1 6 1 6
15 1 2 15 58923 53522 5401 90 5 1 6 1 6
16 1 2 16 63245 58923 4322 72 6 1 6 1 6
17 1 2 17 67562 63245 4317 71 6 1 6 1 6
18 1 2 18 72960 67562 5398 89 5 1 5 1 5
19 1 3 19 43800 NA NA NA 4 1 5 1 7
20 1 3 20 49083 43800 5283 88 4 1 6 1 6
21 1 3 21 54302 49083 5219 86 5 1 6 1 5
22 1 3 22 58324 54302 4022 67 6 1 6 1 6
23 1 3 23 63123 58324 4799 79 5 1 5 1 6
24 1 3 24 70981 63123 7858 130 4 1 6 1 6
25 1 3 25 75603 70981 4622 77 4 1 6 1 5
26 1 3 26 77583 75603 1980 33 5 1 5 1 5
27 1 4 27 27420 NA NA NA 4 1 6 1 5
28 1 4 28 29288 27420 1868 31 4 1 5 1 5
29 1 4 29 35339 29288 6051 100 5 1 4 1 5
30 1 4 30 37744 35339 2405 40 4 1 3 1 5
31 1 4 31 43021 37744 5277 87 4 1 4 1 5
32 1 4 32 51781 43021 8760 146 4 1 4 1 5
33 1 4 33 71460 51781 19679 327 4 1 6 1 6
34 1 4 34 76204 71460 4744 79 4 1 5 1 5
35 1 5 35 33136 NA NA NA 1 1 6 1 5
36 1 5 36 38883 33136 5747 95 4 1 4 1 5
37 1 5 37 45603 38883 6720 112 4 1 5 1 5
38 1 5 38 49445 45603 3842 64 4 1 5 1 5
39 1 5 39 55624 49445 6179 102 5 1 5 1 5
40 1 5 40 67085 55624 11461 191 4 1 5 1 6
41 1 5 41 75724 67085 8639 143 5 1 5 1 5
42 1 6 42 27597 NA NA NA 4 1 5 1 5
43 1 6 43 29711 27597 2114 35 4 1 5 1 5
44 1 6 44 35311 29711 5600 93 4 1 5 1 5
45 1 6 45 45720 35311 10409 173 4 1 5 1 5
46 1 6 46 47880 45720 2160 36 4 1 5 1 5
47 1 6 47 54304 47880 6424 107 4 1 5 1 5
48 1 6 48 62042 54304 7738 128 4 1 5 1 5
49 1 6 49 66725 62042 4683 78 5 1 5 1 5
50 1 6 50 75302 66725 8577 142 4 1 5 1 5
51 2 1 1 31220 NA NA NA 5 1 6 1 7
52 2 1 2 37021 31220 5801 96 4 1 6 1 6
53 2 1 3 38820 37021 1799 29 4 3 5 2 6
54 2 1 4 47041 38820 8221 137 5 3 6 1 6
55 2 1 5 49202 47041 2161 36 4 4 4 2 6
56 2 2 6 27111 NA NA NA 3 1 4 3 5
57 2 2 7 40561 27111 13450 224 2 1 5 1 6
58 2 2 8 45483 40561 4922 82 5 1 5 1 4
59 2 2 9 65582 45483 20099 334 6 1 7 1 7
60 2 2 10 71460 65582 5878 97 6 1 6 1 6
61 2 2 11 77340 71460 5880 98 5 1 6 1 7
62 2 3 12 34566 NA NA NA 4 1 6 1 7
63 2 3 13 41405 34566 6839 113 7 1 5 1 5
64 2 3 14 44223 41405 2818 46 6 1 6 1 6
65 2 3 15 69485 44223 25262 421 5 1 4 1 6
66 2 4 16 37921 NA NA NA 5 1 5 1 6
67 2 4 17 54062 37921 16141 269 5 2 4 4 4
68 2 4 18 60542 54062 6480 108 5 3 5 1 5
69 2 4 19 66360 60542 5818 96 5 1 4 1 5
70 2 4 20 69663 66360 3303 55 4 1 4 1 7
71 2 4 21 76023 69663 6360 106 5 1 5 1 7
72 2 4 22 77463 76023 1440 24 4 1 5 1 5
73 2 5 23 27050 NA NA NA 5 3 5 1 6
74 2 5 24 29400 27050 2350 39 4 1 5 1 5
75 2 5 25 36783 29400 7383 123 5 1 5 1 5
76 2 5 26 42062 36783 5279 87 5 1 4 1 6
77 2 5 27 46984 42062 4922 82 5 1 6 1 5
78 2 5 28 50344 46984 3360 56 4 1 5 1 6
79 2 5 29 56885 50344 6541 109 7 1 7 1 7
80 2 5 30 71101 56885 14216 236 4 1 5 1 7
81 2 6 31 27094 NA NA NA 1 1 4 1 5
82 2 6 32 27559 27094 465 7 1 1 4 1 5
83 2 6 33 40441 27559 12882 214 4 1 5 1 6
84 2 6 34 44763 40441 4322 72 5 1 5 1 6
85 2 6 35 50522 44763 5759 95 5 1 5 1 5
86 2 6 36 60962 50522 10440 174 4 1 5 1 6
# ... with 8 more variables: NA3 <dbl>, PA4 <dbl>, NA4 <dbl>, PA5 <dbl>, NA5 <dbl>,
# PA6 <dbl>, NA6 <dbl>, obs <int>
然后我使用以下代码进行重组
datasetSPSSSMESM_wide2 <- reshape(datasetSPSSSMESM_2,
timevar="Obs", idvar="ID", direction="wide")
我想得到这样的东西
ID Time1_1 Time1_2 Time 1_3 Time1_4 Time1_5 ....
1 27154 30150 33266 39842 46744
2 31220 37021 38820 47041 49202
但是当我查看数据集时,我得到了这个。变量本身都是 NA,但数据似乎都存储在变量名中。
# A tibble: 2 x 19
# Groups: ID [2]
ID `Day.c(1, 2, 3,~ `Time1.c(1, 2, ~ `Time1_1.c(1, 2~ `Time_between.c~
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 NA NA NA NA
2 2 NA NA NA NA
# ... with 14 more variables: `Time_minutes.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
# 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
# 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>, `PA1.c(1, 2,
# 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
# 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
# 47, 48, 49, 50)` <dbl>, `NA1.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
# 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
# 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>, `PA2.c(1, 2, 3, 4, 5, 6,
# 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
# 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
# 50)` <dbl>, `NA2.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
# 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
# 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>, `PA3.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
# 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
# 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>,
# `NA3.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
# 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
# 44, 45, 46, 47, 48, 49, 50)` <dbl>, `PA4.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
# 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34,
# 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>, `NA4.c(1, 2,
# 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25,
# 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
# 47, 48, 49, 50)` <dbl>, `PA5.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
# 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37,
# 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>, `NA5.c(1, 2, 3, 4, 5, 6,
# 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28,
# 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
# 50)` <dbl>, `PA6.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
# 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
# 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>, `NA6.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
# 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31,
# 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50)` <dbl>,
# `obs.c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22,
# 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43,
# 44, 45, 46, 47, 48, 49, 50)` <int>
>
print(datasetSPSSSMESM_wide2)
有谁知道我做错了什么?
提前非常感谢!
多米尼克
#############################更新#################### ######
现在使用 dput
> dput(head(datasetSPSSSMESM_2))
structure(list(ID = c(1, 1, 1, 1, 1, 1), Day = c(1, 1, 1, 1,
1, 1), Obs = c(1, 2, 3, 4, 5, 6), Time1 = c(27154, 30150, 33266,
39842, 46744, 50643), Time1_1 = c(NA, 27154, 30150, 33266, 39842,
46744), Time_between = c(NA, 2996, 3116, 6576, 6902, 3899), Time_minutes = c(NA,
49, 51, 109, 115, 64), PA1 = c(4, 4, 5, 5, 5, 1), NA1 = c(1,
1, 1, 1, 1, 1), PA2 = c(6, 5, 5, 7, 6, 5), NA2 = c(1, 1, 1, 1,
1, 1), PA3 = c(6, 6, 6, 6, 6, 4), NA3 = c(2, 2, 2, 2, 1, 5),
PA4 = c(6, 5, 6, 6, 6, 2), NA4 = c(2, 1, 1, 1, 1, 1), PA5 = c(6,
6, 6, 6, 6, 3), NA5 = c(1, 1, 1, 1, 1, 1), PA6 = c(6, 5,
6, 6, 6, 4), NA6 = c(1, 1, 1, 1, 1, 1), obs = 1:6), row.names = c(NA,
-6L), groups = structure(list(ID = 1, Day = 1, .rows = structure(list(
1:6), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr",
"list"))), row.names = 1L, class = c("tbl_df", "tbl", "data.frame"
), .drop = TRUE), class = c("grouped_df", "tbl_df", "tbl", "data.frame"
))
整形后
> dput(head(datasetSPSSSMESM_wide2))
structure(list(ID = c(1, 2), `Day.1:11` = c(NA_real_, NA_real_
), `Obs.1:11` = c(NA_real_, NA_real_), `Time1.1:11` = c(NA_real_,
NA_real_), `Time1_1.1:11` = c(NA_real_, NA_real_), `Time_between.1:11` = c(NA_real_,
NA_real_), `Time_minutes.1:11` = c(NA_real_, NA_real_), `PA1.1:11` = c(NA_real_,
NA_real_), `NA1.1:11` = c(NA_real_, NA_real_), `PA2.1:11` = c(NA_real_,
NA_real_), `NA2.1:11` = c(NA_real_, NA_real_), `PA3.1:11` = c(NA_real_,
NA_real_), `NA3.1:11` = c(NA_real_, NA_real_), `PA4.1:11` = c(NA_real_,
NA_real_), `NA4.1:11` = c(NA_real_, NA_real_), `PA5.1:11` = c(NA_real_,
NA_real_), `NA5.1:11` = c(NA_real_, NA_real_), `PA6.1:11` = c(NA_real_,
NA_real_), `NA6.1:11` = c(NA_real_, NA_real_)), row.names = c(NA,
-2L), groups = structure(list(ID = c(1, 2), .rows = structure(list(
1L, 2L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr",
"list"))), row.names = 1:2, class = c("tbl_df", "tbl", "data.frame"
), .drop = TRUE), reshapeWide = list(v.names = NULL, timevar = "obs",
idvar = "ID", times = structure(list(obs = 1:11), row.names = c(NA,
-11L), class = c("tbl_df", "tbl", "data.frame")), varying = structure(c("Day.1:11",
"Obs.1:11", "Time1.1:11", "Time1_1.1:11", "Time_between.1:11",
"Time_minutes.1:11", "PA1.1:11", "NA1.1:11", "PA2.1:11",
"NA2.1:11", "PA3.1:11", "NA3.1:11", "PA4.1:11", "NA4.1:11",
"PA5.1:11", "NA5.1:11", "PA6.1:11", "NA6.1:11"), .Dim = c(18L,
1L), .Dimnames = list(NULL, "obs"))), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"))
解决方案
我想如果您想让该数据框更宽,通常您只想选择一个要包含的变量。例如,如果要分析“Time1”,可以尝试以下操作:
datasetSPSSSMESM_wide3 <- tidyr::pivot_wider(data = datasetSPSSSMESM_2, id_cols = "ID", names_from = "Obs", values_from = "Time1")
(我更熟悉这个 pivot_wider 函数......也许 stats::reshape 也可以使用其他语法来做到这一点。)
或者,如果您真的想要一个包含所有其他变量的超宽数据框,您可以使用以下内容:
datasetSPSSSMESM_wide3 <- tidyr::pivot_wider(data = datasetSPSSSMESM_2, id_cols = "ID", names_from = "Obs", values_from = c(-ID, -Obs))
希望这可以帮助!
景梦
推荐阅读
- python - 如何计算字典中单词的词频?
- javascript - 范围输入逐渐改变颜色
- ios - UINavigationItem 不显示左右 barButtons
- c# - C#如何删除句尾前的空格
- c# - 通过 backgroundtranferapi 只请求一段文件
- sql - BigQuery 是否支持“立即执行”命令来运行动态查询?
- ionic-framework - 离子,无法找到管道“翻译”,仅在 AoT 编译器中
- python - 根据散点图中的另一列定义气泡大小和气泡颜色(matplotlib)
- opencv - 提取pycharm中的边缘像素数 - opencv
- vba - 将输出写入文本文件