r - 分布在R中的多列
问题描述
我在长格式的 3 个时间点数据中有 6 个基因,我试图用 6 个基因的 6 个列进行传播。总是有这个错误。'您是否需要使用 tibble::rowid_to_column() 创建唯一 ID?打电话rlang::last_error()
查看回溯'
fgcrkmtptlog
- timepointgene treatment value tpt6
1 24 crk10 treated 1.7883197 24 treated
2 24 crk10 treated 1.0605152 24 treated
3 24 crk10 treated 1.0050634 24 treated
4 24 crk10 treated 1.8876708 24 treated
5 24 crk10 treated 1.4960427 24 treated
6 48 crk10 treated 2.4190837 48 treated
7 48 crk10 treated 2.9805329 48 treated
8 48 crk10 treated 3.4241471 48 treated
9 48 crk10 treated 2.3705634 48 treated
10 48 crk10 treated 2.0378527 48 treated
11 72 crk10 treated 2.5438502 72 treated
12 72 crk10 treated 3.7291318 72 treated
13 72 crk10 treated 2.8419034 72 treated
14 72 crk10 treated 3.3363484 72 treated
15 72 crk10 treated 3.2231344 72 treated
16 24 crk18 treated 2.0620297 24 treated
17 24 crk18 treated 1.5837581 24 treated
18 24 crk18 treated 2.1590703 24 treated
19 24 crk18 treated 2.1706227 24 treated
20 24 crk18 treated 2.4964019 24 treated
21 48 crk18 treated 2.6026845 48 treated
22 48 crk18 treated 2.7898342 48 treated
23 48 crk18 treated 2.6719992 48 treated
24 48 crk18 treated 2.7574874 48 treated
25 48 crk18 treated 3.4852919 48 treated
26 72 crk18 treated 3.1710652 72 treated
27 72 crk18 treated 3.3720779 72 treated
28 72 crk18 treated 1.8194282 72 treated
29 72 crk18 treated 2.8221811 72 treated
30 72 crk18 treated 2.8395098 72 treated
31 24 crk23 treated 0.9164792 24 treated
32 24 crk23 treated 0.9580680 24 treated
33 24 crk23 treated 0.5976315 24 treated
34 24 crk23 treated 1.0597296 24 treated
35 24 crk23 treated 1.0389352 24 treated
36 48 crk23 treated 2.1156238 48 treated
37 48 crk23 treated 2.8226339 48 treated
38 48 crk23 treated 3.4533979 48 treated
39 48 crk23 treated 2.7486982 48 treated
40 48 crk23 treated 2.0324462 48 treated
41 72 crk23 treated 3.1622761 72 treated
42 72 crk23 treated 1.7135985 72 treated
43 72 crk23 treated 2.7186619 72 treated
44 72 crk23 treated 2.7810451 72 treated
45 72 crk23 treated 1.4502025 72 treated
46 24 crk24 treated 0.5338245 24 treated
47 24 crk24 treated 0.4759149 24 treated
48 24 crk24 treated 1.1967879 24 treated
49 24 crk24 treated 1.0627795 24 treated
50 24 crk24 treated 1.1429535 24 treated
51 48 crk24 treated 1.4532524 48 treated
52 48 crk24 treated 2.2573031 48 treated
53 48 crk24 treated 2.3474122 48 treated
54 48 crk24 treated 2.2203353 48 treated
55 48 crk24 treated 2.4594710 48 treated
56 72 crk24 treated 2.3058234 72 treated
57 72 crk24 treated 2.4236584 72 treated
58 72 crk24 treated 2.5484249 72 treated
59 72 crk24 treated 2.6685704 72 treated
60 72 crk24 treated 2.0967240 72 treated
61 24 crk40 treated 1.0119949 24 treated
62 24 crk40 treated 1.0813096 24 treated
63 24 crk40 treated 1.7328680 24 treated
64 24 crk40 treated 1.9962639 24 treated
65 24 crk40 treated 2.3567004 24 treated
66 48 crk40 treated 3.5558450 48 treated
67 48 crk40 treated 2.6131649 48 treated
68 48 crk40 treated 2.5299872 48 treated
69 48 crk40 treated 3.4911513 48 treated
70 48 crk40 treated 3.3247960 48 treated
71 72 crk40 treated 4.8381673 72 treated
72 72 crk40 treated 4.9352079 72 treated
73 72 crk40 treated 4.4292105 72 treated
74 72 crk40 treated 3.8631403 72 treated
75 72 crk40 treated 4.0052355 72 treated
76 24 crk47 treated 0.1378544 24 treated
77 24 crk47 treated 1.9212654 24 treated
78 24 crk47 treated 2.3856740 24 treated
79 24 crk47 treated 1.6301435 24 treated
80 24 crk47 treated 1.6994583 24 treated
81 48 crk47 treated 2.8292882 48 treated
82 48 crk47 treated 2.9817805 48 treated
83 48 crk47 treated 2.9055344 48 treated
84 48 crk47 treated 2.9817805 48 treated
85 48 crk47 treated 3.0199036 48 treated
86 72 crk47 treated 2.7876993 72 treated
87 72 crk47 treated 2.9055344 72 treated
88 72 crk47 treated 3.6472018 72 treated
89 72 crk47 treated 2.5866866 72 treated
90 72 crk47 treated 2.6698643 72 treated
我正在尝试将其转换为以基因和时间点为列的数据格式,以及具有三个时间点的六个基因
fgcrkmtptlog %>%
group_by(timepoint) %>%
spread(gene, value)
我想要这张图片的数据
使用后
fgcrkmtptlog %>%
rowid_to_column() %>%
spread(gene, value)
df 显示很多 NA
1 1 24 treated 24 treated 1.788320 NA NA NA NA NA
2 2 24 treated 24 treated 1.060515 NA NA NA NA NA
3 3 24 treated 24 treated 1.005063 NA NA NA NA NA
4 4 24 treated 24 treated 1.887671 NA NA NA NA NA
5 5 24 treated 24 treated 1.496043 NA NA NA NA NA
6 6 48 treated 48 treated 2.419084 NA NA NA NA NA
解决方案
spread
需要一个唯一的行 ID,否则它无法工作。如果您的第一列(用作 id)包含重复项,则需要创建一个新的唯一行 ID。
您发布的错误消息正是如此,因此将以下内容添加到您的代码中:
fgcrkmtptlog %>%
# group_by(timepoint) %>% I took this out because group_by should be unnecessary here
rowid_to_column() %>%
spread(gene, value)
这将解决您当前的错误。
编辑:
根据您的数据,spread 可能会引入 NA,这是一个示例:
# Produce sample data
df <- structure(list(Year = c("2014", "2014", "2014", "2014", "2015",
"2015", "2015", "2015", "2016"), Month = c("01", "06", "07",
"12", "01", "06", "07", "12", "01"), Day = c("01", "01", "01",
"01", "01", "01", "01", "01", "01"), test = structure(c(1L, 1L,
1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("A", "B", "C"), class = "factor"),
Halfyear = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L
), .Label = c("2014 First Half", "2015 First Half", "2016 First Half"
), class = "factor")), class = "data.frame", row.names = c(NA,
-9L))
# Your code
df <- data.frame(years,test)
df %>%
rowid_to_column() %>%
spread(Month,test)
如果您对此进行测试,您将看到spread
正确引入NAs
,因为有些Months
没有test
值。由于 spread 在我的数据中每个现有月份创建一列,因此它还必须显示 NA ,其中不存在月份和测试的先前组合。
在传播之前,您有一个稀疏数据集,仅显示实际存在的数据,但传播完成了数据集以使其变宽。
推荐阅读
- windows - 将我的应用程序注册到 URI Scheme 并获取启动它的命令
- c++ - Gcc 无法在 MacOS 10.15.4 Catalina 上运行?
- reactjs - 如何使用 useEffect 更新和渲染组件?
- elixir - 是否有可能获得响应体参数?
- c++ - 来自线程的 PostMessage
- reactjs - React:重新渲染行为
- excel - 复制变量范围并粘贴到多个选项卡
- python - Python用RE在字符串中查找可变字符?
- python - 如何在带有scrapy的元素中选择特定元素
- android - 未使用 LiveData、TimerTask 和 Retrofit 更新视图