r - 如何使用宽数据格式的第二行作为额外的变量名
问题描述
我有一个数据框,其中第一行是列名(当然),但第二行是一个额外的信息和一个我想对其应用统计信息的变量/例如,第一行是植物的 ID,第二行是位置,其余行是时间序列的因变量。重要的是要注意第一个列是我的 x 轴,它代表时间。这是我的一段数据:
Days L-FCS L-DRC L-PCH S-PCH S-PCH S-SSV M-SSV L-SSV S-DRC L-MNS L-DRC
Room 1-BR-SW 1-BR-SW 1-BR-SW 1-BR-SW 1-BR-SW 1-BR-SW 1-BR-SW 2-BR-SE 2-BR-SE 2-BR-SE
0.00 0 0 0 0 0 0 0 0 0 0 0
0.04 0 0 1 0 0 0 1 0 0 0 0
0.08 0 0 1 0 0 1 2 0 0 0 0
0.13 0 0 -1 0 0 2 3 0 0 0 0
0.17 0 0 -1 0 0 3 4 0 0 0 0
0.21 0 0 -1 0 0 4 5 0 0 0 0
0.25 0 0 -1 0 0 4 6 0 0 0 0
0.29 0 0 -2 0 0 4 6 0 0 0 0
0.33 0 0 -1 0 0 4 6 0 0 0 0
0.38 -1 0 -1 0 0 4 6 0 0 0 0
0.42 -2 0 -1 0 0 4 6 0 0 0 0
0.46 -5 0 -1 0 0 4 6 0 0 0 0
0.50 -5 0 -2 0 0 4 6 0 0 -1 0
0.54 -5 0 -2 0 0 4 6 0 0 -2 0
0.58 -6 0 -3 0 0 4 7 0 0 -3 0
0.63 -8 0 -3 0 0 4 8 0 0 -3 0
0.67 -9 0 -3 0 0 4 8 0 0 -3 0
0.71 -9 0 -3 0 0 4 11 0 -1 -3 0
0.75 -9 0 -3 0 0 4 11 0 -1 -4 0
0.79 -9 0 -3 0 0 4 13 0 -1 -5 0
0.83 -10 0 -3 0 0 4 13 0 -1 -5 0
0.88 -12 0 -3 0 0 4 13 0 -1 -5 0
0.92 -13 0 -4 0 0 4 13 0 -1 -6 0
0.96 -14 0 -5 0 -1 4 13 0 -1 -6 0
1.00 -14 0 -5 0 -1 4 13 0 -1 -6 0
1.04 -15 0 -5 0 -1 4 13 0 -2 -6 0
1.08 -16 0 -5 0 -1 4 13 0 -2 -6 0
我省略了一些行名,因为它不适合这里的 col(第一行和第二行的名称对于数字来说太宽了)
为了将来使用,我很想知道如何将任何多行用作我的数据的变量。我试图将其重塑为长格式(并且我将此数据用于其他目的的长格式)但我找不到如何重塑它,所以我也有这些列。
到目前为止我所做的是完全省略第二行,所以我只有 Days col,它的第一个调用是 0.00
再次附上 dput() :
structure(list(Days = c("Room", "0.00", "0.04", "0.08", "0.13",
"0.17"), L.FCS = c("1-BR-SW", "0", "0", "0", "0", "0"), L.DRC = c("1-BR-SW",
"0", "0", "0", "0", "0"), L.PCH = c("1-BR-SW", "0", "0", "0",
"0", "0"), S.PCH = c("1-BR-SW", "0", "0", "0", "0", "0"), S.PCH.1 = c("1-BR-SW",
"0", "0", "0", "0", "0"), S.SSV = c("1-BR-SW", "0", "0", "0",
"0", "0"), Hodaya_M = c("1-BR-SW", "0", "1", "1", "3", "3"),
L.SSV = c("2-BR-SE", "0", "-1", "-1", "-2", "-2"), S.DRC = c("2-BR-SE",
"0", "0", "-1", "-1", "-1")), row.names = c(NA, 6L), class = "data.frame")
解决方案
我建议分别提取标题信息和数据,然后将两者结合起来。
library(tidyverse)
df1_headers <- df1 %>%
janitor::clean_names() %>% # Reformats names and makes them unique
slice(1) %>% # only keep the first data row, your "location"
mutate(across(,as.character)) %>% # make everything character data
pivot_longer(everything(), values_to = "location")
df1_body <- df1 %>%
janitor::clean_names() %>%
slice(-1) %>% # remove only the first row
mutate(across(,as.numeric), # make numeric
row = row_number()) %>% # add row number in case days not unique / ordered
pivot_longer(-c(row, days))
df_long <- df1_body %>% left_join(df1_headers)
这种格式对于进一步的分析和可视化应该非常灵活,例如
ggplot(df_long, aes(days, value, color = name)) +
geom_line() +
facet_wrap(~location)
源数据
df1 <- data.frame(
stringsAsFactors = FALSE,
check.names = FALSE,
Days = c("Room","0","0.04","0.08",
"0.13","0.17","0.21","0.25","0.29","0.33","0.38",
"0.42","0.46","0.5","0.54","0.58","0.63","0.67",
"0.71","0.75","0.79","0.83","0.88","0.92","0.96","1",
"1.04","1.08"),
`L-FCS` = c("1-BR-SW","0","0","0","0",
"0","0","0","0","0","-1","-2","-5","-5","-5",
"-6","-8","-9","-9","-9","-9","-10","-12","-13",
"-14","-14","-15","-16"),
`L-DRC` = c("1-BR-SW","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0","0"),
`L-PCH` = c("1-BR-SW","0","1","1","-1",
"-1","-1","-1","-2","-1","-1","-1","-1","-2",
"-2","-3","-3","-3","-3","-3","-3","-3","-3","-4",
"-5","-5","-5","-5"),
`S-PCH` = c("1-BR-SW","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0","0"),
`S-PCH` = c("1-BR-SW","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0",
"0","0","0","0","0","0","0","0","-1","-1","-1",
"-1"),
`S-SSV` = c("1-BR-SW","0","0","1","2",
"3","4","4","4","4","4","4","4","4","4","4",
"4","4","4","4","4","4","4","4","4","4","4","4"),
`M-SSV` = c("1-BR-SW","0","1","2","3",
"4","5","6","6","6","6","6","6","6","6","7",
"8","8","11","11","13","13","13","13","13","13",
"13","13"),
`L-SSV` = c("2-BR-SE","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0","0"),
`S-DRC` = c("2-BR-SE","0","0","0","0",
"0","0","0","0","0","0","0","0","0","0","0",
"0","0","-1","-1","-1","-1","-1","-1","-1","-1",
"-2","-2"),
`L-MNS` = c("2-BR-SE","0","0","0","0",
"0","0","0","0","0","0","0","0","-1","-2","-3",
"-3","-3","-3","-4","-5","-5","-5","-6","-6",
"-6","-6","-6"),
`L-DRC` = c(NA,0L,0L,0L,0L,0L,0L,0L,
0L,0L,0L,0L,0L,0L,0L,0L,0L,0L,0L,0L,0L,0L,
0L,0L,0L,0L,0L,0L)
推荐阅读
- python - 在不导入/运行的情况下查看另一个 Python 文件的变量
- scala - 将scala代码转换为python LDA
- android - JobService 未发送通知
- javascript - 如何在php中使用ckeditor
- android - VertScrollBox 当底部到达 android 和 IOS 时检测
- c++ - 本征和巨大的密集二维阵列
- rust - “借来的数据不能存储在其关闭之外”是什么意思?
- javascript - 如何在 Angular 6 中添加脚本文件?
- python - Keras 在一类 Cifar-10 上过拟合
- java - 如何使用延迟初始化?[延迟初始化异常]