r - 如何在R中的几个步骤中收集列而不丢失分组
问题描述
我需要将宽数据集转换为长数据集,并且有 16 列必须收敛为 4。每 4 列包含彼此相关的信息,并且该信息不能在转换中“丢失”。
我有来自四个块的排名任务的数据,它基本上给了我一个数据集,其中信息被分为四组宽格式。即first_image,first_sex,first_score,second_image,second_sex,second_score ...
我已经尝试过 group_by 和 gather() 的各种组合,但我还差得很远。
我已经阅读了将多组测量列(宽格式)重塑为单列(长格式),但恐怕我并不聪明。
我已经制作了一些关于某个参与者的数据是什么样子的示例数据,并且我还制作了一个我希望数据看起来如何的示例。
library(tidyverse)
sample_dat <- data.frame(subject_id = rep("sj1", 4),
first_pick = rep(1, 4),
first_image_pick = (c("a", "b", "c", "d")),
first_pick_neuro = rep("TD", 4),
first_pick_sex = rep("F", 4),
second_pick = rep(2, 4),
second_image_pick = (c("e", "f", "g", "h")),
second_pick_neuro = rep("TD", 4),
second_pick_sex = rep("M", 4),
third_pick = rep(3, 4),
third_image_pick = (c("i", "j", "k", "l")),
third_pick_neuro = rep("DS", 4),
third_pick_sex = rep("F", 4),
fourth_pick = rep(4, 4),
fourth_image_pick = (c("m", "n", "o", "p")),
fourth_pick_neuro = rep("DS", 4),
fourth_pick_sex = rep("M", 4))
预期输出:
final_data <- data.frame(subject_id = rep("sj1", 16),
image = c("a", "b", "c", "d",
"e", "f", "g", "h",
"i", "j", "k", "l",
"m", "n", "o", "p"),
rank = rep(c(1, 2, 3, 4), each = 4), # from the numbers in the first_pick, second_pick etc.
neuro = rep(c("TD", "DS"), each = 8),
sex = rep(c("F", "M", "F", "M"), each = 4))
到目前为止,我已经尝试过了,但是它只复制了所有信息:
sample_dat_long <- sample_dat %>%
group_by(subject_id) %>%
gather(Pick, Image,
first_image_pick,
second_image_pick,
third_image_pick,
fourth_image_pick)
所以基本上我不想在收集数据时丢失每张图像的信息(选择、性别、神经)。
任何帮助都会很棒!
解决方案
我们可以用它来做到这一点,它可以从“宽”到“长”格式进行多次melt
重塑。在这里,带有子字符串 'image'、'neuro'、'sex' 的列名被重新整形为单独的列以获得预期的输出data.table
measure
patterns
library(data.table)
melt(setDT(sample_dat), measure = patterns("image", "neuro", "sex"),
value.name = c("image", "neuro", "sex"), variable.name = 'rank')[,
.(subject_id, rank, image, neuro, sex)]
推荐阅读
- php - Wordpress 如何将目录请求转换为数据库生成的页面?
- scadalts - 脚本工具不适用于上一个 ScadaLTS 版本 2.6.2 和早期版本
- regex - 在 MacOS 上匹配正则表达式并提取 json 中的所有实例
- javascript - 如何通过 react-router-dom 链接改变状态?
- wordpress - Mac 上的 Web 字体乱码
- android - android studio 捕获图像更快
- r - R Shiny uiOutput 显示来自另一个 uiOutput 的输入值的标签
- python - 从联合的基数计算所有不相交子集的基数
- dart - 用于编译 sass 的命令行问题
- node.js - 如果用户通过 Google 帐户登录,如何访问 Google 我的业务评级?