r - 宽到长,有许多不同的列
问题描述
我以前使用过 pivot_longer,但这次我有一个更复杂的宽数据框,我无法对其进行排序。示例代码将为您提供可重现的数据框。我以前没有处理过这样的事情,所以我不确定尝试以长格式格式化这种类型的 df 是否正确?
df <- data.frame(
ID = as.numeric(c("7","8","10","11","13","15","16")),
AGE = as.character(c("45 – 54","25 – 34","25 – 34","25 – 34","25 – 34","18 – 24","35 – 44")),
GENDER = as.character(c("Female","Female","Male","Female","Other","Male","Female")),
SD = as.numeric(c("3","0","0","0","3","2","0")),
GAMING = as.numeric(c("0","0","0","0","2","2","0")),
HW = as.numeric(c("2","2","0","2","2","2","2")),
R1_1 = as.numeric(c("10","34","69","53","79","55","28")),
M1_1 = as.numeric(c("65","32","64","53","87","55","27")),
P1_1 = as.numeric(c("65","38","67","54","88","44","26")),
R1_2 = as.numeric(c("15","57","37","54","75","91","37")),
M1_2 = as.numeric(c("90","26","42","56","74","90","37")),
P1_2 = as.numeric(c("90","44","33","54","79","95","37")),
R1_3 = as.numeric(c("5","47","80","27","61","19","57")),
M1_3 = as.numeric(c("30","71","80","34","71","15","57")),
P1_3 = as.numeric(c("30","36","81","35","62","8","56")),
R2_1 = as.numeric(c("10","39","75","31","71","80","59")),
M2_1 = as.numeric(c("90","51","74","15","70","75","61")),
P2_1 = as.numeric(c("90","52","35","34","69","83","60")),
R2_2 = as.numeric(c("10","45","31","54","39","95","77")),
M2_2 = as.numeric(c("60","70","40","78","5","97","75")),
P2_2 = as.numeric(c("60","40","41","58","9","97","76")),
R2_3 = as.numeric(c("5","38","78","45","25","16","22")),
M2_3 = as.numeric(c("30","34","84","62","33","52","20")),
P2_3 = as.numeric(c("30","34","82","45","32","16","22")),
R3_1 = as.numeric(c("10","40","41","42","62","89","41")),
M3_1 = as.numeric(c("90","67","37","40","27","89","42")),
P3_1 = as.numeric(c("90","34","51","44","38","84","43")),
R3_2 = as.numeric(c("10","37","20","54","8","93","69")),
M3_2 = as.numeric(c("60","38","21","62","5","95","71")),
P3_2 = as.numeric(c("60","38","23","65","14","92","69")),
R3_3 = as.numeric(c("5","30","62","11","60","32","52")),
M3_3 = as.numeric(c("30","67","34","55","45","25","45")),
P3_3 = as.numeric(c("30","28","41","24","53","23","52")),
R1_4 = as.numeric(c("10","40","61","17","39","72","25")),
M1_4 = as.numeric(c("45","20","63","25","62","70","23")),
P1_4 = as.numeric(c("45","52","56","16","26","72","27")),
R2_4 = as.numeric(c("5","21","70","33","80","68","30")),
M2_4 = as.numeric(c("35","21","69","27","85","69","23")),
P2_4 = as.numeric(c("35","32","34","25","79","63","29")),
R3_4 = as.numeric(c("10","29","68","21","8","71","41")),
M3_4 = as.numeric(c("50","37","66","28","33","65","41")),
P3_4 = as.numeric(c("50","38","47","28","24","71","41"))
)
新列名是从旧列名中提取的,例如 R1_1 中的(示例):
- R 是包含先前存储在 R1_1 中的值的列的名称
- 1(R1_1 中“R”之后的第一个字符)是 Speed 列中使用的值
- 1(“R1_1”的最后一个字符)是 Sound 列中使用的值
基本上每一行对应1个人回答的1个问题,每个问题通过3个不同的评分(R,M,P)回答
谢谢你!
解决方案
如果我理解正确,以下应该有效:
df %>%
pivot_longer(
cols = matches('[RMP]\\d_\\d'),
names_to = c('RMP', 'Speed', 'Sound'),
values_to = 'Data',
names_pattern = '([RMP])(\\d)_(\\d)'
) %>%
pivot_wider(names_from = RMP, values_from = Data)
这假设“速度”和“声音”都是个位数的值。如果可能有多个数字,则\\d
上述模式中出现的 需要替换为\\d+
.
推荐阅读
- javascript - 访问异步函数中的变量
- python - 如何与字典python中的其他键交换键(不与值)
- autodesk-forge - 如何从伪造查看器中获取所有数据
- javascript - 为什么 FULLCALENDAR 在 13:00(下午 1:00)之后不显示事件?
- linear-programming - 将条件约束转换为线性规划的线性约束
- java - 设置每个延迟时间
在TestNG中 - postgresql - Postgres:防止新用户在没有超级用户的情况下创建表
- git - 错误:无法打开 .git/FETCH_HEAD:错误消息
- python - 我想找到组内的最小差异并返回索引。(Python,pandas)
- javascript - 为什么不能在外面获取数据