首页 > 解决方案 > 创建跨列的响应比例

问题描述

你好可爱的堆栈溢出人员:

我是 R 的初学者,目前我正在尝试为我的研究中的每个参与者创建一个比例,即他们有多少与他们不同的亲密朋友占他们的朋友总数。

我对来自 qualtrics 的数据进行了排列,以便参与者能够将多达 20 个朋友分类为“相同”或“不同”,这些响应排列在 Q34_1_44 到 Q34_20_63 的列中。当然,并不是每个参与者都对全部可用的 20 个密友进行了评分,所以很多列几乎都是空的。不管我试图创建另一个包含等于不同朋友数/总数的数值的列(所以空白无关紧要)。我尝试了以下方法(但由于我意识到这不是一个合适的功能(或至少据我所知)而陷入困境:

    dataset.clean2 <- mutate(dataset.clean1, diff.friend=ifelse(Q34_1_44=="Different race",))

如果有人知道一种易于理解(无论多么乏味)的方法来查找每个参与者的不同朋友的百分比,您的答案将不胜感激!

提前非常感谢你<3

标签: r

解决方案


在这个论坛上,您应该包含与您的格式相同的示例数据。这为想要帮助您的人节省了工作量并减少了歧义,这两者都可以帮助您回答问题。

例如:

my_data <- structure(list(id = 1:5, Q_1 = c(
  "diff", "same", "diff", "same",
  "same"
), Q_2 = c("diff", "diff", "same", "diff", "same"), Q_3 = c(
  "same",
  "diff", "diff", "diff", "same"
), Q_4 = c(
  "diff", "diff", "same",
  "same", "same"
), Q_5 = c("same", "diff", "diff", "diff", "diff")), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"))

使用 dplyr 的一种方法:

library(dplyr)
my_data %>%
  rowwise() %>%
  mutate(diff_share = sum(c_across(-id) == "diff")/5)

     id Q_1   Q_2   Q_3   Q_4   Q_5   diff_share
  <int> <chr> <chr> <chr> <chr> <chr>      <dbl>
1     1 diff  diff  same  diff  same         0.6
2     2 same  diff  diff  diff  diff         0.8
3     3 diff  same  diff  same  diff         0.6
4     4 same  diff  diff  same  diff         0.6
5     5 same  same  same  same  diff         0.2

推荐阅读