r - 循环遍历 ar 数据框并将行作为参数传递给函数
问题描述
我想遍历一个数据框并将行作为参数传递给一个函数,以汇总名为 df3 的数据框的总数。
我尝试过使用传统 for 循环的代码,但没有结果。
我在https://adv-r.hadley.nz/functionals.html#pmap中查看了 pmap
但我看不到如何将此示例应用于我的代码。
以下是原始数据中的一些数据:
dput(head(df3,n=3))
structure(list(id = c("81", "83", "85"), look_work = c("yes",
"yes", "yes"), current_work = c("no", "yes", "no"), hf_l5k = c("",
"", ""), ac_l5k = c("", "", ""), hf_5_10k = c("", "1", "1"),
ac_5_10k = c("", "1", "1"), hf_11_20k = c("", "", ""), ac_11_20k = c("",
"", ""), hf_21_50k = c("", "", ""), ac_21_50k = c("", "",
""), hf_51_100k = c("", "", ""), ac_51_100k = c("", "", ""
), hf_m100k = c("", "", ""), ac_m100k = c("", "", ""), s_l1000 = c("",
"", ""), se_l1000 = c("", "", "1"), s_1001_1500 = c("", "1",
"1"), se_1001_1500 = c("", "", ""), s_2001_3000 = c("", "",
""), se_2001_3000 = c("", "1", ""), s_3001_4000 = c("", "",
""), se_3001_4000 = c("", "", ""), s_4001_5000 = c("", "",
""), se_4001_5000 = c("", "", ""), s_5001_6000 = c("", "",
""), se_5001_6000 = c("", "", ""), s_m6000 = c("", "", ""
), se_m6000 = c("", "", ""), s_n_ans = c("", "", ""), se_n_ans = c("",
"", ""), before_work = c("no", "NULL", "yes"), keen_move = c("yes",
"yes", "no"), city_size = c("village", "more than 500k inhabitants",
"more than 500k inhabitants"), gender = c("male", "female",
"female"), age = c("18 - 24 years", "18 - 24 years", "more than 50 years"
), education = c("secondary", "vocational", "secondary")), row.names = c(NA,
3L), class = "data.frame")
这是参数的数据框 hf_names:
structure(list(hf_names = c("hf_l5k", "hf_5_10k", "hf_11_20k",
"hf_21_50k", "hf_51_100k", "hf_m100k"), job = c("hf_l5k_job",
"hf_5_10k_job", "hf_11_20k_job", "hf_21_50k_job", "hf_51_100k_job",
"hf_m100k_job"), tot = c("hf_l5k_tot", "hf_5_10k_tot", "hf_11_20k_tot",
"hf_21_50k_tot", "hf_51_100k_tot", "hf_m100k_tot")), class = "data.frame", row.names = c(NA,
-6L))
这是我尝试使用传统 for 循环的代码:
library(dplyr)
tot_function <- function(df, filter_tot, col_name1, col_name2) {
# filter desired columns for all jobs
filter_tot <- df %>% filter(col_name1=="1") %>%
summarise(col_name2 = n())
}
for (i in seq_along(hf_names3)) {
tot_function(df3, hf_names3$tot[i], hf_names3$hf_names[i], hf_names3$job[i])
}
预期的结果将是数据框或向量:
hf_l5k_jobs hf_l5_10k_jobs
10 193
但此代码不会生成任何内容,因为它查看的是诸如 trim 和 runif 之类的简单函数。
解决方案
我不认为你需要把这个复杂化。您可以从中获取名称,从hf_names
该列中提取子集df3
并计算该列中 1 的数量。
sapply(hf_names$hf_names, function(x) sum(df3[[x]] == 1))
# hf_l5k hf_5_10k hf_11_20k hf_21_50k hf_51_100k hf_m100k
# 0 2 0 0 0 0
如果您愿意tidyverse
,可以更改sapply
为map.*
变体
purrr::map_int(hf_names$hf_names, ~sum(df3[[.]] == 1))
推荐阅读
- node.js - 如何将这个非常不安全的查询转换为 Knex querybuilder 调用?
- c - 动态链接库可以覆盖静态库吗?
- php - 在每个 Wordpress 页面中显示不同的标题文本
- c# - 标签在 ToolStrip 上方不可见
- java - 502 Bad Gateway 和“请在 30 秒后重试”消息
- oracle-cloud-infrastructure - 我的两个 Always Free 实例都已终止
- php - Adsense 链接给出错误 403 禁止 nginx/1.16.0
- spring-boot - 来自 Spring Boot 应用程序的 Elastic Search 远程连接
- solace-mq - 有没有办法像 ActiveMQ 一样在本地运行慰藉队列?
- p5.js - 我将如何检查日期之间的天数