首页 > 解决方案 > R - 从在线数据库中提取数据并为每次迭代具有唯一名称的循环脚本

问题描述

我对 R 相当陌生,并且一直在努力使用 for 循环简化一些代码。我正在尝试使用包dataRetrieval从在线数据库中提取水质数据。我目前已经为每个站点复制了代码并更改了站点编号和输出名称,但一直试图通过将脚本放在for 循环中来简化这一点,并且在创建具有唯一标识符的单独数据表时遇到了麻烦。

为每个站点创建数据表的原始代码。唯一改变的变量是siteNumbers数据表名称“x”_dataTable

#BW00A
siteNumbers = c("383652091125002")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW00A_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)
#BW01
siteNumbers = c("383648091124501")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW01_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)
#BW01A
siteNumbers = c("383648091124502")
parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

BW01A_dataTable <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)

新代码我无法开始工作。我已将siteNumbersandsiteNames放入数据框中。我想要的是 for循环内的脚本遍历siteNumbers以提取数据,然后将新创建的数据表归因于相应的siteNamesaka unique_siteName。我不确定这是否可能。

df <- data.frame(
  siteNumbers = c("383652091125001",    "383652091125002",  "383648091124501",  "383648091124502",  "383506091132201",  "383508091132002",  "383508091132004",  "383519091133701",  "383544091132601",  "383544091132502",  "383628091124801",  "383639091125902",  "383639091125901",  "383638091125001",  "383638091125002",  "383631091124803",  "383631091124804",  "383631091124801",  "383631091124802",  "383636091123801",  "383636091123811",  "383616091125701",  "383640091130701",  "383640091130702",  "383621091130701",  "383621091130703",  "383621091130702",  "383624091130501",  "383624091130502",  "383616091130801",  "383616091130802",  "383644091131601",  "383627091130201",  "383622091130604",  "383622091130605",  "383557091132001",  "383614091132801"),
  siteName = c("BW-00", "BW-00A",   "BW-01",    "BW-01A",   "MW-04",    "MW-04A",   "MW-04B",   "MW-11",    "BW-21",    "BW-21A",   "210TB-C6", "Bates Spring", "Bates Spring below dam",   "BW-02",    "BW-02A",   "BW-04A-D", "BW-04A-S", "BW-04D",   "BW-04S",   "BW-05",    "BW-05A",   "BW-07",    "BW-08",    "BW-08A",   "BW-11",    "BW-11A-D", "BW-11A-S", "BW-13",    "BW-13A",   "BW-14",    "BW-14A",   "BW4-15",   "BW4-16",   "BW4-17",   "BW4-18",   "W3",   "W4")
)

parameterCode = c("00010","00095", "00300", "00400", "34475", "34485", "45617")
startDate = "1900-01-01"
endDate = "2020-12-01"

for (row in df)
{
 unique_siteName <- readNWISqw(siteNumbers, parameterCode,
                             startDate, endDate)  
  
}

谢谢你的帮助!

标签: rfor-loopuniqueidentifierdata-retrieval

解决方案


您需要遍历行索引并在循环中引用具有行号的数据框,并创建一个list来累积结果:

results <- list()
for (row in 1:nrow(df)) {
 results[[i]] <- readNWISqw(df$siteNumbers[i], parameterCode,
                             startDate, endDate)  
}
names(results) <- df$siteName

R 还提供lapply了一种简化这种常见模式的方法。上面的循环等价于:

results <- lapply(df$siteNumbers, FUN = readNWISqs, parameterCode, startDate, endDate)
names(results) <- df$siteName

我建议在如何制作数据框列表中阅读我的答案?更多的讨论和解释,包括我们为什么这样做以及接下来的步骤是什么(例如,将results列表组合成一个数据框)。


推荐阅读