r - 为什么一段独立工作的代码不能在 R 中并行工作?
问题描述
我写了一段代码(附在下面),当我以串行方式运行它时工作正常,但是当我在 R 中使用 foreach 和 doparallel 库时,我收到一个错误代码,内容为:“任务 1 失败 -”缺少 TRUE /FALSE 需要”
for each 循环中的所有内容都独立工作,并且在较小的批次中,我可以连续运行它并且它可以工作。
ListOfColumns <- colnames(tempdata)
foreach(i = 1:nSubsets,
.export = ls(globalenv())) %dopar% {
DoubleTempData <- get(paste0("Subset", i))
DoubleTempData <- subset(DoubleTempData, select = -c(subset))
RowCounter <- 2
ColumnFigurer <- 2
LastCATEGORYIndicator <- "THERE IS NO CATEGORY, ONLY ZUUL"
while (RowCounter <= nrow(DoubleTempData)) {
print(paste("Checking row ", RowCounter))
RowChecker <- max(1, RowCounter - 5)
while (RowChecker < RowCounter) {
print(paste("Checking row",
RowCounter,
"against row",
RowChecker))
if (DoubleTempData$CATEGORY[RowChecker] == DoubleTempData$CATEGORY[RowCounter])
{
print("The rows match!")
while (ColumnFigurer > 0) {
if (DoubleTempData$CATEGORY[RowCounter] != LastCATEGORYIndicator) {
ColumnFigurer <- 2
}
print(paste ("Checking Iteration", ColumnFigurer))
if (ColumnFigurer * length(ListOfColumns) <= length(colnames(DoubleTempData)))
{
print(paste("Iteration", ColumnFigurer, " exists"))
CellChecker <-
((ColumnFigurer - 1) * length(ListOfColumns) + 1)
if (is.na(DoubleTempData[[RowChecker, CellChecker]])) {
print(paste("Current value is NA. Writing in new value."))
ColumnCounter <- 1
while (ColumnCounter <= length(ListOfColumns)) {
DoubleTempData[[RowChecker, (ColumnFigurer - 1) * length(ListOfColumns) +
ColumnCounter]] <-
DoubleTempData[[RowCounter, ColumnCounter]]
ColumnCounter <- ColumnCounter + 1
}
DoubleTempData <- DoubleTempData[-RowCounter]
LastCATEGORYIndicator <-
DoubleTempData$CATEGORY[RowChecker]
RowCounter <- max(2, RowCounter - 1)
ColumnFigurer <- ColumnFigurer + 1
break
}
else
{
print(paste(
"Current value is not NA, increasing iteration count."
))
ColumnFigurer <- ColumnFigurer + 1
}
}
if (ColumnFigurer * length(ListOfColumns) > length(colnames(DoubleTempData)))
{
print(paste(
"Iteration ",
ColumnFigurer,
"does not exist, adding iteration."
))
ColumnAdder <- 1
while (ColumnAdder <= length(ListOfColumns)) {
NewColumnName <-
paste0(ListOfColumns[ColumnAdder], "_", ColumnFigurer)
DoubleTempData[, NewColumnName] <- NA
ColumnAdder <- ColumnAdder + 1
}
}
}
}
RowChecker <- RowChecker + 1
}
RowCounter <- RowCounter + 1
}
assign(paste0("Subset", i), DoubleTempData)
}
例如,这是我运行的随机生成的 Subset1 的样本,其中包含大约 70 个观察值和 7 列(其中一个被程序按预期删除):
[![在此处输入图像描述][1]][1] [1]:https://i.stack.imgur.com/Jlytj.png