首页 > 解决方案 > 使用 R 将非 BOOLEAN 结果生成到 CSV

问题描述

我接手了一所离开组织的大学的报告。该报告是用 R 编写的,它访问 Oracle 数据库,并在 R 中运行 SQL 脚本。

然后,R 代码将数据推送到 csv。我遇到的困难是它正在向 csv 推送 或空白单元格TRUEFALSE我想将代码更新为 print UpdatedNot Updated或空白单元格。我正在努力寻找进行这些更新的正确位置。

下面是生成 CSV 的代码,以及代码引用的片段,它可能正在生成我所引用的TRUEand FALSE。任何帮助将不胜感激。

# prepare a data frame based on Today_Active_address_wf for printing. 
ForPrintONLYToday_Active_address_wf <- Today_Active_address_wf

# add an empty row to dataframe
ForPrintONLYToday_Active_address_wf[nrow(ForPrintONLYToday_Active_address_wf)+1, ] <- NA

# write today Active address wf data frame to .csv file
filename <- paste("Daily Address Change Workflow Report___", format(Sys.time(), "%Y-%m-%d__%Hh%M"), ".csv", sep ="")
filelocation <- "\\\\DCV-PANAPP-P001\\File Share\\BIS\\Data Administration\\Daily Workflow Reports\\"
filewrite <- paste(filelocation, filename, sep = "")
write.csv(ForPrintONLYToday_Active_address_wf, file = filewrite, row.names = FALSE, na="")

# add additional info (totals for each column)
addInfo1 <- c("Total_duplicated_WF"
              ,"Total_inconsistent_wf_addr"
              ,"Total_QAS_validated_addr"
              ,"Total_invalid_Wf_addr_date"
              ,"Postal_unmatched_with_valid_WF"
              ,"Total_No_Policy"
              ,"Total_home_risk"
              ,"Total_motor_risk"
              ,"Total_risk_addr_change"
  )
addInfo2 <- c(sum(ForPrintONLYToday_Active_address_wf$dupCloseFlag, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$inconsistentDataFlag, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$WF_ADDRESS_VALIDATED, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$valid_addr_date == FALSE, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$MatchedPO_WF_addr == FALSE & ForPrintONLYToday_Active_address_wf$valid_addr_date == TRUE, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$hasNOpolicy, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$hasHomeRisk, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$hasMotorRisk, na.rm = TRUE)
              ,sum(ForPrintONLYToday_Active_address_wf$riskAddrChange_Flag, na.rm = TRUE)
  )
addInfoTable <- as.table(setNames(addInfo2, addInfo1))

write.table(addInfoTable, file = filewrite, row.names = FALSE, col.names = FALSE, na="",append = TRUE, sep = ":,")

### Additional Function ###

# function that converts "Y" to TRUE and "N" to FALSE
yn_to_logical <- function(x) {
 y <- rep.int(NA, length(x))
y[x == "Y"] <- TRUE
y[x == "N"] <- FALSE
y
} '

谢谢大家,我知道这是一堆代码,但 R 不是我的强项。

标签: sqlrcsv

解决方案


有很多方法可以用另一组值替换一组值。这是一种方法:

# Create a simple data set that contains a column with "Y","N" and NAs (missing values) 
df <- data.frame(x = sample(c("Y","N",NA), 10, replace = TRUE), 
                 val = rnorm(10) )
df
#       x         val
# 1     N  0.56554865
# 2  <NA> -1.81437749
# 3     Y -1.21385694
# 4     Y -1.30173545
# 5     N -0.18994710
# 6     N -0.67519801
# 7  <NA>  0.02093869
# 8     N  0.69082204
# 9     Y  0.01715652
# 10    Y  1.34007199

# use function recode() from car package
library(car)
df$x <- recode(df$x, " 'Y' = 'Updated'; 'N' = 'Not Updated'")
df
#      x         val
# 1  Not Updated  0.56554865
# 2         <NA> -1.81437749
# 3      Updated -1.21385694
# 4      Updated -1.30173545
# 5  Not Updated -0.18994710
# 6  Not Updated -0.67519801
# 7         <NA>  0.02093869
# 8  Not Updated  0.69082204
# 9      Updated  0.01715652
# 10     Updated  1.34007199

推荐阅读