r - run the script according to logical conditions in R
问题描述
In my data set, I work with groups(stratum) SKU-acnumber-year. Here little example:
df=structure(list(SKU = c(11202L, 11202L, 11202L, 11202L, 11202L,
11202L, 11202L, 11202L, 11202L, 11202L, 11202L, 11202L, 11202L,
11202L, 11202L, 11202L, 11202L, 11202L, 11202L, 11202L, 11202L
), stuff = c(8.85947691, 9.450108704, 10.0407405, 10.0407405,
10.63137229, 11.22200409, 11.22200409, 11.81263588, 12.40326767,
12.40326767, 12.40326767, 12.99389947, 13.58453126, 14.17516306,
14.76579485, 15.94705844, 17.12832203, 17.71895382, 21.26274458,
25.98779894, 63.19760196), action = c(0L, 0L, 0L, 0L, 0L, 0L,
0L, 0L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L),
acnumber = c(137L, 137L, 137L, 137L, 137L, 137L, 137L, 137L,
137L, 137L, 137L, 137L, 137L, 137L, 137L, 137L, 137L, 137L,
137L, 137L, 137L), year = c(2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L,
2018L, 2018L, 2018L, 2018L, 2018L, 2018L, 2018L)), .Names = c("SKU",
"stuff", "action", "acnumber", "year"), class = "data.frame", row.names = c(NA,
-21L))
Very important:
The action column has only two values 0 and 1. As we can see in this example there is 3 observations by stuff of 1 category of action and 18 obs by stuff of zero category.
I need set logic condition. so, for groups which have from 1 to 4 observations by stuff of 1 category of action, then run script1.r
and for groups which have >=5 observations by stuff of 1 category of action then must be run script2.r
I would imagine it in this way, the script3.r is created, with the following content(condition), but I do not know how to correctly set these logical conditions.
# i take data from sql
dbHandle <- odbcDriverConnect("driver={SQL Server};server=;database=;trusted_connection=true")
sql <- paste0(select needed columns)
df <- sqlQuery(dbHandle, sql)
for groups where from 1-4 observations by stuff of 1 category of action then C:/path to/скрипт1.r
(or if groups have from 1-4 observations by stuff of 1 category of action then C:/path to/script1.r)
for groups where >=5 observations by stuff of 1 category of action then C:/path to/script2.r
( of if groups have >=5 observations by stuff of 1 category of action then C:/path to/script2.r)
How do I implement this? the script.3r runs by schedule, it will work according to the schedule, in order to run two scripts. I just do not want to make my Shedule for each script seaprately.
解决方案
Consider if
logic inside by
, the method to slice a dataframe by factor(s). And run the other scripts via command line with system()
calling Rscript
(assuming the R bin directory is set to your PATH environment variable):
by_list <- by(df, df[,c("SKU", "acnumber", "year")], function(sub) {
if (sum(sub$action == 1) %in% c(1:4)) system("Rscript /path/to/script1.r")
if (sum(sub$action == 1) >= 5) system("Rscript /path/to/script2.r")
return(sub)
})
Even better, source()
the external scripts in main script, making sure to wrap entire process of both scripts in function()
calls, even adding arguments like specific SKU. Otherwise,source
will run those files. With this approach, you can return output.
source("/path/to/script1.r") # IMPORTS script1_function()
source("/path/to/script2.r") # IMPORTS script2_function()
by_list <- by(df, df[,c("SKU", "acnumber", "year")], function(sub) {
current_SKU <- max(sub$SKU) # OR min(sub$SKU) OR sub$SKU[[1]]
if (sum(sub$action == 1) %in% c(1:4)) output <- script1_function()
if (sum(sub$action == 1) >= 5) output <- script2_function()
return(output)
})
推荐阅读
- javascript - 打字稿:valueOf对象算术打字
- aspectj - 在抛出 NPE 的相同方法上的 AspectJ 执行切入点
- excel - 使用变量作为范围内的列名
- java - java中的Jsoup.parse()和Jsoup.ParseBodyFragment()有什么区别?
- redis - Redis Pub/Sub - 发布者也是订阅者?
- javascript - TypeError:WooCommerceRestApi 不是构造函数
- javascript - 在 puppeteer 中将函数传递给 evaluate()
- vue.js - 在 Nuxt 应用程序中为动态路由生成 xml 站点地图
- excel - 根据一个列中的唯一值和另一列中的最大重复值更新列
- python-3.x - 将列表添加到 DF 列