首页 > 解决方案 > 使用 ccwc() 函数进行风险集抽样,无需替换

问题描述

有谁知道是否可以使用 Epi 包(风险集抽样)中的 ccwc() 函数而无需更换?如果是这样,这是如何指定的?我无法在文档中找到任何关于此的内容。有关该功能的工作原理,请参见下面的可重现示例:

library(lubridate)
library(dplyr)
library(tidyverse)
library(Epi)
library(survival)

### Create data frame:
Person_IDs <- seq(1,10000,1)
Example_DF <- as.data.frame(Person_IDs)
Example_DF$Start_Date <- as.Date("2020-01-01")
Example_DF$Exposure_Date <- as.Date("2020-01-01") + days(sample (c(45:365), size=10000, replace =T))
Example_DF$End_Date <- as.Date("2021-05-01")
Example_DF$Sex <- sample(c("Male", "Female"), size = 10000, replace = T)
Example_DF$Age <- sample(c(1:100), size = 10000, replace = T)
Example_DF$Fail <- sample(c(0,1), size = 10000, replace = TRUE, prob = c(0.9, 0.1))

### Show first rows of data frame:
head(Example_DF)

## Set seed for the random number generator
set.seed(20140111)
## Generate a nested case-control study
Risk_Set_Sampling_With_Replacement <- ccwc(entry    = Start_Date,    # Time of entry to follow-up
               exit     = Exposure_Date,    # Time of exit from follow-up
               fail     = Fail,    # Status on exit (1 = Fail, 0 = Censored)
               origin   = Start_Date,    # Origin of analysis time scale
               controls = 5,      # The number of controls to be selected for each case
               data     = Example_DF,   # data frame
               include = Person_IDs,
               match    = list(Age, Sex),    # List of categorical variables on which to match cases and controls
               silent   = TRUE
)
## Show how many times each person_ID is included as case and controls, respectively:
Risk_Set_Sampling_With_Replacement[Risk_Set_Sampling_With_Replacement$Fail == 1,] %>% group_by(Person_IDs) %>% tally(sort = TRUE)
## Only one for each person, which is supposed to be the case (can not be selected as case >1 time)

Risk_Set_Sampling_With_Replacement[Risk_Set_Sampling_With_Replacement$Fail == 0,] %>% group_by(Person_IDs) %>% tally(sort = TRUE)
## Many person IDs selected more than once, which is supposed to be the case if sampling with replacement is desired. I do however want sampling without replacement. 

标签: rreplacesampling

解决方案


推荐阅读