r - 如何在数据框中连续定位第一个 NA 实例?
问题描述
如何查看第一个NA
出现在数据框中的一行中的位置(在哪一列中)?
我正在研究参与者在 10 步长的过程中退出的点。
每个步骤都通过相应的列来标识,这意味着总共 10 列。
我可以判断某人是否已完成任何一步的方法是,我是否在列中看到一个日期时间值,该值指示他们完成该步骤的时刻。
如果他们尚未完成该步骤,它将显示NA
,以下列也将显示。
例如,如果我NA
在第 5 列中看到特定行,那么我知道该特定用户没有继续执行第 4 步,因为其余列也将显示NA
。
这个想法是参与者完成所有 10 个步骤,这意味着他们已经完成了整个过程。
我希望能够确定最常见的下车点。
我的数据集有 2,000 行深 - 如何快速检查和/或识别它?
样本数据:
structure(list(associate = c("tXQCMHwGFy", "JzObuwUnkJ", "2fM04XFVja",
"uFsZTj2i2M", "ZsI0u5ka2j", "9r98DMXxFE", "NtmXw4qnIa", "oGB0Ugi93h",
"G0r2yOxM7s", "MIpQqbBagS", "HCGJ5kSOlk", "3ljP9FuGcA", "5k7OvbBZUH",
"6DDEbTWhBD", "xuU5Ewninw", "5UGABh3kcg", "G5etNVDoEH", "ejlCBv3dp2",
"2DUWxEFt6o", "sCJeaxCSk5", "sb9QKBDSHl", "E8n3XZSS1x", "Ld7rFWFKag",
"ykziBo9kOx", "Z9mOsGpDNE"), accountCreation = structure(c(1524606379.904,
1528147858.812, 1521994536.637, 1522097826.043, 1528150007.134,
1526575446.645, 1523493362.438, 1528123246.558, 1528135004.808,
1527791947.924, 1526755863.609, 1525455650.394, 1523409400.766,
1524347073.427, 1526134766.407, 1523638698.97, 1527878066.61,
1524855389.236, 1526309009.378, 1520972884.396, 1527180696.03,
1527268883.689, 1521646455.016, 1526837992.595, 1521040859.622
), class = c("POSIXct", "POSIXt")), profileSetup = structure(c(1524606693.345,
1528148032.015, 1521994616.897, 1522097826.043, 1528186485.637,
1526575497.987, 1523493556.798, 1528123314.197, 1528135180.95,
1527792152.877, 1526756131.911, 1525455787.847, 1523409400.766,
1524347073.427, 1526134850.566, 1523638905.289, 1527878482.462,
1524855535.686, 1526309106.294, 1522186725.043, 1527180799.909,
1527269009.143, 1521646455.016, 1526838102.323, 1521040859.622
), class = c("POSIXct", "POSIXt")), profilesetupDuration = c(314,
174, 80, 0, 36478, 51, 194, 68, 176, 205, 268, 137, 0, 0, 84,
207, 416, 146, 97, 1213841, 103, 126, 0, 110, 0), introductionSplash = structure(c(1524872052.263,
1528148043.062, 1521995730.924, 1522097826.043, 1528186496.499,
1526575506.96, 1523493567.959, 1528123329.044, 1528135237.755,
1527792185.349, NA, 1525455815.855, 1523409400.766, 1524347073.427,
1526134861.747, 1523638967.684, 1527878727.235, 1524855546.038,
1526309117.104, 1522186739.397, NA, 1527269018.641, 1521646455.016,
1526838112.374, 1521040859.622), class = c("POSIXct", "POSIXt"
)), introductionSplashDuration = c(265673, 185, 1194, 0, 36489,
60, 205, 83, 233, 238, NA, 165, 0, 0, 95, 269, 661, 157, 108,
1213855, NA, 135, 0, 120, 0), introduction = structure(c(1525124180.491,
1528148744.594, 1521996568.337, 1522097826.043, NA, 1526576050.815,
1523495507, 1528126805.572, NA, 1527792470.951, NA, 1525456759.777,
1523409400.766, 1524347073.427, 1526135265.531, 1523639316.761,
1527878956.368, 1524861227.537, 1526310376.89, 1522187755.31,
NA, 1527269672.153, 1521646455.016, 1526838283.459, 1521040859.622
), class = c("POSIXct", "POSIXt")), introductionDuration = c(517801,
886, 2032, 0, NA, 604, 2145, 3559, NA, 523, NA, 1109, 0, 0, 499,
618, 890, 5838, 1367, 1214871, NA, 789, 0, 291, 0), demoChatSkipped = structure(c(NA,
1528148761.447, NA, 1522097826.043, NA, 1526576060.249, NA, NA,
NA, 1527792487.742, NA, 1525456803.893, 1523409400.766, 1524347073.427,
1526147587.803, NA, NA, NA, NA, NA, NA, 1527269694.132, 1521646455.016,
1526838287.934, 1521040859.622), class = c("POSIXct", "POSIXt"
)), demoChatSkippedDuration = c(NA, 903, NA, 0, NA, 614, NA,
NA, NA, 540, NA, 1153, 0, 0, 12821, NA, NA, NA, NA, NA, NA, 811,
0, 295, 0), approval = structure(c(1525124264.718, 1528148756.313,
1522018833.517, 1522097826.043, NA, 1526576055.489, 1523538955.529,
1528136805.681, NA, 1527792479.256, NA, 1525456805.673, 1523409400.766,
1524347073.427, 1526147585.05, 1523639448.648, 1527879134.158,
1524861732.505, 1526315087.819, 1522188033.261, 1527180827.746,
1527269692.115, 1521646455.016, 1526838288.734, 1521040859.622
), class = c("POSIXct", "POSIXt")), approvalDuration = c(517885,
898, 24297, 0, NA, 609, 45593, 13559, NA, 532, NA, 1155, 0, 0,
12819, 750, 1068, 6343, 6078, 1215149, 131, 809, 0, 296, 0),
tutorial = structure(c(NA, NA, NA, 1522097826.043, NA, NA,
NA, NA, NA, NA, NA, NA, 1523409400.766, 1524347073.427, NA,
NA, NA, NA, NA, NA, NA, NA, 1521646455.016, NA, 1521040859.622
), class = c("POSIXct", "POSIXt")), tutorialDuration = c(NA,
NA, NA, 0, NA, NA, NA, NA, NA, NA, NA, NA, 0, 0, NA, NA,
NA, NA, NA, NA, NA, NA, 0, NA, 0), letsbegin = structure(c(1525124456.616,
1528148773.37, 1522031049.317, 1522097826.043, NA, 1526576071.6,
1523538956.159, 1528136822.297, NA, 1527794019.564, NA, 1525456849.582,
1523409400.766, 1524347073.427, 1526312517.824, 1523639449.148,
1527879134.675, 1524861750.153, 1526317200.235, 1522188066.352,
1527180828.158, NA, 1521646455.016, 1527015876.057, 1521040859.622
), class = c("POSIXct", "POSIXt")), letsbeginDuration = c(518077,
915, 36513, 0, NA, 625, 45594, 13576, NA, 2072, NA, 1199,
0, 0, 177751, 751, 1068, 6361, 8191, 1215182, 132, NA, 0,
177884, 0), demoChatDuration = c(517884, NA, 24297, NA, NA,
NA, 2499, 13559, NA, NA, NA, NA, NA, NA, 13201, 729, 1029,
6342, 6078, 1215148, NA, 967, NA, NA, NA)), row.names = c(937L,
1941L, 396L, 30L, 1950L, 1337L, 602L, 1812L, 1872L, 1719L, 1423L,
1077L, 173L, 234L, 1204L, 680L, 1748L, 989L, 1243L, 251L, 1568L,
1615L, 196L, 1451L, 154L), class = "data.frame")
解决方案
如果你想要一个快速的解决方案,我会使用矢量化它max.col
res <- max.col(is.na(df), ties = "first")
即使特定行中根本没有 s也会max.col
返回。因此,您可以添加以下行来处理这些特定情况1
NA
if(any(res == 1)) is.na(res) <- (res == 1) & !is.na(df[[1]])
这会将这些情况转换为NA
- 意味着未找到该行的列索引
推荐阅读
- bash - 将 Bash 逗号分隔的字符串值变量传递给 jq arg 过滤器
- python - 如何跨实例正确重置类变量
- swift - Simultaneous transactions in Firebase Realtime Database are failing to get most recent data (Swift)
- javascript - 发出 SizeLimitsPlugin 后 npm run dev 98%>找不到此依赖项
- javascript - 为什么我的 iframe 导致页面的其余部分变为空白?
- dax - 过滤器与计算 DAX 函数
- json - 使用 Spark Scala 读取 JSON RDD
- c - 隐式函数转换用于双参数到 int 参数
- python - Django:NOT NULL 约束失败:wishlist_wish.user_id
- unity3d - 有没有办法检查我的玩家只能跳一次