首页 > 解决方案 > 在列中重复序列

问题描述

我想一次又一次地重复给定的(下面)序列以填充 R 中的大约 5000 行。

时间数据集:

8.00.00 AM
9.00.00 AM
10.00.00 AM
11.00.00 AM
12.00.00 PM
1.00.00 PM
2.00.00 PM
3.00.00 PM
4.00.00 PM
5.00.00 PM
6.00.00 PM
7.00.00 PM
8.00.00 PM
9.00.00 PM

标签: r

解决方案


您看到空白可能有多种原因。我将关注两种可能:NAs 和文字空白。

srcvec <- c("8.00.00 AM", "9.00.00 AM", "10.00.00 AM", "11.00.00 AM", "12.00.00 PM", 
"1.00.00 PM", "2.00.00 PM", "3.00.00 PM", "4.00.00 PM", "5.00.00 PM", 
"6.00.00 PM", "7.00.00 PM", "8.00.00 PM", "9.00.00 PM", NA, ""
)
rep(srcvec, len=30)
#  [1] "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM" "11.00.00 AM" "12.00.00 PM" "1.00.00 PM" 
#  [7] "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM"  "5.00.00 PM"  "6.00.00 PM"  "7.00.00 PM" 
# [13] "8.00.00 PM"  "9.00.00 PM"  NA            ""            "8.00.00 AM"  "9.00.00 AM" 
# [19] "10.00.00 AM" "11.00.00 AM" "12.00.00 PM" "1.00.00 PM"  "2.00.00 PM"  "3.00.00 PM" 
# [25] "4.00.00 PM"  "5.00.00 PM"  "6.00.00 PM"  "7.00.00 PM"  "8.00.00 PM"  "9.00.00 PM" 

要删除NAs,我们可以简单地使用na.omit

rep(na.omit(srcvec), len=30)
#  [1] "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM" "11.00.00 AM" "12.00.00 PM" "1.00.00 PM" 
#  [7] "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM"  "5.00.00 PM"  "6.00.00 PM"  "7.00.00 PM" 
# [13] "8.00.00 PM"  "9.00.00 PM"  ""            "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM"
# [19] "11.00.00 AM" "12.00.00 PM" "1.00.00 PM"  "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM" 
# [25] "5.00.00 PM"  "6.00.00 PM"  "7.00.00 PM"  "8.00.00 PM"  "9.00.00 PM"  ""           

要删除空白,我们可以过滤nzchar,当字符串连续 1 个或多个字符时返回 true:

rep(Filter(nzchar, na.omit(srcvec)), len=30)
#  [1] "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM" "11.00.00 AM" "12.00.00 PM" "1.00.00 PM" 
#  [7] "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM"  "5.00.00 PM"  "6.00.00 PM"  "7.00.00 PM" 
# [13] "8.00.00 PM"  "9.00.00 PM"  "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM" "11.00.00 AM"
# [19] "12.00.00 PM" "1.00.00 PM"  "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM"  "5.00.00 PM" 
# [25] "6.00.00 PM"  "7.00.00 PM"  "8.00.00 PM"  "9.00.00 PM"  "8.00.00 AM"  "9.00.00 AM" 

如果你有非空的空格(例如,空格),你可以使用这个:

srcvec <- c(srcvec, "   ")
rep(Filter(function(a) !is.na(a) & nzchar(gsub("\\s", "", a)), srcvec), len=30)
#  [1] "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM" "11.00.00 AM" "12.00.00 PM" "1.00.00 PM" 
#  [7] "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM"  "5.00.00 PM"  "6.00.00 PM"  "7.00.00 PM" 
# [13] "8.00.00 PM"  "9.00.00 PM"  "8.00.00 AM"  "9.00.00 AM"  "10.00.00 AM" "11.00.00 AM"
# [19] "12.00.00 PM" "1.00.00 PM"  "2.00.00 PM"  "3.00.00 PM"  "4.00.00 PM"  "5.00.00 PM" 
# [25] "6.00.00 PM"  "7.00.00 PM"  "8.00.00 PM"  "9.00.00 PM"  "8.00.00 AM"  "9.00.00 AM" 

推荐阅读