首页 > 解决方案 > 将列中以冒号分隔的字符串拆分为 R 中的不同列

问题描述

data <- data.frame(col1 = c('0/1:60,4:0.044:4:0:1.00:2352,160:32:28', '0/1:58,4:0.041:4:0:1.00:2304,150:28:30', '0/1:25,2:0.095:1:1:0.500:908,78:9:16'))

data

                                    col1
1 0/1:60,4:0.044:4:0:1.00:2352,160:32:28
2 0/1:58,4:0.041:4:0:1.00:2304,150:28:30
3   1/1:25,2:0.095:1:1:0.500:908,78:9:16

我想提取第二个冒号之前的数字,即0/1, 0/1, 1/1, 60,4, 58,4, 25,2, 并将其拆分为不同的列。

data
                                    col1    col2    col3    col4    col5    
1 0/1:60,4:0.044:4:0:1.00:2352,160:32:28       0       1      60       4
2 0/1:58,4:0.041:4:0:1.00:2304,150:28:30       0       1      58       4
3   1/1:25,2:0.095:1:1:0.500:908,78:9:16       1       1      25       2

标签: r

解决方案


strsplit两次(一次使用:,再次使用[/,])和[-extraction 的工作方式如下:

tmp <- do.call(rbind.data.frame, lapply(strsplit(data$col1, ":"), function(st) as.integer(unlist(strsplit(st, "[/,]")[1:2]))))
cbind(data, setNames(tmp, paste0("col", 1+seq_len(ncol(tmp)))))
#                                     col1 col2 col3 col4 col5
# 1 0/1:60,4:0.044:4:0:1.00:2352,160:32:28    0    1   60    4
# 2 0/1:58,4:0.041:4:0:1.00:2304,150:28:30    0    1   58    4
# 3   0/1:25,2:0.095:1:1:0.500:908,78:9:16    0    1   25    2

推荐阅读