首页 > 解决方案 > 基于索引的字符串拆分

问题描述

我有一个 DF,它由一个具有字母数字值的列组成。我想拆分这些值并将其存储在单独的列中。

我有一个数据框,其中有一列带有字母数值。我想拆分该值并将其存储到新列中,如下面的示例所示。

str<-c("1001AA00100BC300AA01111000AA0299F40400F4053DF40C0000F4030000F40680F4077", "1001AA00100BC300AA01111000AA0299F40400F4053DF40C0000F4030000F40680F4077", "1001AA00100BC300AA01111000AA0299F40400F4053DF40C0000F4030000F40680F4077", "1001AA00100BC300AA01111000AA0299F40400F4053DF40C0000F4030000F40680F4077", "1001AA00100BC300AA01111000AA0299F40400F4053DF40C0000F4030000F40680F4077", "1001AA00100BC300AA01111000AA0299F40400F4053DF40C0000F4030000F40680F4077")

输出:

AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077

标签: r

解决方案


使用一行样本输出找到字段宽度。这以 4 开头,因为输入的前 4 个字符似乎从示例输出中丢失。然后在read.fwf. 如果您真的不希望输入的前 4 个字符出现在输出中,则将该read.fwf行 替换为read.fwf(textConnection(str), widths)[-1]. 不使用任何包。

sample.out <- "AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077"
widths <- c(4, sapply(read.table(text = sample.out, as.is = TRUE), nchar))

read.fwf(textConnection(str), widths)

给予:

    V1   V2       V3   V4     V5   V6 V7                                      V8
1 1001 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
2 1001 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
3 1001 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
4 1001 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
5 1001 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077
6 1001 AA00 100BC300 AA01 111000 AA02 99 F40400F4053DF40C0000F4030000F40680F4077

推荐阅读