首页 > 解决方案 > 如何通过提取将列拆分为两列?

问题描述

我想将列分成两列,然后将数字单独提取并保留在一列中。

df <- data.frame(V1 = c("[1] Strongly disagree", "[2] Somewhat disagree", "[3] Neither", "[4] Somewhat agree", "[5] Strongly agree"))
                  V1
 [1] Strongly disagree
 [2] Somewhat disagree
 [3] Neither
 [4] Somewhat agree
 [5] Strongly agree

我尝试使用以下separate功能tidyr

tidyr::separate(df, V1, into = c("Value", "Label"), sep = "] ")

Value   Label
[1      Strongly disagree           
[2      Somewhat disagree           
[3      Neither         
[4      Somewhat agree          
[5      Strongly agree

我也许可以[使用另一个功能删除 ,但我想知道我是否可以一步解决这个问题,并想知道是否还有另一个功能可以完成这项工作。

我想最终得到这个

        Label        Value
 Strongly disagree     1
 Somewhat disagree     2
 Neither               3
 Somewhat agree        4
 Strongly agree        5

标签: r

解决方案


如果您更喜欢基础 R,这里是基础 R 解决方案:

df <- data.frame(V1 = c("[1] Strongly disagree", "[2] Somewhat disagree", "[3] Neither", "[4] Somewhat agree", "[5] Strongly agree"))

df$value = as.numeric(regmatches(df$V1, regexpr(r"(\d)", df$V1)))

df$V1 = regmatches(df$V1, regexpr("(?<=] ).*", df$V1, perl=TRUE))
df
#>                  V1 value
#> 1 Strongly disagree     1
#> 2 Somewhat disagree     2
#> 3           Neither     3
#> 4    Somewhat agree     4
#> 5    Strongly agree     5

reprex 包(v0.3.0)于 2020 年 9 月 5 日创建

regmatches是一个基本的 R 函数,它从向量返回匹配的值,它将向量和regexpr对象作为输入。

如果第一种情况(value列)\d用于提取数字。在第二种情况下,(?<=] ).*用于返回匹配之后的任何内容]


推荐阅读