首页 > 解决方案 > Create column, separated by "," as a numeric output

问题描述

I am trying to present data that is currently in rows as "XXX-XX-0001, YY-YY-0001" into a new column, outlining the number within each row [2]

I have managed to mutate a new column, however it is a character output chr [2], i need this to be just 2.

{r}
bill <- bill %>%
mutate(NO_IA = strsplit(as.character(IA_YES), ","))

When I try to use as .numeric, It doesn't like that my input is "," - also if I try to double up it reject its to ( as.numeric & as.character in same line)

标签: rstrsplit

解决方案


经过一番澄清,这是一个更好的答案:

数据(来自评论)

string <- scan(text = "
AAA-GB-0001 
BBB-ES-0005,ADD-GB-0001 
BSC-ES-0005,HQQ-GB-0001,REE-GB-0001 
BDD-GB-0001,BSC-ES-0005,HQQ-GB-0001,UZZ-DE-0001 
BDD-GB-0001,UEE-DE-0001 
BDD-GB-0001,BRE-EE-0005,CTT-DE-0002,LZZ-DE-0011,UZZ-DE-0001", 
               what = character(), sep = "\n")

library(dplyr)
bill <- tibble(IA_YES = string)

下次提供一些示例数据是有意义的。例如,通过使用dput()(在这种情况下,将结果从dput(bill).

解决方案

请注意,strsplit代码中的命令会创建一个列表。该列表存储在新创建的列中,可以用作R. 我们可以使用该purrr包对列表进行操作,它提供了更好版本R的 *apply 函数:

bill %>%
  mutate(NO_IA = strsplit(as.character(IA_YES), ",")) %>% 
  mutate(length = map_int(NO_IA, length))
#> # A tibble: 6 x 3
#>   IA_YES                                                    NO_IA    length
#>   <chr>                                                     <list>    <int>
#> 1 "AAA-GB-0001 "                                            <chr [1~      1
#> 2 "BBB-ES-0005,ADD-GB-0001 "                                <chr [2~      2
#> 3 "BSC-ES-0005,HQQ-GB-0001,REE-GB-0001 "                    <chr [3~      3
#> 4 "BDD-GB-0001,BSC-ES-0005,HQQ-GB-0001,UZZ-DE-0001 "        <chr [4~      4
#> 5 "BDD-GB-0001,UEE-DE-0001 "                                <chr [2~      2
#> 6 BDD-GB-0001,BRE-EE-0005,CTT-DE-0002,LZZ-DE-0011,UZZ-DE-0~ <chr [5~      5

map_int(NO_IA, length)对:函数的简短解释map都以相同的方式工作。您提供可以转换为列表的列表或向量并对其应用函数。在这种情况下,我们测量length()列表中每个条目的值。另一种编写方式是map_int(NO_IA, function(x) length(x)). purrr与函数相比的优点apply是可以更好地控制输出。map_int将返回一个整数,map_chr例如,返回一个字符对象。

旧答案

您可以在转换之前用点替换逗号:

library(dplyr)df <- tibble(num = c("12,3", "10.7"))
df %>% 
  mutate(num = as.numeric(sub(",", ".", num, fixed = TRUE)))
#> # A tibble: 2 x 1
#>     num
#>   <dbl>
#> 1  12.3
#> 2  10.7

更“整洁”的版本:

library(tidyverse)
df <- tibble(num = c("12,3", "10.7"))
df %>% 
  mutate(num = str_replace(num, fixed(","), ".") %>%  
           as.numeric())
#> # A tibble: 2 x 1
#>     num
#>   <dbl>
#> 1  12.3
#> 2  10.7

推荐阅读