首页 > 解决方案 > 将字符串列拆分为R中的多个新列

问题描述

我有一列数据看起来像这样:

data.frame(Weather=c("Breezy Temp: 68° F, Humidity: 66%, Wind: W 15 mph","N/A Temp: ° F, Wind:   mph")

我想提取所有数字(从字符串中提取数字),但我想将它们保存在单独的列中。

理想情况下,结果如下所示:

Row 1: 68 66 15 Row 2: NA NA NA (Blanks will do, too)

到目前为止,我已经能够做到这一点:

(str_extract_all(Data$Column,"\\(?[0-9,.]+\\)?"))

但我只是得到一个看起来像这样的列表;

[[1]] [1] "68" "," "," "," "66" "," "," "," "15"

[[2]] [1] ","

而不是能够将每一行分成 3 列。

谢谢!

标签: rregexstringlist

解决方案


这是一个选项base R。我们将\\D+一个或多个非数字,gsubread.csvFilterNA

Filter(function(x) any(!is.na(x)), 
  read.csv(text = gsub("\\D+", ",", df1$Weather), 
         fill = TRUE, header = FALSE))

-输出

#  V2 V3 V4
#1 68 66 15
#2 NA NA NA

使用新数据

Filter(function(x) any(!is.na(x)), 
   read.csv(text = gsub("\\D+", ",", df2$Weather), 
          fill = TRUE, header = FALSE))
#  V2 V3 V4 V5
#1 68 66 15 NA
#2 NA NA NA NA
#3 76 68  6 10

数据

df1 <- structure(list(Weather = c("Breezy Temp: 68° F, Humidity: 66%, Wind: W 15 mph", 
"N/A Temp: ° F, Wind:   mph")), class = "data.frame", row.names = c(NA, 
-2L))

df2 <- structure(list(Weather = c("Breezy Temp: 68° F, Humidity: 66%, Wind: W 15 mph", 
"N/A Temp: ° F, Wind:   mph", "Cloudy Temp: 76° F, Humidity: 68%, Wind: SW 6 mph, Gusts to 10 mph"
 )), row.names = c(NA, -3L), class = "data.frame")

推荐阅读