r - 如何跨列提取部分单元格值?
问题描述
我有一个这样的数据框:
df1<-structure(list(q006_1 = c("1098686880", "18493806","9892464","96193586",
"37723803","13925456","37713534","1085246853"),
q006_2 = c("1098160170","89009521","9726314","28076230","63451251",
"1090421499","37124019"),
q006_3 = c("52118967","41915062","1088245358","79277706","91478662",
"80048634")),
class=data.frame, row.names = c(NA, -8L)))
我知道如何substr
在 data.table 中使用一列提取每个数字的最后五位数字,但我想在所有列中执行此操作。
n_last <- 5
df1[, `q006_1`:= substr(q006_1, nchar(q006_1) - n_last + 1, nchar(q006_1))]
如何对所有列执行此操作?
解决方案
data.table
可以如下完成:(您的示例数据不完整,因为第一列有 8 个,第二列有 7 个,第三列有 6 个条目。)
library(data.table)
#or `cols <- names(df1)` if you want to apply it on all columns and this is not just an example
cols <- c("q006_1", "q006_2", "q006_3")
setDT(df1)[ , (cols):= lapply(.SD, function(x){
sub('.*(?=.{5}$)', '', x, perl=T)}),
.SDcols = cols][]
# q006_1 q006_2 q006_3
# 1: 86880 60170 18967
# 2: 93806 09521 15062
# 3: 92464 26314 45358
# 4: 93586 76230 77706
# 5: 23803 51251 78662
# 6: 25456 21499 48634
# 7: 13534 24019 76230
# 8: 46853 76230 76230
数据:
df1<-structure(list(q006_1 = c("1098686880", "18493806","9892464","96193586",
"37723803","13925456","37713534","1085246853"),
q006_2 = c("1098160170","89009521","9726314","28076230",
"63451251","1090421499","37124019","28076230"),
q006_3 = c("52118967","41915062","1088245358","79277706",
"91478662","80048634","28076230","28076230")),
class = c("data.frame"), row.names = c(NA, -8L))
推荐阅读
- apache-spark - Spark DataFrame(或DataSet)中两列的链接值
- vue.js - Vee Validate 阻止我的组件运行,errors.first 故障
- javascript - 如何从反应组件更改快速 API url
- r - 如何在 ggplot2 中使用线图和误差线绘制多个连续变量与因子的关系?
- r - 我无法使用 rbind 和 do.call 访问列表列表中的第三个列表
- php - 如何指定 wordpress 在默认位置以外的特定位置查找 wp-content 目录?
- c# - 为所有出现的 Rule().NotEmpty() 设置默认错误代码;
- c# - Ef Core,不同数据类型的多对多连接表复合键
- javascript - 对显示“不允许加载本地资源”错误的字符串使用 IndexOf() 方法
- javascript - 用js搜索字母的字符串