r - 删除单个字符而不更改 r 数据框中的数字
问题描述
我的数据框中有许多箭头,“>”和“<”以及一些元素值。我想删除这些字符但保留数字。我只知道如何用下面的代码用 NA 替换整个元素。
df <- apply(df, 1:2, gsub, pattern = "<|>", replacement = "")
有人可以帮我编辑它,以便它也保留元素编号,而不是把整个东西都扔掉吗?
数据框:
structure(list(`Analyte Sample` = c(1, 2, 3, 4, 5, 6, 7, 8,
9, 10, 11, 12, 13, 14), A = c("4190", "6665", "7435", "2052",
"783", "322", "199", "90", "46", "17", "8", "3", "3", "<1↓"
), B = c("11569", "6677", "3852", "983.88", "589", "359", "203",
"68", "33", "12", "6", "<2↓", "4", "<1↓"), C = c("20453",
"7699", "2499", "707.98", "412", "328", "156", "88", "39", "27",
"17", "<1↓", "<3↓", "<1↓"), D = c("7893", ">20000↑",
"1623", "685.64", "321", "644", "112", "65", "35", "29", "9",
"5", "<3↓", "<1↓"), E = c("320", "15444", "2049", "1065",
"389", "365", "145", "77", "38", "16", "9", "6", "<2↓", "<2↓"
), F = c("7438", ">21999↑", "3472", "1057", "563", "401", "167",
"89", "46", "19", "6", "<1↓", "<1↓", "<1↓"), G = c(7345,
9001, 2473, 1138, 516, 403, 134, 81, 37, 17, 8, 6, 4, 3), H = c("9004",
"3998", "2299", "964.88", "499", "341", "112", "88", "39", "32",
"<29↓", "<30↓", "<31↓", "<29↓"), I = c("8434", "8700",
"2217", "1263", "567", "352", "153", "80", "43", "18", "9", "2",
"3", "<1↓"), J = c("7734", "6733", "2092", "1115", "637", "332",
"155", "82", "37", "17", "10", "4", "1", "<1↓"), K = c(">3718↑",
">3000↑", "2118", "862.13", "426", "355", "143", "78", "44",
"22", "11", "<4↓", "<4↓", "<3↓"), L = c(6345, 7688, 2311,
1195, 647, 366, 177, 83, 41, 20, 8, 6, 3, 2), M = c("4222", ">25587↑",
"1846", "814.61", "422", "314", "154", "86", "41", "27", "21",
"<2↓", "<2↓", "<3↓"), N = c("6773", "8934", "2381", "1221",
"677", "356", "146", "89", "40", "17", "10", "5", "2", "<2↓"
), O = c(">2200↑", ">2133↑", ">2000↑", "564.5", "226",
"476", "111", "60", "32", "36", "18", "<10↓", "<1↓", "<2↓"
)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA,
-14L), spec = structure(list(cols = list(`Analyte Sample` = structure(list(), class = c("collector_double",
"collector")), A = structure(list(), class = c("collector_character",
"collector")), B = structure(list(), class = c("collector_character",
"collector")), C = structure(list(), class = c("collector_character",
"collector")), D = structure(list(), class = c("collector_character",
"collector")), E = structure(list(), class = c("collector_character",
"collector")), F = structure(list(), class = c("collector_character",
"collector")), G = structure(list(), class = c("collector_double",
"collector")), H = structure(list(), class = c("collector_character",
"collector")), I = structure(list(), class = c("collector_character",
"collector")), J = structure(list(), class = c("collector_character",
"collector")), K = structure(list(), class = c("collector_character",
"collector")), L = structure(list(), class = c("collector_double",
"collector")), M = structure(list(), class = c("collector_character",
"collector")), N = structure(list(), class = c("collector_character",
"collector")), O = structure(list(), class = c("collector_character",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1), class = "col_spec"))
解决方案
我认为在您的情况下,最好的方法是使用正则表达式。使用 tidyverse:
df %>% mutate_at(vars(A:O), ~ as.numeric(gsub("[^0-9]*([0-9]*).*", "\\1", .)))
如果您只想更改以 a或开头的值,请执行以下操作:<
>
df %>% mutate_at(vars(A:O), ~ as.numeric(gsub("[<>]*([0-9]*).*", "\\1", .)))
当然,您也可以使用apply
... 但请注意 apply 在应用函数之前将数据框更改为矩阵的方式(作为数字的列将以空格为前缀,因此我们需要在模式中包含空格):
apply(df, 2, function(x) gsub("[ <>]*([0-9]*).*", "\\1", x))
解释:
该模式[0-9]*
匹配一个数字任意多次。该模式可以多次[^0-9]
匹配除数字之外的任何内容。
推荐阅读
- ios - 在基类中检测触摸
- angular - Angular 项目中的 Wordpress 文件夹 - 我们可以访问 Wordpress 网站吗?
- sql-server - 客户端计算机上需要的 SQL Server 配置管理器或 SQL Server 配置管理器中“客户端协议”的用途是什么
- .net - 如何在 Visaul Studio 代码中更改目标 .net 框架
- keras - 为什么我的神经网络总是预测同一个类别?
- python - Pandas if 语句在通过多个条件时不起作用
- python - 我如何迭代两个 django 模型
- python - Python async DB 在子进程中插入
- javascript - 使用 jQuery 滚动到 Reactjs 中的部分
- sql - Oracle SQL:如何删除 listagg 中的重复项