r - Merging two list columns based on a condition
问题描述
I have data which looks like:
extractedNames1 extractedNames2
1
2 synopsys
3
4 somero
5 cbiz medical management professionals, cbiz, cbiz, cbiz, cbiz, cbiz
6
7
8 johnson, johnson, johnson
9 iridium
10 skillsoft, skillsoft, skillsoft skillsoft, skillsoft, skillsoft
It has two columns which contain lists. I want to merge the two columns into a single column.
First I would like to only keep the unique
values. So cbiz medical management professionals
would be a unique value in column extractedNames1
but cbiz, cbiz, cbiz, cbiz, cbiz
would be collapsed down to cbiz
in column extractedNames2
.
Second I would like to merge based on a condition. If there is no result in extractedNames1
then take the unique
value from extractedNames2
. If there is a result in both columns then just keep the result from extractedNames1
. (So I want to merge the two columns but only when there is an empty space in the column extractedNames1
.)
Expected output:
extractedNamesFina
l
1
2 synopsys
3
4 somero
5 cbiz medical management professions,
6
7
8 Johnson
9 iridium
10 skillsoft
Data
structure(list(extractedNames1 = list(character(0), character(0),
character(0), character(0), " cbiz medical management professionals,",
character(0), character(0), character(0), character(0), c(" skillsoft",
" skillsoft", " skillsoft")), extractedNames2 = list(character(0),
"synopsys", character(0), "somero", c("cbiz", "cbiz", "cbiz",
"cbiz", "cbiz"), character(0), character(0), c("johnson",
"johnson", "johnson"), "iridium", c("skillsoft", "skillsoft",
"skillsoft"))), row.names = c(NA, -10L), class = "data.frame")
解决方案
You can use the map()
family in purrr
.
library(purrr)
df2 <- as.data.frame(matrix(nrow = nrow(df)))
df2[[1]] <- pmap(map(df, map, unique), ~ if(length(.x)) .x else .y)
df2
# V1
# 1
# 2 synopsys
# 3
# 4 somero
# 5 cbiz medical management professionals,
# 6
# 7
# 8 johnson
# 9 iridium
# 10 skillsoft
If you want to insert a list into a data.frame
, you need to create a data.frame
with the same length as the list. A more convenient way is to use tibble()
, which can take a list as input.
library(tibble)
tibble(new = pmap(map(df, map, unique), ~ if(length(.x)) .x else .y))
# # A tibble: 10 x 1
# new
# <list>
# 1 <chr [0]>
# 2 <chr [1]>
# 3 <chr [0]>
# 4 <chr [1]>
# 5 <chr [1]>
# 6 <chr [0]>
# 7 <chr [0]>
# 8 <chr [1]>
# 9 <chr [1]>
# 10 <chr [1]>
or
as_tibble_col(pmap(map(df, map, unique), ~ if(length(.x)) .x else .y), column_name = "new")
推荐阅读
- c# - 如何在 XAMPP 服务器上上传数据库?
- ruby-on-rails - 添加404错误页面后的ConnectionTimeoutError?
- vb.net - If Condition 中的拆分字符串不显示所有拆分值。VB.NET
- r - 用标点符号解析字符串,在r中的一个大列表中的不均匀行中
- android - 单击立即尝试按钮时即时应用程序崩溃
- android - 添加 Barcode_Scan 包后 Flutter 应用程序无法构建
- c - 我删除结构条目的代码没有删除,我不知道为什么
- php - 在 Laravel 中显示类别、子类别和子类别时出错
- wordpress - .htaccess 网站的 WordPress 问题转到示例域
- css - HTML如何增加