首页 > 解决方案 > 删除标题中包含特定字符的列,同时将这些列下的值移动到最近的左侧列

问题描述

好吧,通过标题你可以清楚地看到我的逻辑在窗外。所以我会尽力明确我的目标。

我有 2 行的 10 列,一个包含列名,另一个包含主题名称。

1       2       3       4       5       6       7       8       9       10         #(Column Count)
Name1   ---     ---     Name2   ---     ---     Name3   ---     ---     Name4      #(Column Names)[Row1]
Topic1  Topic2  Topic3  Topic4  Topic5  Topic6  Topic7  Topic8  Topic9  Topic10    #(Topic Names)[Row2]

基本上我想删除所有包含“---”的列,但是将这些列下的值移动到最近的未删除的左侧列。所以期望的执行应该是这样的:

1       2       3       4     
Name1   Name2   Name3   Name4           
Topic1  Topic4  Topic7  Topic10
Topic2  Topic5  Topic8
Topic3  Topic6  Topic9           

标签: rrstudio

解决方案


我们可以用

library(zoo)
df2 <- transform(stack(df1),  
      ind = na.locf0(replace(ind,  grepl('---', ind), NA))) 
lst1 <- split(df2$values, as.character(df2$ind))
out <- do.call(cbind, lapply(lst1, `length<-`, max(lengths(lst1))))
out
#     Name1    Name2    Name3    Name4    
#[1,] "Topic1" "Topic4" "Topic7" "Topic10"
#[2,] "Topic2" "Topic5" "Topic8" NA       
#[3,] "Topic3" "Topic6" "Topic9" NA   

或者另一种选择是重塑为“长”格式,然后转换回“宽”格式

library(dplyr)
library(tidyr)
library(data.table)
df1 %>% 
    pivot_longer(everything()) %>% 
    mutate(name = na_if(name, "---")) %>%
    fill(name) %>% 
    mutate(rn = rowid(name)) %>% 
    select(name, value, rn) %>%
    pivot_wider(names_from = name, values_from = value) %>%
    select(-rn)
# A tibble: 3 x 4
#  Name1  Name2  Name3  Name4  
#  <chr>  <chr>  <chr>  <chr>  
#1 Topic1 Topic4 Topic7 Topic10
#2 Topic2 Topic5 Topic8 <NA>   
#3 Topic3 Topic6 Topic9 <NA>  

数据

df1 <- structure(list(Name1 = "Topic1", `---` = "Topic2", `---` = "Topic3", 
    Name2 = "Topic4", `---` = "Topic5", `---` = "Topic6", Name3 = "Topic7", 
    `---` = "Topic8", `---` = "Topic9", Name4 = "Topic10"), row.names = c(NA, 
-1L), class = "data.frame")

推荐阅读