r - 删除字符变量开头的“The”，并将其移至末尾

问题描述

我有一些看起来像这样的数据（最后输入数据的代码）：

Year    Movie
2012    The Avengers
2015    Furious 7    
2017    The Fate of the Furious

我想要的输出是：

Year    Movie
2012    Avengers, The
2015    Furious 7
2017    Fate of the Furious, The

我应该使用stringr和regex格式吗？regex您是否可以推荐一个比大多数网站或帮助文档解释得更简单的链接？

这很糟糕，但这是我现在所能做的：

str_replace(df$Movie, pattern = "The", replacement = "")

即使只是一些关于在帮助文档中寻找什么命令的提示，或者在哪里可以找到我应该寻找什么的解释也会有所帮助。

df <- data.frame(stringsAsFactors=FALSE,
        Year = c(2012L, 2015L, 2017L),
       Movie = c("The Avengers", "Furious 7", "The Fate of the Furious")
)

df

str_replace(df$Movie, pattern = "The", replacement = "")

标签： rregexstringr

尝试

sub("^([Tt]he?) (.*)", "\\2, \\1", df$Movie)
#[1] "Avengers, The"           
#[2] "Furious 7"               
#[3] "Fate of the Furious, The"

? - 表示“The”是可选的，最多匹配一次。如果字符串以“the”开头，也将匹配。感谢@rawr！
.- 匹配任何字符 - 零次或多次，这*表明
()- 将其中的正则表达式匹配的文本捕获到一个编号组中，该组可以通过编号反向引用重复使用，即\\1和\\2。请参阅正则表达式.info。

我希望这对你有一些意义。

r - 删除字符变量开头的“The”，并将其移至末尾

问题描述

解决方案

推荐阅读