r - 展开词收缩
问题描述
我正在编写一个函数来扩展单词收缩。它将数据框作为输入参数,并输出带有“clean_text”列的数据框,并在文本中显示扩展模式。我可以通过使用 qdap mgsub 函数来替换文本中的模式来做到这一点。但是,我想知道是否有更好的解决方案。
contrap_pattern <- c("i'm","you're","he's","she's","it's", "we're", "they're","i've","you've","we've","they've","i'd","you'd","he'd","she'd","we'd","they'd","i'll","you'll","he'll","she'll","we'll","they'll","isn't","aren't","wasn't","weren't","hasn't","haven't","hadn't","doesn't","don't","didn't","won't","wouldn't","shan't","shouldn't","can't","couldn't","mustn't","let's","that's","who's","what's","here's","there's","when's","where's","why's","how's")
replacement_pattern <- c("I am","you are","he is" ,"she is" ,"it is","we are" , "they are", "I have","you have","we have", "they have","I would","you would","he would", "she would","we would","they would", "I will","you will","he will", "she will" ,"we will","they will","is not","are not","was not","were not","has not" , "have not","had not","does not","do not", "did not" ,"will not","would not", "shall not","should not","can not","could not","must not","let us","that is", "who is","what is","here is", "there is","when is","where is","why is","how is")
clean$text_clean <- qdap::mgsub(pattern = contrap_pattern, replacement = replacement_pattern, clean$text_clean)
更新:无需在代码中明确编写模式,函数 replace_contraction() 就可以满足需要。感谢@phiver 的建议。
解决方案
推荐阅读
- parsing - 首先并遵循以下语法
- android - Appium是否支持android单元测试?
- r - 向 geom_sf 添加标签会返回 stat_sf_coordinates 的错误
- apache-spark - 如何计算pyspark中列过多的数据的不同值
- android - Android Studio 中的 Kotlin 不会加载 HTTP 音频链接。HTTPS 工作正常
- java - 在“内部”方法中抛出异常后如何停止Java方法执行?
- angular - 带有 *ngIf else 模板的 ng-container 不适用于 ContentChildren QueryList
- c# - C#窗口服务能否自动登录到自定义事件源/日志
- spring - JSF 2.3 (CDI) + Spring 4 集成
- sql-server - SQL Server,添加字段和视图