r - 删除r中每一行中相似的国家名称
问题描述
我有一个数据集,其中有一个示例列,如下所示。
我需要在每一行中删除相似的国家名称(主要请求)
然后我需要为每个国家创建一个列(补充请求)。
data<-read.table(text="
LocationCountry
United States, Belgium, France, Ireland, Netherlands, Netherlands, Netherlands, Sweden
Spain, Spain, Spain, Spain
Korea, Republic of
Korea, Republic of
Austria, Austria, Austria
United States, United States, United States, United States, United States, United States
Italy, Italy
Korea, Republic of, Korea, Republic of, Korea, Republic of, Korea, Republic of, Korea, Republic of, Korea, Republic of, Korea, Republic of, Korea, Republic of
India, Iran, Islamic Republic of
Spain, Spain, Spain, Spain
Korea, Republic of
Turkey, Turkey", header=T, sep="\n")
任何建议将不胜感激
解决方案
在base R
中,我们可以使用 strsplit
to split into a list
,获取unique
元素并将paste
它们返回
data$LocationCountry <- sapply(strsplit(data$LocationCountry, ",\\s*"),
function(x) toString(unique(x)))
-输出
data
# LocationCountry
#1 United States, Belgium, France, Ireland, Netherlands, Sweden
#2 Spain
#3 Korea, Republic of
#4 Korea, Republic of
#5 Austria
#6 United States
#7 Italy
#8 Korea, Republic of
#9 India, Iran, Islamic Republic of
#10 Spain
#11 Korea, Republic of
#12 Turkey
对于补充部分,如果我们需要为“LocationCountry”中的每个元素创建二进制列,则使用更新后的具有唯一名称的“LocationCountry”列,将其拆分,然后应用mtabulate
library(qdapTools)
cbind(data, mtabulate(strsplit(data$LocationCountry, ",\\s+")))
-输出
LocationCountry Austria Belgium France India Iran Ireland Islamic Republic of Italy
1 United States, Belgium, France, Ireland, Netherlands, Sweden 0 1 1 0 0 1 0 0
2 Spain 0 0 0 0 0 0 0 0
3 Korea, Republic of 0 0 0 0 0 0 0 0
4 Korea, Republic of 0 0 0 0 0 0 0 0
5 Austria 1 0 0 0 0 0 0 0
6 United States 0 0 0 0 0 0 0 0
7 Italy 0 0 0 0 0 0 0 1
8 Korea, Republic of 0 0 0 0 0 0 0 0
9 India, Iran, Islamic Republic of 0 0 0 1 1 0 1 0
10 Spain 0 0 0 0 0 0 0 0
11 Korea, Republic of 0 0 0 0 0 0 0 0
12 Turkey 0 0 0 0 0 0 0 0
Korea Netherlands Republic of Spain Sweden Turkey United States
1 0 1 0 0 1 0 1
2 0 0 0 1 0 0 0
3 1 0 1 0 0 0 0
4 1 0 1 0 0 0 0
5 0 0 0 0 0 0 0
6 0 0 0 0 0 0 1
7 0 0 0 0 0 0 0
8 1 0 1 0 0 0 0
9 0 0 0 0 0 0 0
10 0 0 0 1 0 0 0
11 1 0 1 0 0 0 0
12 0 0 0 0 0 1 0
推荐阅读
- java - Spring如何读取x-www-form-urlencoded content Type?
- maven - 在 npm 中进行本地测试的 maven install 相当于什么?
- typescript - 如何在 TypeScript 中打印出一个类型的所有类型的属性?
- python-3.x - TF2 冻结 ckpt 到 pb
- broadleaf-commerce - Broeadleafcommerce 6.1.6 和 Mysql 8
- python - Telethon 消息回复和个人资料照片下载不起作用
- vue.js - Vue.js @click.capture 从子元素的停止事件触发
- javascript - 使用索引值删除数组列表
- sas - 使用 SAS 中的各种条件删除重复项
- date-fns - 解析日期时间返回无效日期