首页 > 解决方案 > 基于公共列组合数据集

问题描述

我想为每个用户名使用 twitterID 为我的数据集创建一个新列。使用这个rtweet包,我可以像这样推导出 ID:

usr_df <- lookup_users(df2$Username) %>% 
  select(user_id, screen_name)

所以我认为我的操作相当简单:我只想在一个数据框中将正确的 twitterID 添加到正确的用户名中。我一直在玩,inner_join没有任何成功。此外,某些行包含用户名的 NA。

数据集 1:

# A tibble: 6 x 2
  Name           Username       
  <chr>          <chr>          
1 ZiadAboultaif  ziad_aboultaif 
2 ScottAitchison ScottAAitchison
3 DanAlbas       DanAlbas       
4 JohnAldag      jwaldag        
5 OmarAlghabra   OmarAlghabra   
6 ShafqatAli     Shafqat_Ali_1  

# For reproducibility:
structure(list(Name = c("ZiadAboultaif", "ScottAitchison", "DanAlbas", 
"JohnAldag", "OmarAlghabra", "ShafqatAli"), Username = c("ziad_aboultaif", 
"ScottAAitchison", "DanAlbas", "jwaldag", "OmarAlghabra", "Shafqat_Ali_1"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

数据集 2:

  user_id    screen_name    
  <chr>      <chr>          
1 4568748862 ziad_aboultaif 
2 172004509  ScottAAitchison
3 16278177   DanAlbas       
4 335769776  jwaldag        
5 20199202   OmarAlghabra   
6 578640179  Shafqat_Ali_1 

# For reproducibility:
structure(list(user_id = c("4568748862", "172004509", "16278177", 
"335769776", "20199202", "578640179"), screen_name = c("ziad_aboultaif", 
"ScottAAitchison", "DanAlbas", "jwaldag", "OmarAlghabra", "Shafqat_Ali_1"
)), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame"
))

标签: r

解决方案


推荐阅读