r - 基于相同的公司名称标准化客户 ID

问题描述

我需要使用众多客户 ID 之一，并将其标准化为所有完全相同的公司名称。

前

    Customer.Ids       Company        Location
    1211            Lightz           New York
    1325            Comput.Inc       Seattle 
    1756            Lightz          California

后

    Customer.Ids     Company        Location
    1211             Lightz            New York
    1325             Comput.Inc        Seattle 
    1211             Lightz           California

两家公司的客户 ID 现在相同。哪个代码最适合这个？

标签： r

我们可以match在这里使用它，因为它返回第一个匹配的位置。我们可以match Company用Company. 根据?match

match 返回其第二个参数的第一个参数的（第一个）匹配位置的向量。

df$Customer.Ids <- df$Customer.Ids[match(df$Company, df$Company)]
df

#  Customer.Ids    Company   Location
#1         1211     Lightz    NewYork
#2         1325 Comput.Inc    Seattle
#3         1211     Lightz California

在哪里

match(df$Company, df$Company) #returns
#[1] 1 2 1

其他一些选项，使用sapply

df$Customer.Ids <- df$Customer.Ids[sapply(df$Company, function(x)
                               which.max(x == df$Company))]

在这里，我们遍历每个Company并获取它发生的第一个实例。

或者使用ave与@Shree 遵循相同逻辑的另一个选项，以按组获得第一次出现。

with(df, ave(Customer.Ids, Company, FUN = function(x) head(x, 1)))
#[1] 1211 1325 1211

r - 基于相同的公司名称标准化客户 ID

问题描述

解决方案

推荐阅读