首页 > 解决方案 > 为什么排名功能赋予所有国家同等的排名?

问题描述

我在问一个已经回答的问题:使用 dplyr 创建排名变量?. 但由于一些奇怪的原因,该方法不适用于我的数据。我按国家对两个时期的失业率差异进行排名。

我按照建议使用此代码:

df %>% mutate(rank = dense_rank(desc(difference)))

但我得到 1 作为所有国家的排名。有人可以告诉我出了什么问题吗?

这是我的数据:

structure(list(cntry = structure(1:23, .Label = c("Austria", 
"Belgium", "Switzerland", "Czech Republic", "Germany", "Denmark", 
"Estonia", "Greece", "Spain", "Finland", "France", "Hungary", 
"Ireland", "Iceland", "Italy", "Luxembourg", "Netherlands", "Norway", 
"Poland", "Portugal", "Sweden", "Slovakia", "United Kingdom"), class = "factor"), 
    difference = c(0.0321271618815491, -0.0251554839428438, 1.15072942999273, 
    1.33128598731325, -2.26400160811014, 3.15779980836141, 6.80457896869579, 
    6.70389987400804, 10.8919891165462, 0.547460084552159, 0.906834874234579, 
    3.01112447330944, 8.5885631447415, 3.75206570820895, 1.58794503937105, 
    0.334356006591187, 0.664766564981566, 0.0155501469693973, 
    -0.984605793974606, 4.28470580541735, 1.11996749834057, 1.67278245779503, 
    1.93783051552776)), row.names = c(NA, -23L), groups = structure(list(
    cntry = structure(1:23, .Label = c("Austria", "Belgium", 
    "Switzerland", "Czech Republic", "Germany", "Denmark", "Estonia", 
    "Greece", "Spain", "Finland", "France", "Hungary", "Ireland", 
    "Iceland", "Italy", "Luxembourg", "Netherlands", "Norway", 
    "Poland", "Portugal", "Sweden", "Slovakia", "United Kingdom"
    ), class = "factor"), .rows = structure(list(1L, 2L, 3L, 
        4L, 5L, 6L, 7L, 8L, 9L, 10L, 11L, 12L, 13L, 14L, 15L, 
        16L, 17L, 18L, 19L, 20L, 21L, 22L, 23L), ptype = integer(0), class = c("vctrs_list_of", 
    "vctrs_vctr", "list"))), row.names = c(NA, 23L), class = c("tbl_df", 
"tbl", "data.frame"), .drop = TRUE), class = c("grouped_df", 
"tbl_df", "tbl", "data.frame"))

标签: rdplyr

解决方案


基于dput,它是一个分组数据集,即按“cntry”分组,每个“cntry”只有一个观察值。我们可以ungroup申请

library(dplyr)
df %>% 
    ungroup %>%
    mutate(rank = dense_rank(desc(difference)))

推荐阅读