首页 > 解决方案 > 获取数据集 R 包中所有对象名称的列表?

问题描述

如何获取数据集包中对象的确切名称列表?

我在这里找到了很多:

data_package = data(package="datasets")
datasets <- as.data.frame(data_package[[3]])$Item
datasets

#   [1] "AirPassengers"          "BJsales"                "BJsales.lead (BJsales)" "BOD"                    "CO2"                    "ChickWeight"           
#   [7] "DNase"                  "EuStockMarkets"         "Formaldehyde"           "HairEyeColor"           "Harman23.cor"           "Harman74.cor"          
#  [13] "Indometh"               "InsectSprays"           "JohnsonJohnson"         "LakeHuron"              "LifeCycleSavings"       "Loblolly"              
#  [19] "Nile"                   "Orange"                 "OrchardSprays"          "PlantGrowth"            "Puromycin"              "Seatbelts"             
#  [25] "Theoph"                 "Titanic"                "ToothGrowth"            "UCBAdmissions"          "UKDriverDeaths"         "UKgas"                 
#  [31] "USAccDeaths"            "USArrests"              "USJudgeRatings"         "USPersonalExpenditure"  "UScitiesD"              "VADeaths"              
#  [37] "WWWusage"               "WorldPhones"            "ability.cov"            "airmiles"               "airquality"             "anscombe"              
#  [43] "attenu"                 "attitude"               "austres"                "beaver1 (beavers)"      "beaver2 (beavers)"      "cars"                  
#  [49] "chickwts"               "co2"                    "crimtab"                "discoveries"            "esoph"                  "euro"                  
#  [55] "euro.cross (euro)"      "eurodist"               "faithful"               "fdeaths (UKLungDeaths)" "freeny"                 "freeny.x (freeny)"     
#  [61] "freeny.y (freeny)"      "infert"                 "iris"                   "iris3"                  "islands"                "ldeaths (UKLungDeaths)"
#  [67] "lh"                     "longley"                "lynx"                   "mdeaths (UKLungDeaths)" "morley"                 "mtcars"                
#  [73] "nhtemp"                 "nottem"                 "npk"                    "occupationalStatus"     "precip"                 "presidents"            
#  [79] "pressure"               "quakes"                 "randu"                  "rivers"                 "rock"                   "sleep"                 
#  [85] "stack.loss (stackloss)" "stack.x (stackloss)"    "stackloss"              "state.abb (state)"      "state.area (state)"     "state.center (state)"  
#  [91] "state.division (state)" "state.name (state)"     "state.region (state)"   "state.x77 (state)"      "sunspot.month"          "sunspot.year"          
#  [97] "sunspots"               "swiss"                  "treering"               "trees"                  "uspop"                  "volcano"               
# [103] "warpbreaks"             "women" 

所以像这样的东西会遍历每一个

for(i in 1:length(datasets)) {
  print(get(datasets[i]))
  cat("\n\n")
}

它适用于前两个数据集(AirPassengersBJsales),但它失败了,BJsales.lead (BJsales)因为它应该被称为datasets::BJsales.lead.

我想我可以使用字符串拆分或类似的方法来丢弃从空格开始的任何内容,但我想知道有没有更简洁的方法来获取dataset包中所有对象的列表?

笔记

ls(getNamespace("datasets"), all.names=TRUE)
# [1] ".__NAMESPACE__."      ".__S3MethodsTable__." ".packageName" 

标签: r

解决方案


?data帮助页面上有一条注释说明

如果数据集的名称与应该用于检索它们的参数的名称不同,则索引将有一个类似的条目beaver1 (beavers),告诉我们beaver1可以通过调用来检索数据集data(beavers)

所以实际的对象名称是最后括号之前的东西。由于该值仅作为字符串返回,因此不幸的是,您需要将其删除。但你可以做到这一点gsub

datanames <- data(package="datasets")$results[,"Item"]
objnames <- gsub("\\s+\\(.*\\)","", datanames)

for(ds in objnames) {
  print(get(ds))
  cat("\n\n")
}

推荐阅读