r - shape file > data frame > using ggplot and joining on GEO ID with other data sets
问题描述
The county shape file is only available as a national shape file (2017 TIGER/Line® Shapefiles: Counties (and equivalent), https://www.census.gov/cgi-bin/geo/shapefiles/index.php?year=2017&layergroup=Counties+%28and+equivalent%29
I want to select just one state (e.g. Pennsylvania). So,
# read county polygons
counties <- readOGR(dsn="tl_2017_us_county", layer="tl_2017_us_county")
# subset to PA counties
PA_counties <- subset(counties, counties@data$STATEFP == "42")
HOWEVER, when I try and create a data frame and map, I'm getting the error: Error in FUN(X[[i]], ...) : object 'lon' not found
# create a data frame
PA_counties.df <- as.data.frame(PA_counties)
PA_counties.dfFORT <-fortify(PA_counties.df, region = "GEOID")
gg<-ggplot()
gg <- gg +geom_polygon(data =PA_counties.dfFORT, aes(x=lon, y=lat, group=group,
fill=NA), color = "blue", fill=NA, size = 0.5
gg <- gg +coord_map()
gg
Help? I'm hoping to create this map; merge the data with another file by GeoID, and fill some of the counties (e.g. for GeoID xxx if =1 then fill with blue, etc).
This must be a very common mapping use case scenario? Grateful for any tips?
Best, Lori
解决方案
我做了一些故障排除,我鼓励你也这样做。首先,查看强化数据框的名称:您有 、 和 等列,INTPLAT
而不是、和。INTPLON
GROUP
lat
long
group
当您调用as.data.frame
空间数据框然后调用fortify
时,您并没有得到我认为您期望的结果。如果您仔细查看从这两个函数获得的输出,它似乎是质心或其他点,这样每个县只有一个点,并且坐标是因子,而不是数字。相反,您需要调用fortify
空间数据框本身。您应该期待一个包含数千行的数据框,因为构成每个县的多边形形状需要很多点。
请注意,我曾经tigris::counties
获取 shapefile,因为我无法读取下载的文件来制作 reprex,但我很确定 shapefile 是相同的。
library(tidyverse)
library(sf)
library(rgdal)
counties <- tigris::counties(cb = T)
# counties <- readOGR(dsn="tl_2017_us_county", layer="tl_2017_us_county")
PA_counties <- subset(counties, counties@data$STATEFP == "42")
PA_counties.dfFORT <- fortify(PA_counties, region = "GEOID")
names(PA_counties.dfFORT)
#> [1] "long" "lat" "order" "hole" "piece" "id" "group"
然后您可以geom_polygon
按预期使用:
ggplot(PA_counties.dfFORT, aes(x = long, y = lat, group = group)) +
geom_polygon(fill = NA, color = "blue") +
coord_map()
一种更简单、更灵活的方法是使用sf
. 同样,您可以sf::read_sf
在 shapefile 上使用;为了制作一个代表,我调用sf::st_as_sf
了我得到的空间数据框tigris
。sf
允许您使用dplyr
-style 函数执行过滤、添加列和计算摘要等操作。
counties_sf <- st_as_sf(counties)
# counties_sf <- read_sf("tl_2017_us_county")
pa_counties_sf <- counties_sf %>%
filter(STATEFP == "42")
head(pa_counties_sf)
#> Simple feature collection with 6 features and 9 fields
#> geometry type: MULTIPOLYGON
#> dimension: XY
#> bbox: xmin: -80.36087 ymin: 39.72002 xmax: -74.7215 ymax: 40.74368
#> epsg (SRID): 4269
#> proj4string: +proj=longlat +ellps=GRS80 +towgs84=0,0,0,0,0,0,0 +no_defs
#> STATEFP COUNTYFP COUNTYNS AFFGEOID GEOID NAME LSAD
#> 1 42 003 01213657 0500000US42003 42003 Allegheny 06
#> 2 42 017 01209173 0500000US42017 42017 Bucks 06
#> 3 42 041 01209176 0500000US42041 42041 Cumberland 06
#> 4 42 055 01213670 0500000US42055 42055 Franklin 06
#> 5 42 061 01213672 0500000US42061 42061 Huntingdon 06
#> 6 42 071 01209181 0500000US42071 42071 Lancaster 06
#> ALAND AWATER geometry
#> 1 1890889706 37411488 MULTIPOLYGON (((-80.36078 4...
#> 2 1565408723 45201371 MULTIPOLYGON (((-75.48406 4...
#> 3 1412834155 12521844 MULTIPOLYGON (((-77.62503 4...
#> 4 2000052118 1544300 MULTIPOLYGON (((-78.09914 3...
#> 5 2265337403 37851955 MULTIPOLYGON (((-78.2567 40...
#> 6 2444606396 103423673 MULTIPOLYGON (((-76.72162 4...
sf
也很容易使用,ggplot
因为从ggplot2
几周前发布的 3.0.0 版本开始,它ggplot
附带了一个自动geom_sf
读取sf
对象geometry
列的功能。您可以直接从读取 shapefile 到绘制它。
ggplot(pa_counties_sf) +
geom_sf(fill = NA, color = "blue")
作为一个例子,你可以如何将你的形状合并到另一个数据集中,我为每个县制作了一些带有字母的虚拟数据,然后调用dplyr::left_join
将其与sf
数据框连接,然后绘制。希望这有助于您入门!
pa_data <- data_frame(
GEOID = pa_counties_sf$GEOID
) %>%
mutate(type = sample(c("A", "B"), size = nrow(.), replace = T))
pa_counties_sf %>%
left_join(pa_data, by = "GEOID") %>%
ggplot() +
geom_sf(aes(fill = type), color = "blue")
由reprex 包(v0.2.0)于 2018 年 7 月 15 日创建。
推荐阅读
- python - a dynamic dataframe range
- xml - Resharper - Format
- python - Total sum of a list of numbers with list iteration
- node.js - how to make for loop in FindById and then send response in mongoose?
- bash - 使用 wget 将 bash 函数转换为 powershell
- java - 如何使用 Spring WebFlux 实现自定义异常处理程序
- javascript - 需要从函数传递参数并将其传递到 JSON 有效负载值中
- javascript - 添加和删除人员的循环算法
- javascript - How to save an output in console as a variable in react?
- python - 在 docker 容器中运行 FastAPI