r - 在 R Studio 上使用 API 进行人口普查数据
问题描述
所以,我是使用 R 的新手,如果问题看起来有点基本,我很抱歉!
但我的工作是让我使用 API 查看人口普查数据并识别每个区域中的一些变量,然后创建一个他们可以查看的 csv 文件。我相信,代码是为我完全编写的,但我需要将变量更改为:
S2602_C01_023E - black / his
S2602_C01_081E - unemployment rate
S2602_C01_070E - not US citizen (divide by total population)
S0101_C01_030E - # over 65 (divide by total pop)
S1603_C01_009E - # below poverty (divide by total pop)
S1251_C01_010E - # child under 18 (divide by # households)
S2503_C01_013E - median income
S0101_C01_001E - total population
S2602_C01_078E - in labor force
而且,我需要划分一些变量,就像我写的那样,并将所有这些导出到一个 CSV 文件中。我只是真的不知道如何处理代码..就像我迷路了,因为我从未使用过 R。我尝试将变量更改为我需要的变量,但出现错误。任何帮助将不胜感激!
library(tidycensus)
library(tidyverse)
library(stringr)
library(haven)
library(profvis)
#list of variables possible
v18 <- load_variables(year = 2018,
dataset = "acs5",
cache = TRUE)
#function to get variables for all states. Year, variables can be
easily edited.
get_census_data <- function(st) {
Sys.sleep(5)
df <- get_acs(year = 2018,
variables = c(totpop = "B01003_001",
male = "B01001_002",
female = "B01001_026",
white_alone = "B02001_002",
black_alone = "B02001_003",
americanindian_alone = "B02001_004",
asian_alone = "B02001_005",
nativehaw_alone = "B02001_006",
other_alone = "B02001_007",
twoormore = "B02001_008",
nh = "B03003_002",
his = "B03003_003",
noncit = "B05001_006",
povstatus = "B17001_002",
num_households = "B19058_001",
SNAP_households = "B19058_002",
medhhi = "B19013_001",
hsdiploma_25plus = "B15003_017",
bachelors_25plus = "B15003_022",
greater25 = "B15003_001",
inlaborforce = "B23025_002",
notinlaborforce = "B23025_007",
greater16 = "B23025_001",
civnoninstitutional = "B27010_001",
withmedicare_male_0to19 = "C27006_004",
withmedicare_male_19to64 = "C27006_007",
withmedicare_male_65plus = "C27006_010",
withmedicare_female_0to19 = "C27006_014",
withmedicare_female_19to64 = "C27006_017",
withmedicare_female_65plus = "C27006_020",
withmedicaid_male_0to19 = "C27007_004",
withmedicaid_male_19to64 = "C27007_007",
withmedicaid_male_65plus = "C27007_010",
withmedicaid_female_0to19 = "C27007_014",
withmedicaid_female_19to64 = "C27007_017",
withmedicaid_female_65plus ="C27007_020"),
geography = "tract",
state = st )
return(df)
}
#loops over all states
df_list <- setNames(lapply(states, get_census_data), states)
##if you want to keep margin of error, remove everything after %>%
in next two lines
final_df <- bind_rows(df_list) %>%
select(-moe)
colnames(final_df)[3] <- "varname"
#cleaning up final data, making it wide instead of long
final_df_wide <- final_df %>%
gather(variable, value, -(GEOID:varname)) %>%
unite(temp, varname, variable) %>%
spread(temp, value)
#exporting to csv file, adjust your path
write.csv(final_df,"C:\Users\NAME\Documents\
acs_2018_tractlevel_dat.a.csv")
解决方案
由于您不能在不透露您的 API 密钥的情况下真正给出可重现的示例,因此我将尽我所能找出在这里可行的方法:
让我们首先编辑从 API 中提取数据的函数:
get_census_data <- function(st) {
Sys.sleep(5)
df <- get_acs(year = 2018,
variables = c(blackHis= "S2602_C01_023E",
unEmployRate = "S2602_C01_081E",
notUSCit = "S2602_C01_070E")
geography = "tract",
state = st )
return(df)
}
我刚刚输入了两个变量,但你应该明白这一点。
试试这是否适合你。并返回存储在各个变量中的数据。
推荐阅读
- python - python输入函数输入表格并打印表格内容
- javascript - 为什么 __dirname 不返回绝对目录路径?
- sql - 并非所有行都是在 SQL Server 中使用 OPENROWSET 导入的
- powershell - powershell 最早的电子邮件、日历、联系人
- r - 如何在不显式调用名称的情况下动态访问 r 矩阵行名称?
- word - 如何将word文档的每一页分别保存为word
- python - 神秘的 Python 全局与函数范围问题
- python-3.x - 谁能帮助/告诉我我做错了什么?和/或我应该在查询中做什么?
- python - PySimpleGui 将多行保存为 .txt
- ensemble-learning - 集成学习与机器学习