r - How to loop multiple .Rdata objects from AWS/S3 into a list in R
问题描述
So I have several large .Rdata objects stored in one bucket of PUMS data. There are 10 files I need to load into R (10 years of data) to do analyses. I am having issues loading or looping multiple files from S3 into R.
Here is how to load one object at a time:
s3load("dataUS19.Rdata", bucket = "my bucket")
That creates ram issues so load them one at a time so created a bucket data frame and then tried this loop:
awsDF <- get_bucket_df("my bucket") # getting bucket
data <- list() # creating list
data <- awsDF$Key[grep("dataUS", awsDF$Key)] #specify only the .Rdata objects that start with dataUS
for (match in data) {
s3load(object=match, bucket="my bucket"))
The issue is that loop does load multiple objects at once but it does not store them as a list. They load as separate dfs/objects which creates ram issues (able to load about 6 of the files)
I am not a programmer and was trained in Stata so any help to get multiple .Rdata objects in a list from S3 would be greatly appreciated.
解决方案
Consider loading with environments. Similar to base R's load
, aws.s3's s3load
maintains an envir
argument.
rdata <- awsDF$Key[grep("dataUS", awsDF$Key)]
data_list <- lapply(rdata, function(file) {
s3load(file, bucket="my bucket", envir=(temp_env <- new.env()))
as.list.environment(temp_env)
})
If .Rdata files contains only one object, extract first item:
as.list.environment(temp_env)[1]
推荐阅读
- raku - 如何在 Perl 6 中卸载模块?
- angular - 如何在 Angular 中引用和使用另一个模块中的组件?
- r - 如何制作具有 FASTA 格式参数的 R 函数
- java - 如何将内部对象从 Hibernate Entity 复制到 DTO?
- ionic-framework - 离子 3 - Yelp API
- sql-server - EXECUTE sp_executesql 在 SQL Server 中不起作用
- node.js - 自适应卡片中的按钮对齐
- python - 在我分发我的 Django/Django-Rest-Framework 项目后,我无法请求它的 `media` 和 `static` 目录
- git - Git 将推送的更改恢复为先前的提交
- python - python中的对象和类型有什么区别