r - 如何将非 R 对象与 R 对象一起“序列化”
问题描述
R 中的一些对象实际上是指向较低级别(不确定这是否是正确的术语)结构的指针,这些结构需要专门的函数来保存到磁盘。例如,saveRDS
不足以保留lightgbm
提升树:
## Create a lightgbm booster
library(lightgbm)
data(agaricus.train, package = "lightgbm")
train = agaricus.train
bst = lightgbm(data = train$data,label = train$label,
nrounds = 1, objective = "binary")
## but suppose bst is only one part of a bigger analysis
results = list(bst = bst, metadata = 'other stuff')
## then it would be nice if this IO cycle worked, but the last line crashes R
# saveRDS(results, file = 'so_post_temp')
# rm(results)
# rm(bst)
# lgb.unloader(wipe = TRUE)
# results = readRDS('so_post_temp')
# predict(results$bst, train$data)
标准解决方案并不可怕,但足以惹恼我。它需要使用单独的 lightgbm 特定保护程序,并为我要保存的任何分析创建单独的“伴侣”文件:
results = list(lgbpath = 'bst.lightgbm', metadata = 'other stuff')
saveRDS(results, file = 'so_post_temp')
lgb.save(bst, file = 'bst.lightgbm')
# destruct:
rm(results)
rm(bst)
lgb.unloader(wipe = TRUE)
# reconstruct:
results = readRDS('so_post_temp')
bst = lgb.load(results$lgbpath)
predict(bst, train$data)
有没有办法清理这个以某种方式将 R 对象和其他对象绑定到一个文件中?就像是
fake_pointer_to_disk = [points to some kind of R object instead]
fake_file_object = lgb.save(bst, file = fake_pointer_to_disk)
results = list(bst = fake_file_object, metadata = 'other stuff')
# later loaded as
bst = lgb.load(results$bst)
解决方案
我认为readBin
应该足够了:
tf <- tempfile()
lgb.save(bst, file=tf)
# since I don't have lightgbm loaded, this is my fake model/save
bst <- 100:150 # my fake data
writeBin(bst, file = tf) # poor man's lgb.save :-)
现在将其作为 blob 读入:
rawbst <- readBin(tf, raw(), n=file.size(tf))
file.remove(tf)
并以您想要的方式保存它:
saveRDS(list(bst = rawbst, metadata = 'other stuff'), file = 'so_post_temp')
当您准备好重新水合您的结果和模型时:
tf2 <- tempfile()
results <- readRDS('so_post_temp')
writeBin(results$bst, tf2)
bst <- lgb.load(tf2)
file.remove(tf2)
(警告:测试不足:它适用于假数据,我没有尝试过使用bst
-like 对象。)
推荐阅读
- php - 如果他们拥有相同的产品,如何将所有用户分组到产品列表中?
- c# - Login session in ASP.NET MVC
- c# - Update all null values in column .netcore console app
- python - how to resize surface elements after resizing window in pygame?
- ios - React Native App not navigating after successful firebase auth
- python - DRF foreign key field is not showing when serializing object
- node.js - How to have WebRTC Stream CD Quality Audio With Minimal Packet Loss
- apache-spark - Classify or match similar patients together with machine learning techniques based on SSN, Date of Birth, Name, Gender, Last Updated Time
- node.js - PostgreSQL 查询返回不在我的数据库中的值
- java - cannot determine the cause of these test errors, debugger isnt functioning here, nor can i figure out how to resolve this