ruby-on-rails - 用大型 ruby/json 文件填充 db 的最佳方法?
问题描述
假设我有一个由散列组成的 ruby 或 json 文件,范围为 14-20MB(未缩小 300K 行)。我创建了一个 rake 任务,它遍历每个散列并根据每个散列中的值创建一个 AR 对象。
不幸的是,由于文件的大小,我stack level too deep
每次运行任务时都会出错。我实际上让脚本运行的唯一方法是将文件拆分为较小的文件。虽然这可行,但拆分文件并一遍又一遍地重复任务变得非常乏味。加载/运行大文件有什么好的选择吗?
耙任务
namespace :db do
task populate: :environment do
$restaurants.each_with_index do |r, index|
uri = URI(r[:website])
restaurant = Restaurant.find_or_create_by(name: r[:name], website: "#{uri.scheme}://#{uri.host}")
restaurant.cuisines = r[:cuisines].map { |c| Cuisine.find_or_create_by(name: c) }
location = Location.create(
restaurant: restaurant,
city_id: 1,
address: r[:address],
latitude: r[:latitude],
longitude: r[:longitude],
phone_number: r[:phone_number]
)
r[:hours].each do |h|
Hour.create(
location: location,
day: Date::DAYNAMES.index(h[:day]),
opens: h[:opens],
closes: h[:closes]
)
end
menu_group = MenuGroup.create(
restaurant: restaurant,
locations: [location],
address: r[:address]
)
r[:menus].each do |m|
menu = Menu.create(
menu_group: menu_group,
position: m[:position],
name: m[:name]
)
m[:sections].each do |s|
section = Section.create(
menu: menu,
position: s[:position],
name: s[:name]
)
s[:dishes].each do |d|
tag = Tag.find_or_create_by(
name: d[:name].downcase.strip
)
Dish.find_or_create_by(
restaurant: restaurant,
sections: [section],
tags: [tag],
name: d[:name],
description: d[:description]
)
end
end
end
puts "#{index + 1} of #{$restaurants.size} completed"
end
end
end
错误
rake aborted!
SystemStackError: stack level too deep
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/compile_cache/iseq.rb:12:in`to_binary'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/compile_cache/iseq.rb:12:in`input_to_storage'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/compile_cache/iseq.rb:37:in`fetch'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/compile_cache/iseq.rb:37:in`load_iseq'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `require'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:21:in `block in require_with_bootsnap_lfi'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/loaded_features_index.rb:65:in `register'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:20:in `require_with_bootsnap_lfi'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:29:in `require'
/usr/local/lib/ruby/gems/2.5.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:283:in `block in require'
/usr/local/lib/ruby/gems/2.5.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:249:in `load_dependency'
/usr/local/lib/ruby/gems/2.5.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:283:in `require'
/Users/user/app/lib/tasks/populate.rake:1:in `<main>'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:50:in `load'
/usr/local/lib/ruby/gems/2.5.0/gems/bootsnap-1.3.1/lib/bootsnap/load_path_cache/core_ext/kernel_require.rb:50:in `load'
/usr/local/lib/ruby/gems/2.5.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:277:in `block in load'
/usr/local/lib/ruby/gems/2.5.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:249:in `load_dependency'
/usr/local/lib/ruby/gems/2.5.0/gems/activesupport-5.2.0/lib/active_support/dependencies.rb:277:in `load'
/usr/local/lib/ruby/gems/2.5.0/gems/railties-5.2.0/lib/rails/engine.rb:650:in `block in run_tasks_blocks'
/usr/local/lib/ruby/gems/2.5.0/gems/railties-5.2.0/lib/rails/engine.rb:650:in `each'
/usr/local/lib/ruby/gems/2.5.0/gems/railties-5.2.0/lib/rails/engine.rb:650:in `run_tasks_blocks'
/usr/local/lib/ruby/gems/2.5.0/gems/railties-5.2.0/lib/rails/application.rb:515:in `run_tasks_blocks'
/usr/local/lib/ruby/gems/2.5.0/gems/railties-5.2.0/lib/rails/engine.rb:459:in `load_tasks'
/Users/user/app/Rakefile:6:in `<top (required)>'
/usr/local/lib/ruby/gems/2.5.0/gems/rake-12.3.1/exe/rake:27:in `<top (required)>'
(See full trace by running task with --trace)
解决方案
我会使用Sidekiq之类的东西将工作分解为可以同时运行的工作人员。
例如:
$restaurants.each_with_index do |r, index|
RestaurantParser.perform_async(r, index)
end
在 RestaurantParser 中执行您通常会采取的步骤。
只要餐厅不依赖数据库中已经存在的其他餐厅,您就可以同时运行工作人员以加快流程。
推荐阅读
- gitlab - 添加 GitLab 钩子以捕获缺少单元测试的合并请求
- javascript - 是否可以在 Fullcalendar resourceTimelineWeek 视图中设置日单元格内容 html?
- r - R:如何在不限制循环范围的情况下仅打印循环中的前 5 个 TRUE 条件
- karate - 空手道路径匹配排除
- excel - 汇总来自动态范围的间接范围列表
- captcha - 使用 Yii 1.1.6 在站点/登录时未显示 CCaptcha 小部件
- hana - 在 HANA XS 中创建过程动态
- php - 如何使用钩子更新 Woocommerce 地址
- java - 使用 SpringMockk 进行 Spring Boot 服务层测试
- elasticsearch - 带有 IN 查询的 Elasticsearch