ruby-on-rails - 我是否必须声明一个常量才能从抓取的文件中导入我的模块数据?
问题描述
不知道我做错了什么。希望将我的刮刀文件数据scraper.rb
导入应用程序。
无法理解为什么我会收到此错误,或者为什么我必须声明一个名为SCRAPER
错误建议的常量。
Puma 发现了这个错误:预期文件 /Users/jmwofford/Desktop/Dev/scratchpad/scratch2_PRIMARY/projects/rails_scraper/scraperProj/app/controllers/scraper.rb 来定义常量 Scraper,但没有(Zeitwerk::NameError)
下面给出的是我的代码
刮刀.rb
require 'net/http'
require 'uri'
require 'json'
require "awesome_print"
require 'nokogiri'
require 'httparty'
require 'mechanize'
module ScraperFinder
def scrape_essential_data
uri = URI.parse("https://buildout.com/plugins/4b4283d94258de190a1a5163c34c456f6b1294a2/inventory")
request = Net::HTTP::Get.new(uri)
request.content_type = "application/x-www-form-urlencoded; charset=UTF-8"
request["Authority"] = "buildout.com"
request["Accept"] = "application/json, text/javascript, */*; q=0.01"
request["X-Newrelic-Id"] = "Vg4GU1RRGwIJUVJUAwY="
request["Dnt"] = "1"
request["X-Requested-With"] = "XMLHttpRequest"
request["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"
request["Origin"] = "https://buildout.com"
request["Sec-Fetch-Site"] = "same-origin"
request["Sec-Fetch-Mode"] = "cors"
request["Sec-Fetch-Dest"] = "empty"
request["Referer"] = "https://buildout.com/plugins/4b4283d94258de190a1a5163c34c456f6b1294a2/leasespaces.jll.com/inventory/?pluginId=0&iframe=true&embedded=true&cacheSearch=true&=undefined"
request["Accept-Language"] = "en-US,en;q=0.9"
req_options = {
use_ssl: uri.scheme == "https",
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
json = JSON.parse(response.body)
props = json['inventory']
props.each do |listing|
property = {
'id' => listing['id'],
'name' => listing['name'],
'address' => listing['address_one_line'],
'description' => listing['id'],
'property_type' => listing['property_sub_type_name'],
'attr' => listing['index_attributes'],
'latitude'=> listing['latitude'],
'longitude' => listing['longitude'],
'picture' => listing['photo_url'],
'sizing' => listing['size_summary'],
'link' => listing['show_link'],
'brokerContacts' => listing['broker_contacts']
}
Property.create(
name: property.name,
address: property.address,
description: property.description,
property_type: property.property_type,
lat: property.latitude,
lon: property.longitude,
pic: property.picture,
size: property.sizing,
link: property.link,
brokerContact: property.brokerContacts
)
p "==========================================================================================="
# pp property
end
end
end
用户控制器
require_relative ("./scraper.rb")
include ScraperFinder
class UsersController < ApplicationController
def index
@scraped = ScraperFinder.scrape_essential_data
end
end
index.html.erb
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<title></title>
<meta name="description" content="">
<meta name="viewport" content="width=device-width, initial-scale=1">
<link rel="stylesheet" href="">
</head>
<body>
<% @scraped.each do |s|%>
<div class="prop_container"> <%= s %> </div>
<%end%>
<script src="" ></script>
</body>
</html>
模式
create_table "properties", force: :cascade do |t|
t.string "name"
t.string "address"
t.string "description"
t.string "property_type"
t.string "attr"
t.string "lat"
t.string "lon"
t.string "pic"
t.string "size"
t.string "link"
t.string "brokerContact"
t.datetime "created_at", precision: 6, null: false
t.datetime "updated_at", precision: 6, null: false
end
create_table "users", force: :cascade do |t|
t.string "name"
t.string "email"
t.datetime "created_at", precision: 6, null: false
t.datetime "updated_at", precision: 6, null: false
end
解决方案
Zeitwerk(Rails 6+ 中使用的自动加载器)假设您在与常量同名的文件中声明常量。scraper.rb
因此预计会声明常量Scraper
。Zeitwerk 与旧的自动加载器不同,它将在启动时遍历您的自动加载目录并索引所有文件,这就是即使您没有引用常量它也会抱怨的原因Scraper
。
您可以将 Zeitwerk 配置为忽略某些文件夹,但您确实应该使用该程序并将您的代码调整为自动加载器。首先重命名您的文件scraper_finder.rb
,它不属于控制器目录,因为它不是控制器。将它放在app/lib
或app/clients
或任何地方确实更合适。
这实际上只是冰山一角,因为这段代码很糟糕。你真正想要的是这样的:
# app/lib/scraper_finder.rb
require 'net/http'
# You don't need to require gems as they are required by bundler during startup
module ScraperFinder
# You need to use self to make the method callable as `ScraperFinder.scrape_essential_data`
def self.scrape_essential_data
req_options = {
use_ssl: uri.scheme == "https",
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(
self.get("https://buildout.com/plugins/4b4283d94258de190a1a5163c34c456f6b1294a2/inventory")
)
end
json = JSON.parse(response.body)
json['inventory'].map do |listing|
Property.create(extract_attributes(listing))
end
end
private
def self.extract_attributes(listing)
listing.slice('id', 'name').symbolize_keys.merge(
description: listing['id'],
property_type: listing['property_sub_type_name'],
attr: listing['index_attributes'],
lat: listing['latitude'],
lon: listing['longitude'],
pic: listing['photo_url'],
size: listing['size_summary'],
link: listing['show_link'],
brokerContacts: listing['broker_contacts']
)
end
def self.get(uri)
Net::HTTP::Get.new(URI.parse(uri)).then do |req|
req.content_type = "application/x-www-form-urlencoded; charset=UTF-8"
req["Authority"] = "buildout.com"
req["Accept"] = "application/json, text/javascript, */*; q=0.01"
req["X-Newrelic-Id"] = "Vg4GU1RRGwIJUVJUAwY="
req["Dnt"] = "1"
req["X-Requested-With"] = "XMLHttpRequest"
req["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/85.0.4183.102 Safari/537.36"
req["Origin"] = "https://buildout.com"
req["Sec-Fetch-Site"] = "same-origin"
req["Sec-Fetch-Mode"] = "cors"
req["Sec-Fetch-Dest"] = "empty"
req["Referer"] = "https://buildout.com/plugins/4b4283d94258de190a1a5163c34c456f6b1294a2/leasespaces.jll.com/inventory/?pluginId=0&iframe=true&embedded=true&cacheSearch=true&=undefined"
req["Accept-Language"] = "en-US,en;q=0.9"
end
end
end
class UsersController < ApplicationController
def index
@scraped = ScraperFinder.scrape_essential_data
end
end
Ruby 中的散列与 Javascript 中的对象或 Struct 不同,因此您的代码将引发 NoMethodError on-property.name
访问散列属性使用方括号property['name']
。但是正如你所看到的那样,所有的重复从一开始就没有保证,因为 Ruby 具有出色的哈希处理方法。
您的方法也被声明为实例方法,但您将其称为 ScraperFinder.scrape_essential_data
. 要使其成为模块方法,您需要使用def self.scrape_essential_data
.
一些快速重构还将这个笨重的怪物拆分为三个更容易阅读原因的独立方法。
如果它在 app 目录中,您不需要手动要求您自己的代码,这样做只会引入错误。使用自动装载机。
include ScraperFinder
正在将模块中的所有方法复制到全局范围内,因为您在模块/类之外调用它!由于您的模块似乎只是具有单例方法(在模块本身上调用的方法),因此您无需在任何地方导入它。
推荐阅读
- c++ - 如何从 bitset 转换回 int
- machine-learning - AWS SageMaker ML DevOps tooling / architecture - Kubeflow?
- ruby - 在 ruby 中接收未定义的方法错误(无导轨)
- math - Calculate Row and Column
- javascript - 选择选项后如何选择选项?
- firebase - Flutter firebase认证登录成功无论用户是否在数据库中
- javascript - 循环输入对象数组
- python - FutureWarning 未与 warnings.simplefilter(action = "error", category=FutureWarning) 一起显示
- html - 如何将输入文本右对齐
- fastapi - FastAPI:CORS 中间件不使用 GET 方法