首页 > 解决方案 > nil 不是符号也不是字符串

问题描述

我正在尝试使用 Kimurai 来抓取网站。当我想做 /scrape 时,我遇到了这个错误。

def scrape
    url = "https://www.tripadvisor.com/Restaurants-g31892-Rogers_Arkansas.html"
    response = RestaurantsScraper.parse!(response, url, data: {})
    if response[status] == :completed && response[error].nil?
      flash.now[notice] = "Successfully scraped url"
    else
      flash.now[alert] = response[error]
    end
  end

这是我的刮板课程

class RestaurantsScraper < Kimurai::Base
    @name = "restaurants_scraper"
    @driver = :selenium_chrome
    @start_urls = ["https://www.tripadvisor.com/Restaurants-g31892-Rogers_Arkansas.html"]

    def parse(response, url:, data: {})
        response.xpath("//div[@class=_1llCuDZj]").each do |a|
            request_to :parse_repo_page, url: absolute_url(a[:href], base: url)
        end
    end
    
    def parse_repo_page(response, url:, data: {})
            item = {}
            item["title"] = t.css('a._15_ydu6b')&.text&.squish&.gsub('[^0-9].', '')
            item["type"] = t.css('span._1p0FLy4t')&.text&.squish
            item["reviews"] = t.css('span.w726Ki5B').text&.squish
            item["top_reviews"] = t.css('a._2uEVo25r _3mPt7dFq').text&.squish

            Restaurant.where(item).first_or_create
    end
end

这是我得到的错误 错误

标签: ruby-on-railsrubyweb-scrapingkimurai

解决方案


这是因为responsefromRestaurantsScraper.parse!(response, url, data: {})没有定义。

kimurai 文档中,它说您需要传递一个Nokogiri::HTML::Document对象。

我没有使用过 Kimurai,感觉肯定有更好的方法可以做到这一点,但以下内容可能足以让您进入下一步:

def scrape
  require 'open-uri'
  url = "https://www.tripadvisor.com/Restaurants-g31892-Rogers_Arkansas.html"  
  html = Nokogiri.parse open(url)
  response = RestaurantsScraper.parse!(html, url, data: {})
  if response[status] == :completed && response[error].nil?
    flash.now[notice] = "Successfully scraped url"
  else
    flash.now[alert] = response[error]
  end
end

推荐阅读