首页 > 解决方案 > 您可以自动将产品从在线商店添加到 WooCommerce 吗?

问题描述

我想从这个荷兰网上商店(类似于 eBay/亚马逊)上的一个帐户中获取所有产品,并使用 WooCommerce将它们添加到这个 WordPress 网上商店。大约 2 到 3 周前,我开始了 Web 开发,我了解 HTML、CSS、JavaScript、Nodejs 和 Express 的基础知识。我想我大概知道该怎么做了,那就是:

我的问题是

  1. 这可能吗?
  2. 我可以使用我可以使用的语言吗?
  3. 你会使用什么方法?(例如,您将如何抓取 HTML,是否有比我描述的步骤更简单的方法,您会使用代码还是使用一些自动化软件等)

这对我来说是一个大项目,所以欢迎任何帮助(关于如何开始)!

标签: javascriptnode.jsweb-scrapingwoocommerceautomation

解决方案


您对这些步骤是正确的,是的,这是可能的。您可以使用 node.js 抓取数据,如您所知,我个人的偏好是 python 在数据抓取方面,但您可以在 node.js 中完成。Node.js 有 HTML 解析器等等。我建议你几件事:

  • 使用解析器解析 HTML 数据,以便更好地访问元素以获取数据。
  • 使用某种数据结构来正确存储数据,例如:JSON、XML、CSV...
  • 如果获取数据是一个漫长的过程,请先获取数据,因为如果解析系统中的任何部分不适合,您可能会在解析时丢失所有数据,然后再解析数据。

我将把我为从您放置的网站获取数据而编写的代码带到这里,它是用 Python 语言编写的,但我在上面添加了注释,以便您更好地了解如何获取数据并用其他语言编写。您还可以使用split从 HTML 数据中剪切部分,您甚至不需要使用解析器。

例子:

import requests, json
from bs4 import BeautifulSoup
from pprint import pprint

endpoint = "http://johndevisser.marktplaza.nl/?p=1"

# Send a get request to page to get the html.
data = requests.get(endpoint).content

# Parse the html via BeautifulSoup
page = BeautifulSoup(data)

# Find 'div' elements whose 'itemscope' attributes are 'itemscope'
products = page.find_all("div", {"itemscope": "itemscope"})[1:]

# Create an empty array to store prepared data.
finalProductList = []

# Iterate over the products.
for i in products:
    # Create a dictionary object to store data properly.
    productData = {}
    # Get the title attribute from 'a' element on the current product.
    productData["title"] = i.find("a").get("title")
    # Get the href attribute from 'a' element on the current product because the real source can be useful in the future.
    productData["origin"] = i.find("a").get("href")
    # Get the image url from 'img' elements to download images.
    productData["imageURL"] = i.find("img").get("src")
    # This may look you complicated but it just finds 'span' elements value of 'class' attribute is 'subtext' and get the
    # inner text, split into two from ' '(space) to this ['€', '15,00'] and get the right part which is the second part
    # in the array which is the price and replace comma with dot to parse in float value.
    productData["price"] = float(i.find("span", {"class": "subtext"}).get_text().split(u"\xa0")[1].replace(",", "."))
    # Append the data to final data array.
    finalProductList.append(productData)

# Get json representation of dictionary.
print(json.dumps(finalProductList))

输出:

[
  {
    "title": "Sieb Posthuma  -  Mannetje Jas  (Hardcover/Gebonden) Kinderjury",
    "origin": "http://www.marktplaza.nl/boeken/kinderboeken/sieb-posthuma-mannetje-jas-hardcover-gebonden-kinderjury-92409632.html",
    "imageURL": "http://www.marktplaza.nl/M92409632/1/sieb-posthuma-mannetje-jas-hardcover-gebonden-kinderjury-92409632.jpg",
    "price": 12.5
  },
  {
    "title": "Estefhan Meijer  -  United Wraps    Wraps Uit De Hele Wereld",
    "origin": "http://www.marktplaza.nl/boeken/kookboeken/estefhan-meijer-united-wraps-wraps-uit-de-hele-wereld-92390218.html",
    "imageURL": "http://www.marktplaza.nl/M92390218/1/estefhan-meijer-united-wraps-wraps-uit-de-hele-wereld-92390218.jpg",
    "price": 15
  },
  {
    "title": "Daphne Deckers  -  De Verschrikkelijke Ijstaart  (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/kookboeken/daphne-deckers-de-verschrikkelijke-ijstaart-hardcover-gebonden-92390182.html",
    "imageURL": "http://www.marktplaza.nl/M92390182/1/daphne-deckers-de-verschrikkelijke-ijstaart-hardcover-gebonden-92390182.jpg",
    "price": 10
  },
  {
    "title": "Adelene Fletcher  -   Bomen Aquarelleren Van A Tot Z",
    "origin": "http://www.marktplaza.nl/boeken/hobby-techniek/adelene-fletcher-bomen-aquarelleren-van-a-tot-z-92390124.html",
    "imageURL": "http://www.marktplaza.nl/M92390124/1/adelene-fletcher-bomen-aquarelleren-van-a-tot-z-92390124.jpg",
    "price": 12.5
  },
  {
    "title": "Razorlight ‎– America  (2 Track CDSingle)",
    "origin": "http://www.marktplaza.nl/cd-vinyl/singles/razorlight-america-2-track-cdsingle-92390118.html",
    "imageURL": "http://www.marktplaza.nl/M92390118/1/razorlight-america-2-track-cdsingle-92390118.jpg",
    "price": 5
  },
  {
    "title": "Twarres ‎– Children  (2 Track CDSingle)",
    "origin": "http://www.marktplaza.nl/cd-vinyl/singles/twarres-children-2-track-cdsingle-92390078.html",
    "imageURL": "http://www.marktplaza.nl/M92390078/1/twarres-children-2-track-cdsingle-92390078.jpg",
    "price": 5
  },
  {
    "title": "Tower Of Power ‎– The Very Best Of Tower Of Power - The Warner Years  (CD)",
    "origin": "http://www.marktplaza.nl/cd-vinyl/pop/tower-of-power-the-very-best-of-tower-of-power-the-warner-years-cd-92389836.html",
    "imageURL": "http://www.marktplaza.nl/M92389836/1/tower-of-power-the-very-best-of-tower-of-power-the-warner-years-cd-92389836.jpg",
    "price": 10
  },
  {
    "title": "Red Hot Chili Peppers ‎– Dani California  (2 Track CDSingle)",
    "origin": "http://www.marktplaza.nl/cd-vinyl/singles/red-hot-chili-peppers-dani-california-2-track-cdsingle-92389742.html",
    "imageURL": "http://www.marktplaza.nl/M92389742/1/red-hot-chili-peppers-dani-california-2-track-cdsingle-92389742.jpg",
    "price": 5
  },
  {
    "title": "Seth Godin  -  Icarus Deception  (Engelstalig)",
    "origin": "http://www.marktplaza.nl/boeken/management-en-economie/seth-godin-icarus-deception-engelstalig-92389542.html",
    "imageURL": "http://www.marktplaza.nl/M92389542/1/seth-godin-icarus-deception-engelstalig-92389542.jpg",
    "price": 12.5
  },
  {
    "title": "Rob Gifford  -  De Chinese Weg",
    "origin": "http://www.marktplaza.nl/boeken/reizen/rob-gifford-de-chinese-weg-92389500.html",
    "imageURL": "http://www.marktplaza.nl/M92389500/1/rob-gifford-de-chinese-weg-92389500.jpg",
    "price": 12.5
  },
  {
    "title": "Bart Leeuwenburgh  -   Darwin In Domineesland",
    "origin": "http://www.marktplaza.nl/boeken/informatief/bart-leeuwenburgh-darwin-in-domineesland-92386128.html",
    "imageURL": "http://www.marktplaza.nl/M92386128/1/bart-leeuwenburgh-darwin-in-domineesland-92386128.jpg",
    "price": 12.5
  },
  {
    "title": "Per Olov Enquist  -  Het Record  (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/romans/per-olov-enquist-het-record-hardcover-gebonden-92386080.html",
    "imageURL": "http://www.marktplaza.nl/M92386080/1/per-olov-enquist-het-record-hardcover-gebonden-92386080.jpg",
    "price": 10
  },
  {
    "title": "Fred Vargas - Uit De Dood Herrezen (Hardcover/Gebonden) blauw/groene achtergrond",
    "origin": "http://www.marktplaza.nl/boeken/romans/fred-vargas-uit-de-dood-herrezen-hardcover-gebonden-blauw-groene-achtergrond-92385368.html",
    "imageURL": "http://www.marktplaza.nl/M92385368/1/fred-vargas-uit-de-dood-herrezen-hardcover-gebonden-blauw-groene-achtergrond-92385368.jpg",
    "price": 12.5
  },
  {
    "title": "Fred Vargas  -   De Omgekeerde Man   (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/romans/fred-vargas-de-omgekeerde-man-hardcover-gebonden-92385304.html",
    "imageURL": "http://www.marktplaza.nl/M92385304/1/fred-vargas-de-omgekeerde-man-hardcover-gebonden-92385304.jpg",
    "price": 15
  },
  {
    "title": "David Sandes  -  Sergei Bubka's Wondermethode  (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/romans/david-sandes-sergei-bubkas-wondermethode-hardcover-gebonden-92385090.html",
    "imageURL": "http://www.marktplaza.nl/M92385090/1/david-sandes-sergei-bubkas-wondermethode-hardcover-gebonden-92385090.jpg",
    "price": 10
  },
  {
    "title": "Sjoerd Kuyper  -  Sjaantje Doet Alsof  (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/kinderboeken/sjoerd-kuyper-sjaantje-doet-alsof-hardcover-gebonden-92384948.html",
    "imageURL": "http://www.marktplaza.nl/M92384948/1/sjoerd-kuyper-sjaantje-doet-alsof-hardcover-gebonden-92384948.jpg",
    "price": 10
  },
  {
    "title": "Het Piratenschip     Klap Open En Bekijk  (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/kinderboeken/het-piratenschip-klap-open-en-bekijk-hardcover-gebonden-92371996.html",
    "imageURL": "http://www.marktplaza.nl/M92371996/1/het-piratenschip-klap-open-en-bekijk-hardcover-gebonden-92371996.jpg",
    "price": 12.5
  },
  {
    "title": "John Topsell  -  Draken Trainen En Verzorgen (Hardcover/Gebonden)",
    "origin": "http://www.marktplaza.nl/boeken/kinderboeken/john-topsell-draken-trainen-en-verzorgen-hardcover-gebonden-92371928.html",
    "imageURL": "http://www.marktplaza.nl/M92371928/1/john-topsell-draken-trainen-en-verzorgen-hardcover-gebonden-92371928.jpg",
    "price": 15
  }
]

推荐阅读