javascript - 您可以自动将产品从在线商店添加到 WooCommerce 吗?
问题描述
我想从这个荷兰网上商店(类似于 eBay/亚马逊)上的一个帐户中获取所有产品,并使用 WooCommerce将它们添加到这个 WordPress 网上商店。大约 2 到 3 周前,我开始了 Web 开发,我了解 HTML、CSS、JavaScript、Nodejs 和 Express 的基础知识。我想我大概知道该怎么做了,那就是:
- 遍历每页的所有产品。
- 获取标题、描述、类别、价格和照片。
- 将该信息与产品对象一起存储在数组中。
- 访问 WooCommerce API。
- 遍历所有产品并将它们添加到 WooCommerce。
我的问题是:
- 这可能吗?
- 我可以使用我可以使用的语言吗?
- 你会使用什么方法?(例如,您将如何抓取 HTML,是否有比我描述的步骤更简单的方法,您会使用代码还是使用一些自动化软件等)
这对我来说是一个大项目,所以欢迎任何帮助(关于如何开始)!
解决方案
您对这些步骤是正确的,是的,这是可能的。您可以使用 node.js 抓取数据,如您所知,我个人的偏好是 python 在数据抓取方面,但您可以在 node.js 中完成。Node.js 有 HTML 解析器等等。我建议你几件事:
- 使用解析器解析 HTML 数据,以便更好地访问元素以获取数据。
- 使用某种数据结构来正确存储数据,例如:JSON、XML、CSV...
- 如果获取数据是一个漫长的过程,请先获取数据,因为如果解析系统中的任何部分不适合,您可能会在解析时丢失所有数据,然后再解析数据。
我将把我为从您放置的网站获取数据而编写的代码带到这里,它是用 Python 语言编写的,但我在上面添加了注释,以便您更好地了解如何获取数据并用其他语言编写。您还可以使用split
从 HTML 数据中剪切部分,您甚至不需要使用解析器。
例子:
import requests, json
from bs4 import BeautifulSoup
from pprint import pprint
endpoint = "http://johndevisser.marktplaza.nl/?p=1"
# Send a get request to page to get the html.
data = requests.get(endpoint).content
# Parse the html via BeautifulSoup
page = BeautifulSoup(data)
# Find 'div' elements whose 'itemscope' attributes are 'itemscope'
products = page.find_all("div", {"itemscope": "itemscope"})[1:]
# Create an empty array to store prepared data.
finalProductList = []
# Iterate over the products.
for i in products:
# Create a dictionary object to store data properly.
productData = {}
# Get the title attribute from 'a' element on the current product.
productData["title"] = i.find("a").get("title")
# Get the href attribute from 'a' element on the current product because the real source can be useful in the future.
productData["origin"] = i.find("a").get("href")
# Get the image url from 'img' elements to download images.
productData["imageURL"] = i.find("img").get("src")
# This may look you complicated but it just finds 'span' elements value of 'class' attribute is 'subtext' and get the
# inner text, split into two from ' '(space) to this ['€', '15,00'] and get the right part which is the second part
# in the array which is the price and replace comma with dot to parse in float value.
productData["price"] = float(i.find("span", {"class": "subtext"}).get_text().split(u"\xa0")[1].replace(",", "."))
# Append the data to final data array.
finalProductList.append(productData)
# Get json representation of dictionary.
print(json.dumps(finalProductList))
输出:
[
{
"title": "Sieb Posthuma - Mannetje Jas (Hardcover/Gebonden) Kinderjury",
"origin": "http://www.marktplaza.nl/boeken/kinderboeken/sieb-posthuma-mannetje-jas-hardcover-gebonden-kinderjury-92409632.html",
"imageURL": "http://www.marktplaza.nl/M92409632/1/sieb-posthuma-mannetje-jas-hardcover-gebonden-kinderjury-92409632.jpg",
"price": 12.5
},
{
"title": "Estefhan Meijer - United Wraps Wraps Uit De Hele Wereld",
"origin": "http://www.marktplaza.nl/boeken/kookboeken/estefhan-meijer-united-wraps-wraps-uit-de-hele-wereld-92390218.html",
"imageURL": "http://www.marktplaza.nl/M92390218/1/estefhan-meijer-united-wraps-wraps-uit-de-hele-wereld-92390218.jpg",
"price": 15
},
{
"title": "Daphne Deckers - De Verschrikkelijke Ijstaart (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/kookboeken/daphne-deckers-de-verschrikkelijke-ijstaart-hardcover-gebonden-92390182.html",
"imageURL": "http://www.marktplaza.nl/M92390182/1/daphne-deckers-de-verschrikkelijke-ijstaart-hardcover-gebonden-92390182.jpg",
"price": 10
},
{
"title": "Adelene Fletcher - Bomen Aquarelleren Van A Tot Z",
"origin": "http://www.marktplaza.nl/boeken/hobby-techniek/adelene-fletcher-bomen-aquarelleren-van-a-tot-z-92390124.html",
"imageURL": "http://www.marktplaza.nl/M92390124/1/adelene-fletcher-bomen-aquarelleren-van-a-tot-z-92390124.jpg",
"price": 12.5
},
{
"title": "Razorlight – America (2 Track CDSingle)",
"origin": "http://www.marktplaza.nl/cd-vinyl/singles/razorlight-america-2-track-cdsingle-92390118.html",
"imageURL": "http://www.marktplaza.nl/M92390118/1/razorlight-america-2-track-cdsingle-92390118.jpg",
"price": 5
},
{
"title": "Twarres – Children (2 Track CDSingle)",
"origin": "http://www.marktplaza.nl/cd-vinyl/singles/twarres-children-2-track-cdsingle-92390078.html",
"imageURL": "http://www.marktplaza.nl/M92390078/1/twarres-children-2-track-cdsingle-92390078.jpg",
"price": 5
},
{
"title": "Tower Of Power – The Very Best Of Tower Of Power - The Warner Years (CD)",
"origin": "http://www.marktplaza.nl/cd-vinyl/pop/tower-of-power-the-very-best-of-tower-of-power-the-warner-years-cd-92389836.html",
"imageURL": "http://www.marktplaza.nl/M92389836/1/tower-of-power-the-very-best-of-tower-of-power-the-warner-years-cd-92389836.jpg",
"price": 10
},
{
"title": "Red Hot Chili Peppers – Dani California (2 Track CDSingle)",
"origin": "http://www.marktplaza.nl/cd-vinyl/singles/red-hot-chili-peppers-dani-california-2-track-cdsingle-92389742.html",
"imageURL": "http://www.marktplaza.nl/M92389742/1/red-hot-chili-peppers-dani-california-2-track-cdsingle-92389742.jpg",
"price": 5
},
{
"title": "Seth Godin - Icarus Deception (Engelstalig)",
"origin": "http://www.marktplaza.nl/boeken/management-en-economie/seth-godin-icarus-deception-engelstalig-92389542.html",
"imageURL": "http://www.marktplaza.nl/M92389542/1/seth-godin-icarus-deception-engelstalig-92389542.jpg",
"price": 12.5
},
{
"title": "Rob Gifford - De Chinese Weg",
"origin": "http://www.marktplaza.nl/boeken/reizen/rob-gifford-de-chinese-weg-92389500.html",
"imageURL": "http://www.marktplaza.nl/M92389500/1/rob-gifford-de-chinese-weg-92389500.jpg",
"price": 12.5
},
{
"title": "Bart Leeuwenburgh - Darwin In Domineesland",
"origin": "http://www.marktplaza.nl/boeken/informatief/bart-leeuwenburgh-darwin-in-domineesland-92386128.html",
"imageURL": "http://www.marktplaza.nl/M92386128/1/bart-leeuwenburgh-darwin-in-domineesland-92386128.jpg",
"price": 12.5
},
{
"title": "Per Olov Enquist - Het Record (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/romans/per-olov-enquist-het-record-hardcover-gebonden-92386080.html",
"imageURL": "http://www.marktplaza.nl/M92386080/1/per-olov-enquist-het-record-hardcover-gebonden-92386080.jpg",
"price": 10
},
{
"title": "Fred Vargas - Uit De Dood Herrezen (Hardcover/Gebonden) blauw/groene achtergrond",
"origin": "http://www.marktplaza.nl/boeken/romans/fred-vargas-uit-de-dood-herrezen-hardcover-gebonden-blauw-groene-achtergrond-92385368.html",
"imageURL": "http://www.marktplaza.nl/M92385368/1/fred-vargas-uit-de-dood-herrezen-hardcover-gebonden-blauw-groene-achtergrond-92385368.jpg",
"price": 12.5
},
{
"title": "Fred Vargas - De Omgekeerde Man (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/romans/fred-vargas-de-omgekeerde-man-hardcover-gebonden-92385304.html",
"imageURL": "http://www.marktplaza.nl/M92385304/1/fred-vargas-de-omgekeerde-man-hardcover-gebonden-92385304.jpg",
"price": 15
},
{
"title": "David Sandes - Sergei Bubka's Wondermethode (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/romans/david-sandes-sergei-bubkas-wondermethode-hardcover-gebonden-92385090.html",
"imageURL": "http://www.marktplaza.nl/M92385090/1/david-sandes-sergei-bubkas-wondermethode-hardcover-gebonden-92385090.jpg",
"price": 10
},
{
"title": "Sjoerd Kuyper - Sjaantje Doet Alsof (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/kinderboeken/sjoerd-kuyper-sjaantje-doet-alsof-hardcover-gebonden-92384948.html",
"imageURL": "http://www.marktplaza.nl/M92384948/1/sjoerd-kuyper-sjaantje-doet-alsof-hardcover-gebonden-92384948.jpg",
"price": 10
},
{
"title": "Het Piratenschip Klap Open En Bekijk (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/kinderboeken/het-piratenschip-klap-open-en-bekijk-hardcover-gebonden-92371996.html",
"imageURL": "http://www.marktplaza.nl/M92371996/1/het-piratenschip-klap-open-en-bekijk-hardcover-gebonden-92371996.jpg",
"price": 12.5
},
{
"title": "John Topsell - Draken Trainen En Verzorgen (Hardcover/Gebonden)",
"origin": "http://www.marktplaza.nl/boeken/kinderboeken/john-topsell-draken-trainen-en-verzorgen-hardcover-gebonden-92371928.html",
"imageURL": "http://www.marktplaza.nl/M92371928/1/john-topsell-draken-trainen-en-verzorgen-hardcover-gebonden-92371928.jpg",
"price": 15
}
]
推荐阅读
- python-3.x - 从 pytest 隐藏 DeprecationWarning
- javascript - URLSearchParams 到 javascript 中的文件
- sql-server - SQL Server 用回车分割字符串的问题
- angular - 如何在 app-routing.module.ts 中使用查询参数为子页面定义角度路由?
- java - 一次从数组中删除一项
- javascript - Finding the difference between 2 columns and printing on a 3rd
- python - 向 scipy 稀疏添加维度
- pid - 使用 Modelica 模型和 PID 控制器设计进行过程识别
- performance - GCP 不同地区 N2 sku 的不同性能水平
- android - 试图避免泛型:“类 Result 需要一个类型参数
"