javascript - Cheerio,如何将其提取到 json 中?
问题描述
我有一个类似这样的 html 页面,其中包含 2 个产品和基于条件的不同版本,在这种情况下,1 个产品有 4 个条件
<div class="listing_item">
<ul class="listedItem listedItem--searched">
<li>
<a href="itemproduct.html">
<div class="itemImg">
<img class="lazy" alt="Item name 1" src="some.jpg" data-original="some.jpg" style="display: block;">
</div>
<div class="icon">
</div>
</a>
<a href="item1productdetails.html" class="itemName">Item name 1</a>
<div class="tableHere product">
<div class="sameSearchButton">
<a href="etctectect">
Search item here
</a>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 1
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">10 USD</div>
<div class="col-xs-2">stock 10</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
<select id="quantity_511377" class="form-control product-listing__qty-to-buy" name="quantityToBuy">
<option value="1">1</option>
<option value="2">2</option>
<option value="3">3</option>
<option value="4">4</option>
<option value="5">5</option>
<option value="6">6</option>
<option value="7">7</option>
<option value="8">8</option>
</select>
</div>
</div>
<div class="col-xs-3 ng-star-inserted">
<button id="cart_511377" class="btn btn-primary btn-sm addCart" data-productclass="511377" title="Add to Cart" type="button">
<i aria-hidden="true" class="fas fa-shopping-cart"/>
</button>
</div>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 2
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">20 USD</div>
<div class="col-xs-2">stock 120</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
</div>
</div>
<div class="col-xs-3 ng-star-inserted">
<button class="btn btn-primary btn-sm notifyme" type="button" title="Request for Items Waiting for Arrival." data-productclass="511378">
<i aria-hidden="true" class="icon fas fa-exclamation-circle"/>
</button>
</div>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 3
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">9 usd</div>
<div class="col-xs-2">stock 5</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
</div>
</div>
<div class="col-xs-3 ng-star-inserted">
<button class="btn btn-primary btn-sm notifyme" type="button" title="Request for Items Waiting for Arrival." data-productclass="511379">
<i aria-hidden="true" class="icon fas fa-exclamation-circle"/>
</button>
</div>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 4
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">700 USD</div>
<div class="col-xs-2">stock 10</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
</div>
</div>
<div class="col-xs-3 ng-star-inserted">
<button class="btn btn-primary btn-sm notifyme" type="button" title="Request for Items Waiting for Arrival." data-productclass="511380">
<i aria-hidden="true" class="icon fas fa-exclamation-circle"/>
</button>
</div>
</div>
</div>
<div class="soldout" style="display: none;" data-soldout="false"/>
</li>
<li>
<a href="itemproduct.html">
<div class="itemImg">
<img class="lazy" alt="Item name 1" src="some.jpg" data-original="some.jpg" style="display: block;">
</div>
<div class="icon">
</div>
</a>
<a href="
item2productdetails.html" class="itemName">Item name 2</a>
<div class="tableHere product">
<div class="sameSearchButton">
<a href="etctectect">
Search item here
</a>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 1
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">2,000 USD</div>
<div class="col-xs-2">stock 29</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
<select id="quantity_511982" class="form-control product-listing__qty-to-buy" name="quantityToBuy">
<option value="1">1</option>
<option value="2">2</option>
<option value="3">3</option>
<option value="4">4</option>
<option value="5">5</option>
<option value="6">6</option>
<option value="7">7</option>
<option value="8">8</option>
</select>
</div>
</div>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 2
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">1,800 USD</div>
<div class="col-xs-2">stock 20</div>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 3
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">1,600 USD</div>
<div class="col-xs-2">stock 4</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
</div>
</div>
</div>
<div class="row not-first ng-star-inserted">
<div class="col-xs-1 ng-star-inserted">
<strong>
Condition 4
</strong>
</div>
<div class="col-xs-4 ng-star-inserted">1,400 USD</div>
<div class="col-xs-2">stock 2</div>
<div class="col-xs-2 ng-star-inserted">
<div class="asSpinner">
</div>
</div>
</div>
</div>
<div class="soldout" style="display: none;" data-soldout="false"/>
</li>
</ul>
</div>
使用cheerio,我在提取它们并将它们制作成这样的json数据时遇到问题
{
"Item 1": [
{
"price": 1,
"conditon": "condition 1",
"stock": "1"
},
{
"price": 1,
"conditon": "condition 2",
"stock": "1"
},
{
"price": 1,
"conditon": "condition 3",
"stock": "1"
},
{
"price": 1,
"conditon": "condition 4",
"stock": "1"
}
],
"Item 2": [
{
"price": 1,
"conditon": "condition 1",
"stock": "1"
},
{
"price": 1,
"conditon": "condition 2",
"stock": "1"
},
{
"price": 1,
"conditon": "condition 3",
"stock": "1"
},
{
"price": 1,
"conditon": "condition 4",
"stock": "1"
}
]
}
目前我正在做这样的事情
$('.listing_item ul li').each(function(i, elm) {
cardName = $(this).find('a[class=itemName]').text().trim()
$('div[class="row not-first ng-star-inserted"]').each(function(){
var obj = {
condition: $(this).find('div[class="col-xs-1 ng-star-inserted"]').text().trim(),
price: $(this).find('div[class="col-xs-4 ng-star-inserted"]').text().trim(),
stock: $(this).find('div[class="col-xs-2"]').text().trim()
};
})
}); 但我发现在我的输出中我得到了很多重复,假设项目 1 有 4 个条件,但它变成了 8。我不知道如何从cheerio 中提取到 json
解决方案
$('div[class="row not-first ng-star-inserted"]')
同时加载具有和div
类的row
所有项目。错误是您正在文档中搜索此类项目,并且您没有缩小当前的搜索范围,例如not-first
ng-star-inserted
li
$('.listing_item ul li').each(function(i, elm) {
cardName = $(this).find('a[class=itemName]').text().trim();
$(this).find('div[class="row not-first ng-star-inserted"]').each(function(){
var obj = {
condition: $(this).find('div[class="col-xs-1 ng-star-inserted"]').text().trim(),
price: $(this).find('div[class="col-xs-4 ng-star-inserted"]').text().trim(),
stock: $(this).find('div[class="col-xs-2"]').text().trim()
};
});
所以$ => $(this).find
编辑
要构建问题中的 JSON,您可以执行以下操作:
var myJSON = {};
$('.listing_item ul li').each(function(i, elm) {
cardName = $(this).find('a[class=itemName]').text().trim();
myJSON[cardName] = [];
$(this).find('div[class="row not-first ng-star-inserted"]').each(function(){
var obj = {
condition: $(this).find('div[class="col-xs-1 ng-star-inserted"]').text().trim(),
price: $(this).find('div[class="col-xs-4 ng-star-inserted"]').text().trim(),
stock: $(this).find('div[class="col-xs-2"]').text().trim()
};
myJSON[cardName].push(obj);
});
未测试。
推荐阅读
- pyspark - 无法从另一个 rdd 创建 rdd
- web-config - IIS 10 - web.config - 如何在没有脚本访问的情况下启用默认文档
- javascript - 状态内部的反应数组不会改变
- java - “找不到适合 jdbc:sqlite 的驱动程序”问题。我究竟做错了什么?
- java - 从通知“应用程序正在后台运行”打开应用程序
- r - 插入行以反映缺失的数据
- xcode - 为什么添加打字稿后XCode编译非常慢?
- flutter - 构建 QuizPage(dirty, state: _QuizPageState#307d6): The getter 'length' was called on null
- r - 执行 bake() 后,它不会在结果中显示所有预测变量和结果
- php - 从 HTML 表单格式化日期时出现 PHP 错误