python - 正则表达式在 JS 对象中找到匹配项
问题描述
我正在抓取一个站点,并且我想要的数据包含在 html 页面的脚本标记中,我编写了一个re
代码来查找匹配项,但似乎我做错了。
Hub = {};
Hub.config = {
config: {},
get: function(key) {
if (key in this.config) {
return this.config[key];
} else {
return null;
}
},
set: function(key, val) {
this.config[key] = val;
}
};
Hub.config.set('sku', {
valCartInfo : {
itemId : '576938415361',
cartUrl: '//cart.mangolane.com/cart.htm'
},
apiRelateMarket : '//tui.mangolane.com/recommend?appid=16&count=4&itemid=576938415361',
apiAddCart : '//cart.mangolane.com/add_cart_item.htm?item_id=576938415361',
apiInsurance : '',
wholeSibUrl : '//detailskip.mangolane.com/service/getData/1/p1/item/detail/sib.htm?itemId=576938415361&sellerId=499095250&modules=dynStock,qrcode,viewer,price,duty,xmpPromotion,delivery,upp,activity,fqg,zjys,amountRestriction,couponActivity,soldQuantity,page,originalPrice,tradeContract',
areaLimit : '',
bigGroupUrl : '',
valPostFee : '',
coupon : {
couponApi : '//detailskip.mangolane.com/json/activity.htm?itemId=576938415361&sellerId=499095250',
couponWidgetDomain: '//assets.mgcdn.com',
cbUrl : '/cross.htm?type=weibo'
},
valItemInfo : {
defSelected: -1,
skuMap : {";20549:103189693;1627207:811754571;":{"price":"528.00","stock":"2","skuId":"4301611864655","oversold":false},
";20549:59280855;1627207:412796441;":{"price":"528.00","stock":"2","skuId":"4432149803707","oversold":false},
";20549:59280855;1627207:196576508;":{"price":"528.00","stock":"2","skuId":"4018119863100","oversold":false},
";20549:72380707;1627207:28341;":{"price":"528.00","stock":"2","skuId":"4166690818570","oversold":false},
";20549:418624880;1627207:28341;":{"price":"528.00","stock":"2","skuId":"4166690818566","oversold":false},
";20549:418624880;1627207:196576508;":{"price":"528.00","stock":"2","skuId":"4018119863098","oversold":false},
";20549:72380707;1627207:3224419;":{"price":"528.00","stock":"2","skuId":"4166690818571","oversold":false},
";20549:147478970;1627207:196576508;":{"price":"528.00","stock":"2","skuId":"4018119863094","oversold":false},
";20549:72380707;1627207:384366805;":{"price":"528.00","stock":"2","skuId":"4432149803708","oversold":false},
";20549:296172561;1627207:811754571;":{"price":"528.00","stock":"2","skuId":"4301611864659","oversold":false},
";20549:72380707;1627207:1150336209;":{"price":"528.00","stock":"2","skuId":"4301611864664","oversold":false},
";20549:147478970;1627207:93586002;":{"price":"528.00","stock":"2","skuId":"4018119863095","oversold":false}}
,propertyMemoMap: {"1627207:811754571":"黑色单里(预售) 年后2.29发货","1627207:93586002":"黑色加绒 现货","1627207:412796441":"黑色(兔毛) 现货","1627207:384366805":"米白色(兔毛) 现货","1627207:3224419":"驼色 现货","1627207:1150336209":"驼色单里(预售) 年后2.29发货","1627207:28341":"黑色 现货","1627207:196576508":"驼色加绒 现货"}
}
});
我只需要获取数据Hub.config.set('sku'
我这样做了,但没有用
config_base_str = re.findall("Hub.config.set ({[\s\S]*?});", config)
config
数据字符串在哪里
解决方案
句号和括号在正则表达式中具有特殊含义。如果要搜索文字字符,则需要先使用反斜杠对其进行转义。
例如假设字符串:
config = """
Hub.config.set('sku', {
valCartInfo : {
itemId : '576938415361',
cartUrl: '//cart.mangolane.com/cart.htm'
},
.........
};
"""
如果您只想要密钥,则可以执行以下操作:
config_base_str = re.findall("Hub\.config\.set\('(\w*)", config) # ['sku']
如果您想要括号内的键后面的所有内容,您可以执行以下操作:
config_base_str = re.findall("Hub\.config\.set\('\w*',\s*({[\s\S]*})", config) # ["{\n valCartInfo : {} ...}"]
推荐阅读
- redis - 无法打开 RDB 文件根目录(在服务器根目录 /etc/crontabs 中)进行保存:权限被拒绝
- c# - 如何使用 MVVM 和 Caliburn Micro 在多个 ViewModel 中使用模型中的数据
- javascript - Python flask ERROR 您需要启用 JavaScript 才能运行此应用
- javascript - StencilJS - 在正文中插入元素而不是组件
- kubernetes - 在 kubernetes 集群中部署应用程序和普通 droplet 之间的区别?
- ios - 滑动以从 TableView 中删除不起作用
- spring-batch - Spring Cloud 任务与 Spring 批处理
- c# - 使用 EF Core 在 Azure Functions 上的 Application Insights 中启用 Sql 依赖项
- security - iText 设置 pdf 权限女巫签名认证
- vue.js - Vuetify v-tab 在选项卡之间导航时防止组件刷新