首页 > 解决方案 > 如何使用 JS 的“加载更多”按钮从网站上抓取数据

问题描述

我正在尝试用 JS 抓取 Google Scholar 网站。该表在底部加载了“加载更多”按钮。如需参考,请参阅此页面:https ://scholar.google.com/citations?hl=en&user=m8dFEawAAAAJ

显然我在“加载更多”按钮上使用了 btn.click() ,直到该按钮被“禁用”。然后,我抓取了数据。有人可以告诉我一个更好的方法来抓取整个表格数据吗?

标签: javascriptwebdomweb-scraping

解决方案


If you click on the Show more button, you can see that the next request is made with the following query string attached to the end of the url:

&cstart=20&pagesize=80

With cstart probably referencing something like count start, you could swap the parameter values to something like this, which should display the next 1000 items starting with the first:

https://scholar.google.com/citations?hl=en&user=m8dFEawAAAAJ&cstart=1&pagesize=1000

推荐阅读