python - 如何从 URL 中的第二个表中抓取数据?
问题描述
抱歉,如果这是一个超级愚蠢的问题,但我确实尝试了一些事情,我的尝试显示在下面。
from bs4 import BeautifulSoup
import requests
for page in range(1,5):
r=requests.get('https://etfdb.com/screener/#tab=returns&page=' + page)
data = r.text
soup = BeautifulSoup(data, "html.parser")
table = soup.find("table", {"class":"table table-bordered table-hover table-striped mm-mobile-table"})
A=[]
B=[]
C=[]
D=[]
E=[]
F=[]
G=[]
H=[]
for row in table.findAll("tr"):
for cell in row("td"):
#print (cell.get_text().strip())
A.append(cell[0].get_text().strip())
B.append(cell[1].get_text().strip())
C.append(cell[2].get_text().strip())
D.append(cell[3].get_text().strip())
E.append(cell[4].get_text().strip())
F.append(cell[5].get_text().strip())
G.append(cell[6].get_text().strip())
H.append(cell[7].get_text().strip())
df=pd.DataFrame(A,columns=['Symbol'])
df['ETF_Name']=B
df['1_Week']=C
df['4_Week']=D
df['YTD']=E
df['1_Year']=F
df['3_Year']=G
df['5_Year']=H
df
我相信相关表格的名称是“table table-bordered table-hover table-striped mm-mobile-table”。问题是,似乎有多个同名的表,我的代码是从第一个表中获取数据,但我想要另一个表中的数据,我认为这是第二个表。我要从中下载数据的表如下所示(“返回”而不是“概述”)。
解决方案
数据通过 JavaScript 动态加载。您可以使用requests
模块来加载数据,例如:
import json
import requests
from bs4 import BeautifulSoup
url = 'https://etfdb.com/api/screener/'
json_data = {"tab":"returns","page":1,"only":["meta","data",None]}
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0', 'Accept':'application/json'}
for page in range(1, 5): # <-- increase this to desired number of pages
json_data['page'] = page
data = requests.post(url, json=json_data, headers=headers).json()
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
# print some data to screen:
for d in data['data']:
print('{:<5}{:<50}{:>9}{:>9}{:>9}'.format(d['symbol']['text'], d['name']['text'], d['ytd'], d['one_week_return'], d['four_week_return']))
印刷:
SPY SPDR S&P 500 ETF -2.19% -2.44% 7.19%
IVV iShares Core S&P 500 ETF -2.22% -2.46% 7.20%
VTI Vanguard Total Stock Market ETF -2.58% -2.45% 7.88%
VOO Vanguard S&P 500 ETF -2.28% -2.49% 7.15%
QQQ Invesco QQQ 14.47% -0.18% 7.05%
AGG iShares Core U.S. Aggregate Bond ETF 5.85% 0.48% 0.82%
VEA Vanguard FTSE Developed Markets ETF -10.34% -2.46% 9.68%
IEFA iShares Core MSCI EAFE ETF -10.36% -2.37% 10.05%
GLD SPDR Gold Trust 13.54% 0.61% -1.22%
VUG Vanguard Growth ETF 10.29% -0.73% 7.44%
VWO Vanguard FTSE Emerging Markets ETF -11.31% -2.48% 7.16%
BND Vanguard Total Bond Market ETF 5.84% 0.42% 0.82%
IWF iShares Russell 1000 Growth ETF 8.62% -0.90% 6.99%
... etc.