首页 > 解决方案 > 使用漂亮的汤从表格中的行中的单元格获取值显示 {{row.value}} 作为结果

问题描述

使用来自https://www.mubasher.info/countries/eg/stock-prices的 HTML我正在尝试获取股票价格公司,它的值来自 HTML 中表格的原始值,

我在 python 3.7 中尝试了以下代码

from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as bs
import re

quotes_page = 'https://www.mubasher.info/countries/eg/stock-prices'
uClient = uReq(quotes_page)
page_content = uClient.read()
uClient.close()

soup = bs(page_content, 'html.parser')

table = soup.findChildren('table')[0]

rows = table.findChildren('tr')

for row in rows:
    cells = row.findChildren('td')
    for cell in cells:
        cell_content = cell.getText()
        clean_content = re.sub( '\s+', ' ', cell_content).strip()
        print(clean_content)

# 它显示以下结果而不是页面中的实际值

{{row.name | limitTo : 20}}
{{row.value}}
{{row.changePercentage}}
{{row.change}}
{{row.turnover}}
{{row.volume}}
{{row.open}}
{{row.high}}
{{row.low}}

标签: pythonweb-scraping

解决方案


该数据/表格是动态的。它是在初始请求之后呈现的。有一个 API,您可以直接访问源代码:

您可以通过在呈现页面时“检查”页面并找到适当的 XHR 来找到它:

在此处输入图像描述

import requests
from pandas.io.json import json_normalize

url = 'https://www.mubasher.info/api/1/stocks/prices'

headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36'}
payload = {'country': 'eg'}

jsonData = requests.get(url, headers=headers, params=payload).json()

prices_df = json_normalize(jsonData['prices'])

输出:

print (prices_df)
     change changePercentage  ...   value     volume
0      0.43            9.56%  ...    4.93     12,390
1      0.93            8.15%  ...   12.34     47,648
2      0.47            7.21%  ...    6.99         85
3      0.01            7.05%  ...    0.17     20,530
4      0.08            6.73%  ...    1.30     90,177
5      0.77            6.17%  ...   13.25     28,350
6      0.00            5.77%  ...    0.06     69,885
7      0.10            5.03%  ...    2.09    174,638
8      0.21            5.00%  ...    4.41    241,000
9      0.27            4.59%  ...    6.15         50
10     0.30            4.54%  ...    6.91  1,078,168
11     2.01            4.53%  ...   46.39      1,050
12     0.03            4.45%  ...    0.61     67,938
13     0.38            4.21%  ...    9.40     57,518
14     0.43            3.72%  ...   11.98         20
15     0.10            3.66%  ...    2.83    192,764
16     0.60            3.63%  ...   17.15    258,029
17     2.30            3.09%  ...   76.85  3,892,116
18     0.34            3.02%  ...   11.59    132,480
19     0.01            3.00%  ...    0.38    412,967
20     0.22            2.88%  ...    7.87    385,284
21     0.11            2.44%  ...    4.62     11,048
22     0.03            2.36%  ...    1.08     71,343
23     1.64            2.18%  ...   76.75         90
24     0.03            2.18%  ...    1.45    675,330
25     0.24            2.12%  ...   11.54    348,092
26     0.31            2.04%  ...   15.48      4,450
27     0.08            1.81%  ...    4.51  3,393,121
28    15.26            1.68%  ...  925.00         15
29     0.69            1.66%  ...   42.20    585,712
..      ...              ...  ...     ...        ...
131   -0.09           -4.86%  ...    1.67     37,000
132   -0.63           -9.86%  ...    5.76      7,000
133  -12.25         -100.00%  ...   12.10      1,000
134   -0.95         -100.00%  ...    0.93      5,000
135  -44.04         -100.00%  ...   44.05        150
136   -9.01         -100.00%  ...    9.30        850
137   -8.06         -100.00%  ...    8.00      1,055
138   -6.31         -100.00%  ...    6.31        589
139   -3.87         -100.00%  ...    3.81        333
140  -12.92         -100.00%  ...   12.02        325
141   -7.80         -100.00%  ...    7.80        500
142   -0.68         -100.00%  ...    0.68     38,000
143   -0.86         -100.00%  ...    0.82     32,870
144   -0.94         -100.00%  ...    0.93      4,000
145   -1.71         -100.00%  ...    1.70      7,250
146  -38.27         -100.00%  ...   38.13        401
147   -1.20         -100.00%  ...    1.22        480
148   -2.37         -100.00%  ...    2.30     12,755
149  -11.43         -100.00%  ...   11.29      1,115
150   -0.39         -100.00%  ...    0.38     14,000
151   -5.28         -100.00%  ...    5.21      1,200
152   -9.00         -100.00%  ...    9.00      1,510
153   -1.28         -100.00%  ...    1.28      2,000
154   -4.45         -100.00%  ...    4.50      6,350
155  -31.70         -100.00%  ...   32.97        500
156  -14.52         -100.00%  ...   14.41        453
157   -4.60         -100.00%  ...    4.78      3,670
158   -6.79         -100.00%  ...    6.81      4,002
159   -3.84         -100.00%  ...    3.76      5,800
160   -0.80         -100.00%  ...    0.80     14,890

[161 rows x 15 columns]

推荐阅读