首页 > 解决方案 > 如何解析来自该站点的冠状病毒数据?

问题描述

我是 Python 的初学者,我对从 Internet 获取数据知之甚少。我在这里使用的这种方法用于获取和打印 IMDB Top 250 电影。所以我想对这个冠状病毒数据做同样的事情。但与 IMDB 数据不同,程序没有将项目视为列表。我看不出与 IMDB 数据有太大区别。那么我怎样才能通过使用这样的简单请求和漂亮的汤来打印至少国家的名字呢?

import requests
from bs4 import BeautifulSoup
url = requests.get("https://www.worldometers.info/coronavirus/")
soup = BeautifulSoup(url.content, "html.parser")
new_soup = soup.find_all("table", {"id":"main_table_countries_today"})
country_table = new_soup[0].contents[3]
country_table = country_table.find_all("tr")
for country in country_table:
    country_name = country.find_all("td", {"style":"font-weight: bold; font-size:15px; text-align:left;"})
    print(country_name[0].text)

标签: pythonpython-3.xpython-requests

解决方案


我一直从约翰霍普金斯大学的GitHub 存储库中获取数据,该存储库被认为是有信誉的来源:

names = ('confirmed', 'deaths', 'recovered')
src_base = 'https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_{name}_global.csv'

可以通过以下方式进行感染requests

import requests


for name, url in src.items():
    response = requests.get(url)

并方便地转换为 Pandas 数据框:

import io
import pandas


dfs = {}
for name, url in src.items():
    response = requests.get(url)
    dfs[name] = pd.read_csv(io.BytesIO(response.content))
    print(name, url)
    print(dfs[name])
confirmed https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv
                Province/State         Country/Region  ...  4/13/20  4/14/20
0                          NaN            Afghanistan  ...      665      714
1                          NaN                Albania  ...      467      475
2                          NaN                Algeria  ...     1983     2070
3                          NaN                Andorra  ...      646      659
4                          NaN                 Angola  ...       19       19
..                         ...                    ...  ...      ...      ...
259  Saint Pierre and Miquelon                 France  ...        1        1
260                        NaN            South Sudan  ...        4        4
261                        NaN         Western Sahara  ...        6        6
262                        NaN  Sao Tome and Principe  ...        4        4
263                        NaN                  Yemen  ...        1        1

[264 rows x 88 columns]
deaths https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_deaths_global.csv
                Province/State         Country/Region  ...  4/13/20  4/14/20
0                          NaN            Afghanistan  ...       21       23
1                          NaN                Albania  ...       23       24
2                          NaN                Algeria  ...      313      326
3                          NaN                Andorra  ...       29       31
4                          NaN                 Angola  ...        2        2
..                         ...                    ...  ...      ...      ...
259  Saint Pierre and Miquelon                 France  ...        0        0
260                        NaN            South Sudan  ...        0        0
261                        NaN         Western Sahara  ...        0        0
262                        NaN  Sao Tome and Principe  ...        0        0
263                        NaN                  Yemen  ...        0        0

[264 rows x 88 columns]
recovered https://raw.githubusercontent.com/CSSEGISandData/COVID-19/master/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_recovered_global.csv
                Province/State         Country/Region  ...  4/13/20  4/14/20
0                          NaN            Afghanistan  ...       32       40
1                          NaN                Albania  ...      232      248
2                          NaN                Algeria  ...      601      691
3                          NaN                Andorra  ...      128      128
4                          NaN                 Angola  ...        4        5
..                         ...                    ...  ...      ...      ...
245  Saint Pierre and Miquelon                 France  ...        0        0
246                        NaN            South Sudan  ...        0        0
247                        NaN         Western Sahara  ...        0        0
248                        NaN  Sao Tome and Principe  ...        0        0
249                        NaN                  Yemen  ...        0        0

[250 rows x 88 columns]

你最终可以有一些快速的情节:

确认的 死亡人数 恢复

此处提供完整代码。


推荐阅读