首页 > 解决方案 > (Python) Beautifull soup 和编码 (utf-8, cp1252,ascii...)

问题描述

请帮忙,我现在很紧张。自从我开始学习 Python 以来,我就遇到了这个问题。总是遇到同样的问题,网上没有人能给出任何有效的答案

我的代码:

from bs4 import BeautifulSoup
import requests

page = requests.get(
    'https://forecast.weather.gov/MapClick.php?lat=34.05349000000007&lon=-118.24531999999999#.XswiwMCxWUk')
soup = BeautifulSoup(page.content, 'html.parser')
week = soup.find(id='seven-day-forecast-body')
items = week.find_all(class_='forecast-tombstone')

print(items[0].find(class_='period-name').get_text())
print(items[0].find(class_='short-desc').get_text())
print(items[0].find(class_='temp temp-high').get_text())

period_names = [item.find(class_='period-name').get_text() for item in items]
short_descp = [item.find(class_='short-desc').get_text() for item in items]
temp = [item.find(class_='temp temp-high').get_text() for item in items]
print(period_names)
print(short_descp)
print(temp)

输出 :

[Running] python -u "c:\Users\dukasu\Documents\Python\test.py"
ThisAfternoon
Partly Sunny
High: 76 �F
Traceback (most recent call last):
  File "c:\Users\dukasu\Documents\Python\test.py", line 20, in <module>
    temp = [item.find(class_='temp temp-high').get_text() for item in items]
  File "c:\Users\dukasu\Documents\Python\test.py", line 20, in <listcomp>
    temp = [item.find(class_='temp temp-high').get_text() for item in items]
AttributeError: 'NoneType' object has no attribute 'get_text'

[Done] exited with code=1 in 0.69 seconds

问题是由于 utf-8 编码(我的电脑是 cp1252),但是如何最终解决它(我认为问题是因为它不能使用度数符号操作)。Python 2 中有一个简单的代码,但是如何在 Python 3.xx 中解决它。如何在代码开头设置编码并忘记这个问题。anp 请原谅我的英语,它不是我的母语。

标签: pythonpython-3.xbeautifulsouputf-8cp1252

解决方案


错误来自返回 None 的类名,仅使用class_='tempNotclass_='temp temp-high

例子

temp = [item.find(class_='temp').get_text() for item in items]

完整代码

from bs4 import BeautifulSoup
import requests

page = requests.get(
    'https://forecast.weather.gov/MapClick.php?lat=34.05349000000007&lon=-118.24531999999999#.XswiwMCxWUk')
soup = BeautifulSoup(page.content, 'html.parser')
week = soup.find(id='seven-day-forecast-body')
items = week.find_all(class_='forecast-tombstone')

print(items[0].find(class_='period-name').get_text())
print(items[0].find(class_='short-desc').get_text())
print(items[0].find(class_='temp temp-high').get_text())

period_names = [item.find(class_='period-name').get_text() for item in items]
short_descp = [item.find(class_='short-desc').get_text() for item in items]
temp = [item.find(class_='temp').get_text() for item in items]
print(period_names)
print(short_descp)
print(temp)

打印出来

ThisAfternoon
Partly Sunny
High: 76 °F
['ThisAfternoon', 'Tonight', 'Saturday', 'SaturdayNight', 'Sunday', 'SundayNight', 'Monday', 'MondayNight', 'Tuesday']
['Partly Sunny', 'Patchy Fog', 'Patchy Fogthen MostlySunny', 'Patchy Fog', 'Patchy Fogthen PartlySunny', 'Patchy Fog', 'Patchy Fogthen MostlyCloudy', 'Mostly Cloudy', 'Partly Sunny']
['High: 76 °F', 'Low: 58 °F', 'High: 75 °F', 'Low: 59 °F', 'High: 80 °F', 'Low: 61 °F', 'High: 78 °F', 'Low: 61 °F', 'High: 77 °F']

推荐阅读