python - (Python) Beautifull soup 和编码 (utf-8, cp1252,ascii...)
问题描述
请帮忙,我现在很紧张。自从我开始学习 Python 以来,我就遇到了这个问题。总是遇到同样的问题,网上没有人能给出任何有效的答案
我的代码:
from bs4 import BeautifulSoup
import requests
page = requests.get(
'https://forecast.weather.gov/MapClick.php?lat=34.05349000000007&lon=-118.24531999999999#.XswiwMCxWUk')
soup = BeautifulSoup(page.content, 'html.parser')
week = soup.find(id='seven-day-forecast-body')
items = week.find_all(class_='forecast-tombstone')
print(items[0].find(class_='period-name').get_text())
print(items[0].find(class_='short-desc').get_text())
print(items[0].find(class_='temp temp-high').get_text())
period_names = [item.find(class_='period-name').get_text() for item in items]
short_descp = [item.find(class_='short-desc').get_text() for item in items]
temp = [item.find(class_='temp temp-high').get_text() for item in items]
print(period_names)
print(short_descp)
print(temp)
输出 :
[Running] python -u "c:\Users\dukasu\Documents\Python\test.py"
ThisAfternoon
Partly Sunny
High: 76 �F
Traceback (most recent call last):
File "c:\Users\dukasu\Documents\Python\test.py", line 20, in <module>
temp = [item.find(class_='temp temp-high').get_text() for item in items]
File "c:\Users\dukasu\Documents\Python\test.py", line 20, in <listcomp>
temp = [item.find(class_='temp temp-high').get_text() for item in items]
AttributeError: 'NoneType' object has no attribute 'get_text'
[Done] exited with code=1 in 0.69 seconds
问题是由于 utf-8 编码(我的电脑是 cp1252),但是如何最终解决它(我认为问题是因为它不能使用度数符号操作)。Python 2 中有一个简单的代码,但是如何在 Python 3.xx 中解决它。如何在代码开头设置编码并忘记这个问题。anp 请原谅我的英语,它不是我的母语。
解决方案
错误来自返回 None 的类名,仅使用class_='temp
Notclass_='temp temp-high
例子
temp = [item.find(class_='temp').get_text() for item in items]
完整代码
from bs4 import BeautifulSoup
import requests
page = requests.get(
'https://forecast.weather.gov/MapClick.php?lat=34.05349000000007&lon=-118.24531999999999#.XswiwMCxWUk')
soup = BeautifulSoup(page.content, 'html.parser')
week = soup.find(id='seven-day-forecast-body')
items = week.find_all(class_='forecast-tombstone')
print(items[0].find(class_='period-name').get_text())
print(items[0].find(class_='short-desc').get_text())
print(items[0].find(class_='temp temp-high').get_text())
period_names = [item.find(class_='period-name').get_text() for item in items]
short_descp = [item.find(class_='short-desc').get_text() for item in items]
temp = [item.find(class_='temp').get_text() for item in items]
print(period_names)
print(short_descp)
print(temp)
打印出来
ThisAfternoon
Partly Sunny
High: 76 °F
['ThisAfternoon', 'Tonight', 'Saturday', 'SaturdayNight', 'Sunday', 'SundayNight', 'Monday', 'MondayNight', 'Tuesday']
['Partly Sunny', 'Patchy Fog', 'Patchy Fogthen MostlySunny', 'Patchy Fog', 'Patchy Fogthen PartlySunny', 'Patchy Fog', 'Patchy Fogthen MostlyCloudy', 'Mostly Cloudy', 'Partly Sunny']
['High: 76 °F', 'Low: 58 °F', 'High: 75 °F', 'Low: 59 °F', 'High: 80 °F', 'Low: 61 °F', 'High: 78 °F', 'Low: 61 °F', 'High: 77 °F']
推荐阅读
- android - Azure 认知服务面向 Android - NoClassDefFoundError
- excel - excel,vlookup,错误信息,我们发现这个公式有问题
- php - 修复在目的地点击 html 链接导致地址栏中重复文件夹的问题
- python - PyOpenSSL 与 ffi.from_buffer(b"") 错误
- javascript - 是否可以使用particles.js 创建这个动画
- azure-data-studio - 如何更改 Azure Data Studio 中结果网格边框的颜色?
- xamarin.forms - 如何修复 Xamarin Forms 中的 SIGABRT 崩溃
- python - 异常后继续嵌套循环
- javascript - 在没有提交按钮的情况下获取选定的单选按钮
- swift - 在 SpriteKit 中为物理体创建自定义路径