首页 > 解决方案 > 与 BeautifulSoup 中的 HTML 变量交互

问题描述

我编写了从特定 URL 获取酒店名称和来自 bookings.com 的价格的代码。我试图让该工具仅输出我正在寻找的一家酒店的名称和价格。我可以在页面上输出所有酒店的名称和价格,但是当我运行一个 IF 语句来尝试输出一个单数时它不起作用。我尝试将 Str() 放在选择酒店名称和价格的代码周围,但这不会导致任何输出。当前代码仅返回“错误酒店”。一旦被刮掉,我是否无法操纵变量?因为我也想比较酒店的价格。

from bs4 import BeautifulSoup
import requests

headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.128 Safari/537.36'}

url = 'https://www.booking.com/searchresults.en-gb.html?aid=355028&sid=d2a902f346650dc0b748848763652bdc&sb=1&src=searchresults&src_elem=sb&error_url=https%3A%2F%2Fwww.booking.com%2Fsearchresults.en-gb.html%3Faid%3D355028%3Bsid%3Dd2a902f346650dc0b748848763652bdc%3Btmpl%3Dsearchresults%3Bcheckin_month%3D5%3Bcheckin_monthday%3D8%3Bcheckin_year%3D2021%3Bcheckout_month%3D5%3Bcheckout_monthday%3D13%3Bcheckout_year%3D2021%3Bcity%3D-2601889%3Bclass_interval%3D1%3Bdest_id%3D-2601889%3Bdest_type%3Dcity%3Bdtdisc%3D0%3Bfrom_sf%3D1%3Bgroup_adults%3D1%3Bgroup_children%3D0%3Binac%3D0%3Bindex_postcard%3D0%3Blabel_click%3Dundef%3Bno_rooms%3D1%3Boffset%3D0%3Bpostcard%3D0%3Broom1%3DA%3Bsb_price_type%3Dtotal%3Bshw_aparth%3D1%3Bslp_r_match%3D0%3Bsrc%3Dsearchresults%3Bsrc_elem%3Dsb%3Bsrpvid%3D6eda76c5afe000a5%3Bss%3DLondon%3Bss_all%3D0%3Bssb%3Dempty%3Bsshis%3D0%3Bssne%3DLondon%3Bssne_untouched%3DLondon%3Btop_ufis%3D1%3Bsig%3Dv1yWyN9mHA%3B&ss=London+Marriott+Hotel+County+Hall%2C+London%2C+Greater+London%2C+United+Kingdom&is_ski_area=&ssne=London&ssne_untouched=London&city=-2601889&checkin_year=2021&checkin_month=5&checkin_monthday=8&checkout_year=2021&checkout_month=5&checkout_monthday=13&group_adults=1&group_children=0&no_rooms=1&from_sf=1&ss_raw=Marriott+London&ac_position=1&ac_langcode=en&ac_click_type=b&dest_id=36867&dest_type=hotel&place_id_lat=51.5010959924622&place_id_lon=-0.119165182113647&search_pageview_id=6eda76c5afe000a5&search_selected=true&search_pageview_id=6eda76c5afe000a5&ac_suggestion_list_length=5&ac_suggestion_theme_list_length=0'


response=requests.get(url, headers=headers)

soup=BeautifulSoup(response.content, "lxml")


for item in soup.select('.sr_property_block'):
    try:
        hotelname = item.select('.sr-hotel__name')[0].get_text()
        hotelprice = item.select('.bui-price-display__value')[0].get_text()

        if hotelname == 'London Marriott Hotel County Hall':
            print(hotelname)
            print(hotelprice)
        else:
            print('Wrong Hotel')        
       
        #print('---------------')
        
    except Exception as e:
        print('')

标签: pythonweb-scrapingbeautifulsoup

解决方案


刮后有两个空格hotelname,一个前导空格字符和一个尾随空格字符。用于strip()消除hotelname.

hotelname = hotelname.strip()


推荐阅读