首页 > 解决方案 > 使用 Beautiful Soup 抓取具有 URL 的弹出窗口(以及其他错误)

问题描述

我正在研究一个科学项目,它会抓取 skyward.smsd.org,它会在弹出窗口中打开,但在页面顶部有一个 URL,当我转到它时,它不是在弹出窗口中,它说您的会话已过期并且有我找不到解决办法。如果有人可以帮助我找到解决这些问题的方法,我也遇到了无效的语法错误 else: msg

while True:

    import requests
    from bs4 import BeautifulSoup
    import time
    from time import sleep
    url = "https://skyward.smsd.org/scripts/wsisa.dll/WService=wsEAplus/sfcalendar002.w"

    headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/39.0.2171.95 Safari/537.36'}

    response = requests.get(url, headers=headers)

    soup = BeautifulSoup(response.text, "lxml")
from requests.packages.urllib3 import add_stderr_logger

add_stderr_logger()
s = requests.Session()

s.headers['User-Agent'] = 'Mozilla/5.0'

login = {login: 3078774, password: (MY PASSWORD)}
login_response = s.post(url, data=login)
for r in login_response.history:
    if r.status_code == 401:  # 401 means authentication failed
        sys.exit(1)  # abort

pdf_response = s.get(pdf_url)  # Your cookies and headers are automatically included

if str(soup).find("skyward") == -1:
    continue

time.sleep(60)



else:
     msg = 'Subject: This is the script talking, check Skyward'

#Possibilty to make this tell you exactly what is changed
#A text feature that goes out daily for missing assignments
fromaddr = '3078774@smsd.org'

toaddrs  = ['3078774@smsd.org']



print('From: ' + fromaddr)
print('To: ' + str(toaddrs))
print('Message: ' + msg)

break

标签: pythonbeautifulsoup

解决方案


推荐阅读