python - 由于在执行主循环之前无法填充变量,在 tkinter GUI 中请求模块 MissingSchema 错误:如何解决这个问题?
问题描述
我正在尝试在一些现有代码上构建 GUI,但遇到了MissingSchema
错误。我知道一般问题,但不是最佳解决方案。
基本上,在 tkinter 之前,mainloop()
我试图发出一个requests
模块请求,以创建一个 BeautifulSoup 对象,该对象是许多功能所需的。但是,要发出该请求,我需要一个带有用户选择的 url 的填充url
变量;mainloop()
但是,在执行之后才能填充此变量。因此,requests
由于 url 为空,调用失败,给了我MissingSchema
错误。你可以运行下面的代码来看看我的意思:
from tkinter import *
from tkinter import scrolledtext as st
import requests
import re
from bs4 import BeautifulSoup
root = Tk()
url_entry = Entry(root)
url = url_entry.get()
log_text = st.ScrolledText(root, state='disabled')
start_button = Button(root, text='Run program', command=lambda: [seo_find_stopwords(urlSoup)])
url_entry.grid(column=0, row=1)
log_text.grid(column=2, row=0, rowspan=3)
start_button.grid(column=1, row=5)
agent = "Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0"
# attempts to access provided URL, returns errors if unable
try:
# 'agent' added as part of effort to avoid HTTP Error 403: Forbidden
url_request = requests.get(url, headers={'User-Agent': agent})
url_request.raise_for_status()
urlSoup = BeautifulSoup(url_request.text, 'lxml')
except requests.exceptions.MissingSchema as exc:
log_text.insert(INSERT, "ERROR: Invalid URL provided. Please try again with a valid URL.")
raise exc
# searches HTML page title for SEO stop words from stopwords.txt, then provides number and list of present stop words
def seo_find_stopwords(urlSoup):
stopwords_count = 0
stopwords_list = []
if urlSoup.title:
with open('stopwords.txt', 'r', encoding='utf-8') as file:
for line in file:
if re.search(r'\b' + line.rstrip('\n') + r'\b', urlSoup.title.text.casefold()):
stopwords_count += 1
stopwords_list.append(line.rstrip('\n'))
if stopwords_count > 0:
log_text.insert(INSERT, "{0} stop words were found in your page title. If possible, it would be good to "
"reduce them. The stop words found are: {1}".format(stopwords_count, stopwords_list))
root.mainloop()
抱歉,如果这有点大,我尝试尽可能地压缩它。我想知道纠正这个错误的最佳方法是什么。我的印象是,它可能是将有关进行requests.get()
调用的部分放入一个函数中,并使用它来返回以urlSoup
某种方式用于需要它的函数中。
解决方案
即使在用户尝试输入任何内容之前,您也试图获取 url。因此,将 url 请求放在一个函数中,并在Entry
小部件有文本或将事件处理程序绑定到按钮时调用它
这是一个演示。(在小部件中插入文本后,您可以按回车键或运行按钮Entry
)
from tkinter import *
import requests
from tkinter import scrolledtext as st
import re
from bs4 import BeautifulSoup
# searches HTML page title for SEO stop words from stopwords.txt, then provides number and list of present stop words
def seo_find_stopwords(urlSoup):
stopwords_count = 0
stopwords_list = []
print('No')
if urlSoup.title:
with open('stopwords.txt', 'r', encoding='utf-8') as file:
for line in file:
if re.search(r'\b' + line.rstrip('\n') + r'\b', urlSoup.title.text.casefold()):
stopwords_count += 1
stopwords_list.append(line.rstrip('\n'))
if stopwords_count > 0:
log_text.insert(INSERT, "{0} stop words were found in your page title. If possible, it would be good to "
"reduce them. The stop words found are: {1}".format(stopwords_count, stopwords_list))
def request_url(event=None):
global urlSoup
try:
# 'agent' added as part of effort to avoid HTTP Error 403: Forbidden
url_request = requests.get(url_entry.get(), headers={'User-Agent': agent})
url_request.raise_for_status()
urlSoup = BeautifulSoup(url_request.text, 'lxml')
except requests.exceptions.MissingSchema as exc:
log_text.insert(INSERT, "ERROR: Invalid URL provided. Please try again with a valid URL.")
raise exc
root = Tk()
urlSoup =''
url_entry = Entry(root)
url_entry.bind('<Return>', request_url)
#url = url_entry.get()
log_text = st.ScrolledText(root, state='disabled')
start_button = Button(root, text='Run program', command=lambda: request_url() or [seo_find_stopwords(urlSoup)])
url_entry.grid(column=0, row=1)
log_text.grid(column=2, row=0, rowspan=3)
start_button.grid(column=1, row=5)
agent = "Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:10.0) Gecko/20100101 Firefox/10.0"
# attempts to access provided URL, returns errors if unable
root.mainloop()
推荐阅读
- angular - 用管道控制 Observables
- eclipse - 我无法从 eclipse 市场安装 Tomcat 插件
- curl - listStatus 上的 INVALID_REQUEST_PARAMETER
- c# - GTK# 如何正确清理小部件、内存泄漏(Glib.toggleref、Glib.signal)
- dotnetnuke - 仅特定于 2sxc 应用程序的一个实例的变量
- java - 从列表中提取值(不是键)
- visual-studio - 您可以将启动操作/启动项目存储在项目/sln 文件中吗?
- php - Wordpress Customizer 隐藏自定义部分?
- r - 归还名称
- rust - 尝试使用 Tokio 实现嵌套并发时“无法递归调用‘Core’”