首页 > 解决方案 > Python线程似乎被冻结了

问题描述

我有一个网络爬虫。这是它的代码。

# I truncated this code because of the comment.
from threading import Thread

no_of_threads = int(input("No. of threads : "))
if no_of_threads > 70:
    no_of_threads = 70
threads = []



def log(data, thid, info=True):
    if info:
        # print a information to the user
    else:
        # inform user about the error


def crawl(t_id):
  # Here comes the long-time loop. this will run until there are no items left in the list of URLs to be scraped
  while True:
        # do some complex operations like getting the title, description, etc.

lines = # a one million links list got from a file

for line in lines:
    # Get each link in the list and put in the list

# Iterate through the no. of threads given by user and create a thread
for _ in range(no_of_threads):
    t = Thread(target=crawl, args=(_,))
    t.start()
    threads.append(t)

for thread in threads:
    thread.join()

它实例化用户输入的线程数。然后在一个永远的 while 循环中,它正在爬取 URL。

我运行了大约 4 个小时,起初,所有线程都工作,但 4 小时后只有一个线程工作。最后,它也变得不活跃。这是为什么?

标签: pythonmultithreadingpython-requests

解决方案


推荐阅读