首页 > 解决方案 > Python - 一次多次运行部分代码

问题描述

我有这个代码:

configurationsFile = "file.csv"
configurations = []


def loadConfigurations():
    with open(configurationsFile) as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=';')
        line_count = 0
        for row in csv_reader:
            url = row[0]
            line_count += 1
            configurations.append({"url": url})
        print(f'{line_count} urls loaded.')

loadConfigurations()


failedConfigs = []

session_requests = requests.session()

for config in configurations:
    try:
        "Do something with the url loaded fron file.csv"

            
    except Exception  as e: 
        print(e)
        failedConfigs.append(config)

if len(failedConfigs) > 0:
    print("These errored out:")
    for theConfig in failedConfigs:
        print("ERROR: {}".format(theConfig['url']))

它从 csv 文件中读取 url,然后为 csv 文件中列出的每个 url 运行代码。唯一的“问题”是如果 csv 文件包含很多 url,那么通过它们运行需要很长时间。所以我正在寻找一种方法来一次运行多个网址。

我对python不太好,所以我什至不知道这是否可能。但问题是,有没有办法告诉代码运行,一次说 5 个 url 而不是 1 个?

标签: python

解决方案


你可以使用threading.Thread类。这是一个例子:

from threading import Thread

def read(file, start, end):
    with open(file, 'r') as r:
        for i, v in enumerate(r):
            if start <= i < end:
                print(v)

file = "file.txt"

t1 = Thread(target=read, args=(file, 0, 100))
t2 = Thread(target=read, args=(file, 100, 200))
t3 = Thread(target=read, args=(file, 200, 300))
t4 = Thread(target=read, args=(file, 300, 400))
t5 = Thread(target=read, args=(file, 400, 500))

t1.start()
t2.start()
t3.start()
t3.start()
t5.start()

t1.join()
t2.join()
t3.join()
t4.join()
t5.join()

或使用循环:

from threading import Thread

def read(file, start, end):
    with open(file, 'r') as r:
        for i, v in enumerate(r):
            if start <= i < end:
                print(v)

file = "file.txt"

threads = []
for i in range(5):
    threads.append(Thread(target=read, args=(file, i * 100, (i + 1) * 100)))
for t in threads:
    t.start()
for t in threads:
    t.join()

基本上,上面定义的函数逐行read()读取文件。将读取任务拆分为 5 个段,以便 5 个线程可以同时读取文件。startend


根据要求更新

对于您的代码,

for config in configurations:
    try:
        "Do something with the url loaded fron file.csv"
            
    except Exception  as e: 
        print(e)
        failedConfigs.append(config)

可以转换为允许您指定从哪个索引到configurations要处理的索引的函数:

def process(start, end):
    for i in range(start, end):
        config = configurations[i]
        try:
            "Do something with the url loaded fron file.csv"

        except Exception  as e: 
            print(e)
            failedConfigs.append(config)

然后您可以添加

threads = []
for i in range(5):
    threads.append(Thread(target=process, args=(i * 100, (i + 1) * 100)))
for t in threads:
    t.start()
for t in threads:
    t.join()

所以你最终可能会得到类似的东西:

configurationsFile = "file.csv"
configurations = []


def loadConfigurations():
    with open(configurationsFile) as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=';')
        line_count = 0
        for row in csv_reader:
            url = row[0]
            line_count += 1
            configurations.append({"url": url})
        print(f'{line_count} urls loaded.')

loadConfigurations()

failedConfigs = []

session_requests = requests.session()

def process(start, end):
    for i in range(start, end):
        config = configurations[i]
        try:
            "Do something with the url loaded fron file.csv"

        except Exception  as e: 
            print(e)
            failedConfigs.append(config)

threads = []
for i in range(5):
    threads.append(Thread(target=process, args=(i * 100, (i + 1) * 100)))
for t in threads:
    t.start()
for t in threads:
    t.join()

if len(failedConfigs) > 0:
    print("These errored out:")
    for theConfig in failedConfigs:
        print("ERROR: {}".format(theConfig['url']))

推荐阅读