首页 > 解决方案 > 为什么这个嵌套的 for 循环在第二个循环中迭代了两次,之后就完全没有了?

问题描述

我试图通过下面定义的函数运行它来处理一些数据。它似乎可以很好地运行程序,但是循环不会迭代我期望的次数。

我将 return 语句放在哪里似乎并不重要,只要它在函数内部而不是在 if 语句之下。

我尝试在每个 for 循环下独立编写行,并在每种情况下写入预期的行数。

def _ManhattanDistance(x,y):
    a = 0
    for i in range(0,len(x)):
        a += abs(float(x[i])-float(y[i]))
    return a

def _CabFare(x,y,z):
    with open(x, 'r') as f:
        with open(y, 'r') as g:
            with open(z, 'wb') as h:
                reader_1 = csv.reader(f)
                reader_2 = csv.reader(g)
                writer = csv.writer(h)
                for row_b in reader_2:
                    for row_a in reader_1:
                        if _ManhattanDistance(row_a,row_b) > 0:
                            writer.writerow(row_a)
                            writer.writerow(row_b)
                return

作为参考,给定我的输入 reader_1 应该有 200 行,而 reader_2 应该有 17145 行。由于我们的包含阈值为零,我预计输出文件中有 17145*200 = 3429000 行——我得到的是一个 400 行的输出。

标签: pythoncsv

解决方案


这似乎有效:

from itertools import product

def _CabFare(x,y,z):
    with open(x, 'r') as f, open(y, 'r') as g, open(z, 'wb') as h:
        writer = csv.writer(h)
        for row_a, row_b in product(csv.reader(f), csv.reader(g)):
            if _ManhattanDistance(row_a, row_b) > 0:
                writer.writerow(row_a)
                writer.writerow(row_b)

速度较慢,但​​内存占用较少:

def _CabFare(x,y,z):
    with open(x, 'r') as f, open(z, 'wb') as h:
        writer = csv.writer(h)
        for row_a in csv.reader(f):
            with open(y, 'r') as g:
                for row_b in csv.reader(g):
                    if _ManhattanDistance(row_a, row_b) > 0:
                        writer.writerow(row_a)
                        writer.writerow(row_b)

推荐阅读