首页 > 解决方案 > Optimize the time it take to read txt file

问题描述

I would like to reduce the time it takes to read a txt file. The file contains x and y coordinates like this:

{52.52, 13.38}
{53.55, 10.}
{48.14, 11.58}
{50.95, 6.97}
...

For now it take approx 0.06s to read and calculate the real position for 12000 coordinates but I would like to do it on half that time.

def read_coordinate_file(filename):
points = []
file = open(filename, "r")
for line in file:
    a, b = line.strip("{}\n").split(",")
    points.append([get_x(float(b)), get_y(float(a))])
file.close()
return np.array(points)


def get_x(b, R=1):
return R*(pi*b)/180


def get_y(a, R=1):
temp = 90*pi+pi*a
return R*np.log(np.tan(temp/360))

If I have understod it correctly this could be done with numpy arrays. I have tried with np.loadtxt but this goes slower than my current code. Is there any way to reduce the time for this?

标签: python-3.xnumpy

解决方案


肯定会同意在 Numpy 中进行所有计算应该更快的评论:

import numpy as np
from math import pi

def read_coordinate_file(filename):
    with open(filename, "r") as f:
        points = [tuple(map(float, line.strip("{}\n").split(','))) for line in f if line]
    arr = np.array(points, dtype=[('x','<f4'), ('y','<f4')])
    arr['x'] = arr['x'] * pi / 180
    arr['y'] = np.log(np.tan((90*pi + pi*arr['y'])/ 360))
    return arr
print(read_coordinate_file('data.txt'))

我没有要测试的数据集,所以我无法验证它是否一定更快,但这至少将计算转移到 Numpy 中。

(我忽略了,R因为对我来说,您在哪里指定默认值的替代值并不是很明显1。)


推荐阅读