python - 将 .txt 或 excel 文件的行读入元组

问题描述

我想逐行阅读两个 .txt 文件。文件的数据分为五列

文件_1：

843.19598 2396.10278 3579.13778 4210.15674 4209.37549
841.93976 2397.21948 3573.11963 4205.89209 4226.73926
842.01642 2397.72266 3573.06494 4202.88379 4226.93799
842.22083 2397.47974 3574.27515 4204.19043 4223.82088
842.42065 2397.20142 3575.47437 4205.52246 4220.64795

文件_2：

3586.02124 2391.50342 837.45227 -837.29681 -2385.97513
3587.69238 2387.48218 836.60445 -840.75067 -2390.17529
3588.44531 2387.44556 836.00555 -840.79022 -2389.77612
3588.08203 2388.25439 836.26544 -840.17017 -2389.07544
3587.66553 2389.05566 836.53046 -839.53912 -2388.40405

文件的每一行都必须转换为一个元组。例如对于两个文件的第一行，输出应该是：

FILE_1/1stLine = (843.19598, 2396.10278, 3579.13778, 4210.15674, 4209.37549)  

FILE_2/1stline = (3586.02124, 2391.50342, 837.45227, -837.29681, -2385.97513)

然后我需要将这两个文件的行组合成一个名为 aux 的新变量，其中第一个元素是 FILE_1 的一行，第二个元素是 FILE_2 中相同位置的行

aux = (FILE_1/1stLine, FILE_2/1stline) ----- aux 1stLine
aux = (FILE_1/2ndLine, FILE_2/2ndline) ----- aux 2ndLine
.
.
aux = (FILE_1/LastLine, FILE_2/Lastline) ----- aux 2ndLastLine

例如，取两个文件的第一行，第一个 aux 必须是：

((843.19598, 2396.10278, 3579.13778, 4210.15674, 4209.37549), (3586.02124, 2391.50342, 837.45227, -837.29681, -2385.97513))

有任何想法吗？

f1 = open("FILE_1.txt", "r")
f2 = open("FILE_2.txt", "r")
for a in f1:
    for b in f2:
        x = tuple(a)
        y = tuple(b)
        aux = (x, y)

这段代码的结果是：

('8', '4', '3', '.', '1', '9', '5', '9', '8', ' ', '2', '3', '9', '6', '.', '1', '0', '2', '7', '8', ' ', '3', '5', '7', '9', '.', '1', '3', '7', '7', '8', ' ', '4', '2', '1', '0', '.', '1', '5', '6', '7', '4', ' ', '4', '2', '0', '9', '.', '3', '7', '5', '4', '9', '\n')
('3', '5', '8', '6', '.', '0', '2', '1', '2', '4', ' ', '2', '3', '9', '1', '.', '5', '0', '3', '4', '2', ' ', '8', '3', '7', '.', '4', '5', '2', '2', '7', ' ', '-', '8', '3', '7', '.', '2', '9', '6', '8', '1', ' ', '-', '2', '3', '8', '5', '.', '9', '7', '5', '1', '3', '\n')
(('8', '4', '3', '.', '1', '9', '5', '9', '8', ' ', '2', '3', '9', '6', '.', '1', '0', '2', '7', '8', ' ', '3', '5', '7', '9', '.', '1', '3', '7', '7', '8', ' ', '4', '2', '1', '0', '.', '1', '5', '6', '7', '4', ' ', '4', '2', '0', '9', '.', '3', '7', '5', '4', '9', '\n'), ('3', '5', '8', '6', '.', '0', '2', '1', '2', '4', ' ', '2', '3', '9', '1', '.', '5', '0', '3', '4', '2', ' ', '8', '3', '7', '.', '4', '5', '2', '2', '7', ' ', '-', '8', '3', '7', '.', '2', '9', '6', '8', '1', ' ', '-', '2', '3', '8', '5', '.', '9', '7', '5', '1', '3', '\n'))

非常感谢！

我不需要像 '843.19598' 那样获取 f1/f2 的每个元素，而是需要像 843.19598 这样没有引号的元素。

让我展示这些数据作为输入的代码（以一组点为例）

问题是我必须从这些文件中读取 x 和 y，并且对于每组我都需要拟合一个椭圆。

import ellipses as el
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Ellipse



x = (5727.53135,  7147.62235, 10330.93573,  8711.17228, 7630.40262,
        4777.24983,  4828.27655,  9449.94416,  5203.81323,  6299.44811,
        6494.21906)

y = (67157.77567 , 66568.50068 , 55922.56257 , 54887.47348 ,
       65150.14064 , 66529.91705 , 65934.25548 , 55351.57612 ,
       63123.5103  , 67181.141725, 56321.36025)

data = (x, y)

lsqe = el.LSqEllipse()
lsqe.fit(data)
center, width, height, phi = lsqe.parameters()

print (center, width, height, phi)

plt.close('all')
fig = plt.figure(figsize=(6,6))
ax = fig.add_subplot(111)
ax.axis('equal')
ax.plot(data[0], data[1], 'ro', label='test data', zorder=1)

ellipse = Ellipse(xy=center, width=2*width, height=2*height, angle=np.rad2deg(phi),
               edgecolor='b', fc='None', lw=2, label='Fit', zorder = 2)
ax.add_patch(ellipse)

plt.legend()
plt.show()

标签： pythonpython-3.xpandastuples

解决方案

数据集

FILE 1 (saved as f1.csv and f1.xls)
843.19598 2396.10278 3579.13778 4210.15674 4209.37549
841.93976 2397.21948 3573.11963 4205.89209 4226.73926
842.01642 2397.72266 3573.06494 4202.88379 4226.93799
842.22083 2397.47974 3574.27515 4204.19043 4223.82088
842.42065 2397.20142 3575.47437 4205.52246 4220.64795

FILE 2 (saved as f2.csv and f2.xls)
3586.02124 2391.50342 837.45227 -837.29681 -2385.97513
3587.69238 2387.48218 836.60445 -840.75067 -2390.17529
3588.44531 2387.44556 836.00555 -840.79022 -2389.77612
3588.08203 2388.25439 836.26544 -840.17017 -2389.07544
3587.66553 2389.05566 836.53046 -839.53912 -2388.40405

使用导入 csv（适用于 ascii 文件，即 .csv、.txt 等）

import csv

# Files to read
files = ['f1.csv', 'f2.csv']
tup_files = ()
aux = ()

# Read each file and concatenate to tup_files
for file in files:
    with open(file) as csv_file:
        csv_reader = csv.reader(csv_file, delimiter=' ')
        tmp_rows = ()
        for row in csv_reader:
            tmp_rows += (tuple(row), )  

    tup_files += (tmp_rows, )

for row_f1, row_f2 in zip(tup_files[0], tup_files[1]):
    aux += (row_f1, row_f2)

print(f'printing f1\n{tup_files[0]}\n')
print(f'printing f2\n{tup_files[1]}\n')
print(f'printing aux\n{aux}')

使用熊猫（适用于 .xls）

import pandas as pd

# Files to read
files = ['f1.xls', 'f2.xls']
tup_files = ()
aux = ()

# Read each file and concatenate to tup_files
for file in files:
    data = pd.read_excel(file, header=None)
    tup_files += (tuple(data.itertuples(index=False, name=None)), )

for row_f1, row_f2 in zip(tup_files[0], tup_files[1]):
    aux += (row_f1, row_f2)

print(f'printing f1\n{tup_files[0]}\n')
print(f'printing f2\n{tup_files[1]}\n')
print(f'printing aux\n{aux}')

哪个产生：

printing f1
(('843.19598', '2396.10278', '3579.13778', '4210.15674', '4209.37549'), 
 ('841.93976', '2397.21948', '3573.11963', '4205.89209', '4226.73926'),
 ('842.01642', '2397.72266', '3573.06494', '4202.88379', '4226.93799'),
 ('842.22083', '2397.47974', '3574.27515', '4204.19043', '4223.82088'),
 ('842.42065', '2397.20142', '3575.47437', '4205.52246', '4220.64795'))

printing f2
(('3586.02124', '2391.50342', '837.45227', '-837.29681', '-2385.97513'),
 ('3587.69238', '2387.48218', '836.60445', '-840.75067', '-2390.17529'), 
 ('3588.44531', '2387.44556', '836.00555', '-840.79022', '-2389.77612'), 
 ('3588.08203', '2388.25439', '836.26544', '-840.17017', '-2389.07544'), 
 ('3587.66553', '2389.05566', '836.53046', '-839.53912', '-2388.40405'))

printing aux
(('843.19598', '2396.10278', '3579.13778', '4210.15674', '4209.37549'), 
 ('3586.02124', '2391.50342', '837.45227', '-837.29681', '-2385.97513'), 
 ('841.93976', '2397.21948', '3573.11963', '4205.89209', '4226.73926'), 
 ('3587.69238', '2387.48218', '836.60445', '-840.75067', '-2390.17529'), 
 ('842.01642', '2397.72266', '3573.06494', '4202.88379', '4226.93799'), 
 ('3588.44531', '2387.44556', '836.00555', '-840.79022', '-2389.77612'), 
 ('842.22083', '2397.47974', '3574.27515', '4204.19043', '4223.82088'), 
 ('3588.08203', '2388.25439', '836.26544', '-840.17017', '-2389.07544'), 
 ('842.42065', '2397.20142', '3575.47437', '4205.52246', '4220.64795'), 
 ('3587.66553', '2389.05566', '836.53046', '-839.53912', '-2388.40405'))

根据需要使用元组的结果。