首页 > 解决方案 > python:如何将txt数据转换为嵌套数组列表?

问题描述

在我的 Python 屏幕中,我将原子坐标存储为以下格式的变量:

ATOM      1  C   LIG    1       77.036  58.294  67.981 -0.14 +0.02    +0.345 C \nATOM      2  O   LIG    1       77.559  57.961  69.048 -0.13 -0.02    -0.247 OA\nATOM      3  O   LIG    1       76.286  59.397  67.781 -0.14 -0.00    -0.462 OA\nATOM      4  C   LIG    1       75.371  59.837  68.775 -0.17 -0.01    +0.232 C \n

这里是 4 个原子的示例,其中每个块(原子)由 '\n' 分隔。根据这个初始数据,每个块的 XYZ 坐标并将其存储在一个嵌套列表中。对于这 4 个原子,预期结果应该是:

[[77.036, 58.294, 67.981],[77.559,57.961,69.048],[76.286,59.397,67.781],[75.371,59.837,68.775]]

标签: pythonarrays

解决方案


您可以使用正则表达式来提取所需的值:

import re
data = "ATOM      1  C   LIG    1       77.036  58.294  67.981 -0.14 +0.02    +0.345 C \nATOM      2  O   LIG    1       77.559  57.961  69.048 -0.13 -0.02    -0.247 OA\nATOM      3  O   LIG    1       76.286  59.397  67.781 -0.14 -0.00    -0.462 OA\nATOM      4  C   LIG    1       75.371  59.837  68.775 -0.17 -0.01    +0.232 C \n"

res = []
for line in data.splitlines():
    m = re.search(r'[\W]{7,}([\d .+-]+)[\D]+', line)
    coords = [float(item.strip()) for item in m.groups()[0].split(' ') if item]
    res.append(coords[:3])
print(res)

expectedResult = [
    [77.036, 58.294, 67.981], [77.559, 57.961, 69.048],
    [76.286, 59.397, 67.781], [75.371, 59.837, 68.775]
]
print(res == expectedResult)

出去:

[[77.036, 58.294, 67.981], [77.559, 57.961, 69.048], [76.286, 59.397, 67.781], [75.371, 59.837, 68.775]]
True

推荐阅读