首页 > 解决方案 > 从另一个文件中数学减去一个文件中的值

问题描述

我有两个文件,每个文件大小相同(100x12),包含数值,正负均以逗号分隔。

文件 1 的示例输出:

-14.99,-15.6,8.0 ->
-9.0,34.87,98.98 ->
(and so on)

文件 2 的示例输出:

-15.99,-18.6,8.00 ->
-3.0,34.34,-98.88 ->
(and so on)

我努力了:

awk '{getline t<"file1"; print $0-t}' file2

但是,这只会减去第一列。如何将其扩展为从 file2/column2 中减去 file1/column1?

我愿意使用熊猫来执行此操作。先感谢您!

标签: pythonbashpandasshellawk

解决方案


首先是数据

file1 = """
-14.99,-15.6,8.0 ->
-9.0,34.87,98.98 ->
"""
file2 = """
-15.99,-18.6,8.00 ->
-3.0,34.34,-98.88 ->
"""

from io import StringIO # faking file on disk

熊猫回答。

import pandas as pd
converter = {2: lambda s: float(s.split(' ')[0])}
df1 = pd.read_csv(StringIO(file1), header=None, converters=converter)
df2 = pd.read_csv(StringIO(file2), header=None, converters=converter)
(df1-df2).to_csv('pddiff12.csv', header=False, index=False)    

或者用纯python滚动它。

# cmt 1 -> indent under with-statement

def read_csv(file_name):
    #with open('file_name', 'rt') as f1: # uncomment when reading from disk
    f1 = StringIO(file_name) # comment out when reading from disk
    rows = [r for r in f1.readlines() if r.strip()] # cmt 1
    crunch = lambda row: [float(r) for r in row.split(',')]
    rows = [crunch(r.split(' ')[0]) for r in rows]
    return rows

data1 = read_csv(file1)
data2 = read_csv(file2)

diff = []
for row1, row2 in zip(data1, data2):
    diff.append([i-j for i, j in zip(row1, row2)])

with open('diff12.csv', 'wt') as d12:
    for row in diff:
        d12.write(', '.join((str(v) for v in row)) + '\n')

Pandas 肯定是最容易阅读和工作的,尽管如果人们倾向于避免这种情况,它是一个明显的依赖关系。在这种情况下,我想我不会。


推荐阅读