首页 > 解决方案 > Pandas 按行条件操作

问题描述

我被从回声测深仪收集的这些测深数据困住了。它看起来像这样:

ID No Time Lat Lon Alt East North Count Fix
LL 0 589105179.00 24.156741 -110.321346 -31.50 4898039.453 -3406895.053 9 2 
ED 0 1.12 0.00
ED 0 1.53 0.00
ED 0 1.60 0.00
ED 0 1.08 0.00
ED 0 1.51 0.00
ED 0 1.06 0.00
LL 0 589105180.00 24.156741 -110.321346 -31.50 4898039.836 -3406894.045 9 2
ED 0 1.06 0.00
ED 0 1.12 0.00
ED 0 0.98 0.00
ED 0 0.96 0.00
ED 0 0.91 0.00
ED 0 0.90 0.00
LL 0 589105181.00 24.156741 -110.321346 -31.50 4898039.433 -3406894.003 9 2
ED 0 1.04 0.00
ED 0 1.04 0.00
ED 0 0.93 0.00
ED 0 0.99 0.00
ED 0 0.99 0.00
ED 0 1.01 0.00
LL 0 589105182.00 24.156741 -110.321346 -31.51 4898038.460 -3406894.841 9 2
ED 0 0.99 0.00
ED 0 0.96 0.00
ED 0 0.96 0.00
ED 0 0.96 0.00
ED 0 0.98 0.00
ED 0 0.98 0.00
LL 0 589105183.00 24.156741 -110.321346 -31.51 4898039.804 -3406894.107 9 2
ED 0 1.01 0.00
ED 0 1.01 0.00
ED 0 0.91 0.00
ED 0 1.04 0.00
ED 0 1.04 0.00
ED 0 0.96 0.00

每 LL 行给出下一个 ED 行测深测量的时间(自 2000 年以来的秒数)、坐标、方向等。

我们需要计算每个 ED 度量的平均值并将其分配给 LL 行。问题是在完整的文件中 ED 措施并不总是 6,有时是 5 或 4。

到目前为止,我已经这样做了:

data = pd.read_csv('Echosounder.txt', sep = '\t')    
LLs = data[data['ID'] == 'LL']    
EDs = data[data['ID'] == 'ED']

我喜欢这个的是它尊重索引顺序。我注意到有不同数量的 ED 措施,因为这样做之后:

EDs.groupby(np.arange(len(EDs))//6).mean()

并将它们附加到 LL,最后 LL 行没有测深值。

请帮忙。

标签: pythonpandas

解决方案


解析文件

from itertools import count
from collections import defaultdict
from pandas.io.common import StringIO as sio
import pandas as pd

c = count()
text = dict(LL=[], ED=defaultdict(list))

with open('file.txt', 'r') as fh:
  cols = fh.readline()

  for line in fh.readlines():
    k, t = line.split(None, 1)

    if k == 'LL':
      i = next(c)
      text[k].append(line)
    else:
      text[k][i].append(t)

构造DataFrame

ll = pd.read_csv(sio('\n'.join([cols, *text['LL']])), delim_whitespace=True)

ed = pd.concat({
    i: pd.read_csv(sio('\n'.join(v)), delim_whitespace=True, header=None)
    for i, v in text['ED'].items()
}).mean(level=0).add_prefix('ed_')

ll.join(ed)

   ID  No         Time        Lat         Lon    Alt         East        North  Count  Fix  ed_0      ed_1  ed_2
0  LL   0  589105179.0  24.156741 -110.321346 -31.50  4898039.453 -3406895.053      9    2     0  1.316667   0.0
1  LL   0  589105180.0  24.156741 -110.321346 -31.50  4898039.836 -3406894.045      9    2     0  0.988333   0.0
2  LL   0  589105181.0  24.156741 -110.321346 -31.50  4898039.433 -3406894.003      9    2     0  1.000000   0.0
3  LL   0  589105182.0  24.156741 -110.321346 -31.51  4898038.460 -3406894.841      9    2     0  0.971667   0.0
4  LL   0  589105183.0  24.156741 -110.321346 -31.51  4898039.804 -3406894.107      9    2     0  0.995000   0.0

推荐阅读