首页 > 解决方案 > 何将列表附加到数据框?

问题描述

我正在尝试将 ASCII 文件逐行读取到 Pandas DataFrame 中。

我写了以下脚本:

import pandas as pd

col_labels = ['Sg', 'Krg', 'Krw', 'Pc']

df = pd.DataFrame(columns=col_labels)

f = open('EPS.INC', 'r')
for line in f:
    if 'SGWFN' in line:
        print('Reading relative permeability table')
        for line in f:
            line = line.strip()
            if (line.split() and not line.startswith('/') and not line.startswith('--')):
                cols = line.split()
                print(repr(cols))
                df=df.append(cols)

print('Resulting Dataframe')
print(df)

我正在解析的文件是这样的:

SGWFN            

--Facies 1 Drainage SATNUM 1            
--Sg    Krg    Krw    J
0.000000    0.000000    1.000000    0.000000
0.030000    0.000000    0.500000    0.091233
0.040000    0.000518    0.484212    0.093203
0.050000    0.001624    0.468759    0.095237
/

我希望为每个数据框行添加四个值。相反,它们被添加为列,如下所示:

Resulting Dataframe
      Sg  Krg  Krw   Pc           0
0    NaN  NaN  NaN  NaN    0.000000
1    NaN  NaN  NaN  NaN    0.000000
2    NaN  NaN  NaN  NaN    1.000000
3    NaN  NaN  NaN  NaN    0.000000
4    NaN  NaN  NaN  NaN    0.030000
5    NaN  NaN  NaN  NaN    0.000000
6    NaN  NaN  NaN  NaN    0.500000

有人可以解释我做错了什么吗?

谢谢!D

标签: pythonpandasdataframeappend

解决方案


我建议在循环中创建空列表L并附加值,最后调用 DataFrame 构造函数:

L = []
#better for correct close file
with open("EPS.INC") as f:
    for line in f:
        if 'SGWFN' in line:
            print('Reading relative permeability table')
            for line in f:
                line = line.strip()
                if (line.split() and not line.startswith('/') and not line.startswith('--')):
                    cols = line.split()
                    print(repr(cols))
                    L.append(cols)

print('Resulting Dataframe')
col_labels = ['Sg', 'Krg', 'Krw', 'Pc']

df = pd.DataFrame(L, columns=col_labels)
print(df)
         Sg       Krg       Krw        Pc
0  0.000000  0.000000  1.000000  0.000000
1  0.030000  0.000000  0.500000  0.091233
2  0.040000  0.000518  0.484212  0.093203
3  0.050000  0.001624  0.468759  0.095237

您的解决方案应通过附加Series指定索引来更改:

col_labels = ['Sg', 'Krg', 'Krw', 'Pc']

df = pd.DataFrame()
f = open('EPS.INC', 'r')
for line in f:
    if 'SGWFN' in line:
        print('Reading relative permeability table')
        for line in f:
            line = line.strip()
            if (line.split() and not line.startswith('/') and not line.startswith('--')):
                cols = line.split()
                print(repr(cols))
                df=df.append(pd.Series(cols, index=col_labels), ignore_index=True)

print('Resulting Dataframe')
print(df)
        Krg       Krw        Pc        Sg
0  0.000000  1.000000  0.000000  0.000000
1  0.000000  0.500000  0.091233  0.030000
2  0.000518  0.484212  0.093203  0.040000
3  0.001624  0.468759  0.095237  0.050000

推荐阅读