首页 > 解决方案 > 如何从excel数据表创建多个数据框

问题描述

在获得所需的列之后,我使用 pandas 库从 excel 电子表格中提取了这个数据框,并且我的表格格式如下,

    REF PLAYERS
0   103368  Andrés Posada Sanmiguel
1   300552  Diego Posada Sanmiguel
2   103304  Roberto Motta Stanziola
3   NaN NaN
4   REF PLAYERS
5   1047012 ANABELLA EISMANN DE AMAYA
6   104701  FERNANDO ENRIQUE AMAYA CASTRO
7   103451  AUGUSTO ANTONIO ALVARADO AZCARRAGA
8   103484  Kevin Adrian Villarreal Kam
9   REF PLAYERS
10  NaN NaN
11  NaN NaN
12  NaN NaN
13  NaN NaN
14  REF PLAYERS
15  NaN NaN
16  NaN NaN
17  NaN NaN
18  NaN NaN
19  REF PLAYERS

我想创建多个数据框,将每一行 [['REF', 'PLAYERS']] 转换为新的数据框列。欢迎提出建议我还需要保留空白。熊猫新手。

标签: pythonpandasdataframe

解决方案


为此,您必须首先以不同的方式从文件中读取数据帧:在函数中设置header=None参数pd.read_excel()。因为现在您的列被称为“REF”和“PLAYERS”,但我们想按它们分组。

那么第一列名称可能是“0”,第一行如下,其中df是您的数据框的名称:

# Set unique index for each group
df["group_id"] = (df[0] == "REF").cumsum()

解决方案:

# Set unique index for each group
df["group_id"] = (df["name_of_first_column"] == "REF").cumsum()

# Iterate over groups
dataframes = []
for name, group in df.groupby("group_id"):
    df_ = group
    # promote 1st row to column name
    df_.columns = df_.iloc[0]
    # and drop it
    df_ = df_.iloc[1:]
    # drop index column
    df_ = df_[["REF", "PLAYERS"]]
    # append to the list of dataframes
    dataframes.append(df_)

您所有的多个数据帧现在都存储在一个数组dataframes中。


推荐阅读