首页 > 解决方案 > 如何解决 IndexError:标量变量的索引无效

问题描述

我有多个 csv 文件已附加到 mySeries 中。我需要找到所有文件的第二列的总和。下面是我的代码。

all_files= glob.glob(os.path.join(directory, "*.csv"))
all_df = []
iter = 0
for f in all_files:
  df = pd.read_csv(f)
  mySeries.append(df)


for i in range(len(mySeries)):
    total=0
    total= sum(int(row[1]) for row in mySeries[i])
    print(total)

求和会给出错误 IndexError: invalid index to scalar variable。

我的数据看起来像这样

                  Flow
Hour                  
01-02-2021 20:00   374
01-02-2021 21:00   283
01-02-2021 22:00   108
01-02-2021 23:00    21
01-12-2020 20:00   400
01-12-2020 21:00   199
01-12-2020 22:00    92
01-12-2020 23:00     4
02-02-2021 00:00     1
02-02-2021 01:00     2
                  Flow
Hour                  
01-02-2021 20:00   605
01-02-2021 21:00   449
01-02-2021 22:00   334
01-02-2021 23:00   204
01-12-2020 20:00   668
01-12-2020 21:00   505
01-12-2020 22:00   391
01-12-2020 23:00   222
02-02-2021 00:00   137
02-02-2021 01:00    76

标签: pythonlistcsvsum

解决方案


只需将您的帧连接在一起并求和

all_files = glob.glob(os.path.join(directory, "*.csv"))
pd.concat([pd.read_csv(file) for file in all_files])['Flow'].sum()

下面的工作示例

import pandas as pd
from io import StringIO

s1 = """Hour,Flow
01-02-2021 20:00,374
01-02-2021 21:00,283
01-02-2021 22:00,108
01-02-2021 23:00,21
01-12-2020 20:00,400
01-12-2020 21:00,199
01-12-2020 22:00,92
01-12-2020 23:00,4
02-02-2021 00:00,1
02-02-2021 01:00,2"""

s2 = """Hour,Flow
01-02-2021 20:00,605
01-02-2021 21:00,449
01-02-2021 22:00,334
01-02-2021 23:00,204
01-12-2020 20:00,668
01-12-2020 21:00,505
01-12-2020 22:00,391
01-12-2020 23:00,222
02-02-2021 00:00,137
02-02-2021 01:00,76"""

pd.concat(pd.read_csv(StringIO(file)) for file in [s1,s2])['Flow'].sum()

# 5075

推荐阅读