首页 > 解决方案 > 行比较和按列追加循环

问题描述

我有一堆学校数据,我将它们保存在每月考试成绩的主列表中。每次孩子得分并且“年龄”、“分数”、“学校”有更新时,我都会插入一个包含更新数据的新行并跟踪所有更改。我正在尝试找出一个 python 脚本来执行此操作,但由于我是新手,所以我一直遇到问题。

我尝试编写一个循环,但不断收到错误,包括“False”、“系列的真值不明确”、“元组索引必须是整数,而不是 str”

master_df = pd.DataFrame({'ID': ['A', 'B', 'C', 'D'],
             'Age':[15,14,17,13],
             'School':['AB', 'CD', 'EF', 'GH'],
             'Score':[80, 75, 62, 100],
             'Date': ['3/1/2019', '3/1/2019', '3/1/2019', '3/1/2019']})

updates_df = pd.DataFrame({'ID': ['A', 'B', 'C', 'D'],
             'Age':[16,14,17,13],
             'School':['AB', 'ZX', 'EF', 'GH'],
             'Score':[80, 90, 62, 100],
             'Date': ['4/1/2019', '4/1/2019', '4/1/2019', '4/1/2019']})

# What I am trying to get is:  

updated_master = pd.DataFrame({'ID': ['A', 'A', 'B', 'B', 'C','D'],
             'Age':[15,16,14,14,17,13],
             'School':['AB', 'AB', 'CD', 'ZX', 'EF', 'GH'],
             'Score':[80, 80, 75, 90, 62, 100],
             'Date': ['3/1/2019', '4/1/2019', '3/1/2019', '4/1/2019', '3/1/2019', '3/1/2019']})

temp_delta_list = []
m_score = master_df.iloc[1:, master_df.columns.get_loc('Score')]
m_age = master_df.iloc[1:, master_df.columns.get_loc('Age')]
m_school = master_df.iloc[1:, master_df.columns.get_loc('School')]

u_score = updates_df.iloc[1:, updates_df.columns.get_loc('Score')]
u_age = updates_df.iloc[1:, updates_df.columns.get_loc('Age')]
u_school = updates_df.iloc[1:, updates_df.columns.get_loc('School')]



for i in updates_df['ID'].values:
    updated_temp_score = updates_df[updates_df['ID'] == i], u_score
    updated_temp_age = updates_df[updates_df['ID'] == i], u_age
    updated_temp_school = updates_df[updates_df['ID'] == i], u_school


    master_temp_score = master_df[master_df['ID'] == i], m_score
    master_temp_age = master_df[master_df['ID'] == i], m_age
    master_temp_school = updates_df[master_df['ID'] == i], m_school

if (updated_temp_score == master_temp_score) | (updated_temp_age == master_temp_age) | (updated_temp_school == master_temp_school):
   pass
else:
   temp_deltas = updates_df[(updates_df['ID'] == i)]
   temp_delta_list.append(temp_deltas)

我最终希望循环比较每个 ID 的每一行值并返回有任何差异的行,然后附加 master_df

标签: pythonpandasloops

解决方案


推荐阅读