首页 > 解决方案 > 将数据框和去重数据框之间的缺失值配对并输出到 csv

问题描述

我有一份学校名单和他们提供的课程。我还有一份独特的课程列表,其中只有一些在各个学校提供,而有些则不提供。我想返回每所学校的缺失课程,并与学校名称配对。

我已经能够返回每所学校的缺失班级列表,但是我无法配对并返回与每所学校的缺失班级相对应的学校名称。

读入数据帧

schools = {'School': ['School A', 'School A', 'School A', 'School B', 'School B', 'School B', 'School C','School C', 'School D'], 'Class': ['Math', 'Chemistry', 'English', 'Math', 'Chemistry', 'English', 'Math', 'Chemistry', 'Physics']}
dfSchool = pd.DataFrame(data=schools)
dfSchool

classes = {'Class': ['Math', 'Chemistry', 'English', 'History', 'Physics']}
dfClasses = pd.DataFrame(data=classes)
dfClasses

按学校分组

grouped = dfSchool.groupby('School')

newdflist = []

for name, group in grouped:
    newdflist.append(group)
    print(name)
    print(group)

归还每所学校的缺失课程

i = 0
while i < 4:
    missingClasses = dfClasses[~dfClasses['Class'].isin(newdflist[i]['Class'])]
    print(missingClasses)
    i += 1

实际结果:

     Class
3  History
4  Physics

     Class
3  History
4  Physics

     Class
2  English
3  History
4  Physics

       Class
0       Math
1  Chemistry
2    English
3    History

期望的结果:

  School    Class
3 School A  History
4 School A  Physics

  School    Class
3 School B  History
4 School B  Physics

  School    Class
2 School C  English
3 School C  History
4 School C  Physics

  School    Class
0 School D      Math
1 School D Chemistry
2 School D   English
3 School D   History

标签: pythonpython-3.xpandas

解决方案


打印所需结果的快速答案在这里:

    for name, group in grouped:
        print(name)
        print(dfClasses[~(dfClasses.Class.isin(group["Class"]))])

我从中得到的结果是:

   School A
    Class
    3 History
    4  Physics
   School B
    Class
    3 History
    4 Physics
   School C
    Class
    2 English
    3 History
    4 Physics
   School D
    Class
    0 Math
    1 Chemistry
    2 English
    3 History

您所要做的就是将其放入数据框中而不是打印。

希望这可以帮助 :)


推荐阅读