首页 > 解决方案 > 如何将熊猫数据框转换为具有聚合级别的嵌套命名元组

问题描述

我正在寻找一种从熊猫数据框创建嵌套命名元组的方法。对象d是预期的输出。我不确定聚合是否必须直接在 Pandas 中完成,然后转换为NamedTuple应该在之后完成?

from typing import NamedTuple
from typing import List
import pandas as pd

if __name__ == "__main__":
    data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
    People = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

    names = list(People[["Name"]].itertuples(name="Names", index=False))
    postal_codes = list(
        People[["PostalCode"]].itertuples(name="PostalCode", index=False)
    )

    # ...
    # ... The code after produce the expected output even if the name of the NamedTuple doesn't matter

    PeopleName = NamedTuple("PeopleName", [("Name", str)])
    PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
    Demography = NamedTuple(
        "Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)]
    )

    d = [
        Demography(
            [PeopleName(Name="tom"), PeopleName(Name="juli")],
            PeoplePC(PostalCode="ab 11"),
        ),
        Demography([PeopleName(Name="nick")], PeoplePC(PostalCode="ab 22"),),
    ]

标签: pythonpandastuples

解决方案


您可以使用groupby然后to_nested_tuple在组上应用函数 ( ):

from typing import NamedTuple, List

import pandas as pd

data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])

PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
Demography = NamedTuple("Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)])


def to_nested_tuple(k, g):
    peoples = list(g['Name'].to_frame().itertuples(name='Person', index=False))
    return Demography(peoples, PeoplePC(k))


d = [to_nested_tuple(*item) for item in people.groupby('PostalCode')]

print(d)

输出

[Demography(names=[Person(Name='tom'), Person(Name='juli')], postalcodes=PeoplePC(PostalCode='ab 11')), Demography(names=[Person(Name='nick')], postalcodes=PeoplePC(PostalCode='ab 22'))]

推荐阅读