python - 如何将熊猫数据框转换为具有聚合级别的嵌套命名元组
问题描述
我正在寻找一种从熊猫数据框创建嵌套命名元组的方法。对象d
是预期的输出。我不确定聚合是否必须直接在 Pandas 中完成,然后转换为NamedTuple
应该在之后完成?
from typing import NamedTuple
from typing import List
import pandas as pd
if __name__ == "__main__":
data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
People = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])
names = list(People[["Name"]].itertuples(name="Names", index=False))
postal_codes = list(
People[["PostalCode"]].itertuples(name="PostalCode", index=False)
)
# ...
# ... The code after produce the expected output even if the name of the NamedTuple doesn't matter
PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
Demography = NamedTuple(
"Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)]
)
d = [
Demography(
[PeopleName(Name="tom"), PeopleName(Name="juli")],
PeoplePC(PostalCode="ab 11"),
),
Demography([PeopleName(Name="nick")], PeoplePC(PostalCode="ab 22"),),
]
解决方案
您可以使用groupby然后to_nested_tuple
在组上应用函数 ( ):
from typing import NamedTuple, List
import pandas as pd
data = [["tom", 10, "ab 11"], ["nick", 15, "ab 22"], ["juli", 14, "ab 11"]]
people = pd.DataFrame(data, columns=["Name", "Age", "PostalCode"])
PeopleName = NamedTuple("PeopleName", [("Name", str)])
PeoplePC = NamedTuple("PeoplePC", [("PostalCode", str)])
Demography = NamedTuple("Demography", [("names", List[PeopleName]), ("postalcodes", PeoplePC)])
def to_nested_tuple(k, g):
peoples = list(g['Name'].to_frame().itertuples(name='Person', index=False))
return Demography(peoples, PeoplePC(k))
d = [to_nested_tuple(*item) for item in people.groupby('PostalCode')]
print(d)
输出
[Demography(names=[Person(Name='tom'), Person(Name='juli')], postalcodes=PeoplePC(PostalCode='ab 11')), Demography(names=[Person(Name='nick')], postalcodes=PeoplePC(PostalCode='ab 22'))]
推荐阅读
- c# - UriBuilder - 删除端口
- bash - 在 Ubuntu 中使用值编辑 Bashrc
- python - 跳过阅读文本文件python中的某些部分
- python - 使用自定义创建操作测试 DRF 模型时出现 KeyError
- angular - Angular项目中的Rxjs websockets
- sql - 动态赋予别名与在 sql server 中声明变量相同
- c++ - 尝试新对象Qchart时,ubuntu中的C ++ Qt分段错误
- excel - “设置 Application.Workbooks.Open”返回错误值
- sql - 函数创建和调用
- javascript - Node js 中的 Model.findOne() 中的 Array.push()