python - 针对特定列将行合并到一个单元格(到一个元组中)
问题描述
假设我有一个如下表:例如,在 id 2 下,这一行仅显示该 id 的状态和日期的历史记录。
*id*, *status*, *date*, *Type*
2 dissolved 2016/03/19 T1
nan active NaT nan
3 dissolved 2016/03/19, T3
nan active 2012/03/16 nan
4 in liquidation, 2017/03/19 T2
nan dissolved, NaT nan
我想要做的是按 id 组合行,例如对于第一个 id = 2 我得到:
*id*, *status*, *date*, *Type*
2 [dissolved,active] [2016/03/19,None] T1
我努力了:
data.groupby(['id']).agg(lambda x: tuple(x)).applymap(list).reset_index()
但是当我想要的只是 2 时,这会将 id 列设置为 [2,nan]。我该怎么做呢?我只想要我的状态和日期值的列表格式,而不是把它们全部加起来!
输入 df:
df = pd.DataFrame([[2,"dissolved","2016/03/19","T1" ],
[float("nan"),"active","NaT",float("nan")],
[3,"dissolved","2016/03/19","T3" ],
[float("nan"),"active","2012/03/16",float("nan")],
[4,"in liquidation","2017/03/19","T2" ],
[float("nan"),"dissolved","NaT",float("nan")]],columns = ["id","status","date","Type"])
解决方案
尝试以下操作:
df = pd.DataFrame([[2,"dissolved","2016/03/19","T1" ],
[float("nan"),"active","NaT",float("nan")],
[3,"dissolved","2016/03/19","T3" ],
[float("nan"),"active","2012/03/16",float("nan")],
[4,"in liquidation","2017/03/19","T2" ],
[float("nan"),"dissolved","NaT",float("nan")]],columns = ["id","status","date","Type"])
df = df.ffill()
df["status"] = df["status"] + ","
df["date"] = df["date"] + ","
df2 = df.groupby(["id","Type"]).sum()
df2["status"] = df2["status"].apply(lambda x: x.split(",")[0:len(x.split(","))-1])
df2["date"] = df2["date"].apply(lambda x: x.split(",")[0:len(x.split(","))-1])
df2
根据您的数据集,您可能需要对其进行一些调整。它输出以下内容:
status date
id Type
2.0 T1 [dissolved, active] [2016/03/19, NaT]
3.0 T3 [dissolved, active] [2016/03/19, 2012/03/16]
4.0 T2 [in liquidation, dissolved] [2017/03/19, NaT]
推荐阅读
- kubernetes - 我正在使用 Helm 从 Bitnami 安装自定义 WordPress 图像。无法从 Docker Hub 中的私有存储库中提取图像
- geometry - 将过渡 + 圆 + 过渡曲线拟合到一组测量点
- python - Pygame - 敌人射击,团体问题
- r - Shiny 在 R 中没有给我正确的直方图
- php - 在 PHP 中根据 XSD 1.1 模式验证 XML
- php - 数据透视表 Laravel 中的外键问题
- c# - .net 中的堆栈操作
- c++ - 当 QNetworkAccessManager 完成时,Qt 检索在请求期间创建的附加数据
- sql-server - 仅与 Active Directory 建立连接...用户登录失败
' - python - Python将输出收集到列表中?