首页 > 解决方案 > 在 Pandas 中创建数据透视表 (SqlAlchemy)

问题描述

我正在尝试创建一个将所有孩子组合成一行的熊猫数据框

class Parent(Base):
  __tablename__ = 'parent'
  id = Column(Integer, primary_key=True)
  name = Column(String())
  class = Column(String())

  all_distance = relationship('Distance', back_populates='parent')
  all_weight = relationship('Weight', back_populates='parent')

class Distance(Base):
  __tablename__ = 'distance'
  id = Column(Integer, primary_key=True)
  distance = Column(String())
  finished = Column(String())

  parent_id = Column(Integer, ForeignKey('parent.id'))
  parent = relationship('Parent', back_populates='all_distance')

class Weight(Base):
  __tablename__ = 'weight'
  id = Column(Integer, primary_key=True)
  weight = Column(String())
  height = Column(String())

  parent_id = Column(Integer, ForeignKey('parent.id'))
  parent = relationship('Parent', back_populates='all_weight')

包含一些数据的表格:

parent
ID | Name | Class
1  | Joe  | Paladin
2  | Ron  | Mage
3  | Sara | Knight

distance
ID | distance | finished | parent_id
1  | 2 miles  | yes      | 1
2  | 3 miles  | yes      | 1 
3  | 1 miles  | yes      | 1
4  | 10 miles | no       | 2

weight
ID | Weight | height | parent_id
1  | 5 lbs  | 5'3    | 1
2  | 10 lbs | 5'5    | 2

目标是创建一个如下所示的 pandas 数据框:

1 | Joe  | Paladin | 2 miles  | yes  | 3 miles | yes  | 1 miles | yes  | 5lbs  | 5'3
2 | Ron  | Mage    | 10 miles | no   | None    | None | None    | None | 10lbs | 5'5
3 | Sara | Knight  | None     | None | None    | None | None    | None | None  | None

我该怎么做?

我已经有点接近了

df = pd.read(db_session.query(Parent, Distance, Weight).join(Distance).join(Weight).statement, db_session.bind)

这给了我所有连接在一起的数据框。

list(df.columns.values)

['id', 'name', 'class', 'id', 'distance', 'finished', 'id', 'weight', 'height']

如何防止相同的列标题?ie - id 现在是 3 倍

但是,当我尝试制作数据透视表时:

df.pivot(index="id")它返回一个错误:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 5194, in pivot
    return pivot(self, index=index, columns=columns, values=values)
  File "/anaconda2/lib/python2.7/site-packages/pandas/core/reshape/reshape.py", line 400, in pivot
    indexed = self.set_index(cols, append=append)
  File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 3909, in set_index
    level = frame[col]._values
  File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 2688, in __getitem__
    return self._getitem_column(key)
  File "/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 2698, in _getitem_column
    result = self._constructor(self._data.get(key))
  File "/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 4130, in get
    raise TypeError("cannot label index with a null key")
TypeError: cannot label index with a null key

标签: pythonpandassqlalchemy

解决方案


您试图将“id”作为索引传递,因此枢轴失败。它应该是:

df.pivot(df.index,"id")

推荐阅读