首页 > 解决方案 > 如何在python中对对象列求和

问题描述

我有一个以 Pandas 对象表示的数据集,见下文:

    datetime    season  holiday workingday  weather temp    atemp   humidity    windspeed   casual  registered  count
1/1/2011 0:00        1      0         0         1   9.84    14.395        81       0           3     13          16
1/1/2011 1:00        1      0          0        2   9.02    13.635        80       0           8    32           40
1/1/2011 2:00         1     0          0        3   9.02    13.635        80       0           5    27           32

p_type_1 = pd.read_csv("Bike Share Demand.csv")

p_type_1 = (p_type_1 >> 
            rename(date = X.datetime))

p_type_1.date.str.split(expand=True,)
p_type_1[['Date','Hour']] = p_type_1.date.str.split(" ",expand=True,)

p_type_1['date'] = pd.to_datetime(p_type_1['date'])

p_hour = p_type_1["Hour"]
p_hour

现在我正在尝试获取我创建的列 Hour 的总和 (p_hour)

p_hours = p_type_1["Hour"].sum()
p_hours

并得到这个错误: TypeError: must be str, not int

所以我然后尝试:

p_hours = p_type_1(str["Hour"].sum())
p_hours

并得到这个错误: TypeError: 'type' object is not subscriptable

我只想要总和,什么给了。

标签: pythonpandas

解决方案


There's quite a bit going on in here that's not correct. So I'll try to break down the issues and offer alternatives.

Here:

p_hours = p_type_1(str["Hour"].sum())
p_hours

What your issue is, is that you are actually trying to do this:

p_hours = p_type_1([str("Hour")].sum())
p_hours

Instead of doing that, your code technically asks for the property named 'Hour' in the string type. That's not what you are trying to do. This crash is unrelated to your core problem, and is just a syntax error.

What the problem actually is here, is that your dataframe column has mixed string and integer types together in the same column. The sum operation will concatenate string, or sum numeric types. In a mixed type, it will fail out.

In order to verify that this is the issue however, we would need to see your actual dataframe, as I have a feeling the one you gave may not be the correct one.

As a proof of concept, I created the following example:

import pandas as pd
dta = [str(x) for x in range(20)]
dta.append(12)
frame = pd.DataFrame.from_dict({
    "data": dta})

print(frame["data"].sum())

>>> TypeError: can only concatenate str (not "int") to str

Note that the newer editions of pandas have more clear error messages.


推荐阅读