首页 > 解决方案 > Pandas groupby and apply function to numeric columns

问题描述

I'm trying to apply the shapiro-wilk test to my dataframe, which is split into groups based on two categorical variables:

df.groupby(['category 1', 'category 2']).apply(stats.shapiro)

This results in an error saying that it couldn't convert string to float. The only non-numeric columns in there are the two categories which I'm using to split the dataframe.

How do I fix it?

EDIT:

example data:

cat1    cat2    purchases    sales
A       B       20           25
C       A       30           45
B       B       35           20
A       A       40           50

I want to get the shapiro statistic and a p value for each of the numeric columns without having to write all possible combinations of each category.

标签: pythonpython-3.xpandaspandas-groupbypandas-apply

解决方案


这应该有效:

df.groupby(['cat1', 'cat2'])['purchases','sales'].apply(stats.shapiro)

推荐阅读