首页 > 解决方案 > 使用 pandas 对数据进行分类

问题描述

在此处输入图像描述我正在尝试对数据集运行卡方检验,为此我需要使用它pd.cut()来制定数据集中的类别。但是,我收到此错误

ufunc 'subtract' 不包含签名匹配类型 dtype('

我的代码:

import pandas as pd
import numpy as np
import scipy as sp
import math

data_main = pd.read_csv("sample_survey.csv")
data = data_main.iloc[:, [1,2]]

data["wrkstat"] = data["wrkstat"].astype(str)
data["marital"] = data["marital"].astype(str)
cols = ['wrkstat', 'marital']

cut_points = ['Divorced', 'Married', 'Never Married', 'Seperated','Widowed']
label_names = ['Divorced1', 'Married', 'Never Married', 
'Seperated','Widowed']
data["Marital_Categories"] = pd.cut(data["marital"], cut_points)

marital_by_wrkstat = data[['wrkstat', 'marital_categories']]
marital_by_wrkstat.head()

标签: pythonpandasmachine-learningcategorical-datachi-squared

解决方案


推荐阅读