首页 > 解决方案 > error when try to incorporate target data from SKLEARN in python

问题描述

I am trying to build a matrix to evaluate how different features' impact the data set's target, here I use Sklearn's breast cancer data, my code is below but the result shows me error, I could not figure out how to fix it can someone help me here?

 import numpy as np
    import seaborn as sns; sns.set(style="ticks", color_codes=True)
    import sklearn.datasets
    import pandas as pd
    from sklearn.datasets import load_breast_cancer
    WBC_dataset = load_breast_cancer()
    
    WBC_df = pd.DataFrame(
    data= np.c_[WBC_dataset['data'],WBC_dataset['target']],
    columns= np.append(WBC_dataset['feature_names'], ['Condition']))
    cols = WBC_dataset.columns.drop('Condition')
    WBC_df[cols] = WBC_df[cols].apply(pd.to_numeric)
    g = sns.pairplot(WBC_df, hue='Condition')

here is the error

KeyError                                  Traceback (most recent call last)
~\Anaconda3\lib\site-packages\sklearn\utils\__init__.py in __getattr__(self, key)
    104         try:
--> 105             return self[key]
    106         except KeyError:

KeyError: 'columns'

During handling of the above exception, another exception occurred:

AttributeError                            Traceback (most recent call last)
<ipython-input-245-1123cb0b25bb> in <module>
      9 data= np.c_[WBC_dataset['data'],WBC_dataset['target']],
     10 columns= np.append(WBC_dataset['feature_names'], ['Condition']))
---> 11 cols = WBC_dataset.columns.drop('Condition')
     12 WBC_df[cols] = WBC_df[cols].apply(pd.to_numeric)
     13 g = sns.pairplot(WBC_df, hue='Condition')

~\Anaconda3\lib\site-packages\sklearn\utils\__init__.py in __getattr__(self, key)
    105             return self[key]
    106         except KeyError:
--> 107             raise AttributeError(key)
    108 
    109     def __setstate__(self, state):

AttributeError: columns

标签: python

解决方案


WBC_df 不是数据框,它是一个包含多个值的字典。检查文档

data = load_breast_cancer()

WBC_df = pd.DataFrame(data.data,columns = data.feature_names)

推荐阅读