首页 > 解决方案 > 为什么要给估值或加息?

问题描述

import pandas as pd
import numpy as np
import seaborn as sb
from matplotlib import pyplot as plt

phoneinfo=np.array([['galaxy s8','android',64,4,140,'samsung',6],['lumia','windows',32,3,150,'microsoft',6],
                    ['xperia l1','android',16,2,180,'sony',5],['iphone7','ios',128,2,138,'apple',4],
                    ['u ultra','android',64,4,170,'htc',5],['galaxy s5','android',16,2,145,'samsung',5],
                    ['iphone 5s','ios',32,1,112,'apple',4],['moto g5','android',16,3,144.7,'motorola',5],
                    ['pixel','android',128,4,143,'google',5]])

phDF=pd.DataFrame(phoneinfo,index=[1,2,3,4,5,6,7,8,9],columns=['name','os','capacity','ram','weight','company','inch'])

sb.pairplot(phDF)
plt.show()


num_var=phDF.drop(['name','os','capacity','ram','company'],axis=1)
corr=num_var.corr()
corr

标签: pythonpandasnumpyseaborn

解决方案


问题是数据框的构造。

array必须有一个这样的dtypedtype:

phoneinfo = np.array([['galaxy s8', 'android', 64, 4, 140, 'samsung', 6],
                      ['lumia', 'windows', 32, 3, 150, 'microsoft', 6],
                      ['xperia l1', 'android', 16, 2, 180, 'sony', 5],
                      ['iphone7', 'ios', 128, 2, 138, 'apple', 4],
                      ['u ultra', 'android', 64, 4, 170, 'htc', 5],
                      ['galaxy s5', 'android', 16, 2, 145, 'samsung', 5],
                      ['iphone 5s', 'ios', 32, 1, 112, 'apple', 4],
                      ['moto g5', 'android', 16, 3, 144.7, 'motorola', 5],
                      ['pixel', 'android', 128, 4, 143, 'google', 5]])

<U32

print(phoneinfo.dtype)  # <U32

这意味着在构建 DataFrame 时:

phDF = pd.DataFrame(phoneinfo, index=[1, 2, 3, 4, 5, 6, 7, 8, 9],
                    columns=['name', 'os', 'capacity', 'ram', 'weight',
                             'company', 'inch'])

都是phDF.dtypes对象:

name        object
os          object
capacity    object
ram         object
weight      object
company     object
inch        object
dtype: object

seaborn.pairplot只会处理数字列。

首先修复此转换类型:

# Convert int cols to int
phDF[['capacity', 'ram', 'inch']] = \
    phDF[['capacity', 'ram', 'inch']].astype(int)
# Convert float cols to float
phDF[['weight']] = phDF[['weight']].astype(float)

然后绘制:

sns.pairplot(phDF)
plt.show()

阴谋


完整示例:

import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt

phoneinfo = np.array([['galaxy s8', 'android', 64, 4, 140, 'samsung', 6],
                      ['lumia', 'windows', 32, 3, 150, 'microsoft', 6],
                      ['xperia l1', 'android', 16, 2, 180, 'sony', 5],
                      ['iphone7', 'ios', 128, 2, 138, 'apple', 4],
                      ['u ultra', 'android', 64, 4, 170, 'htc', 5],
                      ['galaxy s5', 'android', 16, 2, 145, 'samsung', 5],
                      ['iphone 5s', 'ios', 32, 1, 112, 'apple', 4],
                      ['moto g5', 'android', 16, 3, 144.7, 'motorola', 5],
                      ['pixel', 'android', 128, 4, 143, 'google', 5]])

phDF = pd.DataFrame(phoneinfo, index=[1, 2, 3, 4, 5, 6, 7, 8, 9],
                    columns=['name', 'os', 'capacity', 'ram', 'weight',
                             'company', 'inch'])


# Convert int cols to int
phDF[['capacity', 'ram', 'inch']] = \
    phDF[['capacity', 'ram', 'inch']].astype(int)
# Convert float cols to float
phDF[['weight']] = phDF[['weight']].astype(float)


sns.pairplot(phDF)
plt.show()

num_var = phDF.drop(['name', 'os', 'capacity', 'ram', 'company'], axis=1)
corr = num_var.corr()
print(corr)

corr

          weight      inch
weight  1.000000  0.364706
inch    0.364706  1.000000

推荐阅读