python - Correlation heatmap turned values into nan in Python
问题描述
I want to conduct a heatmap on my table df
, which looks normal at the beginning:
Total Paid Post Engaged Negative like
1 2178 0 0 66 0 1207
2 1042 0 0 60 0 921
3 2096 0 0 112 0 1744
4 1832 0 0 109 0 1718
5 1341 0 0 38 0 889
6 1933 0 0 123 0 1501
...
but after I applied:
df= full_Data.iloc[1:,4:10]
df= pd.DataFrame(df,columns=['A','B','C', 'D', 'E', 'F'])
corrMatrix = df.corr()
sn.heatmap(corrMatrix, annot=True)
plt.show()
it returned an empty graph:
C:\Users\User\Anaconda3\lib\site-packages\seaborn\matrix.py:204: RuntimeWarning: All-NaN slice encountered
vmin = np.nanmin(calc_data)
C:\Users\User\Anaconda3\lib\site-packages\seaborn\matrix.py:209: RuntimeWarning: All-NaN slice encountered
vmax = np.nanmax(calc_data)
and df
returned:
A B C D E F
1 nan nan nan nan nan nan
2 nan nan nan nan nan nan
3 nan nan nan nan nan nan
4 nan nan nan nan nan nan
5 nan nan nan nan nan nan
...
Why all the values are turned into nan
?
Update:
Tried to convert df
without naming column in the old way:
df.columns = ['A','B','C', 'D', 'E', 'F']
and
df= pd.DataFrame(df.to_numpy(),columns=['A','B','C', 'D', 'E', 'F'])
and both caught error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-43-3a27f095066b> in <module>
12
13 corrMatrix = df.corr()
---> 14 sn.heatmap(corrMatrix, annot=True)
15 plt.show()
16
~\Anaconda3\lib\site-packages\seaborn\_decorators.py in inner_f(*args, **kwargs)
44 )
45 kwargs.update({k: arg for k, arg in zip(sig.parameters, args)})
---> 46 return f(**kwargs)
47 return inner_f
48
~\Anaconda3\lib\site-packages\seaborn\matrix.py in heatmap(data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, linewidths, linecolor, cbar, cbar_kws, cbar_ax, square, xticklabels, yticklabels, mask, ax, **kwargs)
545 plotter = _HeatMapper(data, vmin, vmax, cmap, center, robust, annot, fmt,
546 annot_kws, cbar, cbar_kws, xticklabels,
--> 547 yticklabels, mask)
548
549 # Add the pcolormesh kwargs here
~\Anaconda3\lib\site-packages\seaborn\matrix.py in __init__(self, data, vmin, vmax, cmap, center, robust, annot, fmt, annot_kws, cbar, cbar_kws, xticklabels, yticklabels, mask)
164 # Determine good default values for the colormapping
165 self._determine_cmap_params(plot_data, vmin, vmax,
--> 166 cmap, center, robust)
167
168 # Sort out the annotations
~\Anaconda3\lib\site-packages\seaborn\matrix.py in _determine_cmap_params(self, plot_data, vmin, vmax, cmap, center, robust)
202 vmin = np.nanpercentile(calc_data, 2)
203 else:
--> 204 vmin = np.nanmin(calc_data)
205 if vmax is None:
206 if robust:
<__array_function__ internals> in nanmin(*args, **kwargs)
~\Anaconda3\lib\site-packages\numpy\lib\nanfunctions.py in nanmin(a, axis, out, keepdims)
317 # Fast, but not safe for subclasses of ndarray, or object arrays,
318 # which do not implement isnan (gh-9009), or fmin correctly (gh-8975)
--> 319 res = np.fmin.reduce(a, axis=axis, out=out, **kwargs)
320 if np.isnan(res).any():
321 warnings.warn("All-NaN slice encountered", RuntimeWarning,
ValueError: zero-size array to reduction operation fmin which has no identity
解决方案
I think problem is passed object DataFrame
to pd.DataFrame
constructor, so there are different original columns names and new columns names from list, so only NaN
s are created.
Solution is convert it to numpy array:
df= pd.DataFrame(df.to_numpy(),columns=['A','B','C', 'D', 'E', 'F'])
Or set new columns names in next step without DataFrame
constructor:
df = full_Data.iloc[1:,4:10]
df.columns = ['A','B','C', 'D', 'E', 'F']
Solution create dict
by existing columns only:
old = df.columns
new = ['A','B','C', 'D', 'E', 'F']
df = df.rename(columns=dict(zip(old, new)))
print (df)
A B C D E F
1 2178 0 0 66 0 1207
2 1042 0 0 60 0 921
3 2096 0 0 112 0 1744
4 1832 0 0 109 0 1718
5 1341 0 0 38 0 889
6 1933 0 0 123 0 1501
print (df.corr())
A B C D E F
A 1.000000 NaN NaN 0.606808 NaN 0.727034
B NaN NaN NaN NaN NaN NaN
C NaN NaN NaN NaN NaN NaN
D 0.606808 NaN NaN 1.000000 NaN 0.916325
E NaN NaN NaN NaN NaN NaN
F 0.727034 NaN NaN 0.916325 NaN 1.000000
EDIT:
Problem was columns was not numeric.
df = df.astype(int)
Or:
df = df.apply(pd.to_numeric, errors='coerce')
推荐阅读
- azure - 如何在 Azure 中重用来自不同管道的工件
- java - 如何编写一个数组列表
到 Java 中的 .csv 文件? - ruby-on-rails - Rails has_one 与多个主键的关联
- botframework - 使用来自 VA 的示例事件的时区
- python - Python 多个 PATCH 给出 http.client.CannotSendRequest: Request-sent
- spring-boot - 使用 Apache Camel 通过 https 调用 REST 服务
- android - 使用浏览器在选择器中打开 facebook 链接
- python - 请求对课堂公告的身份验证范围不足
- spring-boot - Spring Boot 和 Couchbase 连接错误
- c# - 更改 assemblyVersion 将版本添加到 Application.LoadComponent 导致编译错误