首页 > 解决方案 > 在python中绘制一组大数据并计算斜率的有效方法

问题描述

我有一个包含 32 列的文本文件。

Ev col 2、col 3、col 4、col 5、col 6、col 7 col 8 col 9 col 10 等...(共 32 列)

-0.08, 8.300, 8.300, 8.300, 8.301, 8.300, 8.300, 8.300, 8.301, 3.405...(共 32 列)

-0.04, 8.300, 8.300, 8.300, 8.301, 8.300, 8.300, 8.300, 8.301, 3.405...(共 32 列)

0.00, 8.300, 8.300, 8.300, 8.301, 8.300, 8.300, 8.300, 8.301, 3.405...(共32列)

0.04, 8.300, 8.300, 8.300, 8.301, 8.300, 8.300, 8.300, 8.301, 3.405...(共32列)

0.08, 8.300, 8.300, 8.300, 8.301, 8.300, 8.300, 8.300, 8.301, 3.405...(共32列)

我想绘制右侧与左侧的所有 32 列。(即 Ev vs col2、Ev vs col3 ...... Ev vs col 32)并计算它们的斜率。

所以我尝试了蛮力方法-

import matplotlib.pyplot as plt
import numpy as np

x1, y1, y2, y3, y4, y5, y6, y7, y8, y9, y10, y11, y12, y13, y14, y15, y16, y17, y18, y19, y20, y21, y22, y23, y24, y25, y26, y27, y28, y29, y30, y31, y32 = np.loadtxt('mydata.txt', delimiter=',', unpack=True)


slope1, intercept1 = np.polyfit(x1, y1, 1)
slope1, intercept1 = np.polyfit(x1, y2, 1)
slope1, intercept1 = np.polyfit(x1, y3, 1)
slope1, intercept1 = np.polyfit(x1, y4, 1)
#..... All the way upto 32nd column
print('slope 1 =',slope1)

plt.plot(x1,y1, label='With ',marker='o')
plt.plot(x1,y2, label='With ',marker='o')
plt.plot(x1,y3, label='With ',marker='o')
plt.plot(x1,y4, label='With ',marker='o')
plt.plot(x1,y5, label='With ',marker='o')
#.
#.
#.
#....All the way upto 32nd columns 
# ...plt.plot(x1,y32, label='mydata ',marker='o')

plt.show()

即使这段代码有效,我知道这不是有效的方法。有没有更好的方法来绘制这些数据并获得斜率

标签: pythonnumpymatplotlibplotlarge-data

解决方案


你不应该明确地将结果np.loadtxt放入 33 个不同的变量中。将其合二为一,data例如调用并以您需要的方式对其进行索引。

data = np.loadtxt('mydata.txt', delimiter=',', skiprows=1)
plt.plot(data[0], data[1:])

也许你必须先转置它,比如

data = data.T

或使用

plt.plot(data[:, 0], data[:, 1:])

但是,我建议使用pandas

import pandas as pd
df = pd.read_csv(StringIO(s), index_col=0)
df.plot()

关于你的斜率计算:

data = data.T

array([[-0.08 , -0.04 ,  0.   ,  0.04 ,  0.08 ],
       [ 8.3  ,  8.3  ,  8.3  ,  8.3  ,  8.3  ],
       [ 8.3  ,  8.3  ,  8.3  ,  8.3  ,  8.3  ],
       [ 8.3  ,  8.3  ,  8.3  ,  8.3  ,  8.3  ],
       [ 8.301,  8.301,  8.301,  8.301,  8.301],
       [ 8.3  ,  8.3  ,  8.3  ,  8.3  ,  8.3  ],
       [ 8.3  ,  8.3  ,  8.3  ,  8.3  ,  8.3  ],
       [ 8.3  ,  8.3  ,  8.3  ,  8.3  ,  8.3  ]])

for y in data[1:]:
    print(np.polyfit(data[0], y, 1))

[ -1.12054027e-14   8.30000000e+00]
[ -1.12054027e-14   8.30000000e+00]
[ -1.12054027e-14   8.30000000e+00]
[  7.45493703e-15   8.30100000e+00]
[ -1.12054027e-14   8.30000000e+00]
[ -1.12054027e-14   8.30000000e+00]
[ -1.12054027e-14   8.30000000e+00]

或使用pandas数据框:

df

        col 2   col 3   col 4   col 5   col 6   col 7  col 8
Ev                                                          
-0.08     8.3     8.3     8.3   8.301     8.3     8.3    8.3
-0.04     8.3     8.3     8.3   8.301     8.3     8.3    8.3
 0.00     8.3     8.3     8.3   8.301     8.3     8.3    8.3
 0.04     8.3     8.3     8.3   8.301     8.3     8.3    8.3
 0.08     8.3     8.3     8.3   8.301     8.3     8.3    8.3

for name, data in df.iteritems():
    print(name, np.polyfit(data.index, data.values, 1))

col 2 [ -1.12054027e-14   8.30000000e+00]
col 3 [ -1.12054027e-14   8.30000000e+00]
col 4 [ -1.12054027e-14   8.30000000e+00]
col 5 [  7.45493703e-15   8.30100000e+00]
col 6 [ -1.12054027e-14   8.30000000e+00]
col 7 [ -1.12054027e-14   8.30000000e+00]
col 8 [ -1.12054027e-14   8.30000000e+00]

推荐阅读