python - 如何将(行)numpy.matrix 分配给 numpy.matrix 中的列
问题描述
相关代码如下。我创建了一个零矩阵,这里恰好是一个 2X2 矩阵。然后我遍历一个数据文件,用输入数据集每列范围内的随机数填充矩阵。这可行,除了输出矩阵是转置的,我宁愿做对。请查看代码中的注释。
centroids = mat(zeros((k,n))) #create centroid mat
print('centroids: \n', centroids, '\n', type(centroids))
print('starting for loop for j in range(%s)' %(n))
for j in range(n):#create random cluster centers, within bounds of each dimension
print('\n')
# get the min value in the jth column
minJ = min(dataSet[:,j])
# get the max value in the jth column
maxJ = max(dataSet[:,j])
# the range of values is max - min
rangeJ = maxJ - minJ
print('col %s, min = %s, max = %s, range = %s' %(j, minJ, maxJ, rangeJ))
# create a 'column' of random values for each colum
col = mat(minJ + rangeJ * random.rand(1,k))
print('column %s is %s, type is %s' %(j, col, type(col)))
# assign columns to column in centroids
# DOES NOT WORK, assigns to rows.
centroids[j] = col
print(' ==> centroids: \n', centroids)
return centroids
这是输出。注意输出数组 /should/ 是 [[3.08,.434],[-1.36,-.203]]。
centroids:
[[0. 0.]
[0. 0.]]
<class 'numpy.matrix'>
starting for loop for j in range(2)
col 0, min = [[-5.379713]], max = [[4.838138]], range = [[10.217851]]
column 0 is [[ 3.08228829 -1.35924539]], type is <class 'numpy.matrix'>
==> centroids:
[[ 3.08228829 -1.35924539]
[ 0. 0. ]]
col 1, min = [[-4.232586]], max = [[5.1904]], range = [[9.422986]]
column 1 is [[ 0.4342251 -0.2026065]], type is <class 'numpy.matrix'>
==> centroids:
[[ 3.08228829 -1.35924539]
[ 0.4342251 -0.2026065 ]]
================
centroids follows:
[[3.08228829]
[0.4342251 ]]
[[-1.35924539]
[-0.2026065 ]]
这是我尝试过的:
centroids[:,j] = col
centroids[0:1,j] = col
这是错误消息:
Traceback (most recent call last):
File "run.py", line 68, in <module>
centroids = randCent(dataList, 2)
File "run.py", line 51, in randCent
centroids[0:1,j] = col
ValueError: could not broadcast input array from shape (1,2) into shape (1,1)
在不转置矩阵的情况下如何做到这一点?谢谢。
我的脚本文件如下:
#run.py
from numpy import *
import sys
import importlib
#fun defs############################################
def testFun(name):
print("Hello, %s" %(name))
def getData(fileName):
data = []
fr = open(fileName)
for line in fr.readlines():
curLine = line.strip().split('\t')
#print('A ',curLine, type(curLine)) ## gets a list of strings as a list
fltLine = [float(i) for i in curLine] ## converts strings to floats
#print('B ', fltLine, type(fltLine)) ## returns a list of floats
data.append(fltLine)
return data
def distEuclid(vecA, vecB):
return sqrt(sum(power(vecA - vecB, 2))) #la.norm(vecA-vecB)
dataSet = [[3.141592653589793, 1.4142135623730951], [2.718281828459045, 1.618033988749895]]
def randCent(dataSet, k):
print('calling randCent(dataset, %s)' %(k))
dataSet = mat(dataSet)
n = shape(dataSet)[1]
print('columns n is %s and groups k is %s and type(dataSet) is %s' %(n, k, type(dataSet)))
#print(dataSet)
centroids = mat(zeros((k,n))) #create centroid mat
print('centroids: \n', centroids, '\n', type(centroids))
print('starting for loop for j in range(%s)' %(n))
for j in range(n):#create random cluster centers, within bounds of each dimension
print('\n')
# get the min value in the jth column
minJ = min(dataSet[:,j])
# get the max value in the jth column
maxJ = max(dataSet[:,j])
# the range of values is max - min
rangeJ = maxJ - minJ
print('col %s, min = %s, max = %s, range = %s' %(j, minJ, maxJ, rangeJ))
# create a 'column' of random values for each colum
col = mat(minJ + rangeJ * random.rand(1,k))
print('column %s is %s, type is %s' %(j, col, type(col)))
# assign columns to column in centroids
# DOES NOT WORK, assigns to rows.
centroids[0:1,j] = col
print(' ==> centroids: \n', centroids)
# print('==> centroids: ', centroids)
return centroids
#exe code#############################################
print("loading file run.py")
testFun('Bob')
dataList = None
dataList = getData('testSet.txt')
#print(dataList, type(dataList))
print('variable dataList has been initialized: %s' %(dataList is not None))
centroids = randCent(dataList, 2)
print('================\n')
print('centroids follows:')
print(centroids[:,0])
print(centroids[:,1])
解决方案
分配给二维数组:
In [500]: A = np.zeros((2,3), int)
In [501]: A[0,:] = np.arange(3)
In [502]: A[:,1] = [10,20]
In [503]: A
Out[503]:
array([[ 0, 10, 2],
[ 0, 20, 0]])
In [504]: A = np.zeros((2,3), int)
In [505]: A[0,:] = [1,2,3]
In [506]: A[:,1] = [10,20]
In [507]: A
Out[507]:
array([[ 1, 10, 3],
[ 0, 20, 0]])
尝试相同的np.matrix
:
In [512]: M = np.matrix(np.zeros((2,3),int))
In [513]: M
Out[513]:
matrix([[0, 0, 0],
[0, 0, 0]])
In [514]: M[0,:] = [1,2,3]
In [515]: M[:,1] = [10,20]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-515-e95a3ab21d7f> in <module>
----> 1 M[:,1] = [10,20]
ValueError: could not broadcast input array from shape (2) into shape (2,1)
In [516]: M[:,1] = [[10],[20]]
In [517]: M
Out[517]:
matrix([[ 1, 10, 3],
[ 0, 20, 0]])
为什么有区别?因为一旦一个矩阵,总是一个矩阵:
In [518]: A[:,1]
Out[518]: array([10, 20])
In [519]: M[:,1]
Out[519]:
matrix([[10],
[20]])
要将值分配给 (2,1) 形状的空间,您需要一个 (2,1) 值。广播只能将前导维度 (2,) 添加到 (1,2),而不是 (2,1)。
flatiter 可用于分配一维数组:
In [520]: M[:,1].flat
Out[520]: <numpy.flatiter at 0x7f57b127dda0>
In [521]: M[:,1].flat = [100,200]
In [522]: M
Out[522]:
matrix([[ 1, 100, 3],
[ 0, 200, 0]])
推荐阅读
- php - 使用自定义标签 PHP regex 包装 Latex Equation
- javascript - 如何在下拉列表中设置选定的值从搜索返回时
- node.js - 我想使用 sequelize 将数据插入到 db 表中,而 iam 使用 lambda 函数插入数据
- android - 我们如何在显示的项目上动态增加 Recycler 视图的高度
- python - 具有 50 多个创建的虚拟变量的堆积条形图(百分比)?
- php - 将 Filegazor 拆分为不同的前端和后端
- python - pip 对我不起作用(我安装了 python 3.9.7)(Windows)
- android - android requestGroupInfo 将 groupInfo 返回为 null
- php - 将其与 sql 函数绑定时,准备好的语句不起作用
- css - Vuetify v-expand-transition 间距