python - Vectorized linear regression
问题描述
Here is my attempt to perform linear regression utilizing just numpy and linear algebra :
def linear_function(w , x , b):
return np.dot(w , x) + b
x = np.array([[1, 1,1],[0, 0,0]])
y = np.array([0,1])
w = np.random.uniform(-1,1,(1 , 3))
print(w)
learning_rate = .0001
xT = x.T
yT = y.T
for i in range(30000):
h_of_x = linear_function(w , xT , 1)
loss = h_of_x - yT
if i % 10000 == 0:
print(loss , w)
w = w + np.multiply(-learning_rate , loss)
linear_function(w , x , 1)
This causes an error :
ValueError Traceback (most recent call last)
<ipython-input-137-130a39956c7f> in <module>()
24 if i % 10000 == 0:
25 print(loss , w)
---> 26 w = w + np.multiply(-learning_rate , loss)
27
28 linear_function(w , x , 1)
ValueError: operands could not be broadcast together with shapes (1,3) (1,2)
This appears to work for reduced training set dimensionality :
import numpy as np
def linear_function(w , x , b):
return np.dot(w , x) + b
x = np.array([[1, 1],[0, 0]])
y = np.array([0,1])
w = np.random.uniform(-1,1,(1 , 2))
print(w)
learning_rate = .0001
xT = x.T
yT = y.T
for i in range(30000):
h_of_x = linear_function(w , xT , 1)
loss = h_of_x - yT
if i % 10000 == 0:
print(loss , w)
w = w + np.multiply(-learning_rate , loss)
linear_function(w , x , 1)
print(linear_function(w , x[0] , 1))
print(linear_function(w , x[1] , 1))
Which returns :
[[ 0.68255806 -0.49717912]]
[[ 1.18537894 0. ]] [[ 0.68255806 -0.49717912]]
[[ 0.43605474 0. ]] [[-0.06676614 -0.49717912]]
[[ 0.16040755 0. ]] [[-0.34241333 -0.49717912]]
[ 0.05900769]
[ 1.]
[ 0.05900769] & [ 1.]
are close to the training examples so appears this implementation is correct. What is issue with implementation that is throwing error ? I have not implemented the extension of dimensionality from 2 -> 3 correctly ?
解决方案
I've outlined the issues below:
your array shapes are inconsistent. This could result in issues with broadcasting/dots, especially during gradient descent. Fix your initialisation. I would also recommend augmenting
w
withb
andX
with a column of ones.your loss function and gradient calculation don't seem right to me. In general, using manhattan distance as a loss function is not recommended as it is not a sufficient distance metric. I would go with Euclidean distance and attempt to minimise the sum of squares (this is called OLS regression). We then proceed with the gradient calculation as follows.
your update rule will change accordingly based on (2).
make sure to instate a stopping condition for your code. You don't want to overshoot the optimum. Usually, you should stop when the gradient does not change much.
Full listing:
# input, augmented
x = np.array([[1, 1, 1], [0, 0, 0]])
x = np.column_stack((np.ones(len(x)), x))
# predictions
y = np.array([[0, 1]])
# weights, augmented with bias
w = np.random.uniform(-1, 1, (1, 4))
learning_rate = .0001
loss_old = np.inf
for i in range(30000):
h_of_x = w.dot(x.T)
loss = ((h_of_x - y) ** 2).sum()
if abs(loss_old - loss) < 1e-5:
break
w = w - learning_rate * (h_of_x - y).dot(x)
loss_old = loss
Other Recommendations/Enhancements
Next, consider the use of regularisation here. L1 (ridge) and L2 (lasso) are both good alternatives.
Finally, there is a closed form solution for Linear Regression that is guaranteed to converge at a local optimum (gradient descent does only guarantee a local optimum). This is fast, but computationally expensive (since it involves calculating an inverse). See the tradeoffs here.
w = y.dot(np.linalg.inv(x.dot(x.T)).dot(x))
When xT.x is not invertible, you will need to regularise.
Keep in mind that Linear Regression can only model linear decision boundaries. If you're convinced your implementation is correct, and that your loss is still bad, your data may not be fittable in its current vector-space, so you will need non-linear basis function to transform it (this is effectively non-linear regression).
推荐阅读
- php - PHP json_encode - 合并
- django - Django Admin:限制内联表单的查询集选择
- java - 在 JavaFX 中更改场景后如何存储数据库中的字段?登录后需要从数据库表中捕获一个“ID”
- javascript - 如何让 ScrollView 从底部开始,并在 Google Map 组件上滚动?
- linux - 每次运行时将任意目录设置为 root 的 chroot 进程监狱
- php - 使用 SSL 和 Let'sEncrypt 证书,使用共享 Dreamhost Apache 服务器,用于带有 Ratchet 的 React SecureServer
- python - plt.annotate() 在 matplotlib 的新行中
- flutter - 在颤振中运行网络服务器
- javascript - Firebase Phone Auth Recaptcha Verifier 未设置为窗口/加载
- c# - 覆盖 IEnumerable
在更多派生类中 - 任何潜在问题?