更快的SciPy优化

ccgok5k5 于 2023-08-05 发布在其他

关注(0)|答案(2)|浏览(172)

我需要找到一个有几千个变量的成本函数的最小值。成本函数只是一个最小二乘计算，可以通过numpy矢量化轻松快速地计算。尽管如此，优化仍然需要很长时间。我的猜测是，运行速度慢是发生在SciPy的最小化器中，而不是我的成本函数中。如何更改SciPy最小化器的参数以加快运行速度？
样本代码：

import numpy as np
from scipy.optimize import minimize

# random data
x = np.random.randn(100, 75)

# initial weights guess
startingWeights = np.ones(shape=(100, 75))

# random y vector 
y = np.random.randn(100)

def costFunction(weights):
   # reshapes flattened weights into 2d matrix
   weights = np.reshape(weights, newshape=(100, 75))

   # weighted row-wise sum
   weighted = np.sum(x * weights, axis=1)

   # squared residuals
   residualsSquared = (y - weighted) ** 2

   return np.sum(residualsSquared)

result = minimize(costFunction, startingWeights.flatten())

字符串

scipy

来源：https://stackoverflow.com/questions/68507176/faster-scipy-optimizations

2条答案

按热度按时间

y53ybaqx1#

正如在注解中已经提到的，强烈建议为具有N = 100*75 = 7500变量的大型问题提供精确的目标梯度。在没有提供梯度的情况下，它将通过有限差分和approx_derivative函数来近似。然而，有限差分是容易出错的，并且由于梯度的每个评估需要目标函数的2*N评估（没有缓存），计算成本很高。
这可以通过对目标和近似梯度进行定时来容易地说明：

In [7]: %timeit costFunction(startingWeights.flatten())
23.5 µs ± 2.03 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [8]: from scipy.optimize._numdiff import approx_derivative

In [9]: %timeit approx_derivative(costFunction, startingWeights.flatten())
633 ms ± 33.3 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

字符串
因此，在我的机器上，每个梯度评估都需要半秒钟以上！评估梯度的更有效的方法是算法微分。使用JAX库非常简单：

import jax.numpy as jnp
from jax import jit, value_and_grad

def costFunction(weights):
   # reshapes flattened weights into 2d matrix
   weights = jnp.reshape(weights, newshape=(100, 75))

   # weighted row-wise sum
   weighted = jnp.sum(x * weights, axis=1)

   # squared residuals
   residualsSquared = (y - weighted) ** 2

   return jnp.sum(residualsSquared)

# create the derivatives
obj_and_grad = jit(value_and_grad(costFunction))

型
在这里，value_and_grad创建一个函数，用于评估目标和梯度，并返回两者，即obj_value, grad_values = obj_and_grad(x0)。让我们计算这个函数的时间：

In [12]: %timeit obj_and_grad(startingWeights.flatten())
132 µs ± 6.62 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)

型
因此，我们现在评估目标和梯度的速度比以前快了近5000倍。最后，我们可以通过设置jac=True来告诉minimize我们的目标函数返回目标和梯度。所以呢

minimize(obj_and_grad, x0=startingWeights.flatten(), jac=True)

型
应该显著地加速优化。
PS：您也可以尝试通过cyipopt软件包连接的最先进的Ipopt求解器。它还提供了一个类似于scipy的接口，类似于scipy. optimize. minimize。

赞(0）回复(0）举报 2023-08-05

8hhllhi22#

成本函数只是一个最小二乘计算
不要使用最小化，使用least-squares-如果它确实是，你需要的，-示例：

import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import least_squares

x= np.array([0.23, 0.66, 0.93, 1.25, 1.75, 2.03, 2.24, 2.57, 2.87, 2.98])
y= np.array([0.25, -0.27, -1.12, -0.45, 0.28, 0.13, -0.27, 0.26, 0.58, 1.03])    
plt.plot(x,y, 'bv')
plt.show()

def model(theta, t):
    return theta[0]* np.exp(t) + theta[1]*np.log(t) + theta[2]*np.sin(t) + theta[3]*np.cos(t)      #  a ∗ e^x   + b ∗ l n x + c ∗ s i n x + d ∗ c o s x

def residual(theta, t, y):       # f-a
     return (model(theta, t) - y).flatten()

theta = np.array([0.5, 0.5, 0.5, 0.5])

res = least_squares(residual, theta,   args=(x, y),  verbose=1)        # jac=jac,
print(res)

x_test = np.linspace(0, 3)
y_test = model(res.x, x_test)
plt.plot(x,y, 'bo', x_test, y_test, 'k-')
plt.show()

字符串

正如你所看到的，没有的地方可以放置巨大的x 0 =startingWeights= shape(100,75)（同样，* 对于你的最小化方法，你只需要x0=shape(75)作为coeffs_* 的init_guess-如果你真的需要的话）-因为x 0只是init_coeffs_param（或theta）来优化与您的init_data（100，75）-xs...
优化过程正在 * 本地 * 为每个data_point执行（100个数据点中有ys），即使添加了the logics of regularization
如果您问题的formalization是正确的？...您是否可以在优化之前选择Distance metric？..或选择Global_Optimization算法来最小化您拥有的整个空间？..或手动创建Jacobian_fx的最小化方法（作为参数-作为joni回答）？..
无论如何，我认为，你最好不要把你的凸问题转化为线性LS问题，如果在现实中它真的是凸的（数据形状（100，75）的全局最优）…
再一次，我怀疑，你的观点包括这样的数量（75）！* 线性无关尺寸 *... - 要么改变你的目标最小化或应用任何提示我提到

如果是-可以看到LS-Optimization-描述的一些机会（包括Jacobi方法（e.g. impl），预处理，稀疏矩阵的共轭梯度[成为您的多维数据，使用2-范数条件数，如果您的数据是空间的] -进行旋转以更好地选择最小化中的步长方向等）......
P.S. When matrices grow up-需要矩阵分解

赞(0）回复(0）举报 2023-08-05

我来回答

更快的SciPy优化

2条答案

相关问题

热门标签

最新问答