我们有四个矩阵A、B、C、D,它们的维数相同(m * n * p)。我们需要找到系数x1,x2,x3,使得我们:
minimize (D - x1 * A - x2 * B - x3 * C)**2
假设x1 + x2 + x3 = 1和
x1 + x2 + x3 = 1
0<= x1,x2,x3 <=1
在Python中有没有什么有效的方法来最小化这个的平方和?
jaxagkaj1#
如果你想要真正的平方和最小化,在Scipy中使用线性约束是可能的:
import numpy as np from numpy.random import default_rng from scipy.optimize import minimize, LinearConstraint rand = default_rng(seed=0) D = rand.random(size=(5, 7, 6)) ABC = rand.random(size=(3, *D.shape)) def objective(x: np.ndarray) -> float: delta = ( D - x[:, np.newaxis, np.newaxis, np.newaxis] * ABC ).ravel() return delta.dot(delta) result = minimize( fun=objective, x0=[1/3, 1/3, 1/3], bounds=[[0,1], [0,1], [0,1]], constraints=LinearConstraint(A=[1, 1, 1], lb=1, ub=1), ) print(result)
message: Optimization terminated successfully success: True status: 0 fun: 146.34193150290096 x: [ 3.660e-01 3.163e-01 3.177e-01] nit: 4 jac: [-6.803e+01 -6.803e+01 -6.803e+01] nfev: 20 njev: 4
另一种结果不是更快,并且不会具有相同的数值特征的方法是,如果您将其更改为最小绝对误差并使用线性规划:
import numpy as np from numpy.random import default_rng from scipy.optimize import linprog rand = default_rng(seed=0) m, n, p, q = 5, 7, 6, 3 mnp = m*n*p D = rand.random(size=(m, n, p)) ABC = rand.random(size=(m, n, p, q)) # x, and objective auxiliaries for each element of D c = np.ones(shape=q + mnp) c[:q] = 0 # x is not minimized, objectives are bounds = np.zeros((q + mnp, 2)) bounds[:q, 1] = 1 # 0 <= x <= 1 bounds[q:, 1] = np.inf # 0 <= o A_eq = np.zeros((1, q + mnp)) A_eq[0, :q] = 1 # sum(x) == 1 # For each element of D and its corresponding objective o: # -D + x.abc <= o A B C -1 0 0 0 0 ... <= D # D - x.abc <= o -A -B -C -1 0 0 0 0 ... <= -D A_ub = np.block([ [+ABC.reshape((-1, q)), -np.eye(mnp)], [-ABC.reshape((-1, q)), -np.eye(mnp)], ]) b_ub = np.concatenate((+D.ravel(), -D.ravel())) result = linprog( c=c, bounds=bounds, A_eq=A_eq, b_eq=[1], A_ub=A_ub, b_ub=b_ub) x = result.x[:q] obj = result.x[q:] print(result) print('x =', x) # Verification assert result.success assert np.isclose(1, x.sum()) assert np.all(0 <= x) assert np.all(x <= 1) delta = np.abs( D - ( x[np.newaxis, np.newaxis, np.newaxis, :]*ABC ).sum(axis=-1) ).ravel() assert np.allclose(delta, obj)
message: Optimization terminated successfully. (HiGHS Status 7: Optimal) success: True status: 0 fun: 58.41554582615268 x: [ 1.614e-01 5.168e-01 ... 6.642e-02 3.075e-01] nit: 212 lower: residual: [ 1.614e-01 5.168e-01 ... 6.642e-02 3.075e-01] marginals: [ 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00] upper: residual: [ 8.386e-01 4.832e-01 ... inf inf] marginals: [ 0.000e+00 0.000e+00 ... 0.000e+00 0.000e+00] eqlin: residual: [ 0.000e+00] marginals: [ 1.100e+00] ineqlin: residual: [ 7.751e-01 0.000e+00 ... 0.000e+00 0.000e+00] marginals: [-0.000e+00 -1.000e+00 ... -1.000e+00 -1.000e+00] mip_node_count: 0 mip_dual_bound: 0.0 mip_gap: 0.0 x = [0.16136498 0.51681836 0.32181666]
线性程序在15毫秒内运行,而一般最小化为1.5毫秒。你问:如果我们把它改成绝对误差最小值&使用线性规划,什么样的特性会不同呢?例如,参见Why does regression use least "squares" instead of least "absolute values"?。权衡是不平凡的。
1条答案
按热度按时间jaxagkaj1#
如果你想要真正的平方和最小化,在Scipy中使用线性约束是可能的:
另一种结果不是更快,并且不会具有相同的数值特征的方法是,如果您将其更改为最小绝对误差并使用线性规划:
线性程序在15毫秒内运行,而一般最小化为1.5毫秒。
你问:
如果我们把它改成绝对误差最小值&使用线性规划,什么样的特性会不同呢?
例如,参见Why does regression use least "squares" instead of least "absolute values"?。权衡是不平凡的。