scipy curve_fit引发“OptimizeWarning:无法估计参数的协方差”

7eumitmz  于 2023-06-29  发布在  其他
关注(0)|答案(2)|浏览(286)

我试着把这个函数拟合到一些数据上:

但是当我用我的代码

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

def f(x, start, end):
    res = np.empty_like(x)
    res[x < start] =-1
    res[x > end] = 1
    linear = np.all([[start <= x], [x <= end]], axis=0)[0]
    res[linear] = np.linspace(-1., 1., num=np.sum(linear))
    return res

if __name__ == '__main__':

    xdata = np.linspace(0., 1000., 1000)
    ydata = -np.ones(1000)
    ydata[500:1000] = 1.
    ydata = ydata + np.random.normal(0., 0.25, len(ydata))

    popt, pcov = curve_fit(f, xdata, ydata, p0=[495., 505.])
    print(popt, pcov)
    plt.figure()
    plt.plot(xdata, f(xdata, *popt), 'r-', label='fit')
    plt.plot(xdata, ydata, 'b-', label='data')
    plt.show()

我收到警告了

OptimizeWarning: Covariance of the parameters could not be estimated

输出:

在这个例子中,开始和结束应该接近500,但它们与我最初的猜测没有任何变化。

kjthegm6

kjthegm61#

警告(不是错误)

OptimizeWarning: Covariance of the parameters could not be estimated

意味着拟合不能确定拟合参数的不确定性(方差)。
主要问题是模型函数f将参数startend视为离散值--它们被用作函数形式变化的整数位置。scipy的curve_fit(以及scipy.optimize中的所有其他优化例程)假设参数是 * 连续 * 变量,而不是离散的。
拟合过程将尝试在参数中采取小步骤(通常围绕机器精度)以获得残差相对于变量(雅可比矩阵)的数值导数。在值用作离散变量的情况下,这些导数将为零,并且拟合过程将不知道如何改变值以改进拟合。
它看起来像你试图拟合一个阶跃函数的一些数据。请允许我推荐尝试lmfithttps://lmfit.github.io/lmfit-py),它提供了一个更高级别的曲线拟合接口,并且有许多内置模型。例如,它包括一个StepModel,它应该能够对您的数据进行建模。
对于您的数据的轻微修改(以便它具有有限的步长),以下带有lmfit的脚本可以适应这样的数据:

#!/usr/bin/python
import numpy as np
from lmfit.models import StepModel, LinearModel
import matplotlib.pyplot as plt

np.random.seed(0)
xdata = np.linspace(0., 1000., 1000)
ydata = -np.ones(1000)
ydata[500:1000] = 1.
# note that a linear step is added here:
ydata[490:510] = -1 + np.arange(20)/10.0
ydata = ydata + np.random.normal(size=len(xdata), scale=0.1)

# model data as Step + Line
step_mod = StepModel(form='linear', prefix='step_')
line_mod = LinearModel(prefix='line_')

model = step_mod + line_mod

# make named parameters, giving initial values:
pars = model.make_params(line_intercept=ydata.min(),
                         line_slope=0,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)

# fit data to this model with these parameters
out = model.fit(ydata, pars, x=xdata)

# print results
print(out.fit_report())

# plot data and best-fit
plt.plot(xdata, ydata, 'b')
plt.plot(xdata, out.best_fit, 'r-')
plt.show()

打印出一份报告

[[Model]]
    (Model(step, prefix='step_', form='linear') + Model(linear, prefix='line_'))
[[Fit Statistics]]
    # fitting method   = leastsq
    # function evals   = 49
    # data points      = 1000
    # variables        = 5
    chi-square         = 9.72660131
    reduced chi-square = 0.00977548
    Akaike info crit   = -4622.89074
    Bayesian info crit = -4598.35197
[[Variables]]
    step_sigma:      20.6227793 +/- 0.77214167 (3.74%) (init = 2)
    step_center:     490.167878 +/- 0.44804412 (0.09%) (init = 500)
    step_amplitude:  1.98946656 +/- 0.01304854 (0.66%) (init = 0.996283)
    line_intercept: -1.00628058 +/- 0.00706005 (0.70%) (init = -1.277259)
    line_slope:      1.3947e-05 +/- 2.2340e-05 (160.18%) (init = 0)
[[Correlations]] (unreported correlations are < 0.100)
    C(step_amplitude, line_slope)     = -0.875
    C(step_sigma, step_center)        = -0.863
    C(line_intercept, line_slope)     = -0.774
    C(step_amplitude, line_intercept) =  0.461
    C(step_sigma, step_amplitude)     =  0.170
    C(step_sigma, line_slope)         = -0.147
    C(step_center, step_amplitude)    = -0.146
    C(step_center, line_slope)        =  0.127

并生成x1c 0d1x的图
Lmfit有很多额外的功能。例如,如果要设置某些参数值的界限或使某些参数值不发生变化,可以执行以下操作:

# make named parameters, giving initial values:
pars = model.make_params(line_intercept=ydata.min(),
                         line_slope=0,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)

# now set max and min values for step amplitude"
pars['step_amplitude'].min = 0
pars['step_amplitude'].max = 100

# fix the offset of the line to be -1.0
pars['line_offset'].value = -1.0
pars['line_offset'].vary = False

# then run fit with these parameters
out = model.fit(ydata, pars, x=xdata)

如果您知道模型应该是Step+Constant并且常数应该是固定的,那么您也可以将模型修改为

from lmfit.models import ConstantModel
# model data as Step + Constant
step_mod = StepModel(form='linear', prefix='step_')
const_mod = ConstantModel(prefix='const_')

model = step_mod + const_mod

pars = model.make_params(const_c=-1,
                         step_center=xdata.mean(),
                         step_amplitude=ydata.std(),
                         step_sigma=2.0)
pars['const_c'].vary = False
agxfikkp

agxfikkp2#

这个答案太晚了,但是如果你想坚持使用scipy中的curve_fit,那么重写这个函数,使它不显式地依赖于startend作为截止点,就可以完成这项工作。例如,if x < start, then -1可以通过将x移位start并检查其符号来写入,即np.sign(x - start)。然后就变成了为函数的每个条件编写单独的定义并将它们添加到单个函数中。

def f(x, start, end):
    left_tail = np.sign(x - start)
    left_tail[left_tail > -1] = 0         # only -1s are needed from here
    right_tail = np.sign(x - end)
    right_tail[right_tail < 1] = 0        # only 1s are needed from here
    rest = 1 / (end-start) * (x - start)
    rest[(rest < 0) | (rest > 1)] = 0     # only the values between 0 and 1 are needed from here
    return left_tail + rest + right_tail  # sum for a single value

popt, pcov = curve_fit(f, xdata, ydata, p0=[495., 505.])

上面的函数可以用np.clip()(限制数组中的值)更简洁地编写,它可以取代上面的布尔索引和替换。

import numpy as np
from scipy.optimize import curve_fit
import matplotlib.pyplot as plt

def f(x, start, end):
    left_tail = np.clip(np.sign(x - start), -1, 0)
    rest = np.clip(1 / (end-start) * (x - start), 0, 1)
    return left_tail + rest

# sample data (as in the OP)
xdata = np.linspace(0, 1000, 1000)
ydata = np.r_[[-1.]*500, [1]*500]
ydata += np.random.normal(0, 0.25, len(ydata))

# fit function `f` to the data
popt, pcov = curve_fit(f, xdata, ydata, p0=[495., 505.])
print(popt, pcov)

# plot data along with the fitted function
plt.plot(xdata, ydata, 'b-', label='data')
plt.plot(xdata, f(xdata, *popt), 'r-', label='fit')
plt.legend();

# [499.4995098  501.51244722] [[ 1.24195553 -0.25654186]
#  [-0.25654186  0.2538896 ]]

然后使用与OP中相同的方式构建的数据,我们得到系数(499.499,501.51)(非常接近(500,500)),图如下所示。

相关问题