numpy 在同一数据中的不同属性域上查找两个线性拟合

c86crjj0  于 2022-11-10  发布在  其他
关注(0)|答案(1)|浏览(134)

我试图绘制一个三次多项式,在同一组数据上有两个线性拟合。我的数据是这样的:

,Frequency,Flux Density,log_freq,log_flux
0,1.25e+18,1.86e-07,18.096910013008056,-6.730487055782084
1,699000000000000.0,1.07e-06,14.84447717574568,-5.97061622231479
2,541000000000000.0,1.1e-06,14.73319726510657,-5.958607314841775
3,468000000000000.0,1e-06,14.670245853074125,-6.0
4,458000000000000.0,1.77e-06,14.660865478003869,-5.752026733638194
5,89400000000000.0,3.01e-05,13.951337518795917,-4.521433504406157
6,89400000000000.0,9.3e-05,13.951337518795917,-4.031517051446065
7,89400000000000.0,0.00187,13.951337518795917,-2.728158393463501
8,65100000000000.0,2.44e-05,13.813580988568193,-4.61261017366127
9,65100000000000.0,6.28e-05,13.813580988568193,-4.2020403562628035
10,65100000000000.0,0.00108,13.813580988568193,-2.96657624451305
11,25900000000000.0,0.000785,13.413299764081252,-3.1051303432547472
12,25900000000000.0,0.00106,13.413299764081252,-2.9746941347352296
13,25900000000000.0,0.000796,13.413299764081252,-3.099086932262331
14,13600000000000.0,0.00339,13.133538908370218,-2.469800301796918
15,13600000000000.0,0.00372,13.133538908370218,-2.4294570601181023
16,13600000000000.0,0.00308,13.133538908370218,-2.5114492834995557
17,12700000000000.0,0.00222,13.103803720955957,-2.653647025549361
18,12700000000000.0,0.00204,13.103803720955957,-2.6903698325741012
19,230000000000.0,0.133,11.361727836017593,-0.8761483590329142
22,90000000000.0,0.518,10.954242509439325,-0.28567024025476695
23,61000000000.0,1.0,10.785329835010767,0.0
24,61000000000.0,0.1,10.785329835010767,-1.0
25,61000000000.0,0.4,10.785329835010767,-0.3979400086720376
26,42400000000.0,0.8,10.627365856592732,-0.09691001300805639
27,41000000000.0,0.9,10.612783856719735,-0.045757490560675115
28,41000000000.0,0.7,10.612783856719735,-0.1549019599857432
29,41000000000.0,0.8,10.612783856719735,-0.09691001300805639
30,41000000000.0,0.6,10.612783856719735,-0.2218487496163564
31,41000000000.0,0.7,10.612783856719735,-0.1549019599857432
32,37000000000.0,1.0,10.568201724066995,0.0
33,36800000000.0,1.0,10.565847818673518,0.0
34,36800000000.0,0.98,10.565847818673518,-0.00877392430750515
35,33000000000.0,0.8,10.518513939877888,-0.09691001300805639
36,33000000000.0,1.0,10.518513939877888,0.0
37,31400000000.0,0.92,10.496929648073214,-0.036212172654444715
38,23000000000.0,1.4,10.361727836017593,0.146128035678238
39,23000000000.0,1.1,10.361727836017593,0.04139268515822508
40,23000000000.0,1.11,10.361727836017593,0.045322978786657475
41,23000000000.0,1.1,10.361727836017593,0.04139268515822508
42,22200000000.0,1.23,10.346352974450639,0.08990511143939793
43,22200000000.0,1.24,10.346352974450639,0.09342168516223506
44,21700000000.0,0.98,10.33645973384853,-0.00877392430750515
45,21700000000.0,1.07,10.33645973384853,0.029383777685209667
46,20000000000.0,1.44,10.301029995663981,0.15836249209524964
47,15400000000.0,1.32,10.187520720836464,0.12057393120584989
48,15000000000.0,1.5,10.176091259055681,0.17609125905568124
49,15000000000.0,1.5,10.176091259055681,0.17609125905568124
50,15000000000.0,1.42,10.176091259055681,0.15228834438305647
51,15000000000.0,1.43,10.176091259055681,0.1553360374650618
52,15000000000.0,1.42,10.176091259055681,0.15228834438305647
53,15000000000.0,1.47,10.176091259055681,0.1673173347481761
54,15000000000.0,1.38,10.176091259055681,0.13987908640123647
55,10700000000.0,2.59,10.02938377768521,0.4132997640812518
56,8870000000.0,2.79,9.947923619831727,0.44560420327359757
57,8460000000.0,2.69,9.927370363039023,0.42975228000240795
58,8400000000.0,2.8,9.924279286061882,0.4471580313422192
59,8400000000.0,2.53,9.924279286061882,0.40312052117581787
60,8400000000.0,2.06,9.924279286061882,0.31386722036915343
61,8300000000.0,2.58,9.919078092376074,0.41161970596323016
62,8080000000.0,2.76,9.907411360774587,0.4409090820652177
63,5010000000.0,3.68,9.699837725867246,0.5658478186735176
64,5000000000.0,0.81,9.698970004336019,-0.09151498112135022
65,5000000000.0,3.5,9.698970004336019,0.5440680443502757
66,5000000000.0,3.57,9.698970004336019,0.5526682161121932
67,4980000000.0,3.46,9.697229342759718,0.5390760987927766
68,4900000000.0,2.95,9.690196080028514,0.46982201597816303
69,4850000000.0,3.46,9.685741738602264,0.5390760987927766
70,4850000000.0,3.45,9.685741738602264,0.5378190950732742
71,4780000000.0,2.16,9.679427896612118,0.3344537511509309
72,4540000000.0,3.61,9.657055852857104,0.557507201905658
73,2700000000.0,3.5,9.431363764158988,0.5440680443502757
74,2700000000.0,3.7,9.431363764158988,0.568201724066995
75,2700000000.0,3.92,9.431363764158988,0.5932860670204573
76,2700000000.0,3.92,9.431363764158988,0.5932860670204573
77,2250000000.0,4.21,9.352182518111363,0.6242820958356683
78,1660000000.0,3.69,9.220108088040055,0.5670263661590603
79,1660000000.0,3.8,9.220108088040055,0.5797835966168101
80,1410000000.0,3.5,9.14921911265538,0.5440680443502757
81,1400000000.0,3.45,9.146128035678238,0.5378190950732742
82,1400000000.0,3.28,9.146128035678238,0.5158738437116791
83,1400000000.0,3.19,9.146128035678238,0.5037906830571811
84,1400000000.0,3.51,9.146128035678238,0.5453071164658241
85,1340000000.0,3.31,9.127104798364808,0.5198279937757188
86,1340000000.0,3.31,9.127104798364808,0.5198279937757188
87,750000000.0,3.14,8.8750612633917,0.49692964807321494
88,408000000.0,1.46,8.61066016308988,0.1643528557844371
89,408000000.0,1.46,8.61066016308988,0.1643528557844371
90,365000000.0,1.62,8.562292864456476,0.20951501454263097
91,365000000.0,1.56,8.562292864456476,0.1931245983544616
92,333000000.0,1.32,8.52244423350632,0.12057393120584989
93,302000000.0,1.23,8.48000694295715,0.08990511143939793
94,151000000.0,2.13,8.178976947293169,0.3283796034387377
95,73800000.0,3.58,7.868056361823042,0.5538830266438743

我的代码是

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import numpy.polynomial.polynomial as poly

def find_extrema(poly, bounds):
    '''
    Finds the extrema of the polynomial; ensure real.
    https://stackoverflow.com/questions/72932816/python-finding-local-maxima-minima-for-multiple-polynomials-efficiently
    '''
    deriv = poly.deriv()
    extrema = deriv.roots()
    # Filter out complex roots
    extrema = extrema[np.isreal(extrema)]
    # Get real part of root
    extrema = np.real(extrema)
    # Apply bounds check
    lb, ub = bounds
    extrema = extrema[(lb <= extrema) & (extrema <= ub)]
    return extrema

def find_maximum(poly, bounds):
    '''
    Find the maximum point; returns the value of the turnover frequency.
    https://stackoverflow.com/questions/72932816/python-finding-local-maxima-minima-for-multiple-polynomials-efficiently
    '''
    extrema = find_extrema(poly, bounds)
    # Either bound could end up being the minimum. Check those too.
    extrema = np.concatenate((extrema, bounds))
    value_at_extrema = poly(extrema)
    maximum_index = np.argmax(value_at_extrema)
    return extrema[maximum_index]

# LOAD THE DATA FROM FILE HERE

# CARRY ON...

xvar = 'log_freq'
yvar = 'log_flux'

x, y = pks[xvar], pks[yvar]
lower = min(x)
upper = max(x)

# Find the 3rd-order polynomial which fits the SED

coefs = poly.polyfit(x, y, 3) # find the coeffs
x_new = np.linspace(lower, upper, num=len(x)*10) # space to plot the fit
ffit = poly.Polynomial(coefs) # find the polynomial

# Find turnover frequency and peak flux

nu_to = find_maximum(ffit, (lower, upper))
F_p = ffit(nu_to)

# HERE'S THE TRICKY BIT

# Find the straight line to fit to the left of nu_to

left_linefit = poly.polyfit(x, y, 1)
x_left = np.linspace(lower, nu_to, num=len(x)*10) # space to plot the fit
ffit_thin = poly.Polynomial(left_linefit,
                        domain = (lower, nu_to)
                        )

# PLOTS THE POLYNOMIAL WELL

ax1 = plt.subplot(1, 1, 1)
ax1.scatter(pks[xvar], pks[yvar], label = 'PKS 0742+10', c = 'b')
ax1.plot(x_new, ffit(x_new), color = 'r')

ax1.plot(x_left, ffit_left(x_left), color = 'gold')

ax1.set_yscale('linear')
ax1.set_xscale('linear')
ax1.legend()

ax1.set_xlabel(r'$\log\nu$ ($\nu$ in Hz)')
ax1.set_ylabel(r'$\log F_{\nu}$ ($F_{\nu}$ in Jy)')
ax1.grid(axis = 'both', which = 'major')

该代码很好地生成了多边形拟合:

我正在尝试绘制最大值两侧的点的直线拟合,如下图所示:

我以为我能做到

ffit_left = poly.Polynomial(left_linefit,
                        domain = (lower, nu_to)
                        )

ffit_right类似,但这会产生

这实际上是适合整个数据集的直线,仅为该域绘制。我不想操纵数据集,因为最终我将不得不对许多数据集进行操作。
代码的合适部分来自an answer to this question

如何在不操作数据集的情况下将一条直线拟合到一组点?

我的猜测是,我必须让left_linefit = poly.polyfit(x, y, 1)识别一个域,但我在numpy polyfit docs中看不到任何东西。
很抱歉问这么长的问题!

bq3bfh9z

bq3bfh9z1#

我不一定能很好地理解你的要求。如果你想拟合一个由三个线性段组成的分段函数,在https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf中描述了一种方法,并给出了理论和数值例子。
我们考虑了几个案例。其中,下面的案例可能会给您带来方便。

H(*)是Heaviside阶跃函数。

相关问题