bounty将在17小时后过期。回答此问题可获得+50声望奖励。Rafnus希望引起更多人对此问题的关注：我并不关心任何具体的集成方法，我只是想把下面给出的代码加速一个数量级。这应该是绝对可能的，我只是不知道怎么做。任何人谁管理的加速水平是有资格的赏金。

我想通过结合Numba的cfunc和Scipy的LowLevelCallable功能来优化基于时间的ODE系统的求解，以大大提高odeint()的速度。但是，我很难找到正确解的确切语法。
我的（未优化的）代码看起来像这样：

import numpy as np
import numba as nb
from numba import types
from matplotlib import pyplot as plt
import scipy.integrate as si

# hours in the simulation
T_sim = 72 
# simulate fluctuating outside temp for 3 days
T_out = np.sin(np.linspace(0, T_sim, T_sim)/3.8) * 8 + 18 
# heat capacity of the air and floor
cap_air, cap_flr = 1, 10

def func(T, t):
    T_air, T_flr= T[0], T[1]
    t_ = int(t)
    H_airOut = 4 * U_vent[t_] * (T_air - T_out[t_])
    H_blowAir = U_blow[t_]
    H_airFlr = (T_air - T_flr)
    dT_air = ((H_blowAir 
               - H_airOut
               - H_airFlr) / cap_air)
    dT_flr = H_airFlr / cap_flr
    return dT_air, dT_flr

T0 = [16, 18] # starting temperature for T_air and T_flr
t_eval = np.linspace(0, T_sim-1, T_sim-1)
U_blow = np.full(T_sim, 0.1) # control for the heating
U_vent = np.full(T_sim, 0.3) # control for the ventilation
result = si.odeint(func,T0,t_eval)

# this piece of code is a standin for the real code, where func() is run 
# tens of thousands of times. All variables are constant, except for the 
# two arrays U_blow and U_vent
for _ in range(10):
    U_blow = np.random.rand(T_sim)
    U_vent = np.random.rand(T_sim)
    result = si.odeint(func,T0,t_eval)

plt.plot(result[:, 0], label='T_air')
plt.plot(result[:, 1], label='T_flr')
plt.plot(T_out[:-1], label='T_out')
plt.legend()

根据其他SO帖子，我非常确定解决方案应该看起来像下面的代码，但我自己不能让它工作。如果只有一些变量在运行之间改变，问题会容易得多，但这不是事实，因为我需要能够在运行之间改变数组U_blow和U_vent。

import scipy as sp
from numba.types import float64, CPointer, intc

def jit_integrand_function(integrand_function):
    jitted_function = nb.jit(integrand_function, nopython=True)

    @nb.cfunc(float64(intc, CPointer(float64)))
    def wrapped(n, xx):
        return jitted_function(xx[0], xx[1])
    return sp.LowLevelCallable(wrapped.ctypes)

@jit_integrand_function
def func(T, t):
    T_air, T_flr= T[0], T[1]
    t_ = int(t)
    H_airOut = 4 * U_vent[t_] * (T_air - T_out[t_])
    H_blowAir = U_blow[t_]
    H_airFlr = (T_air - T_flr)
    dT_air = ((H_blowAir 
               - H_airOut
               - H_airFlr) / cap_air)
    dT_flr = H_airFlr / cap_flr
    return dT_air, dT_flr

T0 = [16, 18] # starting temperature for T_air and T_flr
t_eval = np.linspace(0, T_sim-1, T_sim-1)

for _ in range(10):
    U_blow = np.random.rand(T_sim)
    U_vent = np.random.rand(T_sim)
    result = si.odeint(func,T0,t_eval, (U_blow, U_vent))

plt.plot(result[:, 0], label='T_air')
plt.plot(result[:, 1], label='T_flr')
plt.plot(T_out[:-1], label='T_out')
plt.legend()

顺便说一句，我的完整模型在数值上不够稳定，欧拉前向积分法无法工作，所以这不是一个选择。

import scipy as sp import scipy.integrate as si import numpy as np # hours in the simulation T_sim = 72 # simulate fluctuating outside temp for 3 days T_out = np.sin(np.linspace(0, T_sim, T_sim) / 3.8) * 8 + 18 # heat capacity of the air and floor cap_air, cap_flr = 1, 10 U_blow = np.random.rand(T_sim) U_vent = np.random.rand(T_sim) def example_rhs(T, t, U_blow, U_vent): T_air, T_flr = T[0], T[1] t_ = int(t) H_airOut = 4 * U_vent[t_] * (T_air - T_out[t_]) H_blowAir = U_blow[t_] H_airFlr = (T_air - T_flr) dT_air = ((H_blowAir - H_airOut - H_airFlr) / cap_air) dT_flr = H_airFlr / cap_flr return dT_air, dT_flr # starting temperature for T_air and T_flr T0 = [16, 18] t_eval = np.linspace(0, T_sim - 1, T_sim - 1) # Repeat the integration a couple of times to increase accuracy for i in range(100): si.odeint(example_rhs, T0, t_eval, args=(U_blow, U_vent))

ncalls tottime percall cumtime percall filename:lineno(function) [...] 100 0.001 0.000 2.021 0.020 _odepack_py.py:28(odeint) 100 0.843 0.008 2.017 0.020 {built-in method scipy.integrate._odepack.odeint} [...] 973000 1.174 0.000 1.174 0.000 test.py:16(example_rhs)

1条答案

按热度按时间

iyfamqjs1#

我试图回答如下，工作的这个完整的例子：

要分析任何python代码的性能，我将通过python -mcProfile -s cumtime test.py使用cProfile
这给了我以下内容（节略）：

正如我们所看到的，代码在odeint函数中花费了大约2秒，这是一个built-in函数，它调用Fortran代码来进行求解。在这段时间内，基于Python的rhs被调用了973000次，即每次积分9730次，尽管这对于不同的输入可能会有所不同。然而，只有大约一半~1.174秒花费在右边。2所以，即使你的回调优化到运行时间为零，你仍然需要~0.8秒来完成整个操作。3因为这个时间是花费在Fortran代码中的，我不认为有什么方法可以加快这部分的速度，所以，我不认为一个数量级是可能的，至少对于这个特定的示例来说是这样（但也许你的原始示例的行为不同？）
遗憾的是LowLevelCallable s显然不支持在scipy中集成ODE，这似乎是一个明显的补充（尽管可能有什么东西阻止了这一点？！）

赞(0）回复(0）举报 2023-01-26

使用Scipy的LowLevelCallable和numba cfunc优化一个基于时间的代码，该代码将多个数组作为输入

1条答案

相关问题

热门标签

最新问答