python-3.x 基于Darts的多协变量时间序列预测

zi8p0yeb  于 2023-05-30  发布在  Python
关注(0)|答案(3)|浏览(271)

我尝试使用教程here,其中我们有两个协变量来预测目标。
本教程使用.stack()将两个协变量相加。我不清楚如何使用此函数向模型添加两个以上的协变量。
我尝试了以下代码:

from darts.models import BlockRNNModel

my_covariate = (tg.sine_timeseries(length=LENGTH, 
                                value_frequency=(1/14), 
                                freq='D', 
                                column_name='my_covariate')
             + 0.4 * tg.gaussian_timeseries(length=LENGTH, freq='D'))

brnn_melting_and_rain = BlockRNNModel(input_chunk_length=30, 
                                      output_chunk_length=10, 
                                      n_rnn_layers=2)

brnn_melting_and_rain.fit(flow_train, 
                          # past_covariates=melting.stack(rainfalls).stack(rainfalls), 
                          past_covariates=[melting,rainfalls,my_covariate],
                          epochs=10, 
                          verbose=True)

eval_model(brnn_melting_and_rain, 
           past_covariates=melting.stack(rainfalls))

但我得到了以下错误:
ValueError:提供的目标序列必须与提供的协变量序列具有相同的长度。
我试着阅读DARTS的文档,但是没有关于如何使用past_covariates的明确说明,具体来说,我应该在这里传递什么数据类型以及是否有任何其他要求。

xmjla07d

xmjla07d1#

从你的代码和你提供的链接中复制粘贴,我已经设法得到了一个工作代码(没有错误),但我不确定这是你想要的。

from darts.models import BlockRNNModel
from darts.utils import timeseries_generation as tg
from darts.metrics import rmse

def eval_model(model, past_covariates=None, future_covariates=None):
    # Past and future covariates are optional because they won't always be used in our tests
    
    # We backtest the model on the last 20% of the flow series, with a horizon of 10 steps:
    backtest = model.historical_forecasts(series=flow, 
                                          past_covariates=past_covariates,
                                          future_covariates=future_covariates,
                                          start=0.8, 
                                          retrain=False,
                                          verbose=True, 
                                          forecast_horizon=10)
    
    flow[-len(backtest)-100:].plot()
    backtest.plot(label='backtest (n=10)')
    print('Backtest RMSE = {}'.format(rmse(flow, backtest)))

LENGTH = 3 * 365

my_covariate = (tg.sine_timeseries(length=LENGTH, 
                                   value_frequency=(1/14), 
                                   freq='D', 
                                   column_name='my_covariate')
             + 0.4 * tg.gaussian_timeseries(length=LENGTH, freq='D'))

melting = (tg.sine_timeseries(length=LENGTH, 
                              value_frequency=(1/365), 
                              freq='D', 
                              column_name='melting')
           + 0.15 * tg.gaussian_timeseries(length=LENGTH, freq='D'))

rainfalls = (tg.sine_timeseries(length=LENGTH, 
                                value_frequency=(1/14), 
                                freq='D', 
                                column_name='rainfall')
             + 0.3 * tg.gaussian_timeseries(length=LENGTH, freq='D'))

melting_contribution = 0.5 * melting.shift(5)

all_contributions = [melting_contribution] + [0.1 * rainfalls.shift(lag) for lag in range(5)]

flow = sum(
    [series[melting_contribution.start_time():][:melting.end_time()] 
     for series in all_contributions]
).with_columns_renamed('melting', 'flow')

brnn_melting_and_rain = BlockRNNModel(input_chunk_length=30, 
                                      output_chunk_length=10, 
                                      n_rnn_layers=2)
brnn_melting_and_rain.fit(flow, 
                          # past_covariates=melting.stack(rainfalls).stack(rainfalls), 
                          # past_covariates=[melting,rainfalls,my_covariate],
                          past_covariates=melting.stack(rainfalls).stack(my_covariate),
                          epochs=10, 
                          verbose=True)

flow_train, _ = flow.split_before(0.8)

eval_model(brnn_melting_and_rain, 
           #past_covariates=melting.stack(rainfalls)
           past_covariates=melting.stack(rainfalls).stack(my_covariate)
          )

主要的事情是用melting.stack(rainfalls).stack(my_covariate)堆叠时间序列,然后将其传递给eval_model方法。

w51jfk4q

w51jfk4q2#

这应该可以解决这个问题,并允许您使用协变量时间序列拟合模型。

from darts.models import BlockRNNModel

my_covariate = (tg.sine_timeseries(length=LENGTH, 
                                value_frequency=(1/14), 
                                freq='D', 
                                column_name='my_covariate')
             + 0.4 * tg.gaussian_timeseries(length=LENGTH, freq='D'))

# Trim all the time series so they are the same length
flow_train = flow_train.trim()
melting = melting.trim()
rainfalls = rainfalls.trim()
my_covariate = my_covariate.trim()

brnn_melting_and_rain = BlockRNNModel(input_chunk_length=30, 
                                      output_chunk_length=10, 
                                      n_rnn_layers=2)

brnn_melting_and_rain.fit(flow_train, 
                          past_covariates=[melting,rainfalls,my_covariate],
                          epochs=10, 
                          verbose=True)

eval_model(brnn_melting_and_rain, 
           past_covariates=melting.stack(rainfalls))
u59ebvdq

u59ebvdq3#

使用API的concatenate正确堆叠许多时间序列,而不是传递列表或调用许多stack调用(例如避免“堆叠”)。
例如,以这种方式构建多元数据沿着自定义要素

from darts import concatenate
past_covariates = concatenate([melting, rainfalls, my_covariate], axis=1)
past_covariates
inspect visually


,然后在此多变量时间序列上训练

brnn_melting.fit(flow_train, 
                 past_covariates=past_covariates, 
                 epochs=100, 
                 verbose=True)

注意:虽然不推荐使用,但“堆叠堆栈”本质上等同于串联

from darts import concatenate

past_covariates = concatenate([melting, rainfalls, my_covariate], axis=1)
past_covariates_old = melting.stack(rainfalls).stack(my_covariate)
assert past_covariates == past_covariates_old

相关问题