我试图使用skforecast
进行时间序列分析,但是我收到警告,告诉我df没有频率,因为索引不是DateTimeIndex
,但事实上它是。
下面是代码:
import yfinance as yf
import datetime as dt
spxl = yf.Ticker("SPXL")
hist = spxl.history(start="2015-01-01")
hist = hist.asfreq("D")
data = hist.dropna()
type(data.index)
#Output: pandas.core.indexes.datetimes.DatetimeIndex
#Split data into train-val-test
#==============================================================================
data = data.loc['2015-01-01': '2022-12-31']
end_train = '2019-12-31'
end_validation = '2020-12-31'
data_train = data.loc[: end_train, :].copy()
data_val = data.loc[end_train:end_validation, :].copy()
data_test = data.loc[end_validation:, :].copy()
#Create forecaster
#==============================================================================
forecaster = ForecasterAutoreg(
regressor = LGBMRegressor(),
lags = 7
)
#Grid search of hyper-parameters and lags
#==============================================================================
#Regressor hyper-parameters
param_grid = {
'n_estimators': [100, 500],
'max_depth': [3, 5, 10],
'learning_rate': [0.01, 0.1]
}
#Lags used as predictors
lags_grid = [7]
在创建预测器时,在此处触发警告:
results_grid_q10 = grid_search_forecaster(
forecaster = forecaster,
y = data.loc[:end_validation, 'Close'],
param_grid = param_grid,
lags_grid = lags_grid,
steps = 7,
refit = True,
metric = 'mean_squared_error',
initial_train_size = int(len(data_train)),
fixed_train_size = False,
return_best = True,
verbose = False
)
我似乎不明白我做错了什么!
1条答案
按热度按时间y1aodyip1#
如果有人在使用skforecast或其他时间序列预测时遇到同样的问题,下面是库创建者的解决方案:
https://github.com/JoaquinAmatRodrigo/skforecast/issues/329