pandas 在Python中进行多TS统计预测时,数据框定义出错

aiazj4mn  于 2022-11-20  发布在  Python
关注(0)|答案(1)|浏览(139)

我试图在python中复制这段用于统计预测的代码,我遇到了一个奇怪的错误***“name 'forecasts' is not defined”***这很奇怪,因为我之前能够复制代码而没有任何错误。
这里与参考代码(在下面的链接中给出,以及我能够成功实现的代码)的区别在于,我没有使用训练集并提取过去6个月的数据进行评估,而是使用整个训练数据来创建统计预测。
例如:如果我的时间序列数据是9月22日之前的数据,我想将9月22日之前的全部数据作为统计模型的训练集,而之前的训练数据是3月22日之前的时间序列,其余6个月是测试数据。但现在出现了错误,我无法理解为什么逻辑是相同的?
附件为计算所用的简化数据框:

  1. {'Key': {0: 65162552161356, 1: 65162552635756, 2: 65162552843456, 3: 65162552842856, 4: 65162552736856}, '2021-04-01': {0: 31, 1: 0, 2: 281, 3: 207, 4: 55}, '2021-05-01': {0: 25, 1: 0, 2: 72, 3: 104, 4: 6}, '2021-06-01': {0: 16, 1: 0, 2: 108, 3: 32, 4: 14}, '2021-07-01': {0: 8, 1: 0, 2: 107, 3: 78, 4: 10}, '2021-08-01': {0: 21, 1: 0, 2: 80, 3: 40, 4: 9}, '2021-09-01': {0: 24, 1: 0, 2: 40, 3: 73, 4: 3}, '2021-10-01': {0: 13, 1: 0, 2: 36, 3: 79, 4: 11}, '2021-11-01': {0: 59, 1: 0, 2: 65, 3: 139, 4: 14}, '2021-12-01': {0: 51, 1: 0, 2: 41, 3: 87, 4: 10}, '2022-01-01': {0: 2, 1: 0, 2: 43, 3: 47, 4: 6}, '2022-02-01': {0: 0, 1: 0, 2: 0, 3: 63, 4: 3}, '2022-03-01': {0: 0, 1: 0, 2: 16, 3: 76, 4: 18}, '2022-04-01': {0: 0, 1: 0, 2: 37, 3: 32, 4: 8}, '2022-05-01': {0: 0, 1: 0, 2: 106, 3: 96, 4: 40}, '2022-06-01': {0: 0, 1: 0, 2: 101, 3: 75, 4: 16}, '2022-07-01': {0: 0, 1: 0, 2: 60, 3: 46, 4: 14}, '2022-08-01': {0: 0, 1: 0, 2: 73, 3: 91, 4: 13}, '2022-09-01': {0: 0, 1: 0, 2: 19, 3: 17, 4: 2}}

以下是参考链接:https://towardsdatascience.com/time-series-forecasting-with-statistical-models-f08dcd1d24d1

  1. import random
  2. from itertools import product
  3. from IPython.display import display, Markdown
  4. from multiprocessing import cpu_count
  5. import matplotlib.pyplot as plt
  6. import numpy as np
  7. import pandas as pd
  8. from statsforecast import StatsForecast
  9. from nixtlats.data.datasets.m4 import M4, M4Info
  10. from statsforecast.models import (
  11. adida,
  12. croston_classic,
  13. croston_sba,
  14. croston_optimized,
  15. historic_average,
  16. imapa,
  17. naive,
  18. random_walk_with_drift,
  19. seasonal_exponential_smoothing,
  20. seasonal_naive,
  21. seasonal_window_average,
  22. ses,
  23. tsb,
  24. window_average
  25. )
  26. df = pd.read_excel ('C:/X/X/X/2.1 Demand_Data_Used.xlsx')
  27. df['Key'] = df['Key'].astype(str)
  28. df = pd.melt(df,id_vars='Key',value_vars=list(df.columns[1:]),var_name ='ds')
  29. df.columns = df.columns.str.replace('Key', 'unique_id')
  30. df.columns = df.columns.str.replace('value', 'y')
  31. df["ds"] = pd.to_datetime(df["ds"],format='%Y-%m-%d')
  32. df=df[["ds","unique_id","y"]]
  33. df['unique_id'] = df['unique_id'].astype('object')
  34. df = df.set_index('unique_id')
  35. df.reset_index()
  36. seasonality = 30 #Monthly data
  37. models = [
  38. adida,
  39. croston_classic,
  40. croston_sba,
  41. croston_optimized,
  42. historic_average,
  43. imapa,
  44. naive,
  45. random_walk_with_drift,
  46. (seasonal_exponential_smoothing, seasonality, 0.2),
  47. (seasonal_naive, seasonality),
  48. (seasonal_window_average, seasonality, 2 * seasonality),
  49. (ses, 0.1),
  50. (tsb, 0.3, 0.2),
  51. (window_average, 2 * seasonality)
  52. ]
  53. fcst = StatsForecast(df=df, models=models, freq='MS', n_jobs=cpu_count())
  54. %time forecasts = fcst.forecast(6)
  55. forecasts.reset_index()
  56. forecasts = forecasts.reset_index().merge(df_test, how='left', on=['unique_id', 'ds'])
  57. models = forecasts.drop(columns=['unique_id', 'ds', 'y']).columns.to_list()

附件是错误图像:

有谁能让我知道我做错了什么吗?我将非常感激。

bkhjykvo

bkhjykvo1#

问题的出现是因为Croston一家。我已经打开了一个issue来解决这个问题。在此期间,跳过那些型号是有效的。

  1. models = [
  2. adida,
  3. #croston_classic,
  4. #croston_sba,
  5. #croston_optimized,
  6. historic_average,
  7. imapa,
  8. naive,
  9. random_walk_with_drift,
  10. (seasonal_exponential_smoothing, seasonality, 0.2),
  11. (seasonal_naive, seasonality),
  12. (seasonal_window_average, seasonality, 2 * seasonality),
  13. (ses, 0.1),
  14. (tsb, 0.3, 0.2),
  15. (window_average, 2 * seasonality)
  16. ]
  17. fcst = StatsForecast(df=df, models=models, freq='MS', n_jobs=cpu_count())
  18. fcst.forecast(6)

更新:
StatsForecast的最新版本修复了此问题。您可以使用以下命令使用它:

  1. from statsforecast.models import CrostonClassic, CrostonSBA, CrostonOptimized
  2. models = [
  3. CrostonClassic(),
  4. CrostonSBA(),
  5. CrostonOptimized()
  6. ]
  7. fcst = StatsForecast(df=df, models=models, freq='MS', n_jobs=cpu_count())
  8. fcst.forecast(6)
展开查看全部

相关问题