python 将 Dataframe 拆分为训练集和测试集时出错

enxuqcxy  于 2023-01-24  发布在  Python
关注(0)|答案(1)|浏览(241)

我正在自学ML & DS。我在尝试拆分DaaFrame(dfc)时卡住了。下面的错误和这个网站上的各种帖子表明这个错误是由于DataFrame没有转换成整数。然而,据我所知和理解,我已经完成了这一步(“split = int(0.80*len(dfc))”)。
如果有人能给我指出正确的方向,我将不胜感激。

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
plt.style.use("seaborn-v0_8")
import warnings
warnings.filterwarnings("ignore")
import yfinance as yf
import ta

df = yf.download("GOOG")
df = df[["Adj Close"]]

df.columns = ["close"]
df = df.sort_index(ascending=False)

df["returns"] = df['close'].pct_change(1)
df["SMA 15"] = df[["close"]].rolling(15).mean().shift(1)
df["SMA 60"] = df[["close"]].rolling(60).mean().shift(1)
df["MSD 15"] = df[["returns"]].rolling(15).std().shift(1)
df["MSD 60"] = df[["returns"]].rolling(60).std().shift(1)

RSI = ta.momentum.RSIIndicator(df["close"], window=14, fillna=False)
df["rsi"] = RSI.rsi()
df["rsi"].loc["2010"].plot(figsize=(15,8))

dfc =df.columns

 Percentage of Train set
split = int(0.80*len(dfc))

# Train set creation
X_train = dfc[['SMA 15', 'SMA 60', 'MSD 15', 'MSD 30', 'rsi']].iloc[:split] # Fro beginning to split
Y_train = dfc[['returns']].iloc[:split]

# Train set creation
X_test = dfc[['SMA 15', 'SMA 60', 'MSD 15', 'MSD 30', 'rsi']].iloc[split:] # Fro split to end
Y_test = dfc[['returns']].iloc[split:]

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
camsedfj

camsedfj1#

一个可能的问题是,您没有索引 Dataframe df,而是尝试索引列dfc
因此,当拆分为train和test时,可以尝试使用df,如下所示:

X_train = df[['SMA 15', 'SMA 60', 'MSD 15', 'MSD 30', 'rsi']].iloc[:split]

相关问题