我有两个numpy数组:
x= np.linspace(1,10,100) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
我想使用这些数据集训练一个线性回归函数。为了比较复杂性和泛化之间的关系,我使用h多项式特征对一组4次(1, 3, 6, 9)
进行预处理。拟合模型后,我想在数组x = np.linspace(1, 10, 100)
上进行测试
经过多次尝试,我发现x和y数组需要重新调整,我也这样做了。但是,当我创建要预测的新x数据集时,它抱怨维度没有对齐。估计器正在处理原始x数组的测试拆分。
- 以下是我的代码**
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
np.random.seed(0)
n = 100
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10
X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)
def fn_one():
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
x_predict = np.linspace(0,10,100)
x_predict = x_predict.reshape(-1, 1)
degrees = [1, 3, 6, 9]
predictions = []
for i, deg in enumerate(degrees):
linReg = LinearRegression()
pf = PolynomialFeatures(degree=deg)
xt = x.reshape(-1, 1)
yt = y.reshape(-1, 1)
X_transformed = pf.fit_transform(xt)
X_train_transformed, X_test_transformed, y_train_temp, y_test_temp = train_test_split(X_transformed, yt, random_state=0)
linReg.fit(X_train_transformed, y_train_temp)
predictions.append(linReg.predict(x_predict))
np.array(predictions)
return predictions
- 不同数组的形状(循环中的3阶)**
x_predict = (100, 1)
xt = 100, 1
yt = 100, 1
X_train_transformed = 75, 4
y_train_temp = 75, 1
X_test_transformed = 25, 4
y_train_temp = 25, 1
X_test_transformed = 4、25、1的预测值
x_predict的预测=不工作:
错误=值错误:形状(100,1)和(2,1)未对齐:1(维度1)!= 2(维度0)
1条答案
按热度按时间1l5u6lss1#
您忘记转换您的
x_predict
。我已更新您的代码如下:现在当你调用
fn_one()
时,你会得到预测。