python ValueError:应为2D数组,但得到的却是1D数组:数组=[-1]

nhn9ugyo  于 2023-06-20  发布在  Python
关注(0)|答案(2)|浏览(126)

问题就在这里
仅从自变量(来自X_train和X_test)中提取median_income列。执行线性回归以根据median_income预测住房价值。使用拟合模型预测测试数据集的输出。绘制训练数据和测试数据的拟合模型,以检查拟合模型是否满足测试数据。
我之前做了一个线性回归。下面是代码import pandas pd import os os.getcwd()os.chdir('/Users/saurabhsaha/Documents/PGP-AI:ML-Purdue/New/datasets')df=pd.read_excel('California_housing. xlsx')

df.total_bedrooms=df.total_bedrooms.fillna(df.total_bedrooms.mean())
x = df.iloc[:,2:8]
y = df.median_house_value

from sklearn.model_selection import train_test_split

x_train,x_test,y_train,y_test = train_test_split(x,y,test_size=.20)

from sklearn.linear_model import LinearRegression

加州_model = LinearRegression().fit(x_train,y_train)

california_model.predict(x_test)

Prdicted_values = pd.DataFrame(california_model.predict(x_test),columns=['Pred'])

预测值

Final = pd.concat([x_test.reset_index(drop=True),y_test.reset_index(drop=True),Prdicted_values],axis=1)
Final['Err_pct'] = abs(Final.median_house_value- 
Final.Pred)/Final.median_house_value

这是我的数据集-https://docs.google.com/spreadsheets/d/1vYngxWw7tqX8FpwkWB5G7Q9axhe9ipTu/edit?usp=sharing&ouid=114925088866643320785&rtpof=true&sd=true
以下是我的代码。

x1_train=x_train.median_income
x1_train
x1_train.shape
x1_test=x_test.median_income
x1_test
type(x1_test)
x1_test.shape
from sklearn.linear_model import LinearRegression
california_model_new = LinearRegression().fit(x1_train,y_train)```

I get an error right here and when I try converting my 2 D array to 1 D as follows , i can not
```python
import numpy as np
x1_train= x1_train.reshape(-1, 1)
x1_test = x1_train.reshape(-1, 1)

这是我得到的错误

AttributeError: 'Series' object has no attribute 'reshape'

我是数据科学的新手,所以如果你能解释一下,那会真实的有帮助

jfgube3f

jfgube3f1#

此错误的原因是x1_train是pd.Series的示例,并且Series对象没有.reshape()函数。但是.reshape()用于numpy数组。
下面是代码来理解这一点:

# A sample series
X = pd.Series([1,2,3,1,2])
0    1
1    2
2    3
3    1
4    2
dtype: int64

# converting Series to numpy array
X = X.values

array([1, 2, 3, 1, 2])

# converting 1-D array to 2-D array
X.reshape(-1,1)

array([[1],
       [2],
       [3],
       [1],
       [2]])

对于手头的问题,下面的代码将有助于解决错误:

x1_train = x_train.median_income
type(x1_train)
pandas.core.series.Series

x1_train = x1_train.values #converting series to numpy array
x1_train
array([8.3252, 8.3014, 7.2574, ..., 1.7   , 1.8672, 2.3886])

# incase you need to convert it to 2-D array
x1_train = x1_train.reshape(-1,1)
x1_train
array([[8.3252],
       [8.3014],
       [7.2574],
       ...,
       [1.7   ],
       [1.8672],
       [2.3886]])
​
jk9hmnmh

jk9hmnmh2#

x1_trainx1_test是pandas Series对象,而reshape()方法应用于numpy数组。
请改为执行以下操作:

x1_train= x1_train.to_numpy().reshape(-1, 1)
x1_test = x1_train.to_numpy().reshape(-1, 1)

相关问题