此问题已在此处有答案:
Given parallel lists, how can I sort one while permuting (rearranging) the other in the same way?(16个回答)
data points connected in wrong order in line graph(1个答案)
2小时前关闭
我使用sklearn的多项式回归创建了一个基于温度的电力需求预测模型。然而,当我完成学习后,当我用matplotlib.pyplot绘制图表时,出现了以下形状。
我想要一个只有一条曲线的模型。有什么问题,我应该怎么做?这里是完整的代码。
import pandas as pd
dt = pd.read_csv("complete_dataset.csv")
dt.isnull().sum()
dt = dt.dropna()
dt.head()
dt = dt[["demand", "solar_exposure", "max_temperature","rainfall"]]
dt.head()
### Correlation between sun exposure and electricity demand --> weak
x = dt.iloc[:, 0].values
y = dt.iloc[:, 1].values
import matplotlib.pyplot as plt
plt.scatter(x, y, s = 2, color = "black")
plt.xlabel("demand")
plt.ylabel("solar exposure")
### Correlation between maximum temperature and electricity demand --> Demand tends to increase as it decreases or increases.
y = dt.iloc[:, 2]
plt.scatter(x, y, s = 1, color = "black")
plt.xlabel("demand")
plt.ylabel("max temperature")
### There appears to be no correlation between rainfall and electricity demand.
y = dt.iloc[:, 3].values
plt.scatter(x, y, s = 2, color = "black")
plt.xlabel("demand")
plt.ylabel("rainfall")
dt = dt[["demand", "max_temperature"]]
dt.rename(columns={'max_temperature': 'temp'}, inplace=True)
## model
x = dt["demand"].values.reshape(-1, 1)
y = dt["temp"].values
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree = 2)
X_poly = poly_reg.fit_transform(x)
X_poly[:5]
poly_reg.get_feature_names_out()
lin_reg = LinearRegression()
lin_reg.fit(X_poly,y)
plt.scatter(x, y, color = "black", s = 2)
plt.plot(x, lin_reg.predict(poly_reg.fit_transform(x)), color = 'red')
plt.xlabel("demand")
plt.ylabel("max temperature")
plt.show()
### Problem: The lines come out strangely because they are split sideways.
### Solution: Should we change the x-axis and y-axis to make a V-shape?
x = dt["demand"].values.reshape(-1, 1)
y = dt["temp"].values.reshape(-1, 1)
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression
poly_reg = PolynomialFeatures(degree = 2)
y_poly = poly_reg.fit_transform(y)
y_poly[:5]
poly_reg.get_feature_names_out()
lin_reg = LinearRegression()
lin_reg.fit(y_poly,x)
plt.scatter(y, x, color = "black", s = 2)
plt.plot(y, lin_reg.predict(poly_reg.fit_transform(y)), color = 'red')
plt.ylabel("demand")
plt.xlabel("max temperature")
plt.show()
1条答案
按热度按时间of1yzvn41#
这是因为plot()假设数据点是有序的。曲线实际上是连接的点,因为它们不是按照预期的顺序,matplotlib将这些点连接起来,导致你看到的混乱。
你只需要在拟合模型后对数据点进行排序:
这将按预期显示一条平滑曲线。