如何修复在Jupyter中使用Pandas时遇到的错误？

pod7payv 于 2023-06-04 发布在其他

关注(0)|答案(1)|浏览(252)

在pandas中设置数据框子集时出错。下面是我的代码：

`import pandas as pd
import matplotlib.pyplot as plt
import scipy

from gluonts.dataset.pandas import PandasDataset
from gluonts.dataset.split import split
from gluonts.torch import DeepAREstimator`

# Load data from a CSV file into a PandasDataset
df = pd.read_csv(
    "city_temperature.csv"
)
df.head()
df = df[df["City"]=="Algiers"]
dataset = PandasDataset(df2)

我试图在一个全球城市温度数据中，将一部分城市名为“阿尔及尔”的数据进行子集化。City是数据集的一列，我尝试使用PandasDataset（）函数。我不知道如何使用这个功能，因为我搜索了帮助文件，但找不到它。在尝试编码后，我得到了这个错误：

/tmp/ipykernel_712/289762879.py:9: DtypeWarning: Columns (2) have mixed types. Specify dtype option on import or set low_memory=False.
  df = pd.read_csv(
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 15
     13 df.dropna()
     14 df2 = df[df["City"]=="Algiers"]
---> 15 dataset = PandasDataset(df2)
     17 # Split the data for training and testing
     18 #training_data, test_gen = split(df, offset=-36)
     19 #test_data = test_gen.generate_instances(prediction_length=12, windows=3)
   (...)
     31 #  forecast.plot()
     32 #plt.legend(["True values"], loc="upper left", fontsize="xx-large")

File <string>:12, in __init__(self, dataframes, target, feat_dynamic_real, past_feat_dynamic_real, timestamp, freq, static_features, future_length, unchecked, assume_sorted, dtype)

File /opt/conda/lib/python3.10/site-packages/gluonts/dataset/pandas.py:119, in PandasDataset.__post_init__(self, dataframes, static_features)
    114 if self.freq is None:
    115     assert (
    116         self.timestamp is None
    117     ), "You need to provide `freq` along with `timestamp`"
--> 119     self.freq = infer_freq(first(pairs)[1].index)
    121 static_features = Maybe(static_features).unwrap_or_else(pd.DataFrame)
    123 object_columns = static_features.select_dtypes(
    124     "object"
    125 ).columns.tolist()

File /opt/conda/lib/python3.10/site-packages/gluonts/dataset/pandas.py:319, in infer_freq(index)
    316 if isinstance(index, pd.PeriodIndex):
    317     return index.freqstr
--> 319 freq = pd.infer_freq(index)
    320 # pandas likes to infer the `start of x` frequency, however when doing
    321 # df.to_period("<x>S"), it fails, so we avoid using it. It's enough to
    322 # remove the trailing S, e.g `MS` -> `M
    323 if len(freq) > 1 and freq.endswith("S"):

File /opt/conda/lib/python3.10/site-packages/pandas/tseries/frequencies.py:193, in infer_freq(index, warn)
    191 if isinstance(index, Index) and not isinstance(index, DatetimeIndex):
    192     if isinstance(index, (Int64Index, Float64Index)):
--> 193         raise TypeError(
    194             f"cannot infer freq from a non-convertible index type {type(index)}"
    195         )
    196     index = index._values
    198 if not isinstance(index, DatetimeIndex):

TypeError: cannot infer freq from a non-convertible index type <class 'pandas.core.indexes.numeric.Int64Index'>

有人能帮我吗？
我试着在谷歌上搜索这个错误，但我找不到答案。

pandas

来源：https://stackoverflow.com/questions/76397888/how-can-i-fix-the-error-im-getting-when-using-pandas-in-jupyter

1条答案

按热度按时间

relj7zay1#

我认为DataFrame的索引导致了这个问题，因为它是Int 64 Index类型。
要解决此问题，您可以尝试在将DataFrame传递给PandasDataset构造函数之前重置其索引。试试这样的方法：

# Subset the DataFrame for the desired city
df2 = df[df["City"] == "Algiers"]

# Reset the index of the DataFrame
df2.reset_index(drop=True, inplace=True)

# Create the PandasDataset
dataset = PandasDataset(df2)

通过使用reset_index（drop=True）重置索引，可以删除现有索引并创建一个从0开始的新的基于整数的索引。这将确保索引可由PandasDataset转换。如果成功了就告诉我！

赞(0）回复(0）举报 2023-06-04

我来回答

如何修复在Jupyter中使用Pandas时遇到的错误？

1条答案

相关问题

热门标签

最新问答