gold = pd.read_csv('gold.csv')
gold['Date'] = gold['Date'].apply(pd.to_datetime, utc=True)
gold
Date Price
0 1950-01-01 00:00:00+00:00 34.730
1 1950-02-01 00:00:00+00:00 34.730
2 1950-03-01 00:00:00+00:00 34.730
3 1950-04-01 00:00:00+00:00 34.730
4 1950-05-01 00:00:00+00:00 34.730
... ... ...
842 2020-03-01 00:00:00+00:00 1593.764
843 2020-04-01 00:00:00+00:00 1680.030
844 2020-05-01 00:00:00+00:00 1715.697
845 2020-06-01 00:00:00+00:00 1734.032
846 2020-07-01 00:00:00+00:00 1840.807
gold.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 847 entries, 0 to 846
Data columns (total 2 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 847 non-null datetime64[ns, UTC]
1 Price 847 non-null float64
dtypes: datetime64[ns, UTC](1), float64(1)
memory usage: 13.4 KB
gold = gold.set_index('Date')
gold.info()
<class 'pandas.core.frame.DataFrame'>
DatetimeIndex: 847 entries, 1950-01-01 00:00:00+00:00 to 2020-07-01 00:00:00+00:00
Data columns (total 1 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Price 847 non-null float64
dtypes: float64(1)
memory usage: 13.2 KB
gold
Date Price
1950-01-01 00:00:00+00:00 34.730
1950-02-01 00:00:00+00:00 34.730
1950-03-01 00:00:00+00:00 34.730
1950-04-01 00:00:00+00:00 34.730
1950-05-01 00:00:00+00:00 34.730
... ...
2020-03-01 00:00:00+00:00 1593.764
2020-04-01 00:00:00+00:00 1680.030
2020-05-01 00:00:00+00:00 1715.697
2020-06-01 00:00:00+00:00 1734.032
2020-07-01 00:00:00+00:00 1840.807
years = pd.date_range('01/01/1950', '01/01/2020', freq='YS')
years
DatetimeIndex(['1950-01-01', '1951-01-01', '1952-01-01', '1953-01-01',
'1954-01-01', '1955-01-01', '1956-01-01', '1957-01-01',
'1958-01-01', '1959-01-01', '1960-01-01', '1961-01-01',
'1962-01-01', '1963-01-01', '1964-01-01', '1965-01-01',
'1966-01-01', '1967-01-01', '1968-01-01', '1969-01-01',
'1970-01-01', '1971-01-01', '1972-01-01', '1973-01-01',
'1974-01-01', '1975-01-01', '1976-01-01', '1977-01-01',
'1978-01-01', '1979-01-01', '1980-01-01', '1981-01-01',
'1982-01-01', '1983-01-01', '1984-01-01', '1985-01-01',
'1986-01-01', '1987-01-01', '1988-01-01', '1989-01-01',
'1990-01-01', '1991-01-01', '1992-01-01', '1993-01-01',
'1994-01-01', '1995-01-01', '1996-01-01', '1997-01-01',
'1998-01-01', '1999-01-01', '2000-01-01', '2001-01-01',
'2002-01-01', '2003-01-01', '2004-01-01', '2005-01-01',
'2006-01-01', '2007-01-01', '2008-01-01', '2009-01-01',
'2010-01-01', '2011-01-01', '2012-01-01', '2013-01-01',
'2014-01-01', '2015-01-01', '2016-01-01', '2017-01-01',
'2018-01-01', '2019-01-01', '2020-01-01'],
dtype='datetime64[ns]', freq='AS-JAN')
reindexed = gold.reindex(years)
reindexed
Price
1950-01-01 NaN
1951-01-01 NaN
1952-01-01 NaN
1953-01-01 NaN
1954-01-01 NaN
... ...
2016-01-01 NaN
2017-01-01 NaN
2018-01-01 NaN
2019-01-01 NaN
2020-01-01 NaN
当我绘制重新索引后,价格没有重新索引后的值。我试图在1950年至2020年的每一年的第一天重新索引黄金,然后将它们绘制在pandas生成的线图上。我已经看过Stack Overflow的各种帖子,但我仍然不确定我做错了什么。
2条答案
按热度按时间j0pj023g1#
您需要使用UTC本地化索引,使用
tz_localize
进行转换:输出:
您也可以尝试
resample
获取每年的first
值:输出:
rqdpfwrv2#
mozway的回答是正确的,问题出在UTC参数上。我最初设置它的方式需要使用UTC本地化索引。
我想补充的是,mozway的回答帮助我弄清楚了,我也可以删除UTC参数,因为我不需要这个数据的时区感知的UTC本地化时间戳。
第二行代码可能是:
而不是:
两种解决方案都在发挥应有的作用。