Pandas/matplotlib新手:用不同的指数聚集时间序列数据?

hpxqektj  于 2023-03-09  发布在  其他
关注(0)|答案(2)|浏览(138)

I'm getting to grips with pandas/matplotlib, and looking to aggregate multiple data series with (marginally) differing indices. For example:

Series 1

seconds_since_startValue
0.035
0.841
1.148

Series 2

seconds_since_startValue
0.031
0.737
1.141

At present, I'm plotting both series as 2 separate line graphs. Ultimately, I'm looking to create a single line that shows, for any given x value, the mean y of both series. The values between specified points can be assumed to be linear.
I assume this is a common task, but the ways I'm trying involve a lot more complexity than I suspect is necessary.
In short: is there a straightforward way in plot the mean for series that have differing index values?
Notes:

  • While the only immediate need is graphing, ideally the aggregation would be calculated in pandas, not matplotlib
  • The solution will aggregate >100 different series, not just 2
c6ubokkw

c6ubokkw1#

一个解决方案是找到序列索引的并集,并对任何缺失的值进行插值。然后可以将序列连接在一起,并计算每个索引的平均值。下面的代码假设序列位于名为series的列表中。
首先,获取索引的并集:

from functools import reduce

new_index = reduce(np.union1d, [s.index.values for s in series])

在示例中,new_index将为array([0. , 0.7, 0.8, 1.1])
现在,reindex系列和concat它们一起:

df = pd.concat([s.reindex(new_index).rename(f'Value_{i}') for i, s in enumerate(series)], axis=1)
df = df.interpolate('linear')
df['Avg'] = df.mean(axis=1)

结果:

Value_0  Value_1   Avg
seconds_since_start                        
0.0                     35.0     31.0  33.0
0.7                     38.0     37.0  37.5
0.8                     41.0     39.0  40.0
1.1                     48.0     41.0  44.5
7vhp5slm

7vhp5slm2#

您可以使用pd.concat聚合100多个系列,然后在计算平均值之前按seconds_since_start分组:

dfs = [df1, df2]  # all your data here
df = pd.concat(dfs, axis=0).groupby('seconds_since_start', as_index=False)['Value'].mean()
df.plot(x='seconds_since_start', y='Value', marker='o')

输出:

>>> df
   seconds_since_start  Value
0                  0.0   33.0
1                  0.7   37.0
2                  0.8   41.0
3                  1.1   44.5

相关问题