我需要遍历pandas.series对象流(我想使用的对象类型与此无关)。可选地,对每个序列应用一个任意函数,并且-这里是clincher-这个任意函数可以是生成函数,生成两个(或更多)值。我对未来充满希望 more_itertools.flatten
函数,但它没有帮助,因为它会在常规函数或没有函数Map到生成器时中断。有没有办法把这个iterable变成一个简单的系列对象生成器?下面是一个简单的例子,说明了这个问题:
In [1]: from more_itertools import flatten
...:
...: def generator():
...: for i in range(10):
...: yield i
...:
...: def postprocess1(i):
...: yield 2*i
...:
...: def postprocess1_return(i):
...: return 2*i
...:
...: def postprocess2(i):
...: yield from (i, 2*i)
...:
In [2]: list(generator())
...:
Out[2]: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
In [3]: list(map(postprocess1, generator()))
...:
Out[3]:
[<generator object postprocess1 at 0x7f5a402916d0>,
<generator object postprocess1 at 0x7f5a40291e40>,
<generator object postprocess1 at 0x7f5a40291f20>,
<generator object postprocess1 at 0x7f5a40291dd0>,
<generator object postprocess1 at 0x7f5a40291eb0>,
<generator object postprocess1 at 0x7f5a40209040>,
<generator object postprocess1 at 0x7f5a40209190>,
<generator object postprocess1 at 0x7f5a402092e0>,
<generator object postprocess1 at 0x7f5a402090b0>,
<generator object postprocess1 at 0x7f5a40209350>]
In [4]: list(map(postprocess1_return, generator()))
...:
Out[4]: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
In [5]: list(map(postprocess2, generator()))
...:
Out[5]:
[<generator object postprocess2 at 0x7f5a403ad430>,
<generator object postprocess2 at 0x7f5a40209580>,
<generator object postprocess2 at 0x7f5a402097b0>,
<generator object postprocess2 at 0x7f5a40209510>,
<generator object postprocess2 at 0x7f5a40209430>,
<generator object postprocess2 at 0x7f5a40209740>,
<generator object postprocess2 at 0x7f5a402096d0>,
<generator object postprocess2 at 0x7f5a40209820>,
<generator object postprocess2 at 0x7f5a40209660>,
<generator object postprocess2 at 0x7f5a40209890>]
In [6]: list(flatten(generator()))
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-6-7cd770547fa4> in <module>
----> 1 list(flatten(generator()))
TypeError: 'int' object is not iterable
In [7]: list(flatten(map(postprocess1, generator())))
...:
Out[7]: [0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
In [8]: list(flatten(map(postprocess1_return, generator())))
...:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-8-35ce9aef7285> in <module>
----> 1 list(flatten(map(postprocess1_return, generator())))
TypeError: 'int' object is not iterable
In [9]: list(flatten(map(postprocess2, generator())))
Out[9]: [0, 0, 1, 2, 2, 4, 3, 6, 4, 8, 5, 10, 6, 12, 7, 14, 8, 16, 9, 18]
1条答案
按热度按时间cunj1qz11#
我想出来了:
more_itertools.collapse(generator, base_type=pd.Series)
真有办法!显然,基本值的类型实际上很重要:没有
base_type=pd.Series
在我的实际代码中,a系列的所有元素都一个接一个地生成,这不是我想要的。