linq 是否仅向前一次传递有状态IEnumerable< T>到IEnumerable〈IEnumerable< T>>而不使用临时存储？

gojuced7 于 2022-12-06 发布在其他

关注(0)|答案(2)|浏览(163)

我希望采用流IEnumerable值，例如：（显示的Tuple用于说明目的。实际的应用程序将从DataReader串流DataRecords）

var tuples = new(int, int)[]
{
    (0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (2, 0), (2, 1), (2, 2),
};

我希望维护状态并监视左字段中的每个更改。具有公共左字段的每组项都应作为一系列IEnumerables返回（数据将预先排序，因此我不必担心在本地端进行排序）。

(0,0) (0,1) (0,2) (0,3)
(1,0) (1,1)
(2,0) (2,1) (2,2)

这可能是不可能的，但我们希望这样做，而不创建任何按组的临时列表，这些临时列表将在RAM中存储记录，因为每个组都将相当大。换句话说，以某种方式让每个组以某种方式神奇地从原始IEnumerable中吸取，这样它就完全是只向前的，一次通过。
内部的TakeWhile * 看起来 * 似乎是可行的方法，但它总是从零开始在tg上重新开始迭代。

private int currentGroup;

            public IEnumerator<IEnumerable<Tuple<int, int>>> GetEnumerator()
            {
                var tg = TupleGenerator();
                foreach (Tuple<int, int> item in tg)
                {
                    currentGroup = item.Item1;
                    yield return tg.TakeWhile((x) => x.Item1 == currentGroup);
                }
            }

            static IEnumerable<Tuple<int, int>> TupleGenerator()
            {
                for (int i = 0; i < 10; i++)
                {
                    for (int j = 0; j < 10; j++)
                    {
                        yield return new Tuple<int, int>(i,j);
                    }
                }
            }

linq

来源：https://stackoverflow.com/questions/74367513/forward-only-one-pass-stateful-ienumerablet-to-ienumerableienumerablet-wit

2条答案

按热度按时间

tyg4sfes1#

因此，虽然可以避免在内存中存储整个组的数据，并且只在请求时使用恒定的内存对每个项进行流处理，但也有缺点。了解为什么缓冲是解决此类问题的典型解决方案非常重要。首先，避免缓冲的代码更加复杂。其次，使用者在请求下一个组之前必须迭代每个内部IEnumerable直到完成，并且没有内部IEnumerable被迭代超过一次。如果你违反了这些规则（如果你不明确地检查它们的话）事情就会悄悄地出错。（您最终得到的数据应该在多个组中的同一个组中，组数错误，等等）考虑到这些错误是多么容易犯，以及混乱的后果，这确实值得显式地检查它们并抛出异常，这样使用者至少知道它是错误的，需要修复。
至于实际的代码，您确实需要手动执行很多操作，以便直接控制枚举数以及何时获得条目，但在其核心，代码 * 大多数 * 只是归结为跟踪前一个条目，并继续将条目生成到同一组中，只要当前条目与之“匹配”。

public static IEnumerable<IEnumerable<T>> GroupWhile<T>(
    this IEnumerable<T> source,
    Func<T, T, bool> predicate)
{
    using (var iterator = source.GetEnumerator())
    {
        bool previousGroupFinished = true;
        bool sourceExhaused = !iterator.MoveNext();
        while (!sourceExhaused)
        {
            if (!previousGroupFinished)
                throw new InvalidOperationException("It is not valid to request the next group until the previous group has run to completion");
            previousGroupFinished = false;
            bool startedIteratingCurrentGroup = false;

            yield return NextGroup();

            IEnumerable<T> NextGroup()
            {
                if (startedIteratingCurrentGroup)
                    throw new InvalidOperationException("This sequence doesn't support being iterated multiple times.");
                startedIteratingCurrentGroup = true;

                T previous;
                do
                {
                    yield return iterator.Current;
                    previous = iterator.Current;
                    sourceExhaused = !iterator.MoveNext();
                }
                while (!sourceExhaused && predicate(previous, iterator.Current));
                previousGroupFinished = true;
            }
        }
    }
}

在你的例子中使用它，很简单，你的项目被分组，同时配对中的第一个项目是相等的，但是你可以使用任何你想要的分组条件。

var data = new[] { (0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (2, 0), (2, 1), (2, 2) };
var grouped = data.GroupWhile((previous, current) => previous.Item1 == current.Item1);
foreach (var group in grouped)
{
    Console.WriteLine(String.Join(", ", group));
}

与基于 predicate 进行分组不同，基于某个键对象进行分组通常也很方便。在您的示例中，键只是元组中的第一项。但是如果计算键的开销很大，或者不能多次计算，您可以修改分组机制，使用键选择器而不是 predicate 。并存储上一个键而不是上一个项。这样做的结果是代码非常相似，但有细微的不同：

public static IEnumerable<IEnumerable<TSource>> GroupAdjacent<TSource, TKey>(
    this IEnumerable<TSource> source,
    Func<TSource, TKey> keySelector,
    IEqualityComparer<TKey> keyComparer = null)
{
    keyComparer = keyComparer ?? EqualityComparer<TKey>.Default;

    using (var iterator = source.GetEnumerator())
    {
        bool previousGroupFinished = true;
        bool sourceExhaused = !iterator.MoveNext();
        TKey nextKey = keySelector(iterator.Current);
        while (!sourceExhaused)
        {
            if (!previousGroupFinished)
                throw new InvalidOperationException("It is not valid to request the next group until the previous group has run to completion");
            previousGroupFinished = false;
            bool startedIteratingCurrentGroup = false;

            yield return NextGroup();

            IEnumerable<TSource> NextGroup()
            {
                if (startedIteratingCurrentGroup)
                    throw new InvalidOperationException("This sequence doesn't support being iterated multiple times.");
                startedIteratingCurrentGroup = true;

                TKey previousKey;
                do
                {
                    yield return iterator.Current;
                    sourceExhaused = !iterator.MoveNext();
                    previousKey = nextKey;
                    if (!sourceExhaused)
                        nextKey = keySelector(iterator.Current);
                }
                while (!sourceExhaused && keyComparer.Equals(previousKey, nextKey));
                previousGroupFinished = true;
            }
        }
    }
}

允许您写入：

var data = new[] { (0, 0), (0, 1), (0, 2), (0, 3), (1, 0), (1, 1), (2, 0), (2, 1), (2, 2) };
var grouped = data.GroupAdjacent(pair => pair.Item1);
foreach (var group in grouped)
{
    Console.WriteLine(String.Join(", ", group));
}

赞(0）回复(0）举报 2022-12-06

wnvonmuf2#

使用MoreLinq's GroupAdjacent method：
根据指定的键选择器功能对序列的相邻元素进行分组。

var groupings = GetValues().GroupAdjacent(tuple => tuple.Item1);

foreach (var grouping in groupings)
{
    Console.WriteLine($"Value: {grouping.Key}. Elements: {string.Join(", ", grouping)}");
}

IEnumerable<(int, int)> GetValues()
{
    yield return (0, 0);
    yield return (0, 1);
    yield return (0, 2);
    yield return (0, 3);
    yield return (1, 0);
    yield return (1, 1);
    yield return (2, 0);
    yield return (2, 1);
    yield return (2, 2);
};

这将输出以下内容：
值：0。元素：（0，0）、（0，1）、（0，2）、（0，3）
价值：1.要素：（1，0），（1，1）
价值：2.要素：（2，0）（2，1）（2，2）
groupings可枚举对象上的每个示例都是IGrouping<int, (int, int)>，当枚举它时，就会得到您所需要的结果。
(P.S：此实现只命中迭代器一次，因此它是完全只进的）

赞(0）回复(0）举报 2022-12-06

我来回答

linq 是否仅向前一次传递有状态IEnumerable< T>到IEnumerable〈IEnumerable< T>>而不使用临时存储？

2条答案

相关问题

热门标签

最新问答