.net 字符串,拆分为Span？

wr98u20j 于 2023-11-20 发布在 .NET

关注(0)|答案(3)|浏览(168)

我想知道如何实现sring.Split()方法，或者是否有任何变通方法，但是对于C#中的ReadOnlySpan<T>或Span，因为不幸的是ReadOnlySpan<T>.Split()似乎不存在。
我不太确定如何实现我希望的行为。它可能可以通过利用ReadOnlySpan<T>.IndexOf()和ReadOnlySpan<T>.Slice()的组合功能来实现，但因为即使对ReadOnlySpan<T>.IndexOf()的支持也不是太好（不可能指定startIndex或Count），我宁愿完全避免这种情况。
我也知道ReadOnlySpan<T>.Split()方法的问题是，它不可能返回ReadOnlySpan<T>[]或ReadOnlySpan<ReadOnlySpan<T>>，因为它是一个ref结构，因此必须是堆栈分配的，并且将其放入任何集合都需要堆分配。
有没有人知道，我该怎么做？

编辑：它应该工作，不知道事先返回的零件数量. *

.net

来源：https://stackoverflow.com/questions/77343199/string-split-for-span

3条答案

按热度按时间

omjgkv6w1#

正如我已经评论过的，在过去的几周里，我一直在解决一个类似的问题--对不起，我迟到了！
@Moho已经给出了一个解决方案，这是我提出的解决方案的基础：
您可以实现一个**IEnumerator<T>，它通过一个ref struct拆分一个(ReadOnly-)Span<T>，然后可以通过一个foreach循环使用它，或者如果您需要能够索引它，通过IEnumerator Pattern**。

现在，你可以自己实现... *
或者
使用SpanExtensions.Net。*

什么是`SpanExtensions.Net`？

SpanExtensions.Net是一个C#库，由我编写，在 NuGethere和 Githubhere上都有提供，它可以方便地使用ReadOnlySpan<T>和Span<T>。值得注意的是，它为ReadOnlySpan<T>和Span<T>提供了一个自定义的基于 * 枚举器的 *Split方法，该方法有许多重载，是高性能和非常可定制**。

它实际上是如何工作的？

请注意，SpanExtensions.Net的完整源代码可以在Github here上找到
由于ref structs不能实现接口，为了确保它们不会逸出 * 栈 ，只需实现*IEnumerator<T>和IEnumerable<T>中包含的方法和属性**，而无需实际实现接口本身，即可使Type能够在foreach构造中使用。换句话说，这意味着，我们可以枚举ref struct，这也意味着，我们可以将**IEnumerator<T>**写为ref struct，因为ref结构是唯一能够包含另一个ref结构作为字段或属性的类型，这是作为IEnumerator的要求。

您可以按照以下方式实现：

1.创建枚举器：

public ref struct SpanSplitEnumerator<T> where T : IEquatable<T>
{
    ReadOnlySpan<T> Span;
    readonly T Delimiter;
    public ReadOnlySpan<T> Current { get; internal set; }
    public SpanSplitEnumerator(ReadOnlySpan<T> span, T delimiter)
    {
        Span = span;
        Delimiter = delimiter;
    }
    public SpanSplitEnumerator<T> GetEnumerator()
    {
        return this;
    }
    public bool MoveNext()
    {
        var span = Span;
        if (span.IsEmpty)
        { 
            return false;
        }
        int index = span.IndexOf(Delimiter);
        if (index == -1 || index >= span.Length)   
        {
            Span = ReadOnlySpan<T>.Empty;
            Current = span;
            return true;
        }
        Current = span[..index];   
        Span = span[(index + 1)..]; 
        return true; 
    }
}

字符串
2.创建一个扩展方法来创建枚举器：

public static SpanSplitEnumerator<T> Split<T>(this ReadOnlySpan<T> span, T delimiter) where T : IEquatable<T>
    {
        return new SpanSplitEnumerator<T>(span, delimiter);
    }

1.使用新的扩展方法拆分ReadOnlySpan：

foreach(var span in source.Split('.')) 
{ 
    Console.WriteLine(span.ToString()); 
}

型
1.如果您需要索引，请像这样使用它：

var enumerator = source.Split('.');
 int index = 0;
 while(enumerator.MoveNext())
 { 
     Console.WriteLine($"{enumerator.Current.ToString()}:{index}");   
     index++; 
 }

型

说明：

为了使我们的返回集合在foreach循环中使用，它必须实现**IEnumerable<T>接口所需的所有成员。由于我们不想创建不必要的大量类，我们将使ref struct同时成为IEnumerable<T>和IEnumerator<T>，这意味着在类型上组合ine。
我们首先创建SpanSplitEnumerator<T>作为ref struct。
强制从IEquatable<T>派生T是必要的，因为我们希望能够比较我们的项目，而不必为Equals(object)装箱。
我们需要存储我们想要操作的ReadOnlySpan<T>，这是可能的，因为我们的枚举器是ref struct。
当然，我们还需要存储我们想要分割的Delimiter。
想象中的IEnumerator<T>接口要求我们实现一个名为 Current 的属性。
然后我们需要实现IEnumerable<T>.GetEnumerator()，我们可以返回this，因为我们希望这个ref struct同时是IEnumerable<T>和IEnumerator<T>。
这样，我们的IEnumerable<T>接口描述的所有内容都实现了，我们现在可以在foreach循环中使用它。
然后我们实现MoveNext()，正如我们想象的IEnumerator<T>**接口所要求的那样-实现应该是不言自明的。
现在我们创建另一个非泛型类，它包含扩展方法。在其中，我们只是将构造函数参数作为参数，并返回创建的Type。
恭喜你，现在你有你的'string.Split()，但为斯潘'。

再次，如果这对你来说似乎势不可挡，我建议你在NuGet或Github上检查我的库SpanExtensions.Net，它巧妙地扩展了Spans并提供了上面描述的Split方法，以及更多。

希望有帮助！

展开查看全部

赞(0）回复(0）举报 2023-11-20

watbbzwu2#

下面的Escape实现会产生readonly ref struct s，它包含每个段的开始索引和长度（沿着源Span），并有一个方法返回段的ReadOnlySpan<T>（并且可以隐式地铸造）。
.NET Fiddle
使用示例：

foreach( var value in "0|1|2|3".AsSpan().Split( "|" ) )
{
    Console.WriteLine( $"{value}" );
}

字符串
代码：

public readonly ref struct SpanSplitter<T>
    where T : IEquatable<T>
{
    private readonly ReadOnlySpan<T> _source;
    private readonly ReadOnlySpan<T> _separator;
    [MethodImpl( MethodImplOptions.AggressiveInlining )]
    public SpanSplitter( ReadOnlySpan<T> source, ReadOnlySpan<T> separator )
    {
        if( 0 == separator.Length )
        {
            throw new ArgumentException( "Requires non-empty value", nameof( separator ) );
        }
        _source = source;
        _separator = separator;
    }
    [MethodImpl( MethodImplOptions.AggressiveInlining )]
    public SpanSplitEnumerator<T> GetEnumerator()
    {
        return new SpanSplitEnumerator<T>( _source, _separator );
    }
}
public ref struct SpanSplitEnumerator<T>
    where T : IEquatable<T>
{
    private int _nextStartIndex = 0;
    private readonly ReadOnlySpan<T> _separator;
    private readonly ReadOnlySpan<T> _source;
    private SpanSplitValue _current;
    [MethodImpl( MethodImplOptions.AggressiveInlining )]
    public SpanSplitEnumerator( ReadOnlySpan<T> source, ReadOnlySpan<T> separator )
    {
        _source = source;
        _separator = separator;
        if( 0 == separator.Length )
        {
            throw new ArgumentException( "Requires non-empty value", nameof( separator ) );
        }
    }
    public bool MoveNext()
    {
        if( _nextStartIndex > _source.Length )
        {
            return false;
        }
        var nextSource = _source.Slice( _nextStartIndex );
        var foundIndex = nextSource.IndexOf( _separator );
        var length = -1 < foundIndex
            ? foundIndex
            : nextSource.Length;
        _current = new SpanSplitValue
        {
            StartIndex = _nextStartIndex,
            Length = length,
            Source = _source,
        };
        _nextStartIndex += _separator.Length + _current.Length;
        return true;
    }
    public SpanSplitValue Current
    {
        [MethodImpl( MethodImplOptions.AggressiveInlining )]
        get => _current;
    }
    public readonly ref struct SpanSplitValue
    {
        public int StartIndex { get; init; }
        public int Length { get; init; }
        public ReadOnlySpan<T> Source { get; init; }
        public ReadOnlySpan<T> AsSpan() => Source.Slice( StartIndex, Length );
        public static implicit operator ReadOnlySpan<T>( SpanSplitValue value )
            => value.AsSpan();
    }
}
public static class ExtensionMethods
{
    [MethodImpl( MethodImplOptions.AggressiveInlining )]
    public static SpanSplitter<T> Split<T>( this ReadOnlySpan<T> source, ReadOnlySpan<T> separator )
        where T : IEquatable<T>
    {
        return new SpanSplitter<T>( source, separator );
    }
    [MethodImpl( MethodImplOptions.AggressiveInlining )]
    public static SpanSplitter<T> Split<T>( this Span<T> source, ReadOnlySpan<T> separator )
        where T : IEquatable<T>
    {
        return new SpanSplitter<T>( source, separator );
    }
}

型

展开查看全部

赞(0）回复(0）举报 2023-11-20

t30tvxxf3#

正如您已经指出的，这样的Split方法不能返回ReadOnlySpan<T>[]。拆分结果必须在堆栈上使用。
在我看来，这种情况下最通用的设计是采用委托参数，允许调用者说出他们想对每个结果做什么。该方法还可以返回它将输入拆分成的段的总数。
下面是一个示例：

delegate void ReadOnlySpanAction<T>(ReadOnlySpan<T> span, int n);
public static int Split<T>(ReadOnlySpan<T> s, T c, ReadOnlySpanAction<T> action) where T: IEquatable<T> {
    int index;
    int n = 0;
    var currentSlice = s;
    while((index = currentSlice.IndexOf(c)) != -1) {
        action(currentSlice[..index], n++);
        currentSlice = currentSlice[(index + 1)..];
    }
    action(currentSlice, n++);
    return n;
}

字符串
调用者可以传入一个lambda并检查n参数以获得作为特定索引的结果。
有了这个基本的设计，你可以很容易地想出其他的变化。如果你想要一个List<(int, int)>的所有切片，你可以这样做：

public static List<(int, int)> Split<T>(ReadOnlySpan<T> s, T c) where T: IEquatable<T> {
    int index;
    int start = 0;
    var currentSlice = s;
    var list = new List<(int, int)>();
    while((index = currentSlice.IndexOf(c)) != -1) {
        list.Add((start, start + index));
        currentSlice = currentSlice[(index + 1)..];
        start += index + 1;
    }
    list.Add((start, start + currentSlice.Length));
    return list;
}

型
或者，如果您希望将段的范围写入Span<Range>（类似于.NET 8内置Split的工作方式），您可以执行以下操作：

// returns the number of segments found
public static int Split<T>(ReadOnlySpan<T> s, T c, Span<Range> destination) where T: IEquatable<T> {
    int index;
    int start = 0;
    var currentSlice = s;
    var n = 0;
    if (destination.IsEmpty) return 0;
    while((index = currentSlice.IndexOf(c)) != -1 && n < destination.Length - 1) {
        destination[n] = new Range(start, start + index);
        currentSlice = currentSlice[(index + 1)..];
        start += index + 1;
        n++;
    }
    destination[n] = new Range(start, start + currentSlice.Length);
    return n + 1;
}

型

展开查看全部

赞(0）回复(0）举报 2023-11-20

我来回答

.net 字符串,拆分为Span？

3条答案

什么是`SpanExtensions.Net`？

它实际上是如何工作的？

说明：

相关问题

热门标签

最新问答

.net 字符串,拆分为Span？

3条答案

什么是SpanExtensions.Net？

它实际上是如何工作的？

说明：

相关问题

热门标签

最新问答

什么是`SpanExtensions.Net`？