首页 > 解决方案 > 在 ByteArray 中定位子数组

问题描述

概括

从块中读取文件中的字节(在 128 - 1024 之间没有特定大小,尚未决定),我想搜索缓冲区以查看它是否包含另一个字节数组的签名(模式),如果它在缓冲区的最后找到一些模式,它应该从文件中读取接下来的几个字节,看看它是否找到了匹配项

我试过的

public static bool Contains(byte[] buffer, byte[] signiture, FileStream file)
{
    for (var i = buffer.Length - 1; i >= signiture.Length - 1; i--) //move backwards through array stop if < signature
    {
        var found = true; //set found to true at start
        for (var j = signiture.Length - 1; j >= 0 && found; j--) //loop backwards throughsignature
        {
            found = buffer[i - (signiture.Length - 1 - j)] == signiture[j];// compare signature's element with corresponding element of buffer
        }
        if (found)
            return true; //if signature is found return true
    }


    //checking end of buffer for partial signiture
    for (var x = signiture.Length - 1; x >= 1; x--)
    {
        if (buffer.Skip(buffer.Length - x).Take(x).SequenceEqual(signiture.Skip(0).Take(x))) //check if partial is equal to partial signiture
        {
            byte[] nextBytes = new byte[signiture.Length - x];
            file.Read(nextBytes, 0, signiture.Length - x); //read next needed bytes from file
            if (!signiture.Skip(0).Take(x).ToArray().Concat(nextBytes).SequenceEqual(signiture))
                return false; //return false if not a match
            return true; //return true if a match
        }
    }
    return false; //if not found return false
}

这可行,但有人告诉我 linq 很慢,我应该使用 Array.IndexOf()。我已经尝试过了,但无法弄清楚如何实现它

标签: c#searchbufferstreamreaderindexof

解决方案


您可以使用Span<T>, AsSpanMemoryExtensions.SequenceEqual。后者不是 LINQ;它经过优化,特别是对于字节数组。它展开循环并使用不安全代码本质上执行memcmp.

如果您没有使用默认包含这些类型/方法的框架(.Netcore2.1+、.Netstandard 2.1),您可以添加System.Memory nuget 包。的实现SequenceEqual有点不同(所谓的“慢版本”),但它仍然比使用 LINQ 的SequenceEqual.

请注意,您还需要检查FileStream.Read.

public static bool Contains(byte[] buffer, byte[] signiture, FileStream file)
{
    var sigSpan = signiture.AsSpan();

    //move backwards through buffer and check if signature found
    for (var i = buffer.Length - signiture.Length; i >= 0; i--)
    { 
        if (buffer.AsSpan(i, signiture.Length).SequenceEqual(sigSpan))
            return true;
    }

    for (var x = signiture.Length - 1; x >= 1; x--)
    {
        var sig = sigSpan.Slice(0, x);
        if (buffer.AsSpan(buffer.Length - x).SequenceEqual(sig)) //check if partial is equal to partial signiture
        {
            var sigLen = signiture.Length;
            byte[] nextBytes = ArrayPool<byte>.Shared.Rent(sigLen - x);

            // need to store number of bytes read
            var read = file.Read(nextBytes, 0, sigLen - x); //read next needed bytes from file
            var next = nextBytes.AsSpan(0, read);

            // don't need to concat with signature, because obviously signature is going to 
            // start with signature.Skip(0).Take(...)
            // just test that the number of bytes we read, plus the number we will skip equals
            // the actual length, then check the remainder
            var result = (read + x == signiture.Length 
                       && signiture.AsSpan(x).SequenceEqual(next));

            ArrayPool<byte>.Shared.Return(nextBytes);
            return result;
        }
    }

    return false; //if not found return false

}

推荐阅读