c# - 在 ByteArray 中定位子数组
问题描述
概括
从块中读取文件中的字节(在 128 - 1024 之间没有特定大小,尚未决定),我想搜索缓冲区以查看它是否包含另一个字节数组的签名(模式),如果它在缓冲区的最后找到一些模式,它应该从文件中读取接下来的几个字节,看看它是否找到了匹配项
我试过的
public static bool Contains(byte[] buffer, byte[] signiture, FileStream file)
{
for (var i = buffer.Length - 1; i >= signiture.Length - 1; i--) //move backwards through array stop if < signature
{
var found = true; //set found to true at start
for (var j = signiture.Length - 1; j >= 0 && found; j--) //loop backwards throughsignature
{
found = buffer[i - (signiture.Length - 1 - j)] == signiture[j];// compare signature's element with corresponding element of buffer
}
if (found)
return true; //if signature is found return true
}
//checking end of buffer for partial signiture
for (var x = signiture.Length - 1; x >= 1; x--)
{
if (buffer.Skip(buffer.Length - x).Take(x).SequenceEqual(signiture.Skip(0).Take(x))) //check if partial is equal to partial signiture
{
byte[] nextBytes = new byte[signiture.Length - x];
file.Read(nextBytes, 0, signiture.Length - x); //read next needed bytes from file
if (!signiture.Skip(0).Take(x).ToArray().Concat(nextBytes).SequenceEqual(signiture))
return false; //return false if not a match
return true; //return true if a match
}
}
return false; //if not found return false
}
这可行,但有人告诉我 linq 很慢,我应该使用 Array.IndexOf()。我已经尝试过了,但无法弄清楚如何实现它
解决方案
您可以使用Span<T>,
AsSpan
和MemoryExtensions.SequenceEqual
。后者不是 LINQ;它经过优化,特别是对于字节数组。它展开循环并使用不安全代码本质上执行memcmp
.
如果您没有使用默认包含这些类型/方法的框架(.Netcore2.1+、.Netstandard 2.1),您可以添加System.Memory nuget 包。的实现SequenceEqual
有点不同(所谓的“慢版本”),但它仍然比使用 LINQ 的SequenceEqual
.
请注意,您还需要检查FileStream.Read
.
public static bool Contains(byte[] buffer, byte[] signiture, FileStream file)
{
var sigSpan = signiture.AsSpan();
//move backwards through buffer and check if signature found
for (var i = buffer.Length - signiture.Length; i >= 0; i--)
{
if (buffer.AsSpan(i, signiture.Length).SequenceEqual(sigSpan))
return true;
}
for (var x = signiture.Length - 1; x >= 1; x--)
{
var sig = sigSpan.Slice(0, x);
if (buffer.AsSpan(buffer.Length - x).SequenceEqual(sig)) //check if partial is equal to partial signiture
{
var sigLen = signiture.Length;
byte[] nextBytes = ArrayPool<byte>.Shared.Rent(sigLen - x);
// need to store number of bytes read
var read = file.Read(nextBytes, 0, sigLen - x); //read next needed bytes from file
var next = nextBytes.AsSpan(0, read);
// don't need to concat with signature, because obviously signature is going to
// start with signature.Skip(0).Take(...)
// just test that the number of bytes we read, plus the number we will skip equals
// the actual length, then check the remainder
var result = (read + x == signiture.Length
&& signiture.AsSpan(x).SequenceEqual(next));
ArrayPool<byte>.Shared.Return(nextBytes);
return result;
}
}
return false; //if not found return false
}
推荐阅读
- python-2.7 - Beam / DataFlow ::ReadFromPubSub(id_label) ::意外行为
- three.js - 三.js:文字精灵倒置显示
- vba - 重新计算父表单,而不会从子表单的当前记录中失去焦点
- java - Apache Camel Restlet - 无法提取多部分数据
- stata - 随机选择观察值
- node.js - 使用 Express 和 GraphQL 的 Windows 身份验证
- mysql - 如何使用 mysql 按产品 ID 对结果进行分组
- c - 如何将 *this 指针隐式传递给 struct 中的函数指针
- jenkins - 如何在变量中获取詹金斯构建状态?
- node.js - Python 代码(视频处理)、NodeJs 服务器和 Firebase 存储