c - 在缓冲数据流中搜索字节模式
问题描述
我想搜索当这些数据可用时我以块(串行)接收的字节模式。例如,字节模式 0xbbffbbffbb。不能保证这个模式会被完整接收,所以做一个简单的 strnstrn 可能不是解决方案。我可以使用什么算法来寻找这种模式?我的方法是查找第一个字节(在本例中为 0xbb),然后确保我还有 4 个字节,然后将其与字符串进行比较。虽然如果两个字节后有一些垃圾数据会失败,比如 0xbbff01[bbffbbffbb]。
我的代码(对不起,如果破旧)看起来像这样:
char* pattern_search(char* buff, size_t *bytes_read)
{
char* ptr = buff;
uint16_t remaining_length = *bytes_read;
while(1) {
// look for one byte in the stream
char* pattern_start = memmem((void*)ptr, remaining_length, 0xbb, 1);
if (pattern_start == NULL) {
// printf("nothing found\n");
return NULL;
}
int pos = pattern_start - ptr;
remaining_length = remaining_length - pos;
ptr = pattern_start;
// see if you have 5 bytes to compare, if not get more
remaining_length += get_additional_bytes();
// compare 5 bytes for pattern
pattern_start = memmem((void*)ptr, remaining_length, start_flag, PATTERN_LEN);
if (pattern_start == NULL) {
// move one step and continue search
ptr++;
remaining_length--;
// move these bytes back to beginning of the buffer
memcpy(buff, ptr, remaining_length);
ptr = buff;
*bytes_read = remaining_length;
if (remaining_length > 0) {
continue;
} else {
return NULL;
}
} else {
// found!
printf("pattern found!\n");
ptr = pattern_start;
break;
}
}
return ptr;
}
解决方案
人们当然可以在这里找到许多不同的解决方案。一种可能是:
- 将模式指定为无符号字符数组
- 使用接收到的数据块和指向回调函数的指针调用“input_received”函数,只要找到模式就会调用该函数
它可能看起来像这样:
#include <stdio.h>
static unsigned const char PATTERN[] = {0xbb, 0xff, 0xbb, 0xff, 0xbb};
static void found(size_t pos) {
printf("pattern found at index %zu\n", pos);
}
static void input_received(const unsigned char *const data,
int n,
void (*callback)(size_t)) {
static int match_count;
static size_t position;
for (int i = 0; i < n; i++, position++) {
if (data[i] == PATTERN[match_count]) {
match_count++;
} else {
match_count = data[i] == PATTERN[0] ? 1 : 0;
}
if (match_count == sizeof PATTERN) {
(*callback)(position - sizeof PATTERN + 1);
match_count = 0;
}
}
}
int main(void) {
unsigned char input[] = {0xff, 0x01, 0x02, 0xff, 0x00,
0xbb, 0xff, 0xbb, 0xff, 0xbb,
0xbb, 0xff, 0xbb, 0xff, 0xbb};
input_received(input, 2, found);
input_received(&input[2], 3, found);
input_received(&input[5], 2, found);
input_received(&input[7], 2, found);
input_received(&input[9], 5, found);
input_received(&input[14], 1, found);
return 0;
}
测试
这将在调试控制台中输出以下内容:
pattern found at index 5
pattern found at index 10
推荐阅读
- angular - Issue with Dynamic Titles in Angular 2+ when switching tabs
- r - Subset table by replacing some row values
- php - 创建空表并插入数据
- vimeo - 显示没有密码的受密码保护的 Vimeo 视频
- python - Updating a python array without loops?
- google-cloud-platform - How to add cloudsql role to a service account via cloud-deployment-manager
- reactjs - 如何使用从 Java API 检索的数据呈现数据表组件来解决此问题?
- c# - 主键盘上 Enter 键的虚拟键码是什么?
- vert.x - 通过 Eclipse 运行 Vert.x (w/ES4X)
- numpy - 没有显式数组的二进制搜索