c-在缓冲数据流中搜索字节模式



我想搜索一种字节模式,当这些数据可用时,我会以块(串行(的形式接收这些字节。例如,字节0xbbffbbffbb的模式。不能保证这个模式会被完全接收,所以做一个简单的strnstrn可能不是解决方案。我可以使用什么算法来查找此模式?我的方法是查找第一个字节(在本例中为0xbb(,然后确保我还有4个字节,然后将其与字符串进行比较。尽管如果两个字节后有一些垃圾数据,比如0xbbff01[bbffbbffbb],它会失败。

我的代码(抱歉,如果破旧(看起来像这样:

char* pattern_search(char* buff, size_t *bytes_read)
{
char* ptr = buff;
uint16_t remaining_length = *bytes_read;
while(1) {
// look for one byte in the stream
char* pattern_start = memmem((void*)ptr, remaining_length, 0xbb, 1);
if (pattern_start == NULL) {
// printf("nothing foundn");
return NULL;
}
int pos = pattern_start - ptr;
remaining_length = remaining_length - pos;
ptr = pattern_start;
// see if you have 5 bytes to compare, if not get more
remaining_length += get_additional_bytes();
// compare 5 bytes for pattern
pattern_start = memmem((void*)ptr, remaining_length, start_flag, PATTERN_LEN);
if (pattern_start == NULL) {
// move one step and continue search
ptr++;
remaining_length--;
// move these bytes back to beginning of the buffer
memcpy(buff, ptr, remaining_length);
ptr = buff;
*bytes_read = remaining_length;
if (remaining_length > 0) {
continue;
} else {
return NULL;
}
} else {
// found!
printf("pattern found!n");
ptr = pattern_start;
break;
}
}
return ptr;
}

在这里可以找到许多不同的解决方案。一种可能是:

  • 将模式指定为无符号字符数组
  • 调用具有已接收数据块和指向回调函数的指针的"input_received"函数,只要找到模式,就会调用该函数

它可能看起来像这样:

#include <stdio.h>
static unsigned const char PATTERN[] = {0xbb, 0xff, 0xbb, 0xff, 0xbb};
static void found(size_t pos) {
printf("pattern found at index %zun", pos);
}
static void input_received(const unsigned char *const data,
int n,
void (*callback)(size_t)) {
static int match_count;
static size_t position;
for (int i = 0; i < n; i++, position++) {
if (data[i] == PATTERN[match_count]) {
match_count++;
} else {
match_count = data[i] == PATTERN[0] ? 1 : 0;
}
if (match_count == sizeof PATTERN) {
(*callback)(position - sizeof PATTERN + 1);
match_count = 0;
}
}
}
int main(void) {
unsigned char input[] = {0xff, 0x01, 0x02, 0xff, 0x00,
0xbb, 0xff, 0xbb, 0xff, 0xbb,
0xbb, 0xff, 0xbb, 0xff, 0xbb};
input_received(input, 2, found);
input_received(&input[2], 3, found);
input_received(&input[5], 2, found);
input_received(&input[7], 2, found);
input_received(&input[9], 5, found);
input_received(&input[14], 1, found);
return 0;
}

测试

这将在调试控制台中输出以下内容:

pattern found at index 5
pattern found at index 10

最新更新