Perl套接字解析来自网络流的数据包



我正试图找到一种使用perl解析数据流的正确方法。我已经阅读了许多例子、文档和问题,但找不到如何从数据流中基本上剪切出一个"包"并进行处理。情况如下:-从某个IP到某个IP和端口的数据流-流包含一些胡言乱语,然后是介于和之间的内容,其中的数据是用分号分隔的

到目前为止,我的尝试是让Socket在端口上侦听并处理$data var:

#!/usr/bin/perl
    use IO::Socket::INET;
    # auto-flush on socket
    $| = 1;
# creating a listening socket
my $socket = new IO::Socket::INET (
    LocalHost => '127.0.0.1',
    LocalPort => '7070',
    Proto => 'tcp',
    Listen => 5,
    Reuse => 1
);
die "cannot create socket $!n" unless $socket;
print "server waiting for client connection on port 7070 n";
while(1)
{
    # waiting for a new client connection
    my $client_socket = $socket->accept();
    # get information about a newly connected client
    my $client_address = $client_socket->peerhost();
    my $client_port = $client_socket->peerport();
    print "connection from $client_address:$client_portn";
    # read up to 1024 characters from the connected client
    my $data = "";
    $client_socket->recv($data, 1024);
    print "received data: $datan";
    @data_array = split(/;/,$data);
    foreach (@data_array) {
      print "$_n";
    }
    # write response data to the connected client
    $data = "ok";
    $client_socket->send($data);
    # notify client that response has been sent
    shutdown($client_socket, 1);
}
$socket->close();

这是有效的,但据我所知,这将使整个流达到最大大小,然后进行处理。

我的问题:我如何确定我需要的部分(开始-结束),处理它,然后继续下一个?

我一直不明白为什么人们使用recv从流套接字中读取。

通常,读取循环看起来如下:

my $buf = '';
while (1) {
   my $rv = sysread($socket, $buf, 64*1024, length($buf));
   if (!defined($rv)) {
      die("Can't read from socket: $!n");
   }
   if (!$rv) {
      die("Can't read from socket: Premature EOFn") if length($buf);
      last;
   }
   while (my $msg = defined(check_for_full_message_and_extract_it_from_buf($buf))) {
      process_msg($msg);
   }
}

(请记住,即使数据少于请求的数据,sysread也会在有数据时立即返回。)

例如,哨兵终止数据的内部循环如下所示:

   while ($buf =~ s/^(.*)n//) {
      process_msg("$1");
   }

例如,长度前缀块的内部循环如下所示:

   while (1) {
      last if length($buf) < 4;
      my $len = unpack('N', $buf);
      last if length($buf) < 4+$len;
      substr($buf, 0, 4, '');
      my $msg = substr($buf, 0, $len, '');
      process_msg($msg);
   }

如果你是特殊情况,你会从开始$buf中删除任何你想忽略的数据,直到你找到你感兴趣的部分,然后你会开始提取你感兴趣。这是模糊的,但我对工作协议只有一个模糊的描述。

我通过使用原始代码并添加:解决了这个问题

if ( $data=~/<START>>/) {
    print "nFound startn";
    $message.=$data;
    while ($message !~/END/){
        $client_socket->recv($data, $message_length);
        $message.=$data;
        print "nStill readingn"; 
    };
    print "nFound endn"; # but may contain (part of) next START
}

我仍然需要实现检查块读取是否包含下一条消息的部分,但我会弄清楚的。谢谢你的帮助!

相关内容

  • 没有找到相关文章

最新更新