PerlScript从文本文件中获取某些单词

我有一个文本文件的数据如下:

#alstrong textert tcp $EXTERNAL_NET $HTTP_PORTS -> $HOME_NET any (msg:"ET ACTIVEX Microsoft Whale Intelligent Application Gateway ActiveX Buffer Overflow-1"; flow:established,to_client; file_data; **content:"8D9563A9-8D5F-459B-87F2-BA842255CB9A"**; nocase; **content:"CheckForUpdates"**; nocase; distance:0; pcre:"/<OBJECTs+[^>]*classids*=s*[x22x27]?s*clsids*x3as*x7B?s*8D9563A9-8D5F-459B-87F2-BA842255CB9A/si";reference:url,dev.metasploit.com/redmine/projects/framework/repository/entry/modules/exploits/windows/browser/mswhale_checkforupdates.rb; reference:url,www.kb.cert.org/vuls/id/789121; reference:url,doc.emergingthreats.net/2010562; classtype:web-application-attack; sid:2010562; rev:6; metadata:affected_product Windows_XP_Vista_7_8_10_Server_32_64_Bit, attack_target Client_Endpoint, created_at 2010_07_30, deployment Perimeter, signature_severity Major, tag ActiveX, updated_at 2016_07_01;)

我需要提取名为"Content"字段中的所有单词。并将它们存储在另一个文本文件中。我在Perl中发现了这段代码(我没有经验)，它是提取所有字段还是只提取第一个字段?

#!/local/bin/perl5 -w
# Description:
# Extract bit-pattern from content-part of Snort-rules.
# Choose rules that have only one content-part.
# Store distinct patterns only.
# Choose length of shortest and longest pattern to store.
$rulesdir = "/hom/geirni/www_docs/research/snort202_win32/Snort/rules";
@rulefiles = `ls $rulesdir/*.rules`;
$camfile = "camdata.txt";
#
$minLength = 4; # Bytes
$maxLength = 32;
#
# Find content-part of rules
for $rulefile(@rulefiles){
#
open(INFILE, "<".$rulefile) or die
"Can't open ".$rulefile."n";
@rules = <INFILE>;
close(INFILE);
#
for $rule(@rules){
#
$contentParts = 0;
#
if($rule =~ /content:/){
@parts = split(/;/, $rule);
for $part(@parts){
if($part =~ /content:/){
$content = $part;
$contentParts++;
# Remove anything before content-part
$content =~ s/^.*content:.*?"//i;
# Remove anything after content-part
$content =~ s/"$.*//g;
}
}
}
#
# Store content-part
if ($contentParts == 1){
push(@contents, $content);
}
}
}
#
#
#
# Convert content-strings to hex. Store only distinct patterns
for $content(@contents){
#
$pipe = 0; # hex patterns are limited by pipes; |00 bc 55|
$char = ""; # Current character in content; ASCII or hex
$pattern = ""; # Content converted to hex
#
# Loop through current content-string
for ($i=0; $i<=length($content)-1; $i++){ # -1 for newline
#
$char = substr($content, $i, 1);
#
# Control over pipes
if($char =~ /|/){
if(!$pipe){
$pipe = 1;
}
else {
$pipe = 0;
}
next; # Skip to next character
}
#
# Convert to lowcase hex
if(!$pipe){ # ASCII-value
$pattern .= sprintf("%x", ord($char));
}
else { # hex-value
$char =~ s/ //; # Remove blanks
$pattern .= "l$char";
}
}
#
# Store converted pattern
if((length($pattern) >= $minLength*2) &&
(length($pattern) <= $maxLength*2)){
$hexPatterns{$pattern} = "dummyValue"; # Keys will be distinct
}
}
#
#
#
# Print patterns, that have no subsets, to file
open(OUTFILE, ">".$camfile) or die
"Can't open ".$camfile."n";
#
@patterns = keys %hexPatterns;
$count = 0; # Count patterns that are written to file
#
HEXLOOP:
for($i=0; $i<=$#patterns; $i++){
for($j=0; $j<=$#patterns; $j++){ # Search for subsets
#
next if($i==$j); # Do not compare a pattern with itself
#
next HEXLOOP if # Skip if subset is found
((length($patterns[$i]) <= length($patterns[$j])) &&
($patterns[$j] =~ /$patterns[$i]/));
}
print OUTFILE $patterns[$i]."n";
$count++;
}
#
close(OUTFILE);
#
#
#
# msg
print
"n".
" Wrote ".$count." patterns to file: "".$camfile.""n".
"n";

下面的perl脚本提取"内容";数据筛选(逐行)。要存储数据，请将输出重定向到文件中。

#!/usr/bin/env perl
#
# vim: ai ts=4 sw=4
use strict;
use warnings;
use feature 'say';
while( my $line = <> ) {
my @array = $line =~ /content:"(.*?)"/g;
say join "t", @array;
}

以script.pl filename

身份运行脚本

输出

8D9563A9-8D5F-459B-87F2-BA842255CB9A    CheckForUpdates

我在Perl中找到了这段代码(我没有经验)

与其使用一个连注释都不敢看的Perl脚本，不如考虑使用:

grep -Po 'content:".*?"' <text >another_text

要删除content:，引号和破折号，可以使用:

grep -Po '(?<=content:").*?(?=")' <text | tr -d - >another_text

@Armali我想在代码中编辑这部分，以便它可以在同一行中再次检查是否有其他内容部分，并提取它们并将它们打印在不同的行中:

#
if($rule =~ /content:/){
@parts = split(/;/, $rule);
for $part(@parts){
if($part =~ /content:/){
$content = $part;
$contentParts++;
# Remove anything before content-part
$content =~ s/^.*content:.*?"//i;
# Remove anything after content-part
$content =~ s/"$.*//g;
}
}
}
#
# Store content-part
if ($contentParts == 1){
push(@contents, $content);
}
}

}

相关内容

最新更新

热门标签：