用于匹配服务器日志的常规经验



我需要从日志文件中识别服务器事件。我为此目的使用模式匹配。我的正则表达式不起作用.请检查我的正则表达式是错误的还是问题出乎其他原因。

示例输入为 :--

2009/12/14 11:49:20.55                  00 STARTUP  Distributed Access Infrastructure V1.1.0   
2009/12/14 11:49:20.55                  01 STARTUP    Tools Access Server initialization started   
2009/12/14 11:49:20.55 TAS#####EC05003E 00 STARTUP  Environment:    
2009/12/14 11:49:20.55 TAS#####EC05003E 01 STARTUP    Job.....DAITAS     System...EC05      ASID.....003E    
2009/12/14 11:49:20.55 TAS#####EC05003E 02 STARTUP    User....USRT001    Group....SYS1      JobNum...STC00079
2009/12/14 11:49:20.55 TAS#####EC05003E 03 STARTUP    Local...GMT-08     GMT......2009/12/14 19:49

我的脚本是:

public void map(Object key, Text value, Context context) throws IOException , InterruptedException{
        String input=value.toString();
        String delimiter= "[n]";
        String[] tokens=input.split(delimiter);
        String sample = null;
        Pattern pattern;
        String regex= " \s+\d+\s+[a-z,A-Z]+\s ";
        pattern=Pattern.compile(regex);


        for(int i=0;i<tokens.length;i++){
            sample=tokens[i];
            System.out.println(sample.toString());
            System.out.println("enter here");
            Matcher match=pattern.matcher(sample);
            boolean val = match.matches();
            System.out.println("the conditions" + val);
            System.out.println("enter here 2");
            if(val){
                System.out.println("the regex is found" + val);
                logEvent.set(sample);
                System.out.println("the value of logEvent is "+ logEvent);
            }
            else{
                logInformation.set(sample);
                System.out.println("the log informaTION" + logInformation);
            }
        context.write(logEvent, logInformation);    

我需要认识到 - 启动

谢谢

试试这个

try {
    Regex regexObj = new Regex(@"(?im)s+(?<event>d+s+[a-z]+)s+(?<details>[^rn]+)$");
    Match matchResults = regexObj.Match(subjectString);
    while (matchResults.Success) {
        for (int i = 1; i < matchResults.Groups.Count; i++) {
            Group groupObj = matchResults.Groups[i];
            if (groupObj.Success) {
                // matched text: groupObj.Value
                // match start: groupObj.Index
                // match length: groupObj.Length
            } 
        }
        matchResults = matchResults.NextMatch();
    } 
} catch (ArgumentException ex) {
    // Syntax error in the regular expression
}

正则表达式解释

@"
(?im)          # Match the remainder of the regex with the options: case insensitive (i); ^ and $ match at line breaks (m)
s             # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?<event>      # Match the regular expression below and capture its match into backreference with name “event”
   d             # Match a single digit 0..9
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   s             # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
   [a-z]          # Match a single character in the range between “a” and “z”
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
s             # Match a single character that is a “whitespace character” (spaces, tabs, and line breaks)
   +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
(?<details>    # Match the regular expression below and capture its match into backreference with name “details”
   [^rn]        # Match a single character NOT present in the list below
                     # A carriage return character
                     # A line feed character
      +              # Between one and unlimited times, as many times as possible, giving back as needed (greedy)
)
$              # Assert position at the end of a line (at the end of the string or before a line break character)
"

希望这有帮助。

相关内容

  • 没有找到相关文章

最新更新