首页 > 解决方案 > 来自字符串的模式

问题描述

我想从字符串中提取模式,例如:

string x== "1234567 - israel.ekpo@massivelogdata.net cc55ZZ35 1789 Hello Grok";
pattern its should generate is = "%{EMAIL:username} %{USERNAME:password} %{INT:yearOfBirth}"

基本上我想为java应用程序中生成的日志创建模式。知道该怎么做吗?

标签: javaregexlogging

解决方案


推荐使用grow 库从日志中提取数据。

例子:

public final class GrokStage {

  private static final void displayResults(final Map<String, String> results) {
    if (results != null) {
      for(Map.Entry<String, String> entry : results.entrySet()) {
        System.out.println(entry.getKey() + "=" + entry.getValue());
      }
    }
  }

  public static void main(String[] args) {

    final String rawDataLine1 = "1234567 - israel.ekpo@massivelogdata.net cc55ZZ35 1789 Hello Grok";

    final String expression = "%{EMAIL:username} %{USERNAME:password} %{INT:yearOfBirth}";

    final GrokDictionary dictionary = new GrokDictionary();

    // Load the built-in dictionaries
    dictionary.addBuiltInDictionaries();

    // Resolve all expressions loaded
    dictionary.bind();

    // Take a look at how many expressions have been loaded
    System.out.println("Dictionary Size: " + dictionary.getDictionarySize());

    Grok compiledPattern = dictionary.compileExpression(expression);

    displayResults(compiledPattern.extractNamedGroups(rawDataLine1));
  }
}

输出:

username=israel.ekpo@massivelogdata.net
password=cc55ZZ35
yearOfBirth=1789

笔记:

这是之前使用的模式:

  • 电子邮件%{\S+}@%{\b\w+\b}\.%{[a-zA-Z]+}
  • 用户名[a-zA-Z0-9._-]+
  • INT(?:[+-]?(?:[0-9]+))

有关 grok 模式的更多信息:BuiltInDictionary.java


推荐阅读