java - Java 正则表达式格式字符串
问题描述
我是 Java 和正则表达式的新手。我正在尝试从以下字符串中提取数据:
JAMMURI YA KENYA ¢)a(s REPUBLIC OF KENYA
sennc wnecs: 23085129 aShl e 31662252
FULL NAMES
JUDITH AWINO OWITI
DATE OF BIRTH
25. 10. 1992
SEX
FEMALE
DISTRICT OF BIRTH
MUHORONI
PLACE OF ISSUE
. NYANDO
DATE OF ISSUE
' 16.04. 2013 A
j HOLDER'S SIGN oo - Ve
在哪里:23085129 = IDNumber, 31662252 = SerialNumber, JUDITH AWINO OWITI = Name, 25. 10. 1992 = DateOfBirth
这是我的代码:
static HashMap<String, String> interpretText(String ocrText) {
HashMap<String, String> result = new HashMap<String, String>();
result.put("Text", ocrText);
for (String text : ocrText.split("\\r?\\n")) {
//Get ID Only
String pattern = "\\s+([0-9]{9})";
Pattern r = Pattern.compile(pattern);
Matcher m = r.matcher(text);
if (m.find()) {
System.out.println("IDNumber: " + m.group(0) );
result.put("IDNumber", m.group(0));
}
// Get all other numbers
pattern = "(\\d+)(.*)";
r = Pattern.compile(pattern);
m = r.matcher(text);
if (m.find()) {
System.out.println("Found value: " + m.group() );
}
}
return result;
}
我的挑战是如何从字符串中提取上述关键数据项。我的正则表达式实现不起作用。23085129 = IDNumber, 31662252 = SerialNumber, JUDITH AWINO OWITI = Name, 25. 10. 1992 = DateOfBirth
循环内正确的正则表达式实现是什么?
解决方案
试试这个表达式:
(?:.*)\s(\d+)[\w\s]+\s(\d+)\WFULL NAMES\W([\w\s]+)\WDATE OF BIRTH\W([\d.\s]+)
在这里你可以看到它是如何工作的:https ://regexr.com/51f5v
Java 示例:
String data = "JAMMURI YA KENYA ¢)a(s REPUBLIC OF KENYA\n" +
"sennc wnecs: 23085129 aShl e 31662252\n" +
"FULL NAMES\n" +
"JUDITH AWINO OWITI\n" +
"DATE OF BIRTH\n" +
"25. 10. 1992\n" +
"SEX\n" +
"FEMALE\n" +
"DISTRICT OF BIRTH\n" +
"MUHORONI\n" +
"PLACE OF ISSUE\n" +
". NYANDO\n" +
"DATE OF ISSUE\n" +
"' 16.04. 2013 A\n" +
"j HOLDER'S SIGN oo - Ve";
Pattern pattern = Pattern.compile("(?:.*)\\s(\\d+)[\\w\\s]+\\s(\\d+)\\WFULL NAMES\\W([\\w\\s]+)\\WDATE OF BIRTH\\W([\\d.\\s]+)");
Matcher matcher = pattern.matcher(data);
if (matcher.find()) {
System.out.println("IDNumber: " + matcher.group(1));
System.out.println("SerialNumber: " + matcher.group(2));
System.out.println("Name: " + matcher.group(3));
System.out.println("DateOfBirth: " + matcher.group(4));
}
输出:
IDNumber: 23085129
SerialNumber: 31662252
Name: JUDITH AWINO OWITI
DateOfBirth: 25. 10. 1992
推荐阅读
- c# - 如何获取此代码以仅从 Azure 文件共享中获取文件?
- python - youtube-dl for python:如何获取没有视频结果的错误代码?
- python - 创建整数递增的变量
- bar-chart - 如何找到 Python 调色板 ='bright' 中包含的颜色名称?
- java - Thymeleaf 形式的 forEach 未发布结果
- php - 如何连接两个表并获得正确的记录
- zapier - 我将我的应用程序推送到 zapier 平台,尝试创建 zap 模板,但在列表中看不到我的应用程序
- javascript - 创建的钩子获取数据然后更新 Vuex 状态,但 DOM 中没有任何内容
- python - 动态系统托盘文本(Python 3)
- pygame - 无法在 google colab 中运行 FlappyBird PLE