首页 > 解决方案 > Regex Expression on IDs of two lengths

问题描述

I'm using regex on a large block of text that has several IDs that I am trying to extract, here is an example of them:

476iDD5100A9E110A2FA
155i6F1388BE08C6940D
3155i6F1388BE08C6940D

"i" is always present at either the 4 or 5th character. The strings are 20 characters if the 4th character is an "i" and 21 characters when the 5th character is an "i". 16 characters always follow the "i".

Here is how it looks in total in the line of text:

id="833i8E8BBB9BB1DA748D" size="large" sourcetype="new"

I wrote the following expression in .NET:

([0-9]{3,4}[i][0-Z]{16})+

It does great with the 20 character IDs, but the 21 character IDs have the first digit truncated down to 20. How do I modify my expression to grab both the 20 and 21 character version of these IDs?

标签: .netregex

解决方案


You may try the regex below:

\b\d{3,4}i[0-9A-Za-z]{16}\b

Explanation of the above regex:

\b - Represents a word boundary.

\d{3,4} - Matches digit 3 to 4 times.

i - Matches i literally.

[a-zA-Z0-9]{16} - Matches a word character 16 times.

pictorial representation

You can find the demo of the above regex in here.


推荐阅读