regex - A regex to get any price string
问题描述
I need to get the price from a string, but no other numbers. There are no restrictions on what the string can say, but it will always have a dollar amount in it. It's the dollar amount I need to get from the string.
The closest solution I've been able to find is \d{1,3}[,\\.]?(\\d{1,2})?
On an example string like, "2 BED / 2 BATH for $120,000.00, what a deal!!!"
, the regex should only return $1,000,000
, and no other numbers. The solution above will return 2
, 2
, and 1,000,000.00
. An ideal solution should NOT match on any digits that are outside of the dollar amount. It also needs to include the symbol immediately before the match (to account for the possibility of all currency symbols (USD, GBP, EUR, etc).
So, the price that's matched by the regex should look like: $120,000.00
, but it could also match on something like €40,000
解决方案
If you want to match all currency symbols before a number with the number itself, you may combine the two expressions:
- Currency symbol regex:
\b(?:[BS]/\.|R(?:D?\$|p))| \b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)\b|\$[Ub]|[\p{Sc}ƒ]
- Number regex:
(?<!\d)(?<!\d\.)(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)
Currencies are taken from World Currency Symbols, the 3-letter currency codes used in the pattern are the most commonly used ones, but the comprehensive list can also be compiled using those data.
The answer is
(?:\b(?:[BS]/\.|R(?:D?\$|p))|\b(?:[TN]T|[CJZ])\$|Дин\.|\b(?:Bs|Ft|Gs|K[Mč]|Lek|B[Zr]|k[nr]|[PQLSR]|лв|ден|RM|MT|lei|zł|USD|GBP|EUR|JPY|CHF|SEK|DKK|NOK|SGD|HKD|AUD|TWD|NZD|CNY|KRW|INR|CAD|VEF|EGP|THB|IDR|PKR|MYR|PHP|MXN|VND|CZK|HUF|PLN|TRY|ZAR|ILS|ARS|CLP|BRL|RUB|QAR|AED|COP|PEN|CNH|KWD|SAR)|\$[Ub]|[\p{Sc}ƒ])\s?(?:\d{1,3}(?:,\d{3})*|\d+)(?:\.\d{1,2})?(?!\.?\d)
See the regex demo
It is created like this: (?:CUR_SYM_REGEX)\s?NUM_REGEX
, with the lookbehinds in number regex stripped from the pattern since the left-hand context is already defined.
推荐阅读
- ios - 当键盘显示一个文本视图但没有另一个文本视图时如何移动视图
- reactjs - 带有 React Suspense 的组件导致 Eternal Loop
- spring-boot - Thymeleaf - 将对象的字段设置为另一个对象
- r - 如何更改 df 中的日期格式
- python - 为什么我的 groupby.sum() pandas 没有计算出来?
- excel - 在 VB.Net 中使用 OleDb 读取 excel 文件并将其写入 SPF 文件
- macos - 是否可以编辑 mac 应用程序并将网站地址添加为帮助菜单链接?
- generics - 如何为任何实现接口 x 和子类 Y 的类定义扩展函数?
- powershell - 获取子项 (gci) | 选择字符串(模式)故障转移远程调用命令(icm)
- python - Django - 改进包含多对多和外键字段的查询