首页 > 解决方案 > 如何构造正则表达式来检查有效路径?

问题描述

我正在尝试为下面的路径构建正则表达式。你能帮我为它构建正则表达式吗?

path  = "root-dir/document/2018/02/29/14/02-sample.txt"
pattern = '((a-z)+(-)(a-z)+/(a-z)+/(\d{4})/(\d{2})/(\d{2})/(\d{2})/(\d{2)(-)(a-z)+(.)(a-z)+)'
bool(re.match(pattern,path))

"root-dir/document/2015/01/25/13/01-sample.txt" //this should be accepted
"root-dir/2015/01/25/13/01-sample.txt" //this should not be accepted
"root-dir/document/201/01/25/13/01-sample.txt" //this should not be accepted as 201 part should be 4 digit
"root-dir/document/2015/01/2/13/01-sample.txt" //this should be not accepted as 2 part should be 2 digit
"root-dir/document/2015/01/25/13/sample.txt" //this should not be accepted as the last part should be something like this 03-sample.txt

标签: pythonregex

解决方案


您的模式几乎没有问题,正确的用法是:

import re
pattern = r'([a-z]+(-)[a-z]+/[a-z]+/(\d{4})/(\d{2})/(\d{2})/(\d{2})/(\d{2})(-)[a-z]+(\.)[a-z]+)'
print(bool(re.match(pattern,path))) #True

您的模式'((a-z)+(-)(a-z)+/(a-z)+/(\d{4})/(\d{2})/(\d{2})/(\d{2})/(\d{2)(-)(a-z)+(.)(a-z)+)'不起作用,因为:

  • \d{2)-你应该关闭{not})
  • a-z- 这相当于里面的任何小写字母,而且只有在里面[ ]不是( )
  • .\.-如果你的意思是字符.(ASCII:46),你应该使用它,因为re它表示任何不是换行符的字符,还记得使用原始字符串而不是普通字符串

推荐阅读