首页 > 解决方案 > 正则表达式/R 从具有版本号的路径中提取字符串

问题描述

我有这样的字符串列:

col = c("/abc/def/hdk/database/dbclient/ibm/DB2Client-V97FP02.v01/sqllib/lib64",
        "/abc/def/hdk/database/dbclient/ibm/DB2Client-V97FP02.v01/sqllib/misc", 
        "azn/external/curl-7.52.1/linux_g44.exe",
        "store/software/ep/rpg/external/python27-2.7.1/lib")

我想提取后跟版本号的字符串。我希望结果如下:

result = c("DB2Client-V97FP02.v01","DB2Client-V97FP02.v01", "curl-7.52.1", "python27-2.7.1")

我能够使用正则表达式仅提取标准版本之后的 bersion 编号,"\\d+(\\.\\d+)"但不知道如何解决这个问题。

谢谢

标签: rregex

解决方案


我建议匹配所有非/字符,然后是连字符,然后是一组可选的 1+ 个单词字符.v,然后是 1+ 个数字,然后是 1 个或多个重复.和 1+ 个数字:

regmatches(col, regexpr("[^/]+-(?:\\w+\\.v)?\\d+(?:\\.\\d+)*", col, perl=TRUE))

请参阅正则表达式演示正则表达式图

在此处输入图像描述

R 演示

col <- c("/abc/def/hdk/database/dbclient/ibm/DB2Client-V97FP02.v01/sqllib/lib64", "/abc/def/hdk/database/dbclient/ibm/DB2Client-V97FP02.v01/sqllib/misc", "azn/external/curl-7.52.1/linux_g44.exe", "store/software/ep/rpg/external/python27-2.7.1/lib")
regmatches(col, regexpr("[^/]+-(?:\\w+\\.v)?\\d+(?:\\.\\d+)*", col, perl=TRUE))
## => [1] "DB2Client-V97FP02.v01" "DB2Client-V97FP02.v01" "curl-7.52.1"           "python27-2.7.1" 

推荐阅读