首页 > 解决方案 > 使用 Excel VBA 提供的 RegEx 仅从 Word doc 中提取第一个匹配项

问题描述

有一个像这样的文件。我每天处理 20 个这样的文档,它们看起来都一样(我的意思是,结构非常一致)。

在此处输入图像描述

此宏的目标是仅从.ActiveDocument.Content. 在整个文档中还有更多匹配项,但我只需要第一个。正在处理的文档将在宏运行之前手动打开。

我只是一个 VBA 初学者,所以如果有可能在不使用数组、集合或某些字典的情况下编写它,我将不胜感激。只有一项要提取,所以最好将它加载到repNmbr字符串变量中,然后从那里加载ws.Range("G30").Value = repNmbr。越简单越好。

我使用了这些资源Excel 正则表达式教程(正则表达式),这非常有帮助,但我仍然不知道如何将 FIRST MATCH 单独加载到我的repNmbr字符串变量中。我想在不使用任何循环的情况下执行此操作,因为我只想将单个字符串加载到此repNmbr变量中。

目前我有这样的代码:

Sub ExtractRepertor03()
    'Application.ScreenUpdating = False
    Dim WordApp As Word.Application
    Dim WordDoc As Word.Document
    Dim ExcelApp As Excel.Application
    Dim rng As Word.Range
    Dim ws As Worksheet
    Dim regEx As Object
    Dim matches As MatchCollection
    Dim match As String
    Dim repNmbr As String

    'Assigning object variables
    Set WordApp = GetObject(, "Word.Application")      'ActiveX can't create object is when
    Set ExcelApp = GetObject(, "Excel.Application")    'there is no Word document open;
    Set regEx = CreateObject("VBScript.RegExp")
    Set WordDoc = WordApp.ActiveDocument
    Set rng = WordApp.ActiveDocument.Content

    'Create the regular expression object
    regEx.Global = False    'because I need only the first match instead of all occurences;
    regEx.IgnoreCase = True
    regEx.Pattern = "([0-9]{1,5})([ ]{0,4})([/])([0-9]{4})"
    'regEx.Pattern = "([0-9]{1,5})([\s]{0,4})(/[0-9]{4})"

    repNmbr = regEx.Execute(rng.text)      'here is something wrong but I don't know what;
                            'I'm trying to assign the first RegEx match to repNmbr variable;
    Debug.Print repNmbr
    repNmbr = Replace(repNmbr, " ", "") 
'    Set matches = regEx.Execute(rng.text)
'    Debug.Print regEx.Test(rng)
'    'Debug.Print regEx.Value
'        For Each match In matches    'I just want this macro run without the loop
'           Debug.Print match.Value   'Result: 9042 /2019
'           repNmbr = match.Value
'        Next match

    ExcelApp.Application.Visible = True
    ws.Range("G30").Value = repNmbr
End Sub

和这样的错误:

在此处输入图像描述

有人可以向我解释为什么Set matches = regEx.Execute(rng.text)工作正常但 repNmbr = regEx.Execute(rng.text)返回错误:“参数数量错误或属性分配无效”?

设置后regEx.Global = False,RegEx 只找到一个值,那么为什么 VBA 拒绝将此字符串分配给repNmbr字符串变量?

标签: regexvbams-word

解决方案


正如我在您的另一个问题中所说,您不需要 RegEx 库。坚持使用 Word 的通配符!尝试:

Sub Demo()
Application.ScreenUpdating = False
Dim WordApp As Word.Application
Set WordApp = GetObject(, "Word.Application")
With WordApp.ActiveDocument.Range
  With .Find
    .Text = "<[0-9 ]{1,7}/[0-9]{4}>"
    .MatchWildcards = True
    .Wrap = wdFindStop
    .Forward = True
    .Execute
  End With
  If .Find.Found = True Then ActiveSheet.Range("G30").Value = Replace(.Text, " ", "")
End With
Application.ScreenUpdating = True
End Sub

注意:我没有打扰任何:

Dim ExcelApp As Excel.Application
Dim rng As Word.Range
Dim ws As Worksheet
Dim regEx As Object
Dim matches As MatchCollection
Dim match As String
Dim repNmbr As String

因为这都是多余的——甚至你自己的代码也从来没有给 ws 分配任何东西。


推荐阅读