首页 > 解决方案 > (正则表达式)没有 For 每个循环的 MatchCollection

问题描述

美好的一天,我目前正在构建一个刮板并有几个问题。我已经内置了线程等,因此代码中的一切都运行得更快,但对我来说一切都运行得太慢了。

Public Sub ScrapeProxyDo(address As String)
    Dim wc As New Net.WebClient
    Dim matchCollection As MatchCollection
    Try
        Dim input As String = wc.DownloadString(address)
        matchCollection = REGEX.Matches(input)
'ncihts
        For Each obj As Object In matchCollection
            Dim match As Match = CType(obj, Match)
            Dim item As String = match.ToString()
            RichTextBox2.AppendText(item & Environment.NewLine)
        Next
    Catch ex As Exception
'Nichts
    End Try
End Sub

代码比较简单,它检查页面是否包含带有端口的IP,它的正则表达式是:Dim REGEX As Regex = New Regex("\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\:[0-9]{1,5}\b")

但是现在他从字符串中获取下载的内容,每个代理单独,当然这需要时间,您可以以某种方式更改它,以便他过滤掉所有代理并将它们直接插入 RichTextBox 吗?这比他从下到上慢慢地工作要快得多。

问候

标签: regexvb.netrichtextbox

解决方案


正如@Heinzi 在评论中所建议的那样,使用StringBuilder. Strings 是不可变的,StringBuilder可以更改。这StringBuilder使我们不会在每次迭代中丢弃 aString并创建一个新的。使用. .Value_ Match除非绝对必须,否则不要使用 As Object。

Public Sub ScrapeProxyDo(address As String)
    Dim REGEX As Regex = New Regex("\b(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\:[0-9]{1,5}\b")
    Dim wc As New Net.WebClient
    Dim input As String = wc.DownloadString(address)
    Dim matchCollection = REGEX.Matches(input)
    Dim sb As New StringBuilder
    For Each obj As Match In matchCollection
        sb.AppendLine(obj.Value)
    Next
    'Assuming this is on the UI thread
    RichTextBox2.Text = sb.ToString
End Sub

推荐阅读