首页 > 解决方案 > 查找具有近似名称的文件夹

问题描述

我目前有一个例程搜索保存文件的目录并找到一个名为“$Fabrication Data”的文件夹。我正在开发一个新的添加内容,该添加内容将被替换到我现有的代码中,以允许一些人为错误,即如果该文件夹名称存在轻微的拼写错误/格式错误。我想检查“路径”目录中的每个文件夹(但不是它的子文件夹)。目前它返回一个匹配:Path\SubFolder$Fabrication Data$,而不是我想要的路径:Path$ Fabrication Data

奖金问题...我目前正在返回任何高于 .8 匹配的文件夹,如果有多个高于 0.8 匹配的文件夹,我如何返回最接近的匹配?


Dim Path As String = "N:\Stuff\More Stuff\More More Stuff\Project Folder"

For Each d In System.IO.Directory.GetDirectories(Path)
    For Each sDir In System.IO.Directory.GetDirectories(d)
        Dim sdirInfo As New System.IO.DirectoryInfo(sDir)
        Dim similarity As Single = GetSimilarity(sdirInfo.Name, "$Fabrication Data")
        If similarity > .8 Then
            sFDPath = Path & "\" & sdirInfo.Name
            
            MsgBox(sFDPath)
            Else
        End If
    Next
Next
End Sub

Public Function GetSimilarity(string1 As String, string2 As String) As Single
    Dim dis As Single = ComputeDistance(string1, string2)
    Dim maxLen As Single = string1.Length
    If maxLen < string2.Length Then
        maxLen = string2.Length
    End If
    If maxLen = 0.0F Then
        Return 1.0F
    Else
        Return 1.0F - dis / maxLen
    End If
End Function

Private Function ComputeDistance(s As String, t As String) As Integer
    Dim n As Integer = s.Length
    Dim m As Integer = t.Length
    Dim distance As Integer(,) = New Integer(n, m) {}
    ' matrix
    Dim cost As Integer = 0
    If n = 0 Then
        Return m
    End If
    If m = 0 Then
        Return n
    End If
    'init1

    Dim i As Integer = 0
    While i <= n
            distance(i, 0) = System.Math.Min(System.Threading.Interlocked.Increment(i), i - 1)
        End While
        Dim j As Integer = 0
        While j <= m
            distance(0, j) = System.Math.Min(System.Threading.Interlocked.Increment(j), j - 1)
        End While
    'find min distance

    For i = 1 To n
        For j = 1 To m
            cost = (If(t.Substring(j - 1, 1) = s.Substring(i - 1, 1), 0, 1))
            distance(i, j) = Math.Min(distance(i - 1, j) + 1, Math.Min(distance(i, j - 1) + 1, distance(i - 1, j - 1) + cost))
        Next
    Next
    Return distance(n, m)
End Function ```

标签: vb.net

解决方案


您可以使用这样的简单类跟踪每个文件夹的评级:

Public Class FolderRating

    Public Rating As Single
    Public Folder As String

    Public Sub New(folder As String, rating As Single)
        Me.Folder = folder
        Me.Rating = rating
    End Sub

End Class

然后,创建一个列表:

Dim ratings As New List(Of FolderRating)

在您的循环中,当您发现评分高于 0.8 时,将其添加到列表中:

If similarity > 0.8 Then
    Dim sFDPath As String = Path & "\" & sdirInfo.Name
    ratings.Add(New FolderRating(sFDPath, similarity))
End If

最后,对列表进行排序:

ratings.Sort(Function(x, y) x.Rating.CompareTo(y.Rating))

然后,您可以获取列表中的最后一个值,它将是您最相似的文件夹(如果有):

Dim bestMatch As FolderRating = ratings.LastOrDefault

推荐阅读