在 Excel VBA 项目中匹配相似但不完全相同的文本字符串

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/13291313/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-09-08 14:23:19  来源:igfitidea点击:

Matching similar but not exact text strings in Excel VBA projects

stringexcelvbamergematch

提问by Ampi Severe

Okay, I have been trying to find a solution for this and I just don't seem to be able. I can't even break down the problem properly. This is the idea.

好的,我一直在尝试为此找到解决方案,但似乎无法解决。我什至无法正确分解问题。这就是想法。

I have two sheets with many rows (one with 800 and the other with 300,000). Each row contains a Name column and then several columns that contain information about this Name. Each sheet has different kinds of information.

我有两张多行的工作表(一张有 800 行,另一张有 300,000 行)。每行包含一个 Name 列,然后包含几个包含有关此 Name 的信息的列。每张纸都有不同种类的信息。

I want to consolidate these two sheets into a master sheet based on this Name column they both have, so the consolidate function is perfect for this. Now the problem is that the names don't match perfectly.

我想根据它们都有的 Name 列将这两个工作表合并到一个主工作表中,因此合并功能非常适合于此。现在的问题是名称不完全匹配。

For example Sheet1 contains

例如 Sheet1 包含

"Company B.V.", "Info #1"
"Company Total", "Info #2"
"Company Ltd", "Info #3"

“Company BV”、“Info #1”、
“Company Total”、“Info #2”、
“Company Ltd”、“Info #3”

and sheet 2 contains

和表 2 包含

"Company and Co.", "Info #4"
"Company and Co", "Info #5"

“公司和公司”,“信息#4”
“公司和公司”,“信息#5”

Sheet 1 contains all the names that are going to be used (around a 100 but in different forms as stated above) and sheet 2 contains all these 100 in multiple rows plus names that aren't in the 100 list and therefore I don't care about.

工作表 1 包含将要使用的所有名称(大约 100 个,但形式不同,如上所述),工作表 2 包含多行中的所有这 100 个名称以及不在 100 个列表中的名称,因此我没有关心。

How would I make a VBA code project where the end result would be something like this, Master sheet:

我将如何制作一个最终结果是这样的 VBA 代码项目,主表:

"Company", "Info #1", "Info #2", "Info #3", "Info #4", "Info #5"

“公司”、“信息#1”、“信息#2”、“信息#3”、“信息#4”、“信息#5”

for every single "Company" (the 100 list) in there??

对于那里的每个“公司”(100 家名单)??

I do hope there is a solution for this. I'm pretty new to VBA projects, but I have done some minimal coding before.

我确实希望有解决方案。我对 VBA 项目很陌生,但我以前做过一些最少的编码。

回答by Robert Ilbrink

I would place the macro in your PERSONAL section, this way the macro is available in all worksheets. Do this by recording a dummy macro and select to store it in Personal Macro workbook. Now you can manually add new macro's and functions in this personal workbook.

我会将宏放在您的个人部分,这样宏在所有工作表中都可用。通过录制虚拟宏并选择将其存储在个人宏工作簿中来执行此操作。现在您可以在这个个人工作簿中手动添加新的宏和函数。

I just tried this one (don't know the original source) and it works fine.

我刚试过这个(不知道原始来源),它工作正常。

The formula looks like this: =PERSONAL.XLSB!FuzzyFind(A1,B$1:B$20)

公式如下所示:=PERSONAL.XLSB!FuzzyFind(A1,B$1:B$20)

The code is here:

代码在这里:

Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
Dim i As Integer, str As String, Value As String
Dim a As Integer, b As Integer, cell As Variant
For Each cell In tbl_array
  str = cell
  For i = 1 To Len(lookup_value)
    If InStr(cell, Mid(lookup_value, i, 1)) > 0 Then
      a = a + 1
      cell = Mid(cell, 1, InStr(cell, Mid(lookup_value, i, 1)) - 1) & Mid(cell, InStr(cell, Mid(lookup_value, i, 1)) + 1, 9999)
    End If
  Next i
  a = a - Len(cell)
  If a > b Then
    b = a
    Value = str
  End If
  a = 0
Next cell
FuzzyFind = Value
End Function

回答by Garima SP

I used Robert solution and it works fine for me. I am posting whole solution for people who are new for excel but knows coding:

我使用了 Robert 解决方案,对我来说效果很好。我正在为不熟悉 excel 但知道编码的人发布整个解决方案:

Though this thread is old but I took some code from another threads and tried and looks like solution is giving approx match. Here I am trying to match one column of sheet1 with one column of sheet2:

虽然这个线程很旧,但我从另一个线程中获取了一些代码并尝试过,看起来解决方案提供了近似匹配。在这里,我试图将 sheet1 的一列与 sheet2 的一列匹配:

  1. add command button in excel
  2. put following code and click/run button and function gives you result in selected column
  1. 在excel中添加命令按钮
  2. 放置以下代码并单击/运行按钮和函数为您提供所选列的结果
 Private Sub CommandButton21_Click()
     Dim ws As Worksheet
     Dim LRow As Long, i As Long, lval As String


   '~~> Change this to the relevant worsheet
    Set ws = ThisWorkbook.Sheets("Sheet1")

With ws
    '~~> Find Last Row in Col G which has data
    LRow = .Range("D" & .Rows.Count).End(xlUp).Row

    If LRow = 1 Then
        MsgBox "No data in column D"
    Else
        For i = 2 To LRow


             lval = "D"
            .Range("G" & i).Value = FuzzyFind(lval & i, .Range("PWC"))
        Next i
    End If
    End With

    End Sub


    Function FuzzyFind(lookup_value As String, tbl_array As Range) As String
    Dim i As Integer, str As String, Value As String
    Dim a As Integer, b As Integer, cell As Variant

    For Each cell In tbl_array
     str = cell
     For i = 1 To Len(lookup_value)
      If InStr(cell, Mid(lookup_value, i, 1)) > 0 Then
     a = a + 1
     cell = Mid(cell, 1, InStr(cell, Mid(lookup_value, i, 1)) - 1) & Mid   (cell, InStr(cell, Mid(lookup_value, i, 1)) + 1, 9999)
    End If
     Next i
     a = a - Len(cell)
     If a > b Then
       b = a
       Value = str
    End If
       a = 0
    Next cell
      If Value <> "" Then
         FuzzyFind = Value
      Else
         FuzzyFind = "None"
      End If
End Function

回答by Robert Ilbrink

You can Google Excel UDF Fuzzy lookup or Levensthein distance. There are some UDF's floating around and Microsoft does have a Fuzzy lookup/match add-on as well (when I used it, it was crash prone and not intuitive).

您可以使用 Google Excel UDF Fuzzy 查找或 Levensthein 距离。有一些 UDF 浮动,微软也有一个模糊查找/匹配插件(当我使用它时,它容易崩溃并且不直观)。

回答by Dick Kusleika

Take a look at the functions on this DDoE post. You could generate a longest common sequence string and compare the length to the original string. Feed it some known matches and some close non-matches and see if you can see a clear dividing line between them.

查看此 DDoE 帖子中的功能。您可以生成最长的公共序列字符串并将长度与原始字符串进行比较。给它一些已知的匹配和一些接近的非匹配,看看你是否能看到它们之间的清晰分界线。

These functions are used for diffing, not finding close matches, but they may work for you.

这些函数用于比较,而不是查找接近的匹配项,但它们可能对您有用。

回答by RIck_R

Not exactly on point but similar, and people dealing with myissue are likely to find thispage when searching.

不完全正确但相似,处理我的问题的人可能会在搜索时找到页面。

Task:A list of patients who have been in car wrecks, including street addresses. Find related accounts based on same street address. The list will be a maximum of maybe 120 records--so partialmanual review is realistic.

任务:经历过车祸的患者列表,包括街道地址。根据相同的街道地址查找相关帐户。该列表最多可能有 120 条记录——因此部分人工是现实的。

Problem:Street addresses are similar but not identical, e.g. 123 JONES LANE and 123 JONES LN or 72 MAIN STREET #32 and 72 MAIN STREET # 32.

问题:街道地址相似但不完全相同,例如 123 JONES LANE 和 123 JONES LN 或 72 MAIN STREET #32 和 72 MAIN STREET #32。

Partof the solution is to compare only the street numbers. With a list that size it's unusual to have two different addresses with the same street number (e.g., 123 JONES LANE and 123 MAIN STREET).

部分解决方案是仅比较街道号码。对于如此大的列表,具有相同街道编号的两个不同地址是不寻常的(例如,123 JONES LANE 和 123 MAIN STREET)。

Caution:You can't use VAL() to pull the street number. Try it with 167 E 13 ST. VBA sees that as 167^13 and will crash if you are outputting to Street_Num As Integer. So you have to use a loop to pull the digits into a new string and stop at the first non-digit character.

注意:您不能使用 VAL() 来提取街道号码。试试 167 E 13 ST。VBA 将其视为 167^13,如果您输出到 Street_Num As Integer,则会崩溃。因此,您必须使用循环将数字拉入新字符串并在第一个非数字字符处停止。